From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/7531
Path: news.gmane.org!not-for-mail
From: John MacFarlane <fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Newsgroups: gmane.text.pandoc
Subject: Re: Decoupling citeproc and highlighting-kate from pandoc
Date: Sat, 28 Sep 2013 14:34:33 -0700
Message-ID: <20130928213433.GB45256@Johns-MacBook-Pro.local>
References: <fd109d63-2add-4956-b1f3-dbaf706634fb@googlegroups.com>
 <20130913062445.GA95508@Johns-MacBook-Pro.local>
 <b01350cd-72e7-49e5-aa50-6c25382376f6@googlegroups.com>
 <20130927182634.GA37542@dhcp-128-32-252-11.lips.berkeley.edu>
 <cfd2f13e-48f4-4b0e-8415-a68597649e17@googlegroups.com>
 <20130928071616.GA42338@Johns-MacBook-Pro.local>
 <20130928074012.GA42449@Johns-MacBook-Pro.local>
 <09c521a8-1f44-474b-8829-60dd95cf0f94@googlegroups.com>
 <20130928193409.GA45018@Johns-MacBook-Pro.local>
 <a86067ec-6b4a-437b-b2f0-3ee666209dae@googlegroups.com>
Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
X-Trace: ger.gmane.org 1380404085 13930 80.91.229.3 (28 Sep 2013 21:34:45 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Sat, 28 Sep 2013 21:34:45 +0000 (UTC)
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
Original-X-From: pandoc-discuss+bncBDW7ZIEHTIIBB5MWTWJAKGQEAGUPCZY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sat Sep 28 23:34:50 2013
Return-path: <pandoc-discuss+bncBDW7ZIEHTIIBB5MWTWJAKGQEAGUPCZY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Envelope-to: gtp-pandoc-discuss@m.gmane.org
Original-Received: from mail-ie0-f190.google.com ([209.85.223.190])
	by plane.gmane.org with esmtp (Exim 4.69)
	(envelope-from <pandoc-discuss+bncBDW7ZIEHTIIBB5MWTWJAKGQEAGUPCZY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>)
	id 1VQ2A3-0003BK-4o
	for gtp-pandoc-discuss@m.gmane.org; Sat, 28 Sep 2013 23:34:47 +0200
Original-Received: by mail-ie0-f190.google.com with SMTP id qd12sf830622ieb.17
        for <gtp-pandoc-discuss@m.gmane.org>; Sat, 28 Sep 2013 14:34:46 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=googlegroups.com; s=20120806;
        h=date:from:to:subject:message-id:references:mime-version:in-reply-to
         :user-agent:x-original-sender:x-original-authentication-results
         :reply-to:precedence:mailing-list:list-id:list-post:list-help
         :list-archive:sender:list-subscribe:list-unsubscribe:content-type
         :content-disposition;
        bh=ZH8Hnc+hAQGM6drHHZpyorZitFakU42x21fsVnUMOTo=;
        b=zWGRwrVnClqilda+1XzQ2Bmiiax+7xUFYdA9kpS1KrwKGK+ADub51C3LcWY7FYxLzb
         4ZCsv7liPblTSD/BakaZd7GWJs8TZQX2GHob/fz+vlTpJw8qToqERkiojC7b6zgI7SMK
         cFB4da+o8gGXdI8PmnNNjL89bXlDdvIgBPSPGgsd3bKDyzValZDbNI9Tj2U6aZYyqkSm
         unbZKGxZdYREdutHhCMfCWPAgsleLJQ5UoAo+gafJJfxLZV+IbINBxRML0lMSAqObQOW
         di3Svgk2j4oXlSgrvz3zNcyqchCHv5DS0T36H16JRmP5lPrpSpDn1TYkoDHWbMixEK3E
         ggUg==
X-Received: by 10.50.79.228 with SMTP id m4mr34902igx.9.1380404086175;
        Sat, 28 Sep 2013 14:34:46 -0700 (PDT)
X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
Original-Received: by 10.50.32.67 with SMTP id g3ls1078984igi.32.gmail; Sat, 28 Sep
 2013 14:34:45 -0700 (PDT)
X-Received: by 10.66.102.69 with SMTP id fm5mr18055125pab.24.1380404085787;
        Sat, 28 Sep 2013 14:34:45 -0700 (PDT)
Original-Received: from cm04fe.IST.Berkeley.EDU (cm04fe.IST.Berkeley.EDU. [169.229.218.145])
        by gmr-mx.google.com with ESMTPS id kh12si2452119pab.2.1969.12.31.16.00.00
        (version=TLSv1 cipher=RC4-SHA bits=128/128);
        Sat, 28 Sep 2013 14:34:45 -0700 (PDT)
Received-SPF: neutral (google.com: 169.229.218.145 is neither permitted nor denied by best guess record for domain of jgm-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org) client-ip=169.229.218.145;
Original-Received: from li55-134.members.linode.com ([74.82.3.134] helo=johnmacfarlane.net)
	by cm04fe.ist.berkeley.edu with esmtpsa (TLSv1:AES256-SHA:256)
	(Exim 4.76)
	(auth plain:jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org)
	(envelope-from <jgm-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>)
	id 1VQ2A0-00057t-Du
	for pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; Sat, 28 Sep 2013 14:34:45 -0700
Original-Received: by johnmacfarlane.net (Postfix, from userid 1000)
	id 407B0BBBDF; Sat, 28 Sep 2013 17:37:24 -0400 (EDT)
In-Reply-To: <a86067ec-6b4a-437b-b2f0-3ee666209dae-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
X-PGP-Key: http://johnmacfarlane.net/jgm.asc
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Original-Sender: fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org
X-Original-Authentication-Results: gmr-mx.google.com;       spf=neutral
 (google.com: 169.229.218.145 is neither permitted nor denied by best guess
 record for domain of jgm-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org) smtp.mail=jgm-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org;
       dmarc=fail (p=NONE dis=NONE) header.from=gmail.com
Precedence: list
Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
List-ID: <pandoc-discuss.googlegroups.com>
X-Google-Group-Id: 1007024079513
List-Post: <http://groups.google.com/group/pandoc-discuss/post>, <mailto:pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
List-Help: <http://groups.google.com/support/>, <mailto:pandoc-discuss+help-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
List-Archive: <http://groups.google.com/group/pandoc-discuss>
Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
List-Subscribe: <http://groups.google.com/group/pandoc-discuss/subscribe>, <mailto:pandoc-discuss+subscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
List-Unsubscribe: <http://groups.google.com/group/pandoc-discuss/subscribe>, <mailto:googlegroups-manage+1007024079513+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Content-Disposition: inline
Xref: news.gmane.org gmane.text.pandoc:7531
Archived-At: <http://permalink.gmane.org/gmane.text.pandoc/7531>

How is the language represented in a biblatex file?  Or is it an
external parameter that comes from the locale or something else?

+++ Nick Bart [Sep 28 13 13:37 ]:
>    unTitlecase: Brilliant.
>    Forgot one important detail, though: unTitlecase should only be applied
>    if the language is English, i.e., if an entry's biblatex hyphenation
>    field is one of american, british, canadian, english, australian,
>    newzealand, USenglish or UKenglish.
>    From the biblatex manual (v 2.7a), "4.6.4 Miscellaneous Commands", item
>    " \MakeSentenceCase":
>    > By default, converting to sentence case is enabled for the following
>    language identifiers: american, british, canadian, english, australian,
>    newzealand as well as the aliases USenglish and UKenglish.
>    On Saturday, September 28, 2013 7:34:09 PM UTC, fiddlosopher wrote:
> 
>      +++ Nick Bart [Sep 28 13 03:52 ]:
>      >    A few additional ideas/issues:
>      >    Title vs. sentence case
>      >    =======================
>      >    bibtex and biblatex expect titles in title case in the
>      database,
>      >    converting these to sentence case if required by a particular
>      style,
>      >    except for strings protected by {}.
>      >    CSL does the opposite: It expects titles in sentence case,
>      converting
>      >    these to title case if required, except for certain stop words.
>      >    I think it make senses for bibtex2pandoc to convert
>      bibtex=biblatex
>      >    titles to sentence case for further consumption by CSL; i.e.,
>      to lower
>      >    case, except strings protected by {}.
>      >        - ex.:
>      >        title = {{JFK}: The {CIA}, {Vietnam}, and the Plot to
>      Assassinate
>      >    {John F. Kennedy}}
>      >        becomes
>      >        title: 'JFK: The CIA, Vietnam, and the plot to assassinate
>      John F.
>      >    Kennedy'
>      >        - to be discussed: should strings inside commands be
>      converted,
>      >    too?
>      >         i.e., should
>      >         title = {An Analysis of \textit{For Whom the Bell Tolls}}
>      >         become
>      >         title: 'An Analysis of <i>for whom the bell tolls</i>'
>      >         and only extra protection
>      >         title = {An Analysis of {\textit{For Whom the Bell
>      Tolls}}}
>      >         would yield
>      >         title: 'An Analysis of <i>For Whom the Bell Tolls</i>'
>      I've implemented the unTitlecase transformation.  (This depends on a
>      recent patch to pandoc that causes the LaTeX reader to insert Span
>      when we have a bare group {Like This}.)
>      I don't know about the "to be discussed" -- currently strings inside
>      emphasis, etc., will retain their case.  I think that's probably
>      right.
>      >    Corporate authors
>      >    =================
>      >    Example: author = {{National Aeronautics and Space
>      Administration}},
>      >    Current bibtex2pandoc output:
>      >    - author:
>      >      - family: Aeronautics
>      >        given:
>      >        - National
>      >      - family: Administration
>      >        given:
>      >        - Space
>      >    Expected:
>      >    - author:
>      >      - literal: 'National Aeronautics and Space Administration'
>      I've added support for this.
>      >    Literal and in institution, organization, publisher, location,
>      etc.
>      >    ===========================================================
>      ==========
>      >    If code includes
>      >
>      >      getLiteralList "publisher" ==> setList "publisher"
>      >    publisher = {Holt, Rinehart {and} Winston}
>      >    Current bibtex2pandoc output:
>      >      publisher:
>      >      - 'Holt, Rinehart'
>      >      - Winston
>      >    Expected:
>      >      publisher:
>      >      - 'Holt, Rinehart and Winston'
>      >    Also noticed, however, that pandoc-citeproc does not seem to
>      like it
>      >    when actual multiple publishers occur:
>      >      publisher:
>      >      - 'Univ. of Toronto Press'
>      >      - Routledge
>      >    throws an error:
>      >    pandoc-citeproc: Error parsing references: when expecting a
>      String,
>      >    encountered Array instead
>      >    This needs to be checked, but if CSL does not allow multiple
>      >    publishers, the code above would need to be changed back to
>      >      getField "publisher" ==> setField "publisher"
>      >    I see that multiple publishers (and locations) are being
>      discussed at
>      >    various Zotero/CSL/citeproc-js forums, but don't get a clear
>      picture
>      >    yet.
>      The literal {and} will not cause breaking of author lists or literal
>      lists.
>      I'll leave the multiple publishers question open for now.  Either
>      bibtex2pandoc or pandoc-citeproc should be changed, not sure which.
>      >    Names
>      >    =====
>      >    Can bibtex2pandoc be expected to distinguish all five name
>      components
>      >    (family, given, suffix, non-dropping-particle,
>      dropping-particle)?
>      >    biblatex provides one useful additional bit of info for name
>      parsing
>      >    via the "useprefix" switch inside the "options" field:
>      >    From the biblatex manual (v 2.7a), "3.1.3 Entry Options"
>      >    > Whether the name prefix (von, van, of, da, de, della, etc.)
>      is
>      >    considered when printing the last name in citations. This also
>      affects
>      >    the sorting and formatting of the bibliography as well as the
>      >    generation of certain types of labels. If this option is
>      enabled,
>      >    biblatex always precedes the last name with the prefix. For
>      example,
>      >    Ludwig van Beethoven would be cited as Beethoven and
>      alphabetized as
>      >    Beethoven, Ludwig van by default. If this option is enabled, he
>      is
>      >    cited as van Beethoven and alphabetized as Van Beethoven,
>      Ludwig
>      >    instead. With Biber, this option is also settable on a per-type
>      basis.
>      I used a rough-and-ready algorithm, probably not the same one biber
>      or biblatex uses.  Basically, the only time my code gives you a
>      non-dropping particle is when you have a name in format
>         B A, D E
>      then B is the particle and A the given name.
>      I don't have any support now for dropping-particle or suffix.
>      This needs to be improved, but I'd need to know better what the
>      algorithm is.
>      >    Inline formatting
>      >    =================
>      >    It probably doesn't play a role for pandoc, but in the interest
>      of
>      >    portability, shouldn't inline formatting in a CSL-YAML file or
>      metadata
>      >    section rather use
>      >        <i> and </i> for italics
>      >        <b> and </b> for bold
>      >        <sub> and </sub> for subscript
>      >        <sup> and </sup> for superscript
>      >        <span style="font-variant:small-caps;"> and </span> for
>      smallcaps
>      >    instead of markdown formatting?
>      >    (see [1]https://github.com/jgm/pandoc/issues/931,
>      >    [2]https://www.zotero.org/support/kb/rich_text_bibliography)
>      No, because the YAML file is standard pandoc metadata, formatted in
>      markdown.  When it is read by pandoc-citeproc, this will be
>      converted
>      where possible to CSL metadata.
>      >    Small Caps
>      >    ==========
>      >    Speaking of small caps: Any plans to add some markdownish
>      formatting
>      >    commands for small caps to pandoc?
>      No idea what a natural syntax would be.
> 
>    --
>    You received this message because you are subscribed to the Google
>    Groups "pandoc-discuss" group.
>    To unsubscribe from this group and stop receiving emails from it, send
>    an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>    To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>    To view this discussion on the web visit
>    [3]https://groups.google.com/d/msgid/pandoc-discuss/a86067ec-6b4a-437b-
>    b2f0-3ee666209dae%40googlegroups.com.
>    For more options, visit [4]https://groups.google.com/groups/opt_out.
> 
> References
> 
>    1. https://github.com/jgm/pandoc/issues/931
>    2. https://www.zotero.org/support/kb/rich_text_bibliography
>    3. https://groups.google.com/d/msgid/pandoc-discuss/a86067ec-6b4a-437b-b2f0-3ee666209dae%40googlegroups.com
>    4. https://groups.google.com/groups/opt_out