From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/7531 Path: news.gmane.org!not-for-mail From: John MacFarlane Newsgroups: gmane.text.pandoc Subject: Re: Decoupling citeproc and highlighting-kate from pandoc Date: Sat, 28 Sep 2013 14:34:33 -0700 Message-ID: <20130928213433.GB45256@Johns-MacBook-Pro.local> References: <20130913062445.GA95508@Johns-MacBook-Pro.local> <20130927182634.GA37542@dhcp-128-32-252-11.lips.berkeley.edu> <20130928071616.GA42338@Johns-MacBook-Pro.local> <20130928074012.GA42449@Johns-MacBook-Pro.local> <09c521a8-1f44-474b-8829-60dd95cf0f94@googlegroups.com> <20130928193409.GA45018@Johns-MacBook-Pro.local> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 X-Trace: ger.gmane.org 1380404085 13930 80.91.229.3 (28 Sep 2013 21:34:45 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 28 Sep 2013 21:34:45 +0000 (UTC) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBDW7ZIEHTIIBB5MWTWJAKGQEAGUPCZY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sat Sep 28 23:34:50 2013 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-ie0-f190.google.com ([209.85.223.190]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1VQ2A3-0003BK-4o for gtp-pandoc-discuss@m.gmane.org; Sat, 28 Sep 2013 23:34:47 +0200 Original-Received: by mail-ie0-f190.google.com with SMTP id qd12sf830622ieb.17 for ; Sat, 28 Sep 2013 14:34:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=date:from:to:subject:message-id:references:mime-version:in-reply-to :user-agent:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:sender:list-subscribe:list-unsubscribe:content-type :content-disposition; bh=ZH8Hnc+hAQGM6drHHZpyorZitFakU42x21fsVnUMOTo=; b=zWGRwrVnClqilda+1XzQ2Bmiiax+7xUFYdA9kpS1KrwKGK+ADub51C3LcWY7FYxLzb 4ZCsv7liPblTSD/BakaZd7GWJs8TZQX2GHob/fz+vlTpJw8qToqERkiojC7b6zgI7SMK cFB4da+o8gGXdI8PmnNNjL89bXlDdvIgBPSPGgsd3bKDyzValZDbNI9Tj2U6aZYyqkSm unbZKGxZdYREdutHhCMfCWPAgsleLJQ5UoAo+gafJJfxLZV+IbINBxRML0lMSAqObQOW di3Svgk2j4oXlSgrvz3zNcyqchCHv5DS0T36H16JRmP5lPrpSpDn1TYkoDHWbMixEK3E ggUg== X-Received: by 10.50.79.228 with SMTP id m4mr34902igx.9.1380404086175; Sat, 28 Sep 2013 14:34:46 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 10.50.32.67 with SMTP id g3ls1078984igi.32.gmail; Sat, 28 Sep 2013 14:34:45 -0700 (PDT) X-Received: by 10.66.102.69 with SMTP id fm5mr18055125pab.24.1380404085787; Sat, 28 Sep 2013 14:34:45 -0700 (PDT) Original-Received: from cm04fe.IST.Berkeley.EDU (cm04fe.IST.Berkeley.EDU. [169.229.218.145]) by gmr-mx.google.com with ESMTPS id kh12si2452119pab.2.1969.12.31.16.00.00 (version=TLSv1 cipher=RC4-SHA bits=128/128); Sat, 28 Sep 2013 14:34:45 -0700 (PDT) Received-SPF: neutral (google.com: 169.229.218.145 is neither permitted nor denied by best guess record for domain of jgm-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org) client-ip=169.229.218.145; Original-Received: from li55-134.members.linode.com ([74.82.3.134] helo=johnmacfarlane.net) by cm04fe.ist.berkeley.edu with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.76) (auth plain:jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org) (envelope-from ) id 1VQ2A0-00057t-Du for pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; Sat, 28 Sep 2013 14:34:45 -0700 Original-Received: by johnmacfarlane.net (Postfix, from userid 1000) id 407B0BBBDF; Sat, 28 Sep 2013 17:37:24 -0400 (EDT) In-Reply-To: X-PGP-Key: http://johnmacfarlane.net/jgm.asc User-Agent: Mutt/1.5.21 (2010-09-15) X-Original-Sender: fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; spf=neutral (google.com: 169.229.218.145 is neither permitted nor denied by best guess record for domain of jgm-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org) smtp.mail=jgm-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org; dmarc=fail (p=NONE dis=NONE) header.from=gmail.com Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-Subscribe: , List-Unsubscribe: , Content-Disposition: inline Xref: news.gmane.org gmane.text.pandoc:7531 Archived-At: How is the language represented in a biblatex file? Or is it an external parameter that comes from the locale or something else? +++ Nick Bart [Sep 28 13 13:37 ]: > unTitlecase: Brilliant. > Forgot one important detail, though: unTitlecase should only be applied > if the language is English, i.e., if an entry's biblatex hyphenation > field is one of american, british, canadian, english, australian, > newzealand, USenglish or UKenglish. > From the biblatex manual (v 2.7a), "4.6.4 Miscellaneous Commands", item > " \MakeSentenceCase": > > By default, converting to sentence case is enabled for the following > language identifiers: american, british, canadian, english, australian, > newzealand as well as the aliases USenglish and UKenglish. > On Saturday, September 28, 2013 7:34:09 PM UTC, fiddlosopher wrote: > > +++ Nick Bart [Sep 28 13 03:52 ]: > > A few additional ideas/issues: > > Title vs. sentence case > > ======================= > > bibtex and biblatex expect titles in title case in the > database, > > converting these to sentence case if required by a particular > style, > > except for strings protected by {}. > > CSL does the opposite: It expects titles in sentence case, > converting > > these to title case if required, except for certain stop words. > > I think it make senses for bibtex2pandoc to convert > bibtex=biblatex > > titles to sentence case for further consumption by CSL; i.e., > to lower > > case, except strings protected by {}. > > - ex.: > > title = {{JFK}: The {CIA}, {Vietnam}, and the Plot to > Assassinate > > {John F. Kennedy}} > > becomes > > title: 'JFK: The CIA, Vietnam, and the plot to assassinate > John F. > > Kennedy' > > - to be discussed: should strings inside commands be > converted, > > too? > > i.e., should > > title = {An Analysis of \textit{For Whom the Bell Tolls}} > > become > > title: 'An Analysis of for whom the bell tolls' > > and only extra protection > > title = {An Analysis of {\textit{For Whom the Bell > Tolls}}} > > would yield > > title: 'An Analysis of For Whom the Bell Tolls' > I've implemented the unTitlecase transformation. (This depends on a > recent patch to pandoc that causes the LaTeX reader to insert Span > when we have a bare group {Like This}.) > I don't know about the "to be discussed" -- currently strings inside > emphasis, etc., will retain their case. I think that's probably > right. > > Corporate authors > > ================= > > Example: author = {{National Aeronautics and Space > Administration}}, > > Current bibtex2pandoc output: > > - author: > > - family: Aeronautics > > given: > > - National > > - family: Administration > > given: > > - Space > > Expected: > > - author: > > - literal: 'National Aeronautics and Space Administration' > I've added support for this. > > Literal and in institution, organization, publisher, location, > etc. > > =========================================================== > ========== > > If code includes > > > > getLiteralList "publisher" ==> setList "publisher" > > publisher = {Holt, Rinehart {and} Winston} > > Current bibtex2pandoc output: > > publisher: > > - 'Holt, Rinehart' > > - Winston > > Expected: > > publisher: > > - 'Holt, Rinehart and Winston' > > Also noticed, however, that pandoc-citeproc does not seem to > like it > > when actual multiple publishers occur: > > publisher: > > - 'Univ. of Toronto Press' > > - Routledge > > throws an error: > > pandoc-citeproc: Error parsing references: when expecting a > String, > > encountered Array instead > > This needs to be checked, but if CSL does not allow multiple > > publishers, the code above would need to be changed back to > > getField "publisher" ==> setField "publisher" > > I see that multiple publishers (and locations) are being > discussed at > > various Zotero/CSL/citeproc-js forums, but don't get a clear > picture > > yet. > The literal {and} will not cause breaking of author lists or literal > lists. > I'll leave the multiple publishers question open for now. Either > bibtex2pandoc or pandoc-citeproc should be changed, not sure which. > > Names > > ===== > > Can bibtex2pandoc be expected to distinguish all five name > components > > (family, given, suffix, non-dropping-particle, > dropping-particle)? > > biblatex provides one useful additional bit of info for name > parsing > > via the "useprefix" switch inside the "options" field: > > From the biblatex manual (v 2.7a), "3.1.3 Entry Options" > > > Whether the name prefix (von, van, of, da, de, della, etc.) > is > > considered when printing the last name in citations. This also > affects > > the sorting and formatting of the bibliography as well as the > > generation of certain types of labels. If this option is > enabled, > > biblatex always precedes the last name with the prefix. For > example, > > Ludwig van Beethoven would be cited as Beethoven and > alphabetized as > > Beethoven, Ludwig van by default. If this option is enabled, he > is > > cited as van Beethoven and alphabetized as Van Beethoven, > Ludwig > > instead. With Biber, this option is also settable on a per-type > basis. > I used a rough-and-ready algorithm, probably not the same one biber > or biblatex uses. Basically, the only time my code gives you a > non-dropping particle is when you have a name in format > B A, D E > then B is the particle and A the given name. > I don't have any support now for dropping-particle or suffix. > This needs to be improved, but I'd need to know better what the > algorithm is. > > Inline formatting > > ================= > > It probably doesn't play a role for pandoc, but in the interest > of > > portability, shouldn't inline formatting in a CSL-YAML file or > metadata > > section rather use > > and for italics > > and for bold > > and for subscript > > and for superscript > > and for > smallcaps > > instead of markdown formatting? > > (see [1]https://github.com/jgm/pandoc/issues/931, > > [2]https://www.zotero.org/support/kb/rich_text_bibliography) > No, because the YAML file is standard pandoc metadata, formatted in > markdown. When it is read by pandoc-citeproc, this will be > converted > where possible to CSL metadata. > > Small Caps > > ========== > > Speaking of small caps: Any plans to add some markdownish > formatting > > commands for small caps to pandoc? > No idea what a natural syntax would be. > > -- > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > [3]https://groups.google.com/d/msgid/pandoc-discuss/a86067ec-6b4a-437b- > b2f0-3ee666209dae%40googlegroups.com. > For more options, visit [4]https://groups.google.com/groups/opt_out. > > References > > 1. https://github.com/jgm/pandoc/issues/931 > 2. https://www.zotero.org/support/kb/rich_text_bibliography > 3. https://groups.google.com/d/msgid/pandoc-discuss/a86067ec-6b4a-437b-b2f0-3ee666209dae%40googlegroups.com > 4. https://groups.google.com/groups/opt_out