From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/33510 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Thomas Newhall Newsgroups: gmane.text.pandoc Subject: citeproc hack for multilingual citations? Date: Mon, 11 Dec 2023 17:48:36 -0800 (PST) Message-ID: <8fd3ff8c-44e7-4abe-9c65-38fb5debbb3dn@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_35648_1747950653.1702345716132" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="18937"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBDJYXM4LQ4HRB5PX32VQMGQERQKYP4Y-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Tue Dec 12 02:48:41 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-oo1-f62.google.com ([209.85.161.62]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1rCrt3-0004ia-5s for gtp-pandoc-discuss@m.gmane-mx.org; Tue, 12 Dec 2023 02:48:41 +0100 Original-Received: by mail-oo1-f62.google.com with SMTP id 006d021491bc7-58db4b9a52esf5975390eaf.0 for ; Mon, 11 Dec 2023 17:48:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20230601; t=1702345720; x=1702950520; darn=m.gmane-mx.org; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:message-id:to:from:date:sender:from:to:cc :subject:date:message-id:reply-to; bh=PWfMukSpNcsjhSNutPE+0orJuvmiyWLvIdT6On19pX4=; b=Wv1BaHcHKZuozFbmAAmU1+bH2/cF11pOj4elDVNlhjTpKxbEDUvJjMhx2ZlJhMkBlp aU40rqkLTlMCH17TEw7mKzG6INv9eBryu4SSB4N5ESVpE5W/Mmo6EEbhmy+rG33iDbc4 vAzvFYbSRJ1gPQPwFHzKbCXzMpualdJ9ynTHSuEszoG8rq5DXZ18EwEZjbS5FrGgde7G sSB29Jjd5Ce1t0CyYitK8rJ0hvo3o59T/X9WuuobzKy2xdWwvPtqoDVLF90zhaWQyoTU U8GKPbLzD1h2T6i+5GsztVUh30AWWCFimE3MPmzKI6FWEpTYggMIaUp5C9jNvBzfdaR8 ZEeg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702345720; x=1702950520; darn=m.gmane-mx.org; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:message-id:to:from:date:from:to:cc:subject :date:message-id:reply-to; bh=PWfMukSpNcsjhSNutPE+0orJuvmiyWLvIdT6On19pX4=; b=QJ523TShXM/SA0Wh3JcejHdv3mrbxeGTkvZdjXTemf8qFy0l2nMmTpSOKWYHQg3MJG mP18gvJ+Ix+jB9YbSjhI+ZTBDaOIKkNVOvyeuoVvRxwA/7k6+fDRaWp6hxG3d1jC5Ulv TYp5hrn56e8+vkhJ8YdMspDmz82F2gYvZKs4ibTHfir1Ew7gg3b/tz03wUrna4WIwNNT 6Z5IfZnyIQpLuor6CH1E6nJCYgdopq84OU/4pLxZdtOcKQbQ0Iwxgq2tHp1HaEgoMvJy rDNGc7/Uh2XqHQDq5J+TtIh4qXI6cba4NyLgGJTI3fBFJdRRj5cGP/bmEr5yXUHN+cvx BfMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702345720; x=1702950520; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-sender:mime-version:subject:message-id:to:from:date :x-beenthere:x-gm-message-state:sender:from:to:cc:subject:date :message-id:reply-to; bh=PWfMukSpNcsjhSNutPE+0orJuvmiyWLvIdT6On19pX4=; b=H5MaRj4d9Vy/CANda2XXlBqBQLnMG73cZ9L3y0tErRxMgbm14c8+rNvJmCogJxNpJY rUsq9OgDED8uAD+h5oTBSq7EZJhtunD8nMZBxBrEy49Zone8EMRpOa7K6Jltgxl1MpL/ Nnwov5fbmC2wO7JT6FUD7/1ciwDdBxQn+DI6aTPiZoUK0KJ/uiqjXJIPiPFSvThhTDq8 oNTXQZgk9GvKqofQsDp/4m2hwpiiULWV3dbSWJsUvPuXa/teGxvJ6uqkB+uCDzoozBK+ 5c4n2/ZXtpRVozVwFGzsN1qpOky13iqCCcLm/gw3GEko7MO78shWpzIlSyGdv8mgRX4K Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOJu0YwcY6CEnENxi/ikJ4iwElG0vFHQ7fy9gDIIWs8eyHeEHTxuvIQf 7sS6+3xkR2+3hXAUY034gXo= X-Google-Smtp-Source: AGHT+IF4EjUw9tT5snoy+tsXKlS31f8JdHp+qIL6SoYpdSlRyyLBlnHvsTQgokyeMOd9C7ilStMKtA== X-Received: by 2002:a4a:e825:0:b0:590:e78e:3e37 with SMTP id d5-20020a4ae825000000b00590e78e3e37mr1540196ood.1.1702345719885; Mon, 11 Dec 2023 17:48:39 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6820:161e:b0:58d:be41:d2a3 with SMTP id bb30-20020a056820161e00b0058dbe41d2a3ls710134oob.2.-pod-prod-01-us; Mon, 11 Dec 2023 17:48:37 -0800 (PST) X-Received: by 2002:a05:6808:221c:b0:3ad:29a4:f54f with SMTP id bd28-20020a056808221c00b003ad29a4f54fmr5276457oib.4.1702345716652; Mon, 11 Dec 2023 17:48:36 -0800 (PST) X-Original-Sender: tom.newhall-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:33510 Archived-At: ------=_Part_35648_1747950653.1702345716132 Content-Type: multipart/alternative; boundary="----=_Part_35649_656486092.1702345716132" ------=_Part_35649_656486092.1702345716132 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello everyone, I'm trying to use pandoc's citeproc to render multilingual citations, saved= =20 in zotero, and automatically exported to CSL JSON. I wrote about this issue= =20 on the Zotero forum=20 earlier=20 today, but was told this was "more of a Pandoc issue". I see there are also= =20 similar threads on this forum here=20 ,= =20 here=20 ,= =20 and here=20 ,= =20 but I couldn't determine whether the solutions provided would work for me. I know there are some limitations to citeproc (that may be getting=20 addressed with the citeproc-rs =20 project), but I am trying to figure out the best workaround that still=20 maintains a markdown-based workflow. I think I almost have it as I need=20 them, but there's a few issues still: Right now, my html output renders like this: Inline citation: (=C5=8Ctani 2016) Bibliography: =C5=8Ctani, Y=C5=ABka =E5=A4=A7=E8=B0=B7=E7=94=B1=E9=A6=99. 2016. =E2=80=9C= (Ronbun) Niss=C5=8Ds=C5=8D Shunj=C5=8D wo hattan toshita niss=C5=8Dkan=20 =E2=80=98Ensh=C5=AB kaitai=E2=80=99 rons=C5=8D=EF=BC=BB=E8=AB=96=E6=96=87= =EF=BC=BD=E5=85=A5=E5=AE=8B=E5=83=A7=E4=BF=8A=E8=8A=BF=E3=82=92=E7=99=BA=E7= =AB=AF=E3=81=A8=E3=81=97=E3=81=9F=E6=97=A5=E5=AE=8B=E9=96=93=E3=80=8C=E5=86= =86=E5=AE=97=E6=88=92=E4=BD=93=E3=80=8D=E8=AB=96=E4=BA=89.=E2=80=9D *Nihon = Bukky=C5=8D s=C5=8Dg=C5=8D kenky=C5=AB=20 =E6=97=A5=E6=9C=AC=E4=BB=8F=E6=95=99=E7=B6=9C=E5=90=88=E7=A0=94=E7=A9=B6* 1= 4: 105=E2=80=93132. There are two changes I would like to make to this. First, I would like to be able to *keep the inline citation as is, while=20 removing the comma after the name "=C5=8Ctani" in the bibliography*. This w= ould=20 be possible with Juris-m if I were to simply copy-and-paste citations, but= =20 I was hoping to be able to have in-text citations linked to the=20 bibliography (and, ideally, live citations), which seem like they will be= =20 difficult to implement if I'm not rendering citations using citeproc. Second, I would like to *keep the English language transliteration of the= =20 journal title in italics (i.e. Nihon Bukky=C5=8D s=C5=8Dg=C5=8D kenky=C5=AB= ), while making the=20 Japanese text for the journal title (i.e. =E6=97=A5=E6=9C=AC=E4=BB=8F=E6=95= =99=E7=B6=9C=E5=90=88=E7=A0=94=E7=A9=B6) to be regular=20 non-italicized text*. In a latex/pdf output this is no problem; latex=20 ignores italicized Chinese characters, but I am wondering if it is possible= =20 with the HTML output. I thought this would be possible by including the=20 Japanese text for the title of the journal as a "note" field, but I=20 couldn't get the "note" field to print at all (see following example).=20 Alternatively, if there were a way to simply tell css to ignore italics for= =20 Chinese/Japanese fonts (like latex does), that could work. Here is the (Better)CSL-JSON for this entry: ```json { "id": "otani-2016", "author": [{ "family": "=C5=8Ctani", "given": "Y=C5=ABka =E5=A4=A7=E8=B0=B7= =E7=94=B1=E9=A6=99" }], "citation-key": "otani-2016", "container-title": "Nihon Bukky=C5=8D s=C5=8Dg=C5=8D kenky=C5=AB =E6=97=A5= =E6=9C=AC=E4=BB=8F=E6=95=99=E7=B6=9C=E5=90=88=E7=A0=94=E7=A9=B6", "DOI": "10.20588/nbs.14.0_105", "ISSN": "1348-4850", "issued": { "date-parts": [["2016"]] }, "language": "jpn", "note": "cjk-title: =E6=97=A5=E6=9C=AC=E4=BB=8F=E6=95=99=E7=B6=9C=E5=90=88= =E7=A0=94=E7=A9=B6", "page": "105=E2=80=93132", "publisher": "=E6=97=A5=E6=9C=AC=E4=BB=8F=E6=95=99=E7=B6=9C=E5=90=88=E7=A0= =94=E7=A9=B6=E5=AD=A6=E4=BC=9A", "source": "search.library.ucla.edu", "title": "(Ronbun) Niss=C5=8Ds=C5=8D Shunj=C5=8D wo hattan toshita niss=C5= =8Dkan 'Ensh=C5=AB kaitai'=20 rons=C5=8D=EF=BC=BB=E8=AB=96=E6=96=87=EF=BC=BD=E5=85=A5=E5=AE=8B=E5=83=A7= =E4=BF=8A=E8=8A=BF=E3=82=92=E7=99=BA=E7=AB=AF=E3=81=A8=E3=81=97=E3=81=9F=E6= =97=A5=E5=AE=8B=E9=96=93=E3=80=8C=E5=86=86=E5=AE=97=E6=88=92=E4=BD=93=E3=80= =8D=E8=AB=96=E4=BA=89", "type": "article-journal", "volume": "14" } ``` And here is the CSL of the portion that I think will apply to this case: ``` < /macro> ```` If this is impossible to do with a "hack" (i.e. using the note field for=20 the Chinese/japanese title) in CSL or zotero, is is possible to wrote some= =20 custom (lua or python) filter that either gets rid of the commas or gets=20 rid of the italics (or both)? Thanks in advance, Tom --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/8fd3ff8c-44e7-4abe-9c65-38fb5debbb3dn%40googlegroups.com. ------=_Part_35649_656486092.1702345716132 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello everyone,

I'm trying to use pandoc's citeproc to= render multilingual citations, saved in zotero, and automatically exported= to CSL JSON. I wrote about this issue on the=C2=A0Zotero forum=C2=A0earlier today, but was told = this was "more of a Pandoc issue". I see there are also similar threads on = this forum=C2=A0here,=C2=A0here, and=C2=A0here, but I couldn't determine whether the solutions provided would w= ork for me.

I know there are some limitations to= citeproc (that may be getting addressed with the citeproc-rs project), but I am trying to figur= e out the best workaround that still maintains a markdown-based workflow. I= think I almost have it as I need them, but there's a few issues still:

Right now, my html output renders like this:

Inline cita= tion: (=C5=8Ctani 2016)

Bibliography:
=C5=8Ctani, Y=C5=ABka= =E5=A4=A7=E8=B0=B7=E7=94=B1=E9=A6=99. 2016. =E2=80=9C(Ronbun) Niss=C5=8Ds= =C5=8D Shunj=C5=8D wo hattan toshita niss=C5=8Dkan =E2=80=98Ensh=C5=AB kait= ai=E2=80=99 rons=C5=8D=EF=BC=BB=E8=AB=96=E6=96=87=EF=BC=BD=E5=85=A5=E5=AE= =8B=E5=83=A7=E4=BF=8A=E8=8A=BF=E3=82=92=E7=99=BA=E7=AB=AF=E3=81=A8=E3=81=97= =E3=81=9F=E6=97=A5=E5=AE=8B=E9=96=93=E3=80=8C=E5=86=86=E5=AE=97=E6=88=92=E4= =BD=93=E3=80=8D=E8=AB=96=E4=BA=89.=E2=80=9D Nihon Bukky=C5=8D s=C5=8Dg= =C5=8D kenky=C5=AB =E6=97=A5=E6=9C=AC=E4=BB=8F=E6=95=99=E7=B6=9C=E5=90=88= =E7=A0=94=E7=A9=B6 14: 105=E2=80=93132.

There are two change= s I would like to make to this.

First, I would like to be able t= o keep the inline citation as is, while removing the comma after the nam= e "=C5=8Ctani" in the bibliography. This would be possible with Juris-m= if I were to simply copy-and-paste citations, but I was hoping to be able = to have in-text citations linked to the bibliography (and, ideally, live ci= tations), which seem like they will be difficult to implement if I'm not re= ndering citations using citeproc.

Second, I would like to kee= p the English language transliteration of the journal title in italics (i.e= . Nihon Bukky=C5=8D s=C5=8Dg=C5=8D kenky=C5=AB), while making the Ja= panese text for the journal title (i.e. =E6=97=A5=E6=9C=AC=E4=BB=8F=E6=95= =99=E7=B6=9C=E5=90=88=E7=A0=94=E7=A9=B6) to be regular non-italicized text<= /b>. In a latex/pdf output this is no problem; latex ignores italicized Chi= nese characters, but I am wondering if it is possible with the HTML output.= I thought this would be possible by including the Japanese text for the ti= tle of the journal as a "note" field, but I couldn't get the "note" field t= o print at all (see following example). Alternatively, if there were a way = to simply tell css to ignore italics for Chinese/Japanese fonts (like latex= does), that could work.

Here is the (Better)CSL-JSON for this e= ntry:

```json
{
"id": "otani-2016",
"author": [{ = "family": "=C5=8Ctani", "given": "Y=C5=ABka =E5=A4=A7=E8=B0=B7=E7=94=B1=E9= =A6=99" }],
"citation-key": "otani-2016",
"container-title": "Nih= on Bukky=C5=8D s=C5=8Dg=C5=8D kenky=C5=AB =E6=97=A5=E6=9C=AC=E4=BB=8F=E6=95= =99=E7=B6=9C=E5=90=88=E7=A0=94=E7=A9=B6",
"DOI": "10.20588/nbs.14.0_10= 5",
"ISSN": "1348-4850",
"issued": { "date-parts": [["2016"]] },<= br />"language": "jpn",
"note": "cjk-title: =E6=97=A5=E6=9C=AC=E4=BB= =8F=E6=95=99=E7=B6=9C=E5=90=88=E7=A0=94=E7=A9=B6",
"page": "105=E2=80= =93132",
"publisher": "=E6=97=A5=E6=9C=AC=E4=BB=8F=E6=95=99=E7=B6=9C= =E5=90=88=E7=A0=94=E7=A9=B6=E5=AD=A6=E4=BC=9A",
"source": "search.libr= ary.ucla.edu",
"title": "(Ronbun) Niss=C5=8Ds=C5=8D Shunj=C5=8D wo hat= tan toshita niss=C5=8Dkan 'Ensh=C5=AB kaitai' rons=C5=8D=EF=BC=BB=E8=AB=96= =E6=96=87=EF=BC=BD=E5=85=A5=E5=AE=8B=E5=83=A7=E4=BF=8A=E8=8A=BF=E3=82=92=E7= =99=BA=E7=AB=AF=E3=81=A8=E3=81=97=E3=81=9F=E6=97=A5=E5=AE=8B=E9=96=93=E3=80= =8C=E5=86=86=E5=AE=97=E6=88=92=E4=BD=93=E3=80=8D=E8=AB=96=E4=BA=89",
"= type": "article-journal",
"volume": "14"
}
```

And here is the CSL of the portion that I think will ap= ply to this case:

```
< /macro><= br />=C2=A0 =C2=A0 <macro name=3D"container-title">
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 <choose>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 <if type=3D"chapter entry-dictionary entry-encyclopedia paper-con= ference" match=3D"any">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 <text macro=3D"container-prefix" suffix=3D" "/>
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 </if>
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 </choose>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 <choose&g= t;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <if type=3D"webpage"&g= t;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <text va= riable=3D"container-title" text-case=3D"title"/>
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 </if>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 <else-if type=3D"legal_case" match=3D"none">
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <group delimiter=3D" "&= gt;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 <text variable=3D"container-title" text-case=3D"title" font-style=3D= "italic"/>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 <choose>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <if variable=3D"note">
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 <text variable=3D"note"/>
=C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 <= /if>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 </choose>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 </group>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 &l= t;/else-if>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 </choose>
=C2=A0= =C2=A0 </macro>
````

If t= his is impossible to do with a "hack" (i.e. using the note field for the Ch= inese/japanese title) in CSL or zotero, is is possible to wrote some custom= (lua or python) filter that either gets rid of the commas or gets rid of t= he italics (or both)?

Thanks in advance,
Tom

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/8fd3ff8c-44e7-4abe-9c65-38fb5debbb3dn%40googlegroups.= com.
------=_Part_35649_656486092.1702345716132-- ------=_Part_35648_1747950653.1702345716132--