From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/30909 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Paulo Ney de Souza Newsgroups: gmane.text.pandoc Subject: Re: Changing colons to full-stops in titles Date: Sat, 2 Jul 2022 22:41:44 -0700 Message-ID: References: <78df697a-50f5-46d0-b0b8-29a2cbc9509an@googlegroups.com> <2a8d940b3675472fb4b50ead406f6fc7@unibe.ch> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="000000000000bc07d305e2e01307" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="33086"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBDYKBT675MKBBJGYQSLAMGQEIO7423Q-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sun Jul 03 07:42:00 2022 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-yb1-f184.google.com ([209.85.219.184]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1o7sMp-0008Pa-Lc for gtp-pandoc-discuss@m.gmane-mx.org; Sun, 03 Jul 2022 07:41:59 +0200 Original-Received: by mail-yb1-f184.google.com with SMTP id b11-20020a5b008b000000b00624ea481d55sf5203899ybp.19 for ; Sat, 02 Jul 2022 22:41:59 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1656826918; cv=pass; d=google.com; s=arc-20160816; b=CvWH20C4XIKgzYyS/NLLlUO8TiZwlAIPsLK1wD15qQaX2mEM20o/W72y9KYtKbTOGF 7gdqz/+/6VqvK9uKwQDTEdlG6aTcrHCwkFL1prNscKgxyNIGRsM/NHkKdF+q3eLOCxN0 T1M2T1bLojtZGNc386uc5zD5ZoMC3rQeGngOtvaBBrQMzl/x8pRa5H4AC7hIHIvyRiko CMGrW/aX7RGct6MCGvnTnQVJA3TMHdwhhTMhf7w1famGBWe0e7+VZBwvPc3ihGXpQAUO ND/Vt7yy/6pEpliiKWqmrFY57J3ad+eUw16j/fEBcosez9pTe49pJAnTjqYfU75ohgcw Ct9g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:to:subject:message-id:date :from:in-reply-to:references:mime-version:sender:dkim-signature :dkim-signature; bh=552c1bbsZe5b6QnfbWAaEfRglPnm4O6c+FroVTCMuLc=; b=tOBiAREWje0yAwdTZSFupLbAmnq6cw8Q88HLk3GdrTiLNPygqmSXSyzQnliFPk/AeW XS0kh572KmexBpFZA5AGvBnOalSqRGLXWp7uVBOCHkC4acjccXV5X14Wcc3RRC1jcror LiNhM1gcv+Qe944aEI4K82411pwFlWnKdP2kkuU8t3heuhijoOnvmprFytaH+Q30ov7r vDYkCkNKZ96W1OHNpO//9zCr9kFhZOuqs4GafBqbaujw8v/GKae4+LMAumw1q1oeO31e dlDmT4WorLNaEM7JfIbAL0M5YN4icdjL5TqgGAnEhkCydofE/zMbph1FwFYn33tX81xq +i4Q== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="izvRq/Bz"; spf=pass (google.com: domain of pauloney-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2001:4860:4864:20::2a as permitted sender) smtp.mailfrom=pauloney-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=sender:mime-version:references:in-reply-to:from:date:message-id :subject:to:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=552c1bbsZe5b6QnfbWAaEfRglPnm4O6c+FroVTCMuLc=; b=f7bRjBTvMUVeiSiVDsmJtnti+Tsxh6ai1ZjvZIADGl+Dtt1tSMk5hNtwS7scE5hflL XjCG4We0vamYbLT10dSTLE/H1EVxsYnHrkEWdoO3OohYa9CU7lWCqnWFpYgQmE2EoeN8 5rq0aAUT064kXqlnZktDVzJV5cGW0NTWFEFZtL7fdKAK6zaehI7Q6s5zofnttFzQTcm3 enLEGzHh4rtRM5Uy+uE0fyyn4+knRh7qnyOzfsY4DRAxLbTGyp9YDqu1KC3jIwsT7ZeP J4xedRi9oZL1zDXIPqgSElsUwl7pAfDiv6CFecZfp/C4n+jrX/ZKTbJo0H0gja1EB9y/ Q/oQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=552c1bbsZe5b6QnfbWAaEfRglPnm4O6c+FroVTCMuLc=; b=PP1MwBvvmGHG8+3ZaMiuDs/qF2r9MSYootOlH5rlzJj9ia1godG52Knag/nzrN1bMP 9K6C1dizKxalN5yy95ohINf2sOp/1LNk4z/9MWTx8HsgONRukz2+rNXpeB5LSzrdU9TN +BHpT+cMIbNKKkLnF6u8VF9/HqWHmw5VtT7xeZvWD7iL2L4DwyMN9Xow8XStu/EOKzpV VmJ6NeTkxFkdPTLT0QpvQygSB7PIvcBqMmfEbkwR0Pq9CgFWvNhKXxqQ2JqS0CQHhzpB LtrDCJ/wr5Y+LJ/mrOzl9eG4ZOz1K/1+ltbifEb0azAy7IbcwSUQzawVw0Ae+gfauHyj 0E/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=sender:x-gm-message-state:mime-version:references:in-reply-to:from :date:message-id:subject:to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=552c1bbsZe5b6QnfbWAaEfRglPnm4O6c+FroVTCMuLc=; b=N9p51Ji+2ISBQpWRI7E4Hy0Vn0ZA9rftoDeJuO/M/V7Z4RjqCi/rMVitfsj2iPMnVQ MEuBjvpOBd5ONyvafYhoJ3Z5UJusT1nxb6kFy9Ge698ILiv7+05CrvPo7cXzkLXZMCfu l0izysH6zyMxsdDcLpcjv1SW9XL8+w2a5Azi6CwbXYqF+mgFN2Kvb0rMu0pqqi9/PFBF ycCp3sIDYsNixU68OjkVmm9l5two62Lyef0c4eIr5TTyEZU5AgOMXrCdIHTLHUvuY0Ye yQ2Oa1V3bCwVHSSDwJQavGdxjYLVyZB9vknCUrdlWAKk4+glLWGGsaIBVIixgLsNpAbq BDcQ== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AJIora/WPc6i5LSYPT6ZDknPKaGQGirEy8lHjbQwWMv6uNBcBUVDj28P fpwMMa8BBA8LtyKqW1vM2/8= X-Google-Smtp-Source: AGRyM1tWlqd7WfXQd+g8AwaoKJKVg54ptCcRdab6vV+2DgrciulnHDlQHMNKTo8VV6lmg2QsweoLYQ== X-Received: by 2002:a25:8a81:0:b0:65b:9268:2760 with SMTP id h1-20020a258a81000000b0065b92682760mr24268045ybl.119.1656826918696; Sat, 02 Jul 2022 22:41:58 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a81:69c5:0:b0:31c:85bf:7c67 with SMTP id e188-20020a8169c5000000b0031c85bf7c67ls612210ywc.10.gmail; Sat, 02 Jul 2022 22:41:55 -0700 (PDT) X-Received: by 2002:a81:6e02:0:b0:317:b68d:3870 with SMTP id j2-20020a816e02000000b00317b68d3870mr26615467ywc.222.1656826915837; Sat, 02 Jul 2022 22:41:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656826915; cv=none; d=google.com; s=arc-20160816; b=jMdldJxFcihwWruwBx6KFgkNVeEpGZ0kOIeeEWeO7mPP7ThJIqSO1zMMwedW8fv01f LAy4qoC2GmehCJ3AzvSrnRWoswMl/wCQMoGwWhTXTISm2418UYzPyWUWy0+QptwSHJ8N Pc75GtY6qLOaNco+0JpXKVX05x2Cj3rxnLAB7J3Ew3I1JhBDRH+4RMWxBafCMd8LRWyO FdwO/UUWJLmRh3Lb/QjnuM5eZFoq7wbZh2dt61Bl9VZhgXIFP65BOeXdyjYhE/MMoaW6 PCOT9nt0V79kkOrtOwF9baztMIXwQJMWFEkhTbBB2HyFapBclRvsaamZNE+TZ8Byc5+w PQgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=w0IASZFsnMehvBFh3FKy7LmNfr9fza0D8FOvLtVzoJc=; b=yFQRNpRP87pZSjNj44mfqZAKbPjOARZoS+DcZqakyBy9aWetDxony6eaSnbMuHN/tj adNgd9p/XVU986yVmanS49OCgvpoMJhNGdgOAvAaYHjFcmteWN8uRuvXUDTuzfy9lZ4X aJPkGCiG3BIKh6jDFHeiaNaNcQzzkeqaD+CmnCGbgUOz8ag6AmGPHKdAx0iMZIJleIo9 /5yDyPtnSvUUV0aHafGo2cSWEb1ZMQeqOrI3ftot9+l5eTesXzwU4nk3iLrPPaKa2ZZr VJxXO16pK3N1pzVOLLMAIMNJ0kWWnFbn/Cd7myJsmrlsbgpwJoKUHLoxBWjoP1e7nWNO qlIw== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="izvRq/Bz"; spf=pass (google.com: domain of pauloney-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2001:4860:4864:20::2a as permitted sender) smtp.mailfrom=pauloney-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Original-Received: from mail-oa1-x2a.google.com (mail-oa1-x2a.google.com. [2001:4860:4864:20::2a]) by gmr-mx.google.com with ESMTPS id n68-20020a254047000000b0066472d2d476si843293yba.4.2022.07.02.22.41.55 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 02 Jul 2022 22:41:55 -0700 (PDT) Received-SPF: pass (google.com: domain of pauloney-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2001:4860:4864:20::2a as permitted sender) client-ip=2001:4860:4864:20::2a; Original-Received: by mail-oa1-x2a.google.com with SMTP id 586e51a60fabf-101b4f9e825so8997764fac.5 for ; Sat, 02 Jul 2022 22:41:55 -0700 (PDT) X-Received: by 2002:a05:6870:bf0f:b0:108:7537:cf34 with SMTP id qh15-20020a056870bf0f00b001087537cf34mr14371039oab.283.1656826915193; Sat, 02 Jul 2022 22:41:55 -0700 (PDT) In-Reply-To: X-Original-Sender: pauloney-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="izvRq/Bz"; spf=pass (google.com: domain of pauloney-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2001:4860:4864:20::2a as permitted sender) smtp.mailfrom=pauloney-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:30909 Archived-At: --000000000000bc07d305e2e01307 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I got interested in another aspect of the posting -- the program " cleanbib.pl" by Benct. I installed it in Ubuntu, and found out it does not process perfectly valid TeX code like characters that end or have a space in the middle, or that it processes \c{e}, but not the comma-accent any of the other vowels... I prepared the torture test below to show the problems: @Book{hobbit, title =3D {Les \oe uf de la serpente}, address =3D {Bla\v zi\'c}, publisher =3D {\c{a} \c{e} \c{i} \c{o} \c{u}}, } and above all, how does this compare to: https://ctan.org/tex-archive/support/bibtexperllibs/LaTeX-ToUnicode Paulo Ney On Sat, Jul 2, 2022 at 1:03 PM BPJ wrote: > string.gsub() optionally takes the maximum number of substitutions as a > fourth argument, and you can reinsert capture groups in the replacement, = so > this should be fairly robust: > > ``````lua > string.gsub(title, '%:(%s)', '.%1', 1) > `````` > > > Den fre 1 juli 2022 18:44John Carter Wood skrev: > >> Ah, of course, biblical references. Religious history is one of my >> fields, how could I miss that? >> >> Looking forward to trying this out! >> >> denis...-NSENcxR/0n0@public.gmane.org schrieb am Freitag, 1. Juli 2022 um 18:41:02 UTC+2: >> >>> A slightly more reliable version: >>> >>> >>> >>> ``` >>> >>> local stringify =3D pandoc.utils.stringify >>> >>> function Meta(m) >>> >>> if m.references ~=3D nil then >>> >>> for _, el in ipairs (m.references) do >>> >>> -- print(stringify(el.title)) >>> >>> el.title =3D pandoc.Str(string.gsub(stringify(el.title), ': ', '. >>> ')) >>> >>> -- print(el.title) >>> >>> end >>> >>> end >>> >>> return m >>> >>> end``` >>> >>> >>> >>> (This won=E2=80=99t replace colons in biblical references, e.g. Gen 1:1= ) >>> >>> >>> >>> You can test with this file : >>> >>> >>> >>> ```markdown >>> >>> --- >>> >>> references: >>> >>> - type: book >>> >>> id: doe >>> >>> author: >>> >>> - family: Doe >>> >>> given: Jane >>> >>> issued: >>> >>> date-parts: >>> >>> - - 2022 >>> >>> title: 'A book: with a subtitle and a reference to Gen 1:1, but that >>> is not a problem' >>> >>> publisher: 'Whatever press' >>> >>> lang: de-De >>> >>> ... >>> >>> >>> >>> test [@doe] >>> >>> ``` >>> >>> >>> >>> The filter itself does not cover capitalization. For some reason, >>> pandoc or citeproc applies title-case transformation here. I don=E2=80= =99t think it >>> should though. >>> >>> >>> >>> *Von:* pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org *Im >>> Auftrag von *John Carter Wood >>> *Gesendet:* Freitag, 1. Juli 2022 18:24 >>> *An:* pandoc-discuss >>> *Betreff:* Re: Changing colons to full-stops in titles >>> >>> >>> >>> That's very interesting, thanks! I'll try it out when I get a chance in >>> the coming days. >>> >>> I have thought about this issue of false positives while thinking about >>> the option of some kind of filter. But...I think they would be very rar= e. I >>> have a hard time thinking of a title with a colon in it that shouldn't = be >>> -- in this case -- be turned into a dot. At least, I don't have anythin= g in >>> my 1,200 references where I can see that that wouldn't apply. >>> >>> Although, of course, I'm sure there are some out there... >>> >>> Just a question: would this also ensure that the first word after the >>> dot is capitalised? Or does that open a new series of problems? :-) >>> >>> >>> denis...-NSENcxR/0n0@public.gmane.org schrieb am Freitag, 1. Juli 2022 um 18:17:02 UTC+2: >>> >>> Here=E2=80=99s a very simple and absolutely unreliable version of a fil= ter. This >>> will replace every colon in a title with a period. >>> >>> >>> >>> ```lua >>> >>> local stringify =3D pandoc.utils.stringify >>> >>> function Meta(m) >>> >>> if m.references ~=3D nil then >>> >>> for _, el in ipairs (m.references) do >>> >>> print(stringify(el.title)) >>> >>> el.title =3D pandoc.Str(string.gsub(stringify(el.title), ':', '.'= )) >>> >>> print(el.title) >>> >>> end >>> >>> end >>> >>> return m >>> >>> end >>> >>> ``` >>> >>> >>> >>> Question is how this can be made robust enough to avoid false positives= . >>> >>> >>> >>> >>> >>> *Von:* pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org *Im >>> Auftrag von *John Carter Wood >>> *Gesendet:* Freitag, 1. Juli 2022 17:52 >>> *An:* pandoc-discuss >>> *Betreff:* Re: Changing colons to full-stops in titles >>> >>> >>> >>> Thanks for the suggestions, a couple of which are kind of stretching my >>> knowledge of these things, but I see where they're going. >>> >>> As to JGM's question: I am using a CSL json bibliography, so my titles >>> are in a single field. ("title":"Science and religion: new perspectives= on >>> the dialogue") >>> >>> The issue is that *most* of the journals / publishers I publish in use, >>> as here, the colon. *Some* (mainly German) styles want the period. If I >>> were solely interested in either one, I could choose and just enter the >>> relevant punctuation in the title field. However, I want to continue sa= ving >>> my bibliographic entries with a colon (because that's the most standard= one >>> for me), but have the option of automatically converting them to a peri= od >>> for those cases where I need to. If that makes sense. >>> >>> Thus: going through denis's options: >>> >>> 1. I have switched to json bibliographies from bibtex/biblatex as they >>> seemed to offer more flexibility (I was running into issue with the str= ange >>> archival references I have to make in my field, and JSON seemed to work >>> better in that regard). So this seems to not apply. >>> >>> 2. Seems to not apply, as I have a single title field >>> >>> 3. Sounds really interesting, and I use BBT, though it also sounds like >>> I would here have to create a separate bibliography file from my Zotero >>> database for those publishers/styles that require the dot. This is not >>> *too* onerous, as it would at least be automated. >>> >>> 4. Having a filter that I could simply apply (as part of a pandoc >>> command, say) or not apply as relevant seems like the most flexible / >>> efficient solution. I don't know lua, but if this is one possible way, = then >>> I could use it as a (hopefully fairly simple?) way into learning it. >>> >>> >>> >>> Does this help to clarify my situation? >>> >>> >>> >>> denis...-NSENcxR/0n0@public.gmane.org schrieb am Freitag, 1. Juli 2022 um 17:34:55 UTC+2: >>> >>> Yes, that=E2=80=99s a known issue... >>> >>> There are a couple of possible solutions : >>> >>> >>> >>> 1. use biblatex databases and patch pandoc so it will concat title and >>> subtitle fields using periods. (line 667 >>> https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Citeproc/BibT= eX.hs >>> ) >>> >>> >>> >>> 2. I think pandoc=E2=80=99s citeproc will just treat every unknown vari= able as a >>> string variable (see >>> https://github.com/jgm/citeproc/blob/3f94424db469c804cf2dac2d22dc7a18b6= 14f43e/src/Citeproc/Types.hs#L1054 >>> and >>> https://github.com/jgm/citeproc/blob/3f94424db469c804cf2dac2d22dc7a18b6= 14f43e/src/Citeproc/Types.hs#L901), >>> so you should be able to use =C2=ABsubtitle=C2=BB in styles. (This will= give you >>> warnings when using the style with Zotero and it won=E2=80=99t work rel= iably across >>> implementations, but anyway ...) >>> >>> >>> >>> 3. if you=E2=80=99re using Zotero, you can leverage Zotero BBT=E2=80=99= s postscript >>> feature to manipulate the JSON after exporting. >>> >>> E.g., this one : >>> >>> if (Translator.BetterCSL && item.title) { >>> >>> reference.title =3D reference.title.replace(/ : /g, '. ') >>> >>> } >>> >>> Not bullet-proof, but simple. You will want to choose a better >>> separator, maybe a double-bar or so. >>> >>> >>> >>> 4. Doing the with lua should also be possible... >>> >>> >>> >>> The question is: do you have the subtitle in a distinct field or is it >>> just in the title field? >>> >>> >>> >>> *Von:* pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org *Im >>> Auftrag von *John Carter Wood >>> *Gesendet:* Freitag, 1. Juli 2022 16:39 >>> *An:* pandoc-discuss >>> *Betreff:* Changing colons to full-stops in titles >>> >>> >>> >>> I have one final (for now...) issue in setting up a CSL file (which I >>> use with pandoc/citeproc and references in a json file). >>> >>> >>> >>> I'm not sure whether this is a CSL issue or whether it's an issue that >>> can be solved via using a filter (or some other solution) in pandoc, bu= t I >>> thought there might be some people here who might have faced a similar >>> issue. >>> >>> >>> >>> The house style for here (German-based publisher) wants a *full-stop/pe= riod >>> *between main title and subtitle in citations / bibliographies; US/UK >>> standard is a *colon* between main title and subtitle. And reference >>> managers like Zotero -- IIUC -- save titles as single fields (at least = they >>> are in my version of Zotero). So it doesn't seem like it is possible to >>> control what delimiter is used between them via CSL. >>> >>> >>> I have found various discussions of relevant title/subtitle division >>> issues -- some going back quite a few years -- in forums on Zotero: >>> >>> >>> https://forums.zotero.org/discussion/8077/separate-fields-for-title-and= -subtitle/ >>> >>> ...and CSL: >>> >>> >>> https://discourse.citationstyles.org/t/handling-main-sub-title-splits-c= iteproc-js/1563/11 >>> >>> >>> >>> However, these were in part discussions among developers about >>> *possible* changes, and I'm not sure of the current status of this >>> issue or whether there is a way to handle it. >>> >>> Would it be possible to automate turning colons in titles into >>> full-stops via using a filter? If so is there such a filter already aro= und? >>> Can this be done via CSL? >>> >>> >>> >>> Or is this, as of now, impossible? >>> >>> (Or is there a real simple solution that I have, as usual, >>> overlooked...) >>> >>> -- >>> >>> You received this message because you are subscribed to the Google >>> Groups "pandoc-discuss" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/pandoc-discuss/78df697a-50f5-46d0-b0b= 8-29a2cbc9509an%40googlegroups.com >>> >>> . >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "pandoc-discuss" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>> >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/pandoc-discuss/a2d540a6-a435-4285-aed= 5-018007d155cfn%40googlegroups.com >>> >>> . >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "pandoc-discuss" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>> >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/pandoc-discuss/f0f222ef-e60e-4397-83a= c-bec1a6ac2d08n%40googlegroups.com >>> >>> . >>> >> -- >> You received this message because you are subscribed to the Google Group= s >> "pandoc-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send a= n >> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/pandoc-discuss/b3deb0de-8ba0-4159-b9f3= -1ecfbe68d457n%40googlegroups.com >> >> . >> > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhAU66TxJKMZdDM-KVa= bJpmKUVo5xyuAAN03F2b89jv9Ow%40mail.gmail.com > > . > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/CAFVhNZMyj_GZ%3DAo_1qR2rwnAAYAaQ%3DMaf880cGLRv7yD_ianpQ%40ma= il.gmail.com. --000000000000bc07d305e2e01307 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I got interested in another=C2=A0aspect of the posting -- = the program "cleanbib.pl" by B= enct.

I installed=C2=A0it in Ubuntu, and found out it do= es not process perfectly valid TeX code like characters that end or have a = space in the middle, or that it processes \c{e}, but not the comma-accent a= ny of the other vowels...

I prepared the tortu= re test below to show the problems:

@Book{hobbit,<= br>=C2=A0 title =C2=A0 =C2=A0=3D {Les \oe uf =C2=A0de la serpente},
=C2= =A0 address =3D {Bla\v zi\'c},
=C2=A0 publisher =3D {\c{a} \c{e} \c{= i} \c{o} \c{u}},
}=C2=A0

and above all, how= does this compare to:

=

Paulo Ney


On Sat, Jul 2, 2022 at 1:= 03 PM BPJ <bpj-J3H7GcXPSITLoDKTGw+V6w@public.gmane.org> wrot= e:
string.gsub() optionally takes the maximum number of substitutions as = a fourth argument, and you can reinsert capture groups in the replacement, = so this should be fairly robust:

``````lua
string.gsub(title, '%:(%s)', &#= 39;.%1', 1)
``````


Den fre 1 juli 2022 18:44John Carter Wood <woodjo-ZOsAvrTRSvuEhhMi0yms2Q@public.gmane.org> skrev:
Ah, of course, biblical = references. Religious history is one of my fields, how could I miss that? <= br>
Looking forward to trying this out!

denis...-NSENcxR/0n0@public.gmane.org schrieb am Fr= eitag, 1. Juli 2022 um 18:41:02 UTC+2:

A slightly more reliable versio= n:

=C2=A0

```

local stringify =3D pandoc.util= s.stringify

function Meta(m)<= /span>

=C2=A0 if m.references ~=3D nil= then

=C2=A0=C2=A0=C2=A0 for _, el in= ipairs (m.references) do

<= p class=3D"MsoNormal">=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 -= - print(stringify(el.title))

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = el.title =3D pandoc.Str(string.gsub(stringify(el.title), ': ', '= ;. '))

<= p class=3D"MsoNormal">=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 -- print(el.titl= e)

=C2=A0=C2=A0=C2=A0 end

=C2=A0 end

=C2=A0 return m

end```

=C2=A0

<= p class=3D"MsoNormal">(This won=E2=80=99t replace colo= ns in biblical references, e.g. Gen 1:1)

=C2=A0

You can test with this file=C2= =A0:

=C2=A0

```markdown

---

references:

- type: book

=C2=A0 id: doe

=C2=A0 author:

=C2=A0 - family: Doe

=C2=A0=C2=A0=C2=A0 given: Jane

=C2=A0 issued:<= /u>

=C2=A0=C2=A0=C2=A0 date-parts:<= u>

=C2=A0=C2=A0=C2=A0 - - 2022<= /u>

=C2=A0 title: 'A book: with= a subtitle and a reference to Gen 1:1, but that is not a problem'

=C2=A0 publisher: 'Whatever= press'

=C2=A0 lang: de-De

...

=C2=A0

test [@doe]

```

=C2=A0

The filter itself does not cover capitalizatio= n. For some reason, pandoc or citeproc applies tit= le-case transformation here. I don=E2=80=99t think it should though.=

=C2=A0

Von: pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> Im Auftrag von John Carter Wood
Gesendet: Freitag, 1. Juli 2022 18:24
An: pandoc-discuss <pandoc-...@goo= glegroups.com>
Betreff: Re: Changing colons to full-stops in titles

=C2=A0

That's very interes= ting, thanks! I'll try it out when I get a chance in the coming days.

I have thought about this issue of false positives while thinking about the= option of some kind of filter. But...I think they would be very rare. I ha= ve a hard time thinking of a title with a colon in it that shouldn't be= -- in this case -- be turned into a dot. At least, I don't have anything in my 1,200 references where I ca= n see that that wouldn't apply.

Although, of course, I'm sure there are some out there...

Just a question: would this also ensure that the first word after the dot i= s capitalised? Or does that open a new series of problems? :-)


denis...-NSENcxR/0n0@public.gmane.org= schrieb am Freitag, 1. Juli 2022 um 18:17:02 UTC+2:

Here=E2=80=99s a very simple an= d absolutely unreliable version of a filter. This will replace every colon = in a title with a period.

=C2=A0

```lua

local stringify =3D pandoc.util= s.stringify

function Meta(m)<= u>

=C2=A0 if m.references ~=3D nil= then

=C2=A0=C2=A0=C2=A0 for _, el in= ipairs (m.references) do

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = print(stringify(el.title))

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 = el.title =3D pandoc.Str(string.gsub(stringify(el.title), ':', '= .'))

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 print(el.title)

=C2=A0=C2=A0=C2=A0 end

=C2=A0 end

=C2=A0 return m

end

```

=C2=A0

Question is how this can be mad= e robust enough to avoid false positives.

=C2=A0

=C2=A0

Von: pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org= <pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> Im Auftrag von John Carter Wood
Gesendet: Freitag, 1. Juli 2022 17:52
An: pandoc-discuss <pandoc-.= ..@googlegroups.com>
Betreff: Re: Changing colons to full-stops in titles

=C2=A0

Thanks for the suggesti= ons, a couple of which are kind of stretching my knowledge of these things,= but I see where they're going.

As to JGM's question: I am using a CSL json bibliography, so my titles = are in a single field. ("title":"Science and religion: new p= erspectives on the dialogue")

The issue is that *most* of the journals / publishers I publish in use, as = here, the colon. *Some* (mainly German) styles want the period. If I were s= olely interested in either one, I could choose and just enter the relevant = punctuation in the title field. However, I want to continue saving my bibliographic entries with a colon (= because that's the most standard one for me), but have the option of au= tomatically converting them to a period for those cases where I need to. If= that makes sense.

Thus: going through denis's options:

1. I have switched to json bibliographies from bibte= x/biblatex as they seemed to offer more flexibility (I was running into iss= ue with the strange archival references I have to make in my field, and JSON seemed to work better in that regard). So this = seems to not apply.

2. Seems to not apply, as I have a single title fiel= d

3. Sounds really interesting, and I use BBT, though = it also sounds like I would here have to create a separate bibliography fil= e from my Zotero database for those publishers/styles that require the dot. This is not *too* onerous, as it would at least be a= utomated.

4. Having a filter that I could simply apply (as par= t of a pandoc command, say) or not apply as relevant seems like the most fl= exible / efficient solution. I don't know lua, but if this is one possible way, then I could use it as a (hopefully fairly si= mple?) way into learning it.

=C2=A0

Does this help to clari= fy my situation?

=C2=A0

denis...@unibe.= ch schrieb am Freitag, 1. Juli 2022 um 17:34:55 UTC+2:=

Yes, that=E2=80=99s a known iss= ue...

There are a couple of possible = solutions=C2=A0:

=C2=A0

1. use biblatex databases and p= atch pandoc so it will concat title and subtitle fields using periods. (line 667 https://github.com/jgm/pandoc/b= lob/master/src/Text/Pandoc/Citeproc/BibTeX.hs)<= /u>

=C2=A0

2. I think pandoc=E2=80=99s citeproc will just treat= every unknown variable as a string variable (see https://github.com/= jgm/citeproc/blob/3f94424db469c804cf2dac2d22dc7a18b614f43e/src/Citeproc/Typ= es.hs#L1054 and https://github.com/jgm/citeproc/blob/3f94424db469c804cf2dac2d22dc7a18b614f4= 3e/src/Citeproc/Types.hs#L901), so you should be able to use =C2=ABsubt= itle=C2=BB in styles. (This will give you warnings when using the style wit= h Zotero and it won=E2=80=99t work reliably across implementations, but anyway ...)

=C2=A0

3. if you=E2=80=99re using Zote= ro, you can leverage Zotero BBT=E2=80=99s postscript feature to manipulate = the JSON after exporting.

E.g., this one=C2=A0:=

if (Translator.BetterCSL &&= amp; item.title) {

=C2=A0 reference.title =3D refe= rence.title.replace(/ : /g, '. ')

}

Not bullet-proof, but simple. Y= ou will want to choose a better separator, maybe a double-bar or so.=

=C2=A0

4. Doing the with lua should also be possible...<= /u>

=C2=A0

The question is: do you have the subtitle in a disti= nct field or is it just in the title field?

=C2=A0

Von: pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org= <pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> Im Auftrag von John Carter Wood
Gesendet: Freitag, 1. Juli 2022 16:39
An: pandoc-discuss <pandoc-.= ..@googlegroups.com>
Betreff: Changing colons to full-stops in titles

=C2=A0

I have one final (for now...) issue in setting up a = CSL file (which I use with pandoc/citeproc and references in a json file).

=C2=A0

I'm not sure whether this is a CSL issue or whet= her it's an issue that can be solved via using a filter (or some other = solution) in pandoc, but I thought there might be some people here who might have faced a similar issue.

=C2=A0

The house style for here (German-based publisher) wa= nts a full-stop/period between main title and subtitle in citations / bibl= iographies; US/UK standard is a colon between main title and subtitle. And reference managers like Z= otero -- IIUC -- save titles as single fields (at least they are in my vers= ion of Zotero). So it doesn't seem like it is possible to control what = delimiter is used between them via CSL.


I have found various discussions of relevant title/subtitle division issues= -- some going back quite a few years -- in forums on Zotero:

https://fo= rums.zotero.org/discussion/8077/separate-fields-for-title-and-subtitle/=

...and CSL:

ht= tps://discourse.citationstyles.org/t/handling-main-sub-title-splits-citepro= c-js/1563/11

=C2=A0

However, these were in = part discussions among developers about possible changes, and I'm not sure of the current status of this= issue or whether there is a way to handle it.

Would it be possible to automate turning colons in t= itles into full-stops via using a filter? If so is there such a filter alre= ady around? Can this be done via CSL?

=C2=A0

Or is this, as of now, impossible?

(Or is there a real simple solution that I have, as usual, overlooked...) <= u>

--

You received this message because you are subscribed= to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org<= /span>.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/78df697a-50f5-46d0-b0b8-29= a2cbc9509an%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org<= /span>.

=

-= -
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe@googlegroups.c= om.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/b3deb0de-8ba= 0-4159-b9f3-1ecfbe68d457n%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhAU66TxJKMZdD= M-KVabJpmKUVo5xyuAAN03F2b89jv9Ow%40mail.gmail.com.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://gro= ups.google.com/d/msgid/pandoc-discuss/CAFVhNZMyj_GZ%3DAo_1qR2rwnAAYAaQ%3DMa= f880cGLRv7yD_ianpQ%40mail.gmail.com.
--000000000000bc07d305e2e01307--