From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/24856 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: BPJ Newsgroups: gmane.text.pandoc Subject: Re: A New Feature for Pandoc's Markdown Extension -- No Space with Newline Date: Tue, 14 Apr 2020 07:17:59 +0200 Message-ID: References: <20130715175101.GA20541@protagoras.phil.berkeley.edu> <51E56808.5000500@gmail.com> <35356bdb-9f45-4f0c-bb49-3fb4e2db98a0@googlegroups.com> <1beb6ec0-19a5-4da7-b785-ebb7d340c865@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="0000000000007cff9005a3395338" Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="26259"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCWMVYEK54FRBFUP2X2AKGQEYP6N3VQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Tue Apr 14 07:18:19 2020 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-wr1-f61.google.com ([209.85.221.61]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1jODxg-0006jx-Dz for gtp-pandoc-discuss@m.gmane-mx.org; Tue, 14 Apr 2020 07:18:16 +0200 Original-Received: by mail-wr1-f61.google.com with SMTP id 11sf1779630wrc.3 for ; Mon, 13 Apr 2020 22:18:16 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1586841496; cv=pass; d=google.com; s=arc-20160816; b=q/Q6JhFLFSZ6nYMdmW69EuK03UixXGcCV29LBJL828IpS4ed0jTA7RZ935aHlAbO3j J/W0VPpOfQrWrbZFMkcRT5LlOYlB0eC2+0y28TVg0ZKSFLqDPtX0Vzgu3XGcrklSwAqn RHk4lF5740iOqfN/Y3rYDZtq4OlIA3/aSUPJkFUAjFm6TK/mPgG7hsOjxzySe/GhHTmf 0e6ULVhcQd+VczyXDo7OMRwe4WAfb854CEPFtyZE4RGpSNnvO6vNyuolhVtJ5BuRO2B3 JCsuj402t0R8k8/g3RGauq2UsJuKoCiEpG47YgVpNZDtsNGcai1yGu6UV9MnegWH7i5q 8PkA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:to:subject:message-id:date :from:in-reply-to:references:mime-version:sender:dkim-signature :dkim-signature; bh=vMNdbfVYjC5e2YfDcQtEXsZoYro91QaZi1qxn/tG+Xg=; b=HAZ4aA3Q1GvqeSHviyjPQyfRQgEA2s8qR2EEuCyAxpqKnmk+mKBW5yQEmuVz8qNj1e LqwUqysnz10cN7YPr87m+HttwA+w+X89rgM2Ejj2y2DpppO7SEZSKzNYoUrrd/EA3qgp oljRoTVbq1IrIAJBi7Em1kqmZoGNKduWPgx7fDi5DKt9v4YAl2IsdNtXr5P3D/UsQjko D4ae+knkAvhWTTQ55c59q+rnDd1uL0uJEAlZjKR2EDjtudrjsIh6fECDCtV0AhUjMXEO HFIndk+vxZzjQEmmfLb9lIZhMcOfwD3UfCHInxYLHALmSCVUiceLKTskTpZz62LpTuDX Z9MQ== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=tLfcwGHi; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2a00:1450:4864:20::12d as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:mime-version:references:in-reply-to:from:date:message-id :subject:to:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=vMNdbfVYjC5e2YfDcQtEXsZoYro91QaZi1qxn/tG+Xg=; b=i0pYiRL845okrO1mZBhh/TFPjPzap9T5+dpTijIjW33Gf0bT5jGa0AAZV3phebZiQb 3anFEVIK3jperXgRHE+s/NaMufGwaEvVUwIXSNl/Id4ojrqk83AbqJgPaDe1hbbxyUcl ap2NmchSmS+UGXKYOf5Q2cNcOsWPWnXQ86DYE1WXLIX7aIDIbEi9FqeUW3IqwLUebLXX kvADZ92Hb7RMmdDxCdqtcCCKGJ0wcSKfS+eVu/6OMUVtxhCpPxG965sOvt9qJNDLz2j9 SrnHaseuVfrFKctTpDO22XzI58ncwr8zlzAKv1zZQBAdoK8ozCMOSkwEbqdxOGiYuYhO RRGw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :x-original-sender:x-original-authentication-results:reply-to :precedence:mailing-list:list-id:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=vMNdbfVYjC5e2YfDcQtEXsZoYro91QaZi1qxn/tG+Xg=; b=NLtuzDVyq73nGqPokl1xWPQ2AzOX3e/vlkurLMhmZgdYBxB6zMiTTbyDdmsORfBCAc nNTbBxp4X1Q08FHx1r+xih18HS80wbsb6Y6/NZzL3lJzKgirRBeNexi7OCBApJzBpPei qhP9Up3a/j+E2FVu+BtRTHTONCY7Nmt1zZ3oiVeAmvntAKDrE6qhRZ6itwExVaVnAHCb CpPsuixipt6HJKrw8ubKh2X+jPTwIOlwF30eur82QMwJvv7e3T6dVieyss6WnQFVBJxU OXX+NSWp7x4eUzzqW/ktKoj0W01AJFE2gqkOFhHtGLh5Nky6M5Kp8OmndvfR3M4F3qi2 W17Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:mime-version:references:in-reply-to:from :date:message-id:subject:to:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=vMNdbfVYjC5e2YfDcQtEXsZoYro91QaZi1qxn/tG+Xg=; b=G7xVFDudeBfleRH5+euXIEO3ei0SnSIipPGyYBZuqBICXw6RGb/pBWsSRQxhoybQGo YbUKXmTZ1/z91IeLbdipaNIFkecRKXlQ3xlpx7tVjS+HHR9vY+1iyT+jkWSRNrjScMbu FQH5A7uqs+sZyr6cBm1y6MPwN9fP9XCQpBq8sM0j8aJfxmnKcAhYRNw4c40AKfDlD0P0 BhXrBTgpedfAIAa+V4DKVAN95TwPl+q4uFmb4mFH/9eOcHa2mNCeA9KbB26GHLmkANG3 eKkNx9zclG4ZBN1P6CFg8RJqpqeLeqthjcNTVchzr6DuSxETIDlNP8YY+gITfMInPPkH M/0g== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AGi0PuaUv9uwVkOnO1m4boljD4fBL2M19Tn0YbDOyBa+PbM4M4a5pl0R xFyadQu+PBjIZfZ3hP7c2yY= X-Google-Smtp-Source: APiQypL6z/FbdBt2oGW+YSyDTw3HA316EbTyiPXOvQDNS+mZRJtTfbrTVPf/EkW5OaJCP+84/90yZw== X-Received: by 2002:a7b:c051:: with SMTP id u17mr20788629wmc.129.1586841496080; Mon, 13 Apr 2020 22:18:16 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:adf:e390:: with SMTP id e16ls1782165wrm.9.gmail; Mon, 13 Apr 2020 22:18:13 -0700 (PDT) X-Received: by 2002:adf:fe0e:: with SMTP id n14mr3344522wrr.247.1586841493619; Mon, 13 Apr 2020 22:18:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1586841493; cv=none; d=google.com; s=arc-20160816; b=DmbMJEnboPGPGPNEH+YbwzUb71wwD6usG5+rQSStqxiZUk/nJn0CaUXyZTI2PV4XYN mG8E867y5ToOSIFXV0htGhbETjTEwL9zQ+wZIZe67drHT5C7+bnwIUNZmA9Cw/L1jgOY 8zZaKzEtzLKexPuJhXAH0ZvQjlCnLGk5M+Sy/xOycHbeIBObaW7e5KDbADKSFeopr0nu EsQAwuZiMkweu9s9Nv3b04M49bD3OzxPjuVObNxo42YuRlmY1ff5MtAPAhxiwS9bISY6 du9GXdfW73/m1RYUB4/wsPzt4C9dm6p2URFgXLoA5YWjW/ZuBPhjAhPUopOavyCIfudH WFfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=AzirqJwVcQonhGhBaiGFLcwhyrd7Q8Gz+guvjCcV94Y=; b=VSGUNY4Oszk3rNTjUxNjWgSAiEtV9Y8x6ebpkASk3H7Yms91aOvcpwHVv9849RUp41 novl+3RT7VQ/nmLOkghZFlPD8n3DCTUgIGNgFyCTrD6PO4ViYPoy1WKSAbOijwwj4mdL yZf9Lqt2I3xDb9WbPW/U39wUwzTXFFSHRuagGQtjfJSgHZs4AWLA5jD176r+0+C6gv6e Klx2JofWafyzAw7eqpFtJgILUuEEFP0ofwO0+ngZ62MCOJ6qeeavb80aku4GZiCAtFFS kaeIYWWyZy1Z9CeHn8/1OX4V5ZiscfOFNA0XjK7cIFqy0agFw/dn5hzDedXIj743iFoG 3BjQ== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=tLfcwGHi; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2a00:1450:4864:20::12d as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Original-Received: from mail-lf1-x12d.google.com (mail-lf1-x12d.google.com. [2a00:1450:4864:20::12d]) by gmr-mx.google.com with ESMTPS id z84si629485wmc.2.2020.04.13.22.18.13 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 13 Apr 2020 22:18:13 -0700 (PDT) Received-SPF: pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2a00:1450:4864:20::12d as permitted sender) client-ip=2a00:1450:4864:20::12d; Original-Received: by mail-lf1-x12d.google.com with SMTP id k28so8327650lfe.10 for ; Mon, 13 Apr 2020 22:18:13 -0700 (PDT) X-Received: by 2002:ac2:4573:: with SMTP id k19mr12133949lfm.144.1586841492711; Mon, 13 Apr 2020 22:18:12 -0700 (PDT) In-Reply-To: <1beb6ec0-19a5-4da7-b785-ebb7d340c865-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> X-Original-Sender: melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=tLfcwGHi; spf=pass (google.com: domain of melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2a00:1450:4864:20::12d as permitted sender) smtp.mailfrom=melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:24856 Archived-At: --0000000000007cff9005a3395338 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable A Perl filter which removes Space and SoftBreak elements sandwiched between two Str elements which respectively ends and starts with a character with Unicode script property CJK is certainly doable. Will that be OK? /BPJ Den tis 14 apr. 2020 02:39J skrev: > Thank you for your efforts very much ! I wonder if the script can keep th= e > spaces inside English words, digits, and punctuation, since my files also > contain short groups of English words and number with digits ? > > On Tuesday, April 14, 2020 at 3:16:40 AM UTC+8, BP wrote: >> >> Wow that script is really ancient! I'll try to port it to a Lua filter >> tomorrow. It's 9 PM here now and I have been coding or writing for twelv= e >> hours, so I'm quite exhausted. >> >> Just to be clear, the old script removes all spaces which are next to a >> "string" element, i.e. all "words", digits and punctuation alike, and no= t >> just CJK characters. If you are OK with that behavior porting it to a Lu= a >> filter will be trivial, and Lua is built-in in Pandoc. Otherwise I'll ha= ve >> to look into rewriting the Perl script, which may be not quite as trivia= l. >> >> /BPJ >> >> Den m=C3=A5n 13 apr. 2020 20:45J skrev: >> >>> Could you help to update zapspace.pl to work with pandoc 2.9.2.1 ? I >>> have Chinese markdown files that use spaces to separate groups of words= , >>> and would like to ignore spaces between Chinese characters before >>> converting to Word. >>> Many thanks ! >>> >>> On Tuesday, July 16, 2013 at 11:34:32 PM UTC+8, BP Jonsson wrote: >>>> >>>> 2013-07-15 19:51, John MacFarlane skrev: >>>> > +++ Bill Chen (CHEN, Zhechuan) [Jul 15 13 17:16 ]: >>>> >> Have found a way to make this feature done. >>>> >> Just add "\n" at the last of the line >>>> > >>>> > This would violate the general rule that backslashes before letters >>>> in >>>> > markdown are just literal backslashes. >>>> > >>>> > I think that a better approach would be to provide a markdown >>>> > extension like the current 'hard_line_breaks': perhaps >>>> > 'ignore_line_breaks'. 'hard_line_breaks' causes line >>>> > breaks in a paragraph to be interpreted as hard breaks; >>>> > 'ignore_line_breaks' would cause them to be ignored entirely. >>>> > (One of these would have to be designated as taking precedence >>>> > if both were selected.) >>>> > >>>> > John >>>> > >>>> >>>> The attached perl script, when used as a filter on pandoc's >>>> json output, should enable Bill to get what he wants. I have >>>> used an earlier version on Tibetan text with satisfactory >>>> results. Someone who knows Haskell could probably write >>>> something shorter which interacts with pandoc in a more >>>> elegant way, but this script works. >>>> >>>> The description inside the file reads as follows: >>>> >>>> FILE: zapspace.pl >>>> >>>> USAGE: pandoc -w json some.markdown | zapspace.pl | pandoc -r >>>> json >>>> >>>> DESCRIPTION: Takes as input a document in pandoc's json format and >>>> removes all "Space" elements inside any list which also >>>> contains any {"Str":"..."} element, and outputs a >>>> modified json document, which when given as input to >>>> pandoc will produce output suitable for languages which >>>> don't put spaces between words or sentences, with no >>>> spaces >>>> inside paragraphs -- unless you insert non-breaking >>>> spaces, >>>> see below! --, and notably spaces caused by linebreaks >>>> in the markdown paragraph will be removed. >>>> >>>> Additionally it does two things which allow you to >>>> insert whitespace inside paragraph-like elements: >>>> >>>> 1) It replaces any non-breaking space (U+00A0) inside a >>>> "Str" element with ordinary soft spaces (U+0020) >>>> *if* the "Str" element also contains characters othe= r >>>> than non-breaking spaces. >>>> >>>> This allows you to insert spaces into your markdown >>>> paragraphs as non-breaking spaces (in pandoc notatio= n >>>> a backslash followed by an ordinary space "like\ >>>> this") >>>> and get ordinary spaces in your output. >>>> >>>> 2) Preserves any "Str" element which only contains one >>>> or more non-breaking spaces as is. >>>> >>>> This allows you to put non-breaking spaces between >>>> words by inserting ordinary whitespace -- which will >>>> be removed -- on either side of the non-breaking >>>> spaces "like \ this". >>>> ^ ^ >>>> >>>> N.B. that this is *not* done by scanning the JSON text >>>> with regular expressions! The JSON is loaded into a >>>> perl data structure which is modified and then converted >>>> back into JSON. Precautions are taken not to modify the >>>> structure such that the output will be rejected by >>>> pandoc, nor to modify code elements, but I can't >>>> guarantee >>>> that this will remain true with future versions of >>>> pandoc, >>>> or that it is true for any input. >>>> >>>> OPTIONS: --- >>>> REQUIREMENTS: * A reasonably recent version of perl. >>>> * The following CPAN modules: >>>> >>>> - [JSON::Any](https://metacpan.org/module/JSON::An= y) >>>> >>>> + A JSON 'backend' module like JSON or >>>> JSON::XS. >>>> - [List::MoreUtils]( >>>> https://metacpan.org/module/List::MoreUtils) >>>> - [autovivification]( >>>> https://metacpan.org/module/autovivification) >>>> >>>> >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "pandoc-discuss" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to pandoc-...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/pandoc-discuss/35356bdb-9f45-4f0c-bb4= 9-3fb4e2db98a0%40googlegroups.com >>> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/1beb6ec0-19a5-4da7-b785-= ebb7d340c865%40googlegroups.com > > . > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/CADAJKhDkCQ-GsQ7-G2_U_SZSx-1zheZAdQizRn-Cjb0jaC92Pw%40mail.g= mail.com. --0000000000007cff9005a3395338 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
A Perl filter which removes Space and SoftBreak elements = sandwiched between two Str elements which respectively ends and starts with= a character with Unicode script property CJK is certainly doable. Will tha= t be OK?

/BPJ


Den tis 14 apr. 2020 02:39J <lixichen-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skrev:
Thank you for your efforts very much ! I wonder= if the script can keep the spaces inside English words, digits, and punctu= ation, since my files also contain short groups of English words and number= with digits=C2=A0?

On Tuesday, April 14, 2020 at 3:16:40 AM UTC+8, = BP wrote:
Wow that= script is really ancient! I'll try to port it to a Lua filter tomorrow= . It's 9 PM here now and I have been coding or writing for twelve hours= , so I'm quite exhausted.

= Just to be clear, the old script removes all spaces which are next to a &qu= ot;string" element, i.e. all "words", digits and punctuation= alike, and not just CJK characters. If you are OK with that behavior porti= ng it to a Lua filter will be trivial, and Lua is built-in in Pandoc. Other= wise I'll have to look into rewriting the Perl script, which may be not= quite as trivial.

/BPJ<= /div>

Den m=C3=A5n 13 = apr. 2020 20:45J <lixi...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org&g= t; skrev:
Could yo= u help to update zapspace.pl to work with pandoc 2.9.2.1 ? I have Chi= nese markdown files that use spaces to separate groups of words, and would = like to ignore spaces between Chinese characters before converting to Word.=
Many thanks !=C2=A0

On Tuesday, July 16, 2013 at 11:34:32 PM UT= C+8, BP Jonsson wrote:
2013-07-15 19= :51, John MacFarlane skrev:
> +++ Bill Chen (CHEN, Zhechuan) [Jul 15 13 17:16 ]:
>> =C2=A0 =C2=A0 Have found a way to make this feature done.
>> =C2=A0 =C2=A0 Just add "\n" at the last of the line
>=20
> This would violate the general rule that backslashes before letter= s in
> markdown are just literal backslashes.
>=20
> I think that a better approach would be to provide a markdown
> extension like the current 'hard_line_breaks': =C2=A0perha= ps
> 'ignore_line_breaks'. =C2=A0'hard_line_breaks' cau= ses line
> breaks in a paragraph to be interpreted as hard breaks;
> 'ignore_line_breaks' would cause them to be ignored entire= ly.
> (One of these would have to be designated as taking precedence
> if both were selected.)
>=20
> John
>=20

The attached perl script, when used as a filter on pandoc's
json output, should enable Bill to get what he wants.=C2=A0 I have
used an earlier version on Tibetan text with satisfactory
results. Someone who knows Haskell could probably write
something shorter which interacts with pandoc in a more
elegant way, but this script works.

The description inside the file reads as follows:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 FILE: zapspace.pl

=C2=A0 =C2=A0 =C2=A0 =C2=A0USAGE: pandoc -w json some.markdown | zaps= pace.pl | pandoc -r json

=C2=A0DESCRIPTION: Takes as input a document in pandoc's json forma= t and
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 removes all "Spac= e" elements inside any list which also
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 contains any {"St= r":"..."} element, and outputs a
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 modified json document= , which when given as input to
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 pandoc will produce ou= tput suitable for languages which
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 don't put spaces b= etween words or sentences, with no spaces
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 inside paragraphs -- u= nless you insert non-breaking spaces,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 see below! --, and not= ably spaces caused by linebreaks
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 in the markdown paragr= aph will be removed.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 Additionally it does t= wo things which allow you to
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 insert whitespace insi= de paragraph-like elements:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 1) =C2=A0It replaces a= ny non-breaking space (U+00A0) inside a
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 "St= r" element with ordinary soft spaces (U+0020)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 *if* the= "Str" element also contains characters other
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 than non= -breaking spaces.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 This all= ows you to insert spaces into your markdown
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 paragrap= hs as non-breaking spaces (in pandoc notation
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 a backsl= ash followed by an ordinary space "like\ this")
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 and get = ordinary spaces in your output.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 2) =C2=A0Preserves any= "Str" element which only contains one
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 or more = non-breaking spaces as is.

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 This all= ows you to put non-breaking spaces between
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 words by= inserting ordinary whitespace -- which will
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 be remov= ed -- on either side of the non-breaking
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 spaces &= quot;like \ =C2=A0this".
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ^ =C2=A0^

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 N.B. that this is *not= * done by scanning the JSON text
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 with regular expressio= ns!=C2=A0 The JSON is loaded into a
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 perl data structure wh= ich is modified and then converted
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 back into JSON. Precau= tions are taken not to modify the
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 structure such that th= e output will be rejected by
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 pandoc, nor to modify = code elements, but I can't guarantee
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 that this will remain = true with future versions of pandoc,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 or that it is true for= any input.

=C2=A0 =C2=A0 =C2=A0OPTIONS: ---
REQUIREMENTS: * =C2=A0 A reasonably recent version of perl.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 * =C2=A0 The following= CPAN modules:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - =C2=A0= [JSON::Any](https://metacpan.org/module/JSON::Any= )
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 + =C2=A0 A JSON 'backend' module like JSON or JSON::XS.
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - =C2=A0= [List::MoreUtils](https://metacpan.org/module/L= ist::MoreUtils)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 - =C2=A0= [autovivification](https://metacpan.org/module= /autovivification)



--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/3535= 6bdb-9f45-4f0c-bb49-3fb4e2db98a0%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh4Ykp1iOSErHA@public.gmane.org= m.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/1beb6ec0-19a5= -4da7-b785-ebb7d340c865%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.= google.com/d/msgid/pandoc-discuss/CADAJKhDkCQ-GsQ7-G2_U_SZSx-1zheZAdQizRn-C= jb0jaC92Pw%40mail.gmail.com.
--0000000000007cff9005a3395338--