From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/32267 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Julien Dutant Newsgroups: gmane.text.pandoc Subject: Re: Lua filter to fix incorrectly nested lists? Date: Fri, 3 Mar 2023 07:51:23 -0800 (PST) Message-ID: <81f58aeb-ac35-45ab-a8f7-79da06619047n@googlegroups.com> References: <163effbf-b672-4501-9171-8c4681034a96n@googlegroups.com> <80183457-60c8-4fc3-aa16-13d2f93104f1n@googlegroups.com> <8c2cd1be-52b9-467b-a747-a88fc062209bn@googlegroups.com> <8208c36c-dd86-49f6-9b77-32cc5f48299dn@googlegroups.com> <9ea5164a-6677-4aa6-850c-d887c77765e3n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_902_94304103.1677858683987" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="3597"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBC5Y3356IYIP3LUIUADBUBHKBUVHS-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Fri Mar 03 16:51:30 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-qk1-f187.google.com ([209.85.222.187]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pY7gu-0000hN-Fn for gtp-pandoc-discuss@m.gmane-mx.org; Fri, 03 Mar 2023 16:51:28 +0100 Original-Received: by mail-qk1-f187.google.com with SMTP id c15-20020a37e10f000000b00741a3333881sf1601854qkm.13 for ; Fri, 03 Mar 2023 07:51:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; t=1677858687; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :sender:from:to:cc:subject:date:message-id:reply-to; bh=kfgUuk/iQ/gL5Nr/CiBoHlmKDdhz7yAe/q72h/wELg8=; b=NQODDCAVVVnhNqNEWAO01PIic0LFf+vkRNfnhm5rEZK0QdPRjbXXt+gHAURvIN40BO QpqXolofWHTkF9Egw/Fn1S5DpQj2qucEuaeWuPiMU8Z0/RApxtrtD6mEKnPEQtVm645A V56HyYUHjchR1MWf0eWPi0Z5TMi90mRozNEgktAyjV+b+u3qcD5Cjkz4gn73WAEYC8zk lGmC/DKe7U8gRJQW9cqohl6d7CF4drfPeqv2A+W2z69uvR2yXYsJn69hcyIQMI6hdVEJ HArX2mIeCU6loAnOg96sRxOxzmaEMi6DkcLImL1ov4UQB2MgVh9dTu9LKqdQCugMiAw2 8cYg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1677858687; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :from:to:cc:subject:date:message-id:reply-to; bh=kfgUuk/iQ/gL5Nr/CiBoHlmKDdhz7yAe/q72h/wELg8=; b=DCOpJvGnKOm+iHsygPyBMXDYNkIZoSJdbvoIDO4vx2VaBsz/OnkaTfNLjR8/bCUgMo PH2041a+UnauibweH81lRdNGxwTSbsINDrM3eURnY5IQL1w7gUpFNmoyDCaC/mwkK5lH gfCOK8x6l9nUhx/Rhtf3RtlS3vC+6bmYQ+W012YUYoH6Oc506TFlxzqyIf/V+ZuyS6bX hoUP2+4KW45NZtcbh2pd6JUiPVMzVf2df6gdTREFxpjqxa9AVi0W9Z2GzBEN7T3Fs7IF iUJTr4YamLmdgJm+tGR0ElzA+0j1yHGoJuJls0TAct+3KQ1g9wEWfTUklMFI0z0OzOIX 38rQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677858687; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-sender:mime-version:subject:references:in-reply-to :message-id:to:from:date:x-gm-message-state:sender:from:to:cc :subject:date:message-id:reply-to; bh=kfgUuk/iQ/gL5Nr/CiBoHlmKDdhz7yAe/q72h/wELg8=; b=rPkDn3UFZAzD3MUyWDBIrhpspwrYQyK8NrTDjOpov9zCZMyyRMj/bmmcOxNtwDn6Mk CGUMkmTt+jAfQqnF6jyKB4QAHIUHkOcJJ676fsHTkGcR2SF1APXlvCzQ8FyTwDpqo/UW kGl/LP5yCQS3gts1FJPZerj0w071jS3L622hJfuOtpeDazaUgdULVPoaxvGkowALdEU6 lPyDcZ8RhKivmYAMGZ/z4wQgGRwst+eB50X7/cjiZDf+5s8V0av2a3gPPAzpcE5j9pEF ixdxYmfduRtUuoQ8gCALg4ydnkLa8sMCxPiPIpG3HEGR+nWHqkmmQEjcCTy7vOnJ012j Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AO0yUKV2g8PbsrndIpcQWXEqPBpaq4aVCzpNB0stIHyTlL/chvpDrgJt NG/hsApF9zVeeHD0NgLRDjs= X-Google-Smtp-Source: AK7set8ykcRejZjWoailypw15LQSDGDtwDq67KWbo/xW8gtcSuAzdYS6CHNPiOve12D3NOmsEpaIag== X-Received: by 2002:a05:620a:1321:b0:742:5db0:2401 with SMTP id p1-20020a05620a132100b007425db02401mr567718qkj.15.1677858687286; Fri, 03 Mar 2023 07:51:27 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:620a:108d:b0:73b:79fc:2f0 with SMTP id g13-20020a05620a108d00b0073b79fc02f0ls674310qkk.9.-pod-prod-gmail; Fri, 03 Mar 2023 07:51:24 -0800 (PST) X-Received: by 2002:a05:620a:d44:b0:742:817f:5f41 with SMTP id o4-20020a05620a0d4400b00742817f5f41mr556582qkl.3.1677858684579; Fri, 03 Mar 2023 07:51:24 -0800 (PST) In-Reply-To: X-Original-Sender: julien.dutant-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:32267 Archived-At: ------=_Part_902_94304103.1677858683987 Content-Type: multipart/alternative; boundary="----=_Part_903_1864638186.1677858683987" ------=_Part_903_1864638186.1677858683987 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Might be worth trying what happens if you return pandoc.Softbreak() instead= =20 of pandoc.Space() https://pandoc.org/lua-filters.html#pandoc.softbreak On Thursday, March 2, 2023 at 9:59:58=E2=80=AFPM UTC JDTS wrote: > Update: I find that any number of \n's is swallowed, but \r is converted= =20 > to \n on org output. =20 > > On Thursday, March 2, 2023 at 4:24:06=E2=80=AFPM UTC-5 JDTS wrote: > >> Thanks again, something like this should work. I'd prefer to turn it int= o=20 >> a regular newline, but haven't figure out how to do that. pandoc.str('\= n')=20 >> doesn't seem to result in any output. =20 >> On Tuesday, February 28, 2023 at 9:13:14=E2=80=AFAM UTC-5 Julien Dutant = wrote: >> >>> Pandoc -f html -t native shows that
is turned into a Linebreak=20 >>> element: >>> >>> pandoc -f html -t native >>> test
>>> [ Plain [ Str "test" , LineBreak ] ] >>> >>> So I'd use a filter that converts Linebreaks to Space. Save as=20 >>> removeLinebreak.lua: >>> >>> function Linebreak (elem) >>> return pandoc.Space() >>> end >>> >>> Could be added to the previous one with >>> >>> return {{=20 >>> OrderedList =3D fixList,=20 >>> BulletList =3D fixList, >>> Linebreak =3D replaceBySpace >>> }} >>> >>> I think replacing it with a space is the safest. To remove it entirely,= =20 >>> you couldn't return nil as Pandoc treats this as "leave unmodified". Yo= u'd=20 >>> have to return an empty list instead, I think: >>> >>> function Linebreak (elem) >>> return pandoc.List:new() >>> end >>> >>> Best, >>> J >>> >>> On Tuesday, February 28, 2023 at 12:28:55=E2=80=AFAM UTC JDTS wrote: >>> >>> One other quick question: pandoc parses
as linebreak, and=20 >>> translates that into org as double-backslash \\. Any way to disable th= is? =20 >>> >>> >>> --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/81f58aeb-ac35-45ab-a8f7-79da06619047n%40googlegroups.com. ------=_Part_903_1864638186.1677858683987 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Might be worth trying what happens if you return pandoc.Softbreak() instead= of pandoc.Space() https://pandoc.org/lua-filters.html#pandoc.softbreak

O= n Thursday, March 2, 2023 at 9:59:58=E2=80=AFPM UTC JDTS wrote:
<= blockquote class=3D"gmail_quote" style=3D"margin: 0 0 0 0.8ex; border-left:= 1px solid rgb(204, 204, 204); padding-left: 1ex;">Update: I find that any = number of \n's is swallowed, but \r is converted to \n on org output. = =C2=A0

On Thursday, March 2, 2023 at 4:24:06=E2=80=AFPM UTC-5 JDTS wrote:
=
Thanks again, something l= ike this should work. I'd prefer to turn it into a regular newline, but= haven't figure out how to do that. =C2=A0pandoc.str('\n') does= n't seem to result in any output. =C2=A0
=
On Tuesday, February 28, 2023 at 9:1= 3:14=E2=80=AFAM UTC-5 Julien Dutant wrote:
Pandoc -f html -t native shows that <br> is= turned into a Linebreak element:

pandoc -f html -= t native
test <br/>
[ Plain [ Str "test" , LineBreak = ] ]

So I'd use a filter that converts Linebrea= ks to Space. Save as removeLinebreak.lua:

function= Linebreak (elem)
=C2=A0 return pandoc.Space()
end

Could be added to the previous one with

=
return {{
OrderedList =3D
fixList,

BulletList =3D
fixList,
Linebreak =3D replaceBySpace
}}

I thi= nk replacing it with a space is the safest. To remove it entirely, you coul= dn't return nil as Pandoc treats this as "leave unmodified". = You'd have to return an empty list instead, I think:

function Linebreak (elem)
=C2=A0 return pandoc.List:n= ew()
end

Best,
J
=
On Tuesday, February 28, 2023 at 12:28:55= =E2=80=AFAM UTC JDTS wrote:
One other = quick question: pandoc parses <br> as linebreak, and translates that = into org as double-backslash \\. =C2=A0Any way to disable this? =C2=A0
<= br>

=

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/81f58aeb-ac35-45ab-a8f7-79da06619047n%40googlegroups.= com.
------=_Part_903_1864638186.1677858683987-- ------=_Part_902_94304103.1677858683987--