From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/31922 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: "balaj...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" Newsgroups: gmane.text.pandoc Subject: Re: Pointers on modifying Plain objects(?) Date: Sun, 25 Dec 2022 01:15:12 -0800 (PST) Message-ID: <5257c49e-968d-40bf-a398-ae104a53c5c8n@googlegroups.com> References: <8af6876b-72cc-448e-9f5e-7d12ccdf2ad8n@googlegroups.com> <878riz8wf4.fsf@zeitkraut.de> <8f0e8d81-7f0b-49a7-b9b5-d78b19a0b1ban@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_26126_1623710226.1671959712883" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="21490"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBDV6VCFWSEDRBIVJUCOQMGQE6TICQOA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sun Dec 25 10:15:17 2022 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-ot1-f55.google.com ([209.85.210.55]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1p9N6D-0005P5-L5 for gtp-pandoc-discuss@m.gmane-mx.org; Sun, 25 Dec 2022 10:15:17 +0100 Original-Received: by mail-ot1-f55.google.com with SMTP id cg13-20020a056830630d00b00670556db34fsf4720215otb.3 for ; Sun, 25 Dec 2022 01:15:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :sender:from:to:cc:subject:date:message-id:reply-to; bh=hX4rpgKxLBSHjDGBon8nqBIJqKuCNfHocOXw725017g=; b=cOvfe23i6HKYbdXuxvrNGcARPtYDb4qTiOJ0eKNR5yHneaHj4ZTEZqbmURzTCjFAic s2c7OZOeSk+4z0m9R6JlFVKrppWSwu0E5K19FpMbo8MD2eaq40QIwZk6Rci1HCSTEcYf apUZE4tjjYyolDjS6X3Cvzb+4f0l1sMDkM5miJWrX/k75DoE/KGUlHdDDFzJRnSTzsau 1XmBObgbWonM2mdgkPQxRA+vyI0aPh8fkaHBBXymj0IpHvjgsl6kSfBt/07vw4fN72e9 YM6dIiEH8pAgGy3NQt+Ukyc9S8nfBG7/jjmfPk64enhTJeY8q9s1L3j6qE5/7HJBfr8P ZUkg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :from:to:cc:subject:date:message-id:reply-to; bh=hX4rpgKxLBSHjDGBon8nqBIJqKuCNfHocOXw725017g=; b=KINtSAQsUwzAu3XZr3ykSLHz4viHy7STgN7JCcn1RqsYlwkZ5vwDd19UPvayH+ZB4a jrITtOJWMYhv3ga84pPWNA1ZmEUK2x4VUzRO8qL0W5zOe8ZURDG/bUVzmYRdNSVrUGeG TiuPLaKcRK2thn1DR66Ru2+gUMUPV2yOA649ttk1cn2m/hx3nTBVxYmuXhtznKhe5tKw Rwp0OhhEPSF5jOaFRuekWFyVucKrz7wAwb2+YpCNSsk1+Rfpw0TRbI+nykOUGaTbxCeZ F2UqZhMGYEGpD0XBNM3+AOkChW8znjSm5BH8H/8x7EZIRRc8G8Kz94vU1v2fonW67OJC Kw/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-sender:mime-version:subject:references:in-reply-to :message-id:to:from:date:x-gm-message-state:sender:from:to:cc :subject:date:message-id:reply-to; bh=hX4rpgKxLBSHjDGBon8nqBIJqKuCNfHocOXw725017g=; b=vMSFcqvHV4GsameAASjlNevQknhCEtk1Ji3giqWxU7+s2ChX14oqJhuh4x8ugkZZBB 9YD8T796mUGDg8PDLr2KjwnjWRgpz0Na14vwAktDZviaZ3Ookg1CcBBoDd/pEHghWE/s rByACVRvi5GCloEGMbbUhTFwmelOvZAD/4OYeH0Mcwx96wFSeEpBBU/yCpZzAvzxnFfD vseDhSh+xtRDb74XoVQQJAQwf1oQhjHkXwhxG5oF0VdPgy1aLZ8GSkzPHjCXX2Q0oyMd VJx3vkbkHxxbnx7PIovlcji+/FgLC87hVAEPyY1l6YeyDPn7zDSnI3TxT94uxYMdcqKf nE8g== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AFqh2krdH2mLSBrlOuCDfSZOCbO3gueSLeacivAYSWniEi8BiSIRW4OW XdpK2NyFEG6oKB89sDi+XM0= X-Google-Smtp-Source: AMrXdXucksAYT6GazA3tVNjJCoIkDTUJdw6WliBul8IBqVFk6Tb8Re+XQ/hDtNW9eDIpOmyi8k4ERw== X-Received: by 2002:a05:6808:2b02:b0:35e:3967:2dcf with SMTP id fe2-20020a0568082b0200b0035e39672dcfmr806382oib.221.1671959716451; Sun, 25 Dec 2022 01:15:16 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6870:e293:b0:143:86b4:f098 with SMTP id v19-20020a056870e29300b0014386b4f098ls2786087oad.11.-pod-prod-gmail; Sun, 25 Dec 2022 01:15:13 -0800 (PST) X-Received: by 2002:a05:6870:a687:b0:144:9227:7292 with SMTP id i7-20020a056870a68700b0014492277292mr1307279oam.10.1671959713694; Sun, 25 Dec 2022 01:15:13 -0800 (PST) In-Reply-To: X-Original-Sender: balaji.dutt-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:31922 Archived-At: ------=_Part_26126_1623710226.1671959712883 Content-Type: multipart/alternative; boundary="----=_Part_26127_219543696.1671959712883" ------=_Part_26127_219543696.1671959712883 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Albert, Merry Christmas to you! Your suggestion seemed really promising, but when I try to create a filter= =20 using the couple of bits of code you'd suggested, I get a compilation=20 error. Here's the filter that I came up with: function Plain (plain) local done_marker =3D pandoc.List{pandoc.Str '[X]', pandoc.Space()} local prefix =3D pandoc.List{plain.content[1], plain.content[2]} if prefix =3D=3D done_marker then plain.content:remove(2) -- remove space plain.content:remove(1) -- remove checkbox pandoc.Strikeout(plain.content) plain.content =3D done_marker .. pandoc.Strikeout(plain.content) end return plain end Unfortunately when I try to run pandoc with this filter I get the following= =20 error: Error running filter=20 C:\Users\Balaji\AppData\Roaming\pandoc\filters\strikethrough.lua: ...\Balaji\AppData\Roaming\pandoc\filters\strikethrough.lua:10: bad=20 argument #2 to 'concat' (table expected, got Inline) As you might be able to guess from the fact that I switched to Python when= =20 trying this earlier, I have really no expertise in Lua. I tried to use=20 table.insert in place of concat, and now the filter does not throw an error= =20 but it does not seem to do anything either. Any suggestions? On Saturday, 24 December 2022 at 18:27:04 UTC+8 Albert Krewinkel wrote: > We could do this by passing the full content to the strikeout constructor= .=20 > We'd remove, then re-add the checkbox later: > > plain.content:remove(2) -- remove space > plain.content:remove(1) -- remove checkbox > plain.content =3D done_marker ..=20 > pandoc.Strikeout(plain.content) > > > > Am 24. Dezember 2022 09:37:58 MEZ schrieb "balaj...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" < > balaj...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>: >> >> Thanks for the pointers Albert! It did help me get started. Unfortunatel= y=20 >> when I started looping through the Plain object, I realized that the=20 >> individual strings were represented as separate elements so there did no= t=20 >> seem to be an easy way to apply a strikethrough formatting for the entir= e=20 >> sentence. The best I would be able to do was apply the strikethrough=20 >> word-by-word but with that approach, the final HTML did not look very=20 >> pleasing. >> >> In the end, I wound up writing a small Python script that would modify a= =20 >> file with the pandoc native format directly (outside of pandoc) and then= =20 >> feed the modified native format file back into pandoc. After a couple of= =20 >> false starts with the regex and then the native output becoming invalid,= =20 >> I've got it working fairly well for my purposes. >> >> On Thursday, 22 December 2022 at 20:21:19 UTC+8 Albert Krewinkel wrote: >> >>> "balaj...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" writes:=20 >>> >>> > The specific scenario I'm looking at is a Markdown file such as this:= =20 >>> >=20 >>> > ### Todo=20 >>> > - [ ] Foo=20 >>> > - [X] Quux Qux=20 >>> >>> This is an interesting case because it is more complex than it seems.= =20 >>> The reason is pandoc's `task_list` extension that causes pandoc to=20 >>> handle these checkboxes specially, converting them to [Str "=E2=98=90",= Space]=20 >>> and [Str "=E2=98=92", Space]. So we'll have to match on that in our fil= ter.=20 >>> >>> A good approach would be to write a filter for Plain, like so:=20 >>> >>> ``` lua=20 >>> function Plain (plain)=20 >>> -- modify the object here=20 >>> return plain=20 >>> end=20 >>> ```=20 >>> >>> Pandoc will then do all necessary document traversals automatically,=20 >>> the function gets applied to all `Plain` elements in the document.=20 >>> >>> To check for the prefix, we'd do something like=20 >>> >>> ``` lua=20 >>> local done_marker =3D pandoc.List{pandoc.Str '=E2=98=92', pandoc.Space(= )}=20 >>> local prefix =3D pandoc.List{plain.content[1], plain.content[2]}=20 >>> if prefix =3D=3D done_marker then=20 >>> -- modify content=20 >>> end=20 >>> ```=20 >>> >>> I hope that's enough to get you started. Happy hacking!=20 >>> >>> >>> --=20 >>> Albert Krewinkel=20 >>> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124=20 >>> >> --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/5257c49e-968d-40bf-a398-ae104a53c5c8n%40googlegroups.com. ------=_Part_26127_219543696.1671959712883 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Albert,

Merry Christmas to you!

<= div>Your suggestion seemed really promising, but when I try to create a fil= ter using the couple of bits of code you'd suggested, I get a compilation e= rror. Here's the filter that I came up with:

=
function Plain (plain)

local done_ma= rker =3D pandoc.List{pandoc.Str '[X]', pandoc.Space()}
local prefix =3D pandoc.List{plain.content[1], plai= n.content[2]}

      if prefix =3D=3D done_marker then=
        pl= ain.content:remove(2) -- remove space
        plain.content:remove(1) -- remove checkb= ox
        = pandoc.Strikeout(plain.content)
        plain.content =3D done_marker .. pandoc.Strike= out(plain.content)
   = ;   end

return plain
Error running filter C:\Users\Balaji\AppData\Roamin= g\pandoc\filters\strikethrough.lua:
...\Balaji\AppData\Roaming\pandoc\fi= lters\strikethrough.lua:10: bad argument #2 to 'concat' (table expected, go= t Inline)


=
As you might be able to guess from the fact that I switched to Python = when trying this earlier, I have really no expertise in Lua. I tried to use= table.insert in place of concat, and now the filter does not throw an erro= r but it does not seem to do anything either.

= Any suggestions?

On Saturday, = 24 December 2022 at 18:27:04 UTC+8 Albert Krewinkel wrote:
We could do this by pass= ing the full content to the strikeout constructor. We'd remove, then re= -add the checkbox later:

plain.content:remove(2) -- remove space
= plain.content:remove(1) -- remove checkbox
plain.content =3D done_marker= ..
=C2=A0 pandoc.Strikeout(plain.content)



Am 24. Dezember 2022 09:37:58 MEZ schrieb "balaj...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org" <balaj...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>: Thanks for the pointers Albert! It did help me get started. Unfortunately w= hen I started looping through the Plain object, I realized that the individ= ual strings were represented as separate elements so there did not seem to = be an easy way to apply a strikethrough formatting for the entire sentence.= The best I would be able to do was apply the strikethrough word-by-word bu= t with that approach, the final HTML did not look very pleasing.

In the end, I wound up writing a small Python script that would mo= dify a file with the pandoc native format directly (outside of pandoc) and = then feed the modified native format file back into pandoc. After a couple = of false starts with the regex and then the native output becoming invalid,= I've got it working fairly well for my purposes.

On Thursday, 22 D= ecember 2022 at 20:21:19 UTC+8 Albert Krewinkel wrote:
"balaj...@gmail.c= om" <balaj...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> The specific scenario I'm looking at is a Markdown file such a= s this:
>
> ### Todo
> - [ ] Foo
> - [X] Quux Qux

This is an interesting case because it is more complex than it seems.
The reason is pandoc's `task_list` extension that causes pandoc to
handle these checkboxes specially, converting them to [Str "=E2=98= =90", Space]
and [Str "=E2=98=92", Space]. So we'll have to match on t= hat in our filter.

A good approach would be to write a filter for Plain, like so:

``` lua
function Plain (plain)
-- modify the object here
return plain
end
```

Pandoc will then do all necessary document traversals automatically,
the function gets applied to all `Plain` elements in the document.

To check for the prefix, we'd do something like

``` lua
local done_marker =3D pandoc.List{pandoc.Str '=E2=98=92', pando= c.Space()}
local prefix =3D pandoc.List{plain.content[1], plain.content[2]}
if prefix =3D=3D done_marker then
-- modify content
end
```

I hope that's enough to get you started. Happy hacking!


--=20
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/5257c49e-968d-40bf-a398-ae104a53c5c8n%40googlegroups.= com.
------=_Part_26127_219543696.1671959712883-- ------=_Part_26126_1623710226.1671959712883--