From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/32251 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: JDTS Newsgroups: gmane.text.pandoc Subject: Re: Lua filter to fix incorrectly nested lists? Date: Mon, 27 Feb 2023 16:14:24 -0800 (PST) Message-ID: <8208c36c-dd86-49f6-9b77-32cc5f48299dn@googlegroups.com> References: <163effbf-b672-4501-9171-8c4681034a96n@googlegroups.com> <80183457-60c8-4fc3-aa16-13d2f93104f1n@googlegroups.com> <8c2cd1be-52b9-467b-a747-a88fc062209bn@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_2222_570145243.1677543264653" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="34547"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBDEZRENAQYORBYUO6WPQMGQEBV75BNA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Tue Feb 28 01:14:30 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-qt1-f187.google.com ([209.85.160.187]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pWndV-0008oQ-Qa for gtp-pandoc-discuss@m.gmane-mx.org; Tue, 28 Feb 2023 01:14:29 +0100 Original-Received: by mail-qt1-f187.google.com with SMTP id t22-20020ac86a16000000b003bd1c0f74cfsf3723924qtr.20 for ; Mon, 27 Feb 2023 16:14:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :sender:from:to:cc:subject:date:message-id:reply-to; bh=3BeqRpFjLzRZ+iifcSAkLaZbh+93KheCe3nPFxmI0HI=; b=dSuU4TddZkSt7nnIIVZkdob+RmlJ1ZgL+/QRKq70zfC1s8lcaPlYJRGg6aWBELRJ7d 6oI2pdbssSfbn37luOgTjckRrzvzNtzf1IIBk3wjm3EUUeEY9NUA6oXWg3yeOl41kvrD POaQh3tjYLdLlOheYncnc7oK8FIigPtiv2Y7ty1eTkEIaZcMSyEWU+Pky9vk6lhymHD+ GLzNh4p2Sj53jSLLac4lIR3vmkd0AaULh3YDj9FsNPYjtvSUFNoQe8kpi6449RiFXS3H IQTeq3JesoGt/y30paY+CvSncJ3SMFCITerkOMAyOX8rInHV8lOCeb7LaZsa6e2judib TU2Q== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :from:to:cc:subject:date:message-id:reply-to; bh=3BeqRpFjLzRZ+iifcSAkLaZbh+93KheCe3nPFxmI0HI=; b=j7PHmj2K2Dqr5SmKbWHYNF1dZSOqXpoknIOOSdIg0ucpKossa1OV50sv3py+8Zd6Sa XN8BLy54M9effRqPkVKK1+TN+hSv2pxGFsGKdh6Q/UCQEkeFTJD5cJ5XcpaKFf+l4Lds zPCMtAuX9LORJGJOWm5Llb9arxP+U/2LDsrv2o76utkmKTlFAZMFvn+iHO0Olgn3vWdQ yOE9xm4l3vE91muYW/UyA93LqqADnHWCqAHISnCE+FW+YVPsQTW9JVtJBQb21EfSiB5h PKk0bDiwiul1pMr8SYhM8fU4S4qeVUck0QY/ty4s5xZ+Qrzhqi2quNZvS1lgqugcLV2r +FqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-sender:mime-version:subject:references:in-reply-to :message-id:to:from:date:x-gm-message-state:sender:from:to:cc :subject:date:message-id:reply-to; bh=3BeqRpFjLzRZ+iifcSAkLaZbh+93KheCe3nPFxmI0HI=; b=vBeMJCFA4Vy7mvNe/xJr5KaYxqjvarCkLn7Z8HpWMmnC/+QuG/AWouZVrcnlsTL0hk AOw6cRDUYgsTJ/aENZ8S2T7PmHqTjR5fk9WVjBsScpTcd2LJ4DXOAzQIw9Ee4f8vhg6E y2Acd3tAmMx3ebg6tY8vG4/uFTLxyEXTcOViJMKMaWXBoKtjotAItGlVLCs9NXAufZWf WoX0IpEFeUpxib5At1s8vJJ3rFKWG+jOYiXcAZI1352sJlrueQzvc5Yq0rYagqkNEx5T UaEehpsd6qEzqNpzZPWScgGUgdTv8spAzvg4QAqgjIXrZHHRARUu4JPowyoHCOFjDOW9 nAnA== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AO0yUKVcrZj2ZkmAupCIE8j7I1jSRR4Fp2JFGtxh91n01d/bXKp7+hEQ 8+4l+dudgaTl6YMlGvMpIyc= X-Google-Smtp-Source: AK7set/Yi2zGcDBZBKZB2vvFfRmWdhU3Sv6+WPm/8QDk5iGJhjvTUtjqQisOG7mRfRzfN8h1pxa3FQ== X-Received: by 2002:a37:a81:0:b0:742:af7d:3e14 with SMTP id 123-20020a370a81000000b00742af7d3e14mr83313qkk.4.1677543268712; Mon, 27 Feb 2023 16:14:28 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:620a:8019:b0:73b:9c11:171c with SMTP id ee25-20020a05620a801900b0073b9c11171cls2348406qkb.10.-pod-prod-gmail; Mon, 27 Feb 2023 16:14:25 -0800 (PST) X-Received: by 2002:a05:620a:ecb:b0:742:5e30:8f04 with SMTP id x11-20020a05620a0ecb00b007425e308f04mr58736qkm.6.1677543265419; Mon, 27 Feb 2023 16:14:25 -0800 (PST) In-Reply-To: X-Original-Sender: jdtsmith-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:32251 Archived-At: ------=_Part_2222_570145243.1677543264653 Content-Type: multipart/alternative; boundary="----=_Part_2223_1505306159.1677543264653" ------=_Part_2223_1505306159.1677543264653 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable This works perfectly (including in targeting org, my use case). Thanks so= =20 much! On Monday, February 27, 2023 at 3:11:13=E2=80=AFPM UTC-5 Julien Dutant wrot= e: > Well, couldn't help but give it a shot. Here's a short filter that does= =20 > the trick. Will work at arbitrary depth. > > https://gist.github.com/jdutant/549ef06074d3ae00b78ca6ec8ed2cfe1 > > > function fixList(elem)=20 > local changed =3D false=20 > local newList =3D pandoc.List:new()=20 > > local function isSubList(list)=20 > return #list =3D=3D 1=20 > and (list[1].t =3D=3D 'BulletList' or list[1].t =3D=3D 'OrderedList')=20 > end=20 > > for _,item in ipairs(elem.c) do=20 > > if #newList > 0 and isSubList(item) then=20 > -- append item's sublist to the last item of newList=20 > changed =3D true=20 > newList[#newList]:insert(item[1])=20 > else=20 > -- otherwise append item to newList=20 > newList:insert(item)=20 > end=20 > > end=20 > > if changed then=20 > elem.c =3D newList=20 > end=20 > > return changed and elem or nil=20 > end=20 > > return {{=20 > OrderedList =3D fixList,=20 > BulletList =3D fixList, }} > > On Monday, February 27, 2023 at 12:33:54=E2=80=AFAM UTC JDTS wrote: > >> >> Thanks, I'll investigate this. The HTML structure is generated and=20 >> therefore quite uniform, so it may be possible to do the munging there.= =20 >> On Sunday, February 26, 2023 at 10:47:36=E2=80=AFAM UTC-5 Julien Dutant = wrote: >> >>> From my labelled-lists filter ( >>> https://github.com/dialoa/dialectica-filters/blob/main/labelled-lists/l= abelled-lists.lua),=20 >>> here is a filter + function that checks whether every item in a bullet = list=20 >>> starts with a Span element.=20 >>> >>> ```lua=20 >>> >>> --- is_custom_labelled_list: Look for custom labels markup=20 >>> -- Custom label markup requires each item starting with a span=20 >>> -- containing the label=20 >>> -- @param element pandoc BulletList element=20 >>> function is_custom_labelled_list (element)=20 >>> local is_cl_list =3D true=20 >>> >>> -- the content of BulletList is a List of List of Blocks=20 >>> for _,blocks in ipairs(element.c) do=20 >>> -- check that the first element of the first block is Span=20 >>> if not( blocks[1].c[1].t =3D=3D 'Span' ) then=20 >>> is_cl_list =3D false =20 >>> break =20 >>> end=20 >>> end=20 >>> return is_cl_list=20 >>> >>> end >>> >>> return {{=20 >>> BulletList =3D function(element)=20 >>> if is_custom_labelled_list(element) then=20 >>> return pandoc.Para(pandoc.Str('Was a list of the required kind!))) >>> end=20 >>> end, }} >>> >>> ``` >>> >>> The difficulty with manipulating lists is to follow their intricate=20 >>> structure: a BulletList element as a content (element.c) that is a pand= oc=20 >>> List. Each item in it (element.c[1], element.c[2]) is of Blocks type, i= .e.=20 >>> a pandoc.List where the each element is a block. In your case you shoul= d=20 >>> check that the list item only contains one block of type ordered list: >>> >>> if #elem.c[i] =3D=3D 1 then list_item_contains_one_block_only =3D true = end >>> >>> and check that this block is of type OrderedList: >>> if #elem.c[i]=3D=3D1 and elem.c[i].t =3D=3D 'OrderedList' then ... >>> >>> you should then add that block to the previous item, and remove the=20 >>> current item. >>> >>> Hope this helps, >>> >>> J >>> >>> On Saturday, February 25, 2023 at 10:06:45=E2=80=AFPM UTC JDTS wrote: >>> >>>> Thanks. Any pointers to lua filters that do something similar? >>>> >>>> On Saturday, February 25, 2023 at 10:01:08=E2=80=AFAM UTC-5 Julien Dut= ant wrote: >>>> >>>>> Looks feasible. Pandoc converts the first html to: >>>>> >>>>> [ BulletList >>>>> [ [ Plain >>>>> [ ... Inlines ] >>>>> ] >>>>> , [ BulletList >>>>> [ [ Plain >>>>> [ ... Inlines ] >>>>> ] >>>>> , [ Plain >>>>> [ ... Inlines ] >>>>> ] >>>>> ] >>>>> ] >>>>> , [ Plain >>>>> [ Inlines ] >>>>> ] >>>>> ] >>>>> ] >>>>> >>>>> I.e., the sublist is converted to its own list item. So the filter=20 >>>>> should pick up list, check if any item within them consists of a lone= =20 >>>>> sublist, and if so, move it to the previous item. (And best, apply th= e=20 >>>>> filter recursively to that sublist itself.) >>>>> >>>>> On Saturday, February 25, 2023 at 2:26:04=E2=80=AFPM UTC JDTS wrote: >>>>> >>>>>> The Apple Notes app produces (via AppleScript) HTML for notes with= =20 >>>>>> nested lists structured like: >>>>>> >>>>>>
    >>>>>> >>>>>>
  • Level 1 element 1
  • >>>>>> >>>>>>
      >>>>>> >>>>>>
    • Level 2 element 1
    • >>>>>> >>>>>>
    • Level 2 element 2
    • >>>>>> >>>>>>
    >>>>>> >>>>>>
  • Level 1 element 2
  • >>>>>> >>>>>>
>>>>>> >>>>>> As you can see, the sublist is incorrectly positioned. It should be= =20 >>>>>> positioned *within* the
  • Level 1 element 1 item, ala: >>>>>> >>>>>>
      >>>>>> >>>>>>
    • Level 1 element 1 >>>>>> >>>>>>
        >>>>>> >>>>>>
      • Level 2 element 1
      • >>>>>> >>>>>>
      • Level 2 element 2
      • >>>>>> >>>>>>
      >>>>>> >>>>>>
    • >>>>>> >>>>>>
    • Level 1 element 2
    • >>>>>> >>>>>>
    >>>>>> >>>>>> Is there a straightforward way with Lua filters to fix this at the= =20 >>>>>> AST level, for arbitrary-depth sublist nesting? >>>>>> >>>>> --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/8208c36c-dd86-49f6-9b77-32cc5f48299dn%40googlegroups.com. ------=_Part_2223_1505306159.1677543264653 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable This works perfectly (including in targeting org, my use case). =C2=A0Thank= s so much!

    On Monday, February 27, 2023 at 3:11:13=E2=80=AFPM UTC-5 Julie= n Dutant wrote:
    Well, couldn't help but give it a shot. Here's a short filt= er that does the trick. Will work at arbitrary depth.


    =
    function fixList(elem)

    local changed =3D false

    local newList =3D pandoc.List:new()


    local function isSubList(list)

    return #list =3D=3D 1

    and (list[1].t =3D=3D &#= 39;BulletList' or list[1<= /span>].t =3D=3D 'OrderedL= ist')

    end


    for _,item in ipairs(elem.c) do<= /span>


    if #newList > 0 and <= span>isSubList(item) then

    -= - append item's sublist to the last item of newList

    changed =3D
    true

    newList[#
    newList]:insert(item[1])

    else

    -= - otherwise append item to newList

    newList:insert
    (item)

    end


    end


    if= changed then

    elem.c =3D newList

    end


    return changed and elem or nil

    end=


    return {{

    OrderedList =3D
    fixList,

    BulletList =3D
    fixList,
    }}

    On Monday, February 27, 2023 at 12:33:54= =E2=80=AFAM UTC JDTS wrote:

    Thanks, I'll investigate this. =C2=A0The HTML structure is ge= nerated and therefore quite uniform, so it may be possible to do the mungin= g there.=C2=A0
    On Sunday, February 26, 2023 at 10:47:36=E2=80=AFAM UTC-5 Julien Dutan= t wrote:
    From my = labelled-lists filter (https://github.com/dialoa/dialectica-filters/bl= ob/main/labelled-lists/labelled-lists.lua), here is a filter + function= that checks whether every item in a bullet list starts with a Span element= .

    ```lua

    --- is_custom_labelled_list: Look for custom labels markup
    -- Custom label markup requires each item starting with a span

    -- containing the label

    -- @param element pandoc BulletList element

    function is_custom_labelled_list (element)

    =C2=A0=C2=A0 local
    is_cl_list =3D true


    =C2=A0=C2=A0 -- the content of BulletList is a List of List of Blocks

    =C2=A0=C2=A0 for
    _,blocks in ipairs(element.c<= /span>) do

    =C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 -- check that the first element of the f= irst block is Span

    =C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 if not( blocks[1].c[1].t =3D=3D 'Span' )
    then
    =C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0 =C2=A0=C2=A0 is_cl_list =3D false=
    =C2=A0
    =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 break =C2=A0
    =C2=A0=C2=A0=C2= =A0=C2=A0 end

    =C2=A0=C2=A0=C2= =A0 end

    =C2=A0=C2=A0 return
    is_cl_list


    end

    return {= {

    BulletList <= span>=3D
    function(element)

    if= is_custom_labelled_list(element) then

    return
    pandoc.Para(pandoc.Str('Was a list of the require= d kind!)))

    = end

    end,
    }}

    ```
    The difficulty with manipulating lists is to follow their intr= icate=20 structure: a BulletList element as a content (element.c) that is a=20 pandoc List. Each item in it (element.c[1], element.c[2]) is of Blocks=20 type, i.e. a pandoc.List where the each element is a block. In your case yo= u should check that the list item only contains one block of type ordered l= ist:

    if #elem.c[i] =3D=3D 1 then list_item_contain= s_one_block_only =3D true end

    and check that this = block is of type OrderedList:
    if #elem.c[i]=3D=3D1 and elem.c[i].= t =3D=3D 'OrderedList' then ...

    you should= then add that block to the previous item, and remove the current item.

    Hope this helps,

    J
    =
    On Sa= turday, February 25, 2023 at 10:06:45=E2=80=AFPM UTC JDTS wrote:
    <= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 0.8ex;border-left:1p= x solid rgb(204,204,204);padding-left:1ex">Thanks. =C2=A0Any pointers to lu= a filters that do something similar?

    On Saturday, February 25, 2023 at 10:01:= 08=E2=80=AFAM UTC-5 Julien Dutant wrote:
    Looks feasible. Pandoc converts the first html to:<= /div>

    [ BulletList
    =C2=A0 =C2=A0 [ [ Plain
    =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 [ ... Inlines ]
    =C2=A0 =C2=A0 =C2=A0 ]
    = =C2=A0 =C2=A0 , [ BulletList
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 [ [ Plai= n
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 [ ... Inlines = ]
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ]
    =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 , [ Plain
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 [ ... Inlines=C2=A0 ]
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = ]
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ]
    =C2=A0 =C2=A0 =C2=A0 ]
    =C2= =A0 =C2=A0 , [ Plain
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 [ Inlines ]
    = =C2=A0 =C2=A0 =C2=A0 ]
    =C2=A0 =C2=A0 ]
    ]

    I.e., the sublist is converted to its own list item. So the filter should = pick up list, check if any item within them consists of a lone sublist, and= if so, move it to the previous item. (And best, apply the filter recursive= ly to that sublist itself.)

    On Saturday, February 25, 2023 a= t 2:26:04=E2=80=AFPM UTC JDTS wrote:
    The Apple Notes app produces (via AppleScript) HTML for note= s with nested lists structured like:

    <ul>

    <li>Level 1 element 1</li>=

    <ul>

    <li>Level 2 element 1</li>=

    <li>Level 2 element 2</li>=

    </ul>

    <li>Level 1 element 2</li>=

    </ul>


    As you can see, the sublist is incorrectly positioned. =C2= =A0It should be positioned=C2=A0within=C2=A0the <li> Level 1 element 1 item, ala:

    <= div>

    <ul>

    <li>Level 1 element 1

    =C2=A0 =C2=A0 <ul>

    =C2=A0 =C2=A0 <li>Level 2 element 1</li>

    =C2=A0 =C2=A0 <li>Level 2 elem= ent 2</li>

    =C2=A0 =C2= =A0 </ul>

    </li><= br>

    <li>Level 1 elemen= t 2</li>

    </ul>


    Is there a straightforward way wi= th Lua filters to fix this at the AST level, for arbitrary-depth sublist ne= sting?

    --
    You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
    To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
    To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/8208c36c-dd86-49f6-9b77-32cc5f48299dn%40googlegroups.= com.
    ------=_Part_2223_1505306159.1677543264653-- ------=_Part_2222_570145243.1677543264653--