From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/32235 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Julien Dutant Newsgroups: gmane.text.pandoc Subject: Re: Lua filter to fix incorrectly nested lists? Date: Sun, 26 Feb 2023 07:47:36 -0800 (PST) Message-ID: <80183457-60c8-4fc3-aa16-13d2f93104f1n@googlegroups.com> References: <163effbf-b672-4501-9171-8c4681034a96n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_6250_1266724712.1677426456073" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="15486"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBC5Y3356IYIJT7XNT4DBUBGHTMXSE-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sun Feb 26 16:47:40 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-qk1-f184.google.com ([209.85.222.184]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1pWJFU-0003sD-Od for gtp-pandoc-discuss@m.gmane-mx.org; Sun, 26 Feb 2023 16:47:40 +0100 Original-Received: by mail-qk1-f184.google.com with SMTP id 19-20020a370c13000000b007428253bb55sf2610161qkm.23 for ; Sun, 26 Feb 2023 07:47:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :sender:from:to:cc:subject:date:message-id:reply-to; bh=uFLa8CVzKzcrSbP/6615uD5ZylfSiG0jmsINiAmaGYk=; b=WmoWjc0k2K7COuETOtY2X1/2Wc2iI05gJZk5Lj9teXBDD7sK+Dz/6MGQp4BKWdZgsn s8JP5EQJUWINsCOxTmuiyaocUev5G5QSYzKw81GpGSI0jPOukgXmKQ3SIiFMOhOq8sdR nZSMm/5U+f2z/j9uNhEzk8wQ3uiGOrB1n2koZQBJIVeM63qhCkXzRMX1QxSb/MiqlY0p 3EjRcsZb9BnPx8Ls7N3Ijf2LrIg4jABv5264PkaHuWe/dlw8uE9V3zqRw03XPMAhsFkX QKf5TYubVUPJeuTcxpP+0IU+wRaMgM4wRh3LdTqb1wI1VJPj0LOLHZCuGewJTypq63Q2 OBLg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:x-original-sender :mime-version:subject:references:in-reply-to:message-id:to:from:date :from:to:cc:subject:date:message-id:reply-to; bh=uFLa8CVzKzcrSbP/6615uD5ZylfSiG0jmsINiAmaGYk=; b=TqwtNHMWIjFkoD68wtLaMRQqPpet50/yGO3/mZbv7qzpZqMPb4F0Zb5vie6d4P15W4 KBCufGnQUoX50bURxy7NYuqKm59cN/RWl7T/wfQwp5cvSfHOhQ6sfPmlfZP37MpdMB2b h8LjlYufW/jbQnHgCF8uW6mdBM380JcjxDUTVOruZR8gmAezj6S2BjC7+sOWU3sMjjyD bK3OWjY5F50nvDvmV0aQMSlKd7fgrm9LCLgQX4s/v+XC+Nti1VHHAh+FS7alWOhwvH/4 frkY/KC0i5v81UJ4pj8lVUpmZ5GZClpPGNcIWaxiEYs2CrEGSm5jKYHwSa7zMliX1Pwr NTRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-sender:mime-version:subject:references:in-reply-to :message-id:to:from:date:x-gm-message-state:sender:from:to:cc :subject:date:message-id:reply-to; bh=uFLa8CVzKzcrSbP/6615uD5ZylfSiG0jmsINiAmaGYk=; b=ap2lWUDtHHm7S8PesEfYEWjoTnbNhy2ACA5SxrSTQgcZA4EAC0PDggh/gQy6U3hSkf 2toxtjL5RzhD19jVAEM66wHXw3dtHt8Xe2exlQeFcyDKNPahQtQu9BG5Li6yCHQvthGr bZ9R+QDj2Edi8uJbhdUqQSTXh/Jhlxe/FW64FKzvc6iyq0uNhof8fu9XoQLpMao6Pv2+ elcl3B80tZCwOXkCt9jES4PD7H70oi2VPZs031MsOOzEwI5SGbJcgwDbe3pdSYx/41is AQrwVwLiCsJ5stx8EN69lV6c9PFIjdTpMaaJzRt9MwxC1vX5JWTq0dXmKAPuf0xuC00+ SfYg== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AO0yUKVIhWc0iIVK9nj2jAwLjF0mie/aohyOevxtPLFeNNLsLpIri40X GI/xxutCUr4+od2JjeAvZnw= X-Google-Smtp-Source: AK7set/T5y0TWrIiRELFj8UTTYl1xaCJtWbvvkEYYnUmY/j/Ocgj//0oBw7CW+hEf6qaDvjhw84wCg== X-Received: by 2002:ae9:f217:0:b0:742:55f9:318c with SMTP id m23-20020ae9f217000000b0074255f9318cmr2814992qkg.1.1677426459646; Sun, 26 Feb 2023 07:47:39 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:622a:4c12:b0:3b0:98a4:96b8 with SMTP id ey18-20020a05622a4c1200b003b098a496b8ls7799205qtb.8.-pod-prod-gmail; Sun, 26 Feb 2023 07:47:37 -0800 (PST) X-Received: by 2002:ac8:5713:0:b0:3bf:ac85:7d6 with SMTP id 19-20020ac85713000000b003bfac8507d6mr2039878qtw.3.1677426456777; Sun, 26 Feb 2023 07:47:36 -0800 (PST) In-Reply-To: X-Original-Sender: julien.dutant-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:32235 Archived-At: ------=_Part_6250_1266724712.1677426456073 Content-Type: multipart/alternative; boundary="----=_Part_6251_1048356910.1677426456073" ------=_Part_6251_1048356910.1677426456073 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable >From my labelled-lists filter=20 (https://github.com/dialoa/dialectica-filters/blob/main/labelled-lists/labe= lled-lists.lua),=20 here is a filter + function that checks whether every item in a bullet list= =20 starts with a Span element.=20 ```lua=20 --- is_custom_labelled_list: Look for custom labels markup=20 -- Custom label markup requires each item starting with a span=20 -- containing the label=20 -- @param element pandoc BulletList element=20 function is_custom_labelled_list (element)=20 local is_cl_list =3D true=20 -- the content of BulletList is a List of List of Blocks=20 for _,blocks in ipairs(element.c) do=20 -- check that the first element of the first block is Span=20 if not( blocks[1].c[1].t =3D=3D 'Span' ) then=20 is_cl_list =3D false =20 break =20 end=20 end=20 return is_cl_list=20 end return {{=20 BulletList =3D function(element)=20 if is_custom_labelled_list(element) then=20 return pandoc.Para(pandoc.Str('Was a list of the required kind!))) end=20 end, }} ``` The difficulty with manipulating lists is to follow their intricate=20 structure: a BulletList element as a content (element.c) that is a pandoc= =20 List. Each item in it (element.c[1], element.c[2]) is of Blocks type, i.e.= =20 a pandoc.List where the each element is a block. In your case you should=20 check that the list item only contains one block of type ordered list: if #elem.c[i] =3D=3D 1 then list_item_contains_one_block_only =3D true end and check that this block is of type OrderedList: if #elem.c[i]=3D=3D1 and elem.c[i].t =3D=3D 'OrderedList' then ... you should then add that block to the previous item, and remove the current= =20 item. Hope this helps, J On Saturday, February 25, 2023 at 10:06:45=E2=80=AFPM UTC JDTS wrote: > Thanks. Any pointers to lua filters that do something similar? > > On Saturday, February 25, 2023 at 10:01:08=E2=80=AFAM UTC-5 Julien Dutant= wrote: > >> Looks feasible. Pandoc converts the first html to: >> >> [ BulletList >> [ [ Plain >> [ ... Inlines ] >> ] >> , [ BulletList >> [ [ Plain >> [ ... Inlines ] >> ] >> , [ Plain >> [ ... Inlines ] >> ] >> ] >> ] >> , [ Plain >> [ Inlines ] >> ] >> ] >> ] >> >> I.e., the sublist is converted to its own list item. So the filter shoul= d=20 >> pick up list, check if any item within them consists of a lone sublist, = and=20 >> if so, move it to the previous item. (And best, apply the filter=20 >> recursively to that sublist itself.) >> >> On Saturday, February 25, 2023 at 2:26:04=E2=80=AFPM UTC JDTS wrote: >> >>> The Apple Notes app produces (via AppleScript) HTML for notes with=20 >>> nested lists structured like: >>> >>>
    >>> >>>
  • Level 1 element 1
  • >>> >>>
      >>> >>>
    • Level 2 element 1
    • >>> >>>
    • Level 2 element 2
    • >>> >>>
    >>> >>>
  • Level 1 element 2
  • >>> >>>
>>> >>> As you can see, the sublist is incorrectly positioned. It should be=20 >>> positioned *within* the
  • Level 1 element 1 item, ala: >>> >>>
      >>> >>>
    • Level 1 element 1 >>> >>>
        >>> >>>
      • Level 2 element 1
      • >>> >>>
      • Level 2 element 2
      • >>> >>>
      >>> >>>
    • >>> >>>
    • Level 1 element 2
    • >>> >>>
    >>> >>> Is there a straightforward way with Lua filters to fix this at the AST= =20 >>> level, for arbitrary-depth sublist nesting? >>> >> --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/80183457-60c8-4fc3-aa16-13d2f93104f1n%40googlegroups.com. ------=_Part_6251_1048356910.1677426456073 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
    From my labelled-lists filter (https://github.com/dialoa/dialectica-fi= lters/blob/main/labelled-lists/labelled-lists.lua), here is a filter + func= tion that checks whether every item in a bullet list starts with a Span ele= ment.

    ```lua

    --- is_custom_labelled_list: Look for custom labe= ls markup
    --= Custom label markup requires each item starting with a span<= /span>

    --= containing the label

    --= @param element pandoc BulletList element

    function= is_custom_labelled_list (element)

    =C2=A0=C2=A0 <= span>local
    is_cl_list =3D true


    =C2=A0=C2=A0 <= span>-- the content of BulletList is a List of List of Blocks<= /span>

    =C2=A0=C2=A0 <= span>for
    _,blocks in ipairs(element.= c) do

    =C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 -- check that the first element of th= e first block is Span

    =C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0 if not( blocks[1].c<= /span>[1].t =3D=3D 'Span' )
    then
    =C2=A0=C2=A0= =C2=A0=C2=A0=C2=A0 =C2=A0=C2=A0 is_cl_list =3D false=C2=A0
    =
    =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 break <= /span> =C2=A0

    =C2=A0=C2=A0= =C2=A0=C2=A0 end

    =C2=A0=C2=A0= =C2=A0 end

    =C2=A0=C2=A0 <= span>return
    is_cl_list


    end

    re= turn {{

    BulletList= =3D function(element)

    = if is_custom_labelled_list(element) then

    return
    pandoc.Para(pandoc.Str('Was a list of the required = kind!)))

    = end

    end<= /span>,
    }}

    ```
    =
    The difficulty with manipulating lists is to follow their = intricate=20 structure: a BulletList element as a content (element.c) that is a=20 pandoc List. Each item in it (element.c[1], element.c[2]) is of Blocks=20 type, i.e. a pandoc.List where the each element is a block. In your case yo= u should check that the list item only contains one block of type ordered l= ist:

    if #elem.c[i] =3D=3D 1 then list_item_conta= ins_one_block_only =3D true end

    and check that t= his block is of type OrderedList:
    if #elem.c[i]=3D=3D1 and elem.c= [i].t =3D=3D 'OrderedList' then ...

    you should t= hen add that block to the previous item, and remove the current item.
    =

    Hope this helps,

    J

    On Saturday, February 25, 2023 at 10:06:45=E2=80=AFPM UTC JDTS wrote:
    =
    Thanks. =C2=A0Any= pointers to lua filters that do something similar?

    On Saturday, February 25,= 2023 at 10:01:08=E2=80=AFAM UTC-5 Julien Dutant wrote:
    Looks feasible. Pandoc converts the = first html to:

    [ BulletList
    =C2=A0 =C2=A0 [ [ P= lain
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 [ ... Inlines ]
    =C2=A0 =C2=A0= =C2=A0 ]
    =C2=A0 =C2=A0 , [ BulletList
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 [ [ Plain
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= [ ... Inlines ]
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ]
    =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 , [ Plain
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 [ ... Inlines=C2=A0 ]
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 ]
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 ]
    =C2=A0 =C2=A0 = =C2=A0 ]
    =C2=A0 =C2=A0 , [ Plain
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 [= Inlines ]
    =C2=A0 =C2=A0 =C2=A0 ]
    =C2=A0 =C2=A0 ]
    ]
    =
    I.e., the sublist is converted to its own list item. So the = filter should pick up list, check if any item within them consists of a lon= e sublist, and if so, move it to the previous item. (And best, apply the fi= lter recursively to that sublist itself.)

    On Saturday, Febru= ary 25, 2023 at 2:26:04=E2=80=AFPM UTC JDTS wrote:
    The Apple Notes app produces (via AppleScript)= HTML for notes with nested lists structured like:

    <ul>

    <li>Level 1 element 1</li>=

    <ul>

    <li>Level 2 element 1</li>=

    <li>Level 2 element 2</li>=

    </ul>

    <li>Level 1 element 2</li>=

    </ul>


    As you can see, the sublist is incorrectly positioned. =C2= =A0It should be positioned=C2=A0within=C2=A0the <li> Level 1 element 1 item, ala:

    <= div>

    <ul>

    <li>Level 1 element 1

    =C2=A0 =C2=A0 <ul>

    =C2=A0 =C2=A0 <li>Level 2 element 1</li>

    =C2=A0 =C2=A0 <li>Level 2 elem= ent 2</li>

    =C2=A0 =C2= =A0 </ul>

    </li><= br>

    <li>Level 1 elemen= t 2</li>

    </ul>


    Is there a straightforward way wi= th Lua filters to fix this at the AST level, for arbitrary-depth sublist ne= sting?

    --
    You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
    To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
    To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/80183457-60c8-4fc3-aa16-13d2f93104f1n%40googlegroups.= com.
    ------=_Part_6251_1048356910.1677426456073-- ------=_Part_6250_1266724712.1677426456073--