public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: JDTS <jdtsmith-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: Lua filter to fix incorrectly nested lists?
Date: Sun, 26 Feb 2023 16:33:54 -0800 (PST)	[thread overview]
Message-ID: <8c2cd1be-52b9-467b-a747-a88fc062209bn@googlegroups.com> (raw)
In-Reply-To: <80183457-60c8-4fc3-aa16-13d2f93104f1n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 4326 bytes --]


Thanks, I'll investigate this.  The HTML structure is generated and 
therefore quite uniform, so it may be possible to do the munging there. 
On Sunday, February 26, 2023 at 10:47:36 AM UTC-5 Julien Dutant wrote:

> From my labelled-lists filter (
> https://github.com/dialoa/dialectica-filters/blob/main/labelled-lists/labelled-lists.lua), 
> here is a filter + function that checks whether every item in a bullet list 
> starts with a Span element. 
>
> ```lua 
>
> --- is_custom_labelled_list: Look for custom labels markup 
> -- Custom label markup requires each item starting with a span 
> -- containing the label 
> -- @param element pandoc BulletList element 
> function is_custom_labelled_list (element) 
>    local is_cl_list = true 
>
>    -- the content of BulletList is a List of List of Blocks 
>    for _,blocks in ipairs(element.c) do 
>       -- check that the first element of the first block is Span 
>       if not( blocks[1].c[1].t == 'Span' ) then 
>          is_cl_list = false  
>          break   
>      end 
>     end 
>    return is_cl_list 
>
> end
>
> return {{ 
> BulletList = function(element) 
> if is_custom_labelled_list(element) then 
> return pandoc.Para(pandoc.Str('Was a list of the required kind!)))
> end 
> end, }}
>
> ```
>
> The difficulty with manipulating lists is to follow their intricate 
> structure: a BulletList element as a content (element.c) that is a pandoc 
> List. Each item in it (element.c[1], element.c[2]) is of Blocks type, i.e. 
> a pandoc.List where the each element is a block. In your case you should 
> check that the list item only contains one block of type ordered list:
>
> if #elem.c[i] == 1 then list_item_contains_one_block_only = true end
>
> and check that this block is of type OrderedList:
> if #elem.c[i]==1 and elem.c[i].t == 'OrderedList' then ...
>
> you should then add that block to the previous item, and remove the 
> current item.
>
> Hope this helps,
>
> J
>
> On Saturday, February 25, 2023 at 10:06:45 PM UTC JDTS wrote:
>
>> Thanks.  Any pointers to lua filters that do something similar?
>>
>> On Saturday, February 25, 2023 at 10:01:08 AM UTC-5 Julien Dutant wrote:
>>
>>> Looks feasible. Pandoc converts the first html to:
>>>
>>> [ BulletList
>>>     [ [ Plain
>>>           [ ... Inlines ]
>>>       ]
>>>     , [ BulletList
>>>           [ [ Plain
>>>                 [ ... Inlines ]
>>>             ]
>>>           , [ Plain
>>>                 [ ... Inlines  ]
>>>             ]
>>>           ]
>>>       ]
>>>     , [ Plain
>>>           [ Inlines ]
>>>       ]
>>>     ]
>>> ]
>>>
>>> I.e., the sublist is converted to its own list item. So the filter 
>>> should pick up list, check if any item within them consists of a lone 
>>> sublist, and if so, move it to the previous item. (And best, apply the 
>>> filter recursively to that sublist itself.)
>>>
>>> On Saturday, February 25, 2023 at 2:26:04 PM UTC JDTS wrote:
>>>
>>>> The Apple Notes app produces (via AppleScript) HTML for notes with 
>>>> nested lists structured like:
>>>>
>>>> <ul>
>>>>
>>>> <li>Level 1 element 1</li>
>>>>
>>>> <ul>
>>>>
>>>> <li>Level 2 element 1</li>
>>>>
>>>> <li>Level 2 element 2</li>
>>>>
>>>> </ul>
>>>>
>>>> <li>Level 1 element 2</li>
>>>>
>>>> </ul>
>>>>
>>>> As you can see, the sublist is incorrectly positioned.  It should be 
>>>> positioned *within* the <li> Level 1 element 1 item, ala:
>>>>
>>>> <ul>
>>>>
>>>> <li>Level 1 element 1
>>>>
>>>>     <ul>
>>>>
>>>>     <li>Level 2 element 1</li>
>>>>
>>>>     <li>Level 2 element 2</li>
>>>>
>>>>     </ul>
>>>>
>>>> </li>
>>>>
>>>> <li>Level 1 element 2</li>
>>>>
>>>> </ul>
>>>>
>>>> Is there a straightforward way with Lua filters to fix this at the AST 
>>>> level, for arbitrary-depth sublist nesting?
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/8c2cd1be-52b9-467b-a747-a88fc062209bn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 12249 bytes --]

  parent reply	other threads:[~2023-02-27  0:33 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-25 14:26 JDTS
     [not found] ` <163effbf-b672-4501-9171-8c4681034a96n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-02-25 15:01   ` Julien Dutant
     [not found]     ` <d63a785d-1d91-4b34-8ab2-aea6ea7447b8n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-02-25 22:06       ` JDTS
     [not found]         ` <d030f117-9471-46dd-b730-d1ea81e3b040n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-02-26 15:47           ` Julien Dutant
     [not found]             ` <80183457-60c8-4fc3-aa16-13d2f93104f1n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-02-27  0:33               ` JDTS [this message]
     [not found]                 ` <8c2cd1be-52b9-467b-a747-a88fc062209bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-02-27 20:11                   ` Julien Dutant
     [not found]                     ` <a299184a-2b46-4940-a634-bdb656bfa15dn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-02-28  0:14                       ` JDTS
     [not found]                         ` <8208c36c-dd86-49f6-9b77-32cc5f48299dn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-02-28  0:28                           ` JDTS
     [not found]                             ` <fb8d262d-bddc-4b79-8aca-703c1dffea36n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-02-28 14:13                               ` Julien Dutant
     [not found]                                 ` <9ea5164a-6677-4aa6-850c-d887c77765e3n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-03-02 21:24                                   ` JDTS
     [not found]                                     ` <c7314562-60e6-4ae9-bb8b-89408251553fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-03-02 21:59                                       ` JDTS
     [not found]                                         ` <ee3c85ef-c9ca-473c-9df8-b18be45cc6abn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-03-03 15:51                                           ` Julien Dutant

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8c2cd1be-52b9-467b-a747-a88fc062209bn@googlegroups.com \
    --to=jdtsmith-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).