From: JDTS <jdtsmith-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: Lua filter to fix incorrectly nested lists?
Date: Mon, 27 Feb 2023 16:28:55 -0800 (PST) [thread overview]
Message-ID: <fb8d262d-bddc-4b79-8aca-703c1dffea36n@googlegroups.com> (raw)
In-Reply-To: <8208c36c-dd86-49f6-9b77-32cc5f48299dn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
[-- Attachment #1.1: Type: text/plain, Size: 6123 bytes --]
One other quick question: pandoc parses <br> as linebreak, and translates
that into org as double-backslash \\. Any way to disable this?
On Monday, February 27, 2023 at 7:14:24 PM UTC-5 JDTS wrote:
> This works perfectly (including in targeting org, my use case). Thanks so
> much!
>
> On Monday, February 27, 2023 at 3:11:13 PM UTC-5 Julien Dutant wrote:
>
>> Well, couldn't help but give it a shot. Here's a short filter that does
>> the trick. Will work at arbitrary depth.
>>
>> https://gist.github.com/jdutant/549ef06074d3ae00b78ca6ec8ed2cfe1
>>
>>
>> function fixList(elem)
>> local changed = false
>> local newList = pandoc.List:new()
>>
>> local function isSubList(list)
>> return #list == 1
>> and (list[1].t == 'BulletList' or list[1].t == 'OrderedList')
>> end
>>
>> for _,item in ipairs(elem.c) do
>>
>> if #newList > 0 and isSubList(item) then
>> -- append item's sublist to the last item of newList
>> changed = true
>> newList[#newList]:insert(item[1])
>> else
>> -- otherwise append item to newList
>> newList:insert(item)
>> end
>>
>> end
>>
>> if changed then
>> elem.c = newList
>> end
>>
>> return changed and elem or nil
>> end
>>
>> return {{
>> OrderedList = fixList,
>> BulletList = fixList, }}
>>
>> On Monday, February 27, 2023 at 12:33:54 AM UTC JDTS wrote:
>>
>>>
>>> Thanks, I'll investigate this. The HTML structure is generated and
>>> therefore quite uniform, so it may be possible to do the munging there.
>>> On Sunday, February 26, 2023 at 10:47:36 AM UTC-5 Julien Dutant wrote:
>>>
>>>> From my labelled-lists filter (
>>>> https://github.com/dialoa/dialectica-filters/blob/main/labelled-lists/labelled-lists.lua),
>>>> here is a filter + function that checks whether every item in a bullet list
>>>> starts with a Span element.
>>>>
>>>> ```lua
>>>>
>>>> --- is_custom_labelled_list: Look for custom labels markup
>>>> -- Custom label markup requires each item starting with a span
>>>> -- containing the label
>>>> -- @param element pandoc BulletList element
>>>> function is_custom_labelled_list (element)
>>>> local is_cl_list = true
>>>>
>>>> -- the content of BulletList is a List of List of Blocks
>>>> for _,blocks in ipairs(element.c) do
>>>> -- check that the first element of the first block is Span
>>>> if not( blocks[1].c[1].t == 'Span' ) then
>>>> is_cl_list = false
>>>> break
>>>> end
>>>> end
>>>> return is_cl_list
>>>>
>>>> end
>>>>
>>>> return {{
>>>> BulletList = function(element)
>>>> if is_custom_labelled_list(element) then
>>>> return pandoc.Para(pandoc.Str('Was a list of the required kind!)))
>>>> end
>>>> end, }}
>>>>
>>>> ```
>>>>
>>>> The difficulty with manipulating lists is to follow their intricate
>>>> structure: a BulletList element as a content (element.c) that is a pandoc
>>>> List. Each item in it (element.c[1], element.c[2]) is of Blocks type, i.e.
>>>> a pandoc.List where the each element is a block. In your case you should
>>>> check that the list item only contains one block of type ordered list:
>>>>
>>>> if #elem.c[i] == 1 then list_item_contains_one_block_only = true end
>>>>
>>>> and check that this block is of type OrderedList:
>>>> if #elem.c[i]==1 and elem.c[i].t == 'OrderedList' then ...
>>>>
>>>> you should then add that block to the previous item, and remove the
>>>> current item.
>>>>
>>>> Hope this helps,
>>>>
>>>> J
>>>>
>>>> On Saturday, February 25, 2023 at 10:06:45 PM UTC JDTS wrote:
>>>>
>>>>> Thanks. Any pointers to lua filters that do something similar?
>>>>>
>>>>> On Saturday, February 25, 2023 at 10:01:08 AM UTC-5 Julien Dutant
>>>>> wrote:
>>>>>
>>>>>> Looks feasible. Pandoc converts the first html to:
>>>>>>
>>>>>> [ BulletList
>>>>>> [ [ Plain
>>>>>> [ ... Inlines ]
>>>>>> ]
>>>>>> , [ BulletList
>>>>>> [ [ Plain
>>>>>> [ ... Inlines ]
>>>>>> ]
>>>>>> , [ Plain
>>>>>> [ ... Inlines ]
>>>>>> ]
>>>>>> ]
>>>>>> ]
>>>>>> , [ Plain
>>>>>> [ Inlines ]
>>>>>> ]
>>>>>> ]
>>>>>> ]
>>>>>>
>>>>>> I.e., the sublist is converted to its own list item. So the filter
>>>>>> should pick up list, check if any item within them consists of a lone
>>>>>> sublist, and if so, move it to the previous item. (And best, apply the
>>>>>> filter recursively to that sublist itself.)
>>>>>>
>>>>>> On Saturday, February 25, 2023 at 2:26:04 PM UTC JDTS wrote:
>>>>>>
>>>>>>> The Apple Notes app produces (via AppleScript) HTML for notes with
>>>>>>> nested lists structured like:
>>>>>>>
>>>>>>> <ul>
>>>>>>>
>>>>>>> <li>Level 1 element 1</li>
>>>>>>>
>>>>>>> <ul>
>>>>>>>
>>>>>>> <li>Level 2 element 1</li>
>>>>>>>
>>>>>>> <li>Level 2 element 2</li>
>>>>>>>
>>>>>>> </ul>
>>>>>>>
>>>>>>> <li>Level 1 element 2</li>
>>>>>>>
>>>>>>> </ul>
>>>>>>>
>>>>>>> As you can see, the sublist is incorrectly positioned. It should be
>>>>>>> positioned *within* the <li> Level 1 element 1 item, ala:
>>>>>>>
>>>>>>> <ul>
>>>>>>>
>>>>>>> <li>Level 1 element 1
>>>>>>>
>>>>>>> <ul>
>>>>>>>
>>>>>>> <li>Level 2 element 1</li>
>>>>>>>
>>>>>>> <li>Level 2 element 2</li>
>>>>>>>
>>>>>>> </ul>
>>>>>>>
>>>>>>> </li>
>>>>>>>
>>>>>>> <li>Level 1 element 2</li>
>>>>>>>
>>>>>>> </ul>
>>>>>>>
>>>>>>> Is there a straightforward way with Lua filters to fix this at the
>>>>>>> AST level, for arbitrary-depth sublist nesting?
>>>>>>>
>>>>>>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/fb8d262d-bddc-4b79-8aca-703c1dffea36n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 18650 bytes --]
next prev parent reply other threads:[~2023-02-28 0:28 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-02-25 14:26 JDTS
[not found] ` <163effbf-b672-4501-9171-8c4681034a96n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-02-25 15:01 ` Julien Dutant
[not found] ` <d63a785d-1d91-4b34-8ab2-aea6ea7447b8n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-02-25 22:06 ` JDTS
[not found] ` <d030f117-9471-46dd-b730-d1ea81e3b040n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-02-26 15:47 ` Julien Dutant
[not found] ` <80183457-60c8-4fc3-aa16-13d2f93104f1n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-02-27 0:33 ` JDTS
[not found] ` <8c2cd1be-52b9-467b-a747-a88fc062209bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-02-27 20:11 ` Julien Dutant
[not found] ` <a299184a-2b46-4940-a634-bdb656bfa15dn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-02-28 0:14 ` JDTS
[not found] ` <8208c36c-dd86-49f6-9b77-32cc5f48299dn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-02-28 0:28 ` JDTS [this message]
[not found] ` <fb8d262d-bddc-4b79-8aca-703c1dffea36n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-02-28 14:13 ` Julien Dutant
[not found] ` <9ea5164a-6677-4aa6-850c-d887c77765e3n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-03-02 21:24 ` JDTS
[not found] ` <c7314562-60e6-4ae9-bb8b-89408251553fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-03-02 21:59 ` JDTS
[not found] ` <ee3c85ef-c9ca-473c-9df8-b18be45cc6abn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-03-03 15:51 ` Julien Dutant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fb8d262d-bddc-4b79-8aca-703c1dffea36n@googlegroups.com \
--to=jdtsmith-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
--cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).