public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Filter functions unexpectedly run in separate passes
@ 2019-02-09 10:52 Axel Rauschmayer
       [not found] ` <275ab994-2195-48f3-89b2-703621fc4c80-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Axel Rauschmayer @ 2019-02-09 10:52 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1034 bytes --]

I’m setting up my filter as follows:

return { {Header = TrackHeaderId, RawInline = TheRawInline} }

I’d expect invocations of TrackHeaderId and TheRawInline to be interleaved. 
Instead, I’m first seeing all invocations of TheRawInline and then all 
invocations of TrackHeaderId.

Is there a way to change this? I’d like TrackHeaderId to track the ID of 
the current Header and I’d like TheRawInline to use that ID.

Thanks!

Axel

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/275ab994-2195-48f3-89b2-703621fc4c80%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 2792 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Filter functions unexpectedly run in separate passes
       [not found] ` <275ab994-2195-48f3-89b2-703621fc4c80-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2019-02-09 21:29   ` Albert Krewinkel
       [not found]     ` <87bm3kirb8.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Albert Krewinkel @ 2019-02-09 21:29 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Axel Rauschmayer writes:

> I’m setting up my filter as follows:
>
> return { {Header = TrackHeaderId, RawInline = TheRawInline} }
>
> I’d expect invocations of TrackHeaderId and TheRawInline to be interleaved.
> Instead, I’m first seeing all invocations of TheRawInline and then all
> invocations of TrackHeaderId.

This is a known problem: we currently filter elements in the order
Inlines → Blocks → Meta → Pandoc. Fixing that is on my TODO list.

> Is there a way to change this? I’d like TrackHeaderId to track the ID of
> the current Header and I’d like TheRawInline to use that ID.

I can think of two ways, both which are a bit involved: One way would be
to use `hierarchicalize` from module pandoc.utils. This will turn a list
of blocks into hierarchical elements. You could then traverse these
elements and use `walk_block` to modify the contents – use a helper Div
to get a single Block element from a list of blocks. After you are done,
convert the hierarchical elements back into a list:

    local List = require 'pandoc.List'
    local utils = require 'pandoc.utils'

    function flatten (elements)
      local result = List:new {}
      for i, elmnt in ipairs(elements) do
        if elmnt.t == 'Sec' then
          local header = pandoc.Header(elmnt.level, elmnt.label, elmnt.attr)
          table.insert(result, header)
          result:extend(flatten(elmnt.contents))
        else
          table.insert(result, elmnt)
        end
      end
      return result
    end

    function Pandoc (doc)
      local elements = utils.hierarchicalize(doc.blocks)
      -- modify elements
      return pandoc.Pandoc(flatten(elements), doc.meta)
    end

The second method would be to do three filter passes:

  1. Add the header ID as a special Span to the header's content – now
     the information is in an Inline element.
  2. Run your inline filter, which has access to the ID, and is run in
     the correct order.
  3. Delete the special Span from headers again.

--
Albert

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87bm3kirb8.fsf%40zeitkraut.de.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Filter functions unexpectedly run in separate passes
       [not found]     ` <87bm3kirb8.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
@ 2019-02-10 10:23       ` Albert Krewinkel
       [not found]         ` <878syohrgm.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Albert Krewinkel @ 2019-02-10 10:23 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Somehow I missed the simplest method: matching on all Blocks and
applying your inline filter with `walk_block`:

    local current_id
    function Header (header)
      current_id = header.identifier
    end

    function Block (block)
      utils.walk_block(block, { RawInline = TheRawInline })
    end


Albert Krewinkel writes:

> Axel Rauschmayer writes:
>
>> I’m setting up my filter as follows:
>>
>> return { {Header = TrackHeaderId, RawInline = TheRawInline} }
>>
>> I’d expect invocations of TrackHeaderId and TheRawInline to be interleaved.
>> Instead, I’m first seeing all invocations of TheRawInline and then all
>> invocations of TrackHeaderId.
>
> This is a known problem: we currently filter elements in the order
> Inlines → Blocks → Meta → Pandoc. Fixing that is on my TODO list.
>
>> Is there a way to change this? I’d like TrackHeaderId to track the ID of
>> the current Header and I’d like TheRawInline to use that ID.
>
> I can think of two ways, both which are a bit involved: One way would be
> to use `hierarchicalize` from module pandoc.utils. This will turn a list
> of blocks into hierarchical elements. You could then traverse these
> elements and use `walk_block` to modify the contents – use a helper Div
> to get a single Block element from a list of blocks. After you are done,
> convert the hierarchical elements back into a list:
>
>     local List = require 'pandoc.List'
>     local utils = require 'pandoc.utils'
>
>     function flatten (elements)
>       local result = List:new {}
>       for i, elmnt in ipairs(elements) do
>         if elmnt.t == 'Sec' then
>           local header = pandoc.Header(elmnt.level, elmnt.label, elmnt.attr)
>           table.insert(result, header)
>           result:extend(flatten(elmnt.contents))
>         else
>           table.insert(result, elmnt)
>         end
>       end
>       return result
>     end
>
>     function Pandoc (doc)
>       local elements = utils.hierarchicalize(doc.blocks)
>       -- modify elements
>       return pandoc.Pandoc(flatten(elements), doc.meta)
>     end
>
> The second method would be to do three filter passes:
>
>   1. Add the header ID as a special Span to the header's content – now
>      the information is in an Inline element.
>   2. Run your inline filter, which has access to the ID, and is run in
>      the correct order.
>   3. Delete the special Span from headers again.
>
> --
> Albert


--
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/878syohrgm.fsf%40zeitkraut.de.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Filter functions unexpectedly run in separate passes
       [not found]         ` <878syohrgm.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
@ 2019-02-10 13:52           ` Axel Rauschmayer
  2019-02-17 21:15           ` Axel Rauschmayer
  1 sibling, 0 replies; 6+ messages in thread
From: Axel Rauschmayer @ 2019-02-10 13:52 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 594 bytes --]

Thanks, very helpful!

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/b730e288-6d61-497d-8eb1-ca640e83c48b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1006 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Filter functions unexpectedly run in separate passes
       [not found]         ` <878syohrgm.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
  2019-02-10 13:52           ` Axel Rauschmayer
@ 2019-02-17 21:15           ` Axel Rauschmayer
       [not found]             ` <69474255-b0f8-498a-9987-f6977bab4619-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  1 sibling, 1 reply; 6+ messages in thread
From: Axel Rauschmayer @ 2019-02-17 21:15 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 763 bytes --]

Caveat: It implemented naively, you visit the same RawInline multiple 
times. How would you prevent that from happening? I’m considering adding a 
marker (a field) to the inline.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/69474255-b0f8-498a-9987-f6977bab4619%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1163 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Filter functions unexpectedly run in separate passes
       [not found]             ` <69474255-b0f8-498a-9987-f6977bab4619-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2019-02-17 21:42               ` Axel Rauschmayer
  0 siblings, 0 replies; 6+ messages in thread
From: Axel Rauschmayer @ 2019-02-17 21:42 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 718 bytes --]

Never mind – I didn’t implement your approach properly (I forgot to return 
the result of pandoc.walk_block()). Sorry for the noise.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/db09ef67-dd1b-4a3c-89ba-77d5626da593%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 1289 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-02-17 21:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-09 10:52 Filter functions unexpectedly run in separate passes Axel Rauschmayer
     [not found] ` <275ab994-2195-48f3-89b2-703621fc4c80-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2019-02-09 21:29   ` Albert Krewinkel
     [not found]     ` <87bm3kirb8.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2019-02-10 10:23       ` Albert Krewinkel
     [not found]         ` <878syohrgm.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2019-02-10 13:52           ` Axel Rauschmayer
2019-02-17 21:15           ` Axel Rauschmayer
     [not found]             ` <69474255-b0f8-498a-9987-f6977bab4619-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2019-02-17 21:42               ` Axel Rauschmayer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).