Pandoc Lua script to filter specific markdown sub sections during PDF generation

public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed

* Pandoc Lua script to filter specific markdown sub sections during PDF generation
@ 2020-09-01  8:01 Henrik Klang
       [not found] ` <4e07ae0a-dcb1-4f89-8b91-ba787b7bea1cn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Henrik Klang @ 2020-09-01  8:01 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 700 bytes --]

 

I have markdown source and want to generate PDF using Pandoc.

I want to remove sub sections from a specific level, in the generated 
document. E.g. filter them from the source markdown.

Would this be possible with Lua or would it be better to do prefiltering 
using some other tools?

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/4e07ae0a-dcb1-4f89-8b91-ba787b7bea1cn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1230 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Pandoc Lua script to filter specific markdown sub sections during PDF generation
       [not found] ` <4e07ae0a-dcb1-4f89-8b91-ba787b7bea1cn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-09-01  8:28   ` Denis Maier
       [not found]     ` <92ac8df6-b88a-b047-b9f0-1fe62873b710-cl+VPiYnx/1AfugRpC6u6w@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Denis Maier @ 2020-09-01  8:28 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 2409 bytes --]

You mean like removing the sub section header? Or the whole subsection 
until the start of the next section?
Both should be possible with a lua filter.  The first is trivial, the 
second more complex (at least for me ;-). But I doubt it wouldn't be 
possible.
The tricky part is possibly that pandoc's AST does not reveal the 
hierarchy here:

[Header 1 ("header-1",[],[]) [Str "Header",Space,Str "1"]
,Para [Str "Text"]
,Header 2 ("header-2",[],[]) [Str "Header",Space,Str "2"]
,Para [Str "Text"]
,Header 2 ("header-2-1",[],[]) [Str "Header",Space,Str "2"]
,Para [Str "Text"]
,Header 1 ("header-1-1",[],[]) [Str "Header",Space,Str "1"]
,Para [Str "Text"]]

So, I imagine you have to walk over the AST and when you encounter a 
"Header 2" element, you delete this and everything else until you find 
the next "Head 1" element.

You could also just include wrap the relevant parts in divs and then 
remove them from the output. That's easier, but not so elegant.


Am 01.09.2020 um 10:01 schrieb Henrik Klang:
>
> I have markdown source and want to generate PDF using Pandoc.
>
> I want to remove sub sections from a specific level, in the generated 
> document. E.g. filter them from the source markdown.
>
> Would this be possible with Lua or would it be better to do 
> prefiltering using some other tools?
>
> -- 
> You received this message because you are subscribed to the Google 
> Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send 
> an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org 
> <mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/pandoc-discuss/4e07ae0a-dcb1-4f89-8b91-ba787b7bea1cn%40googlegroups.com 
> <https://groups.google.com/d/msgid/pandoc-discuss/4e07ae0a-dcb1-4f89-8b91-ba787b7bea1cn%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/92ac8df6-b88a-b047-b9f0-1fe62873b710%40mailbox.org.

[-- Attachment #2: Type: text/html, Size: 3652 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Pandoc Lua script to filter specific markdown sub sections during PDF generation
       [not found]     ` <92ac8df6-b88a-b047-b9f0-1fe62873b710-cl+VPiYnx/1AfugRpC6u6w@public.gmane.org>
@ 2020-09-01  9:35       ` Albert Krewinkel
       [not found]         ` <87d035yh4n.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Albert Krewinkel @ 2020-09-01  9:35 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


Denis Maier writes:

> You mean like removing the sub section header? Or the whole subsection until the
> start of the next section?
> Both should be possible with a lua filter.  The first is trivial, the second
> more complex (at least for me ;-). But I doubt it wouldn't be possible.
> The tricky part is possibly that pandoc's AST does not reveal the hierarchy
> here:
>
> [Header 1 ("header-1",[],[]) [Str "Header",Space,Str "1"]
> ,Para [Str "Text"]
> ,Header 2 ("header-2",[],[]) [Str "Header",Space,Str "2"]
> ,Para [Str "Text"]
> ,Header 2 ("header-2-1",[],[]) [Str "Header",Space,Str "2"]
> ,Para [Str "Text"]
> ,Header 1 ("header-1-1",[],[]) [Str "Header",Space,Str "1"]
> ,Para [Str "Text"]]
>
> So, I imagine you have to walk over the AST and when you encounter a "Header 2"
> element, you delete this and everything else until you find the next "Head 1"
> element.

That's a good approach. Untested Lua:

    local keep_deleting = false
    function Block (b)
      if b.t == 'Header' and b.level >= 3 then
        keep_deleting = true
        return {}
      elseif b.t == 'Header' then
        keep_deleting = false
      elseif keep_deleting then
        return {}
      end
    end

The alternative is to let pandoc wrap all sections in divs and to delete
those that are undesired. See
https://gist.github.com/tarleb/a0f41adfa7b0e5a9be441e945f843299


--
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Pandoc Lua script to filter specific markdown sub sections during PDF generation
       [not found]         ` <87d035yh4n.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
@ 2020-09-01 10:05           ` Henrik Klang
       [not found]             ` <667487b4-f213-45c7-94fd-c0237d8b6592n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Henrik Klang @ 2020-09-01 10:05 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2077 bytes --]

I meant the entire section with both headline and body.

Thanks for the suggestions guys, I will try them out!

/ Henrik

tisdag 1 september 2020 kl. 11:35:15 UTC+2 skrev Albert Krewinkel:

>
> Denis Maier writes:
>
> > You mean like removing the sub section header? Or the whole subsection 
> until the
> > start of the next section?
> > Both should be possible with a lua filter. The first is trivial, the 
> second
> > more complex (at least for me ;-). But I doubt it wouldn't be possible.
> > The tricky part is possibly that pandoc's AST does not reveal the 
> hierarchy
> > here:
> >
> > [Header 1 ("header-1",[],[]) [Str "Header",Space,Str "1"]
> > ,Para [Str "Text"]
> > ,Header 2 ("header-2",[],[]) [Str "Header",Space,Str "2"]
> > ,Para [Str "Text"]
> > ,Header 2 ("header-2-1",[],[]) [Str "Header",Space,Str "2"]
> > ,Para [Str "Text"]
> > ,Header 1 ("header-1-1",[],[]) [Str "Header",Space,Str "1"]
> > ,Para [Str "Text"]]
> >
> > So, I imagine you have to walk over the AST and when you encounter a 
> "Header 2"
> > element, you delete this and everything else until you find the next 
> "Head 1"
> > element.
>
> That's a good approach. Untested Lua:
>
> local keep_deleting = false
> function Block (b)
> if b.t == 'Header' and b.level >= 3 then
> keep_deleting = true
> return {}
> elseif b.t == 'Header' then
> keep_deleting = false
> elseif keep_deleting then
> return {}
> end
> end
>
> The alternative is to let pandoc wrap all sections in divs and to delete
> those that are undesired. See
> https://gist.github.com/tarleb/a0f41adfa7b0e5a9be441e945f843299
>
>
> --
> Albert Krewinkel
> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/667487b4-f213-45c7-94fd-c0237d8b6592n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 3422 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Pandoc Lua script to filter specific markdown sub sections during PDF generation
       [not found]             ` <667487b4-f213-45c7-94fd-c0237d8b6592n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-09-09 11:24               ` Henrik Klang
       [not found]                 ` <dea98b71-e65b-401e-b3df-338cd80ae369n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Henrik Klang @ 2020-09-09 11:24 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2273 bytes --]

Update: your suggestions worked. Thank you for excellent support.

tisdag 1 september 2020 kl. 12:05:48 UTC+2 skrev Henrik Klang:

> I meant the entire section with both headline and body.
>
> Thanks for the suggestions guys, I will try them out!
>
> / Henrik
>
> tisdag 1 september 2020 kl. 11:35:15 UTC+2 skrev Albert Krewinkel:
>
>>
>> Denis Maier writes:
>>
>> > You mean like removing the sub section header? Or the whole subsection 
>> until the
>> > start of the next section?
>> > Both should be possible with a lua filter. The first is trivial, the 
>> second
>> > more complex (at least for me ;-). But I doubt it wouldn't be possible.
>> > The tricky part is possibly that pandoc's AST does not reveal the 
>> hierarchy
>> > here:
>> >
>> > [Header 1 ("header-1",[],[]) [Str "Header",Space,Str "1"]
>> > ,Para [Str "Text"]
>> > ,Header 2 ("header-2",[],[]) [Str "Header",Space,Str "2"]
>> > ,Para [Str "Text"]
>> > ,Header 2 ("header-2-1",[],[]) [Str "Header",Space,Str "2"]
>> > ,Para [Str "Text"]
>> > ,Header 1 ("header-1-1",[],[]) [Str "Header",Space,Str "1"]
>> > ,Para [Str "Text"]]
>> >
>> > So, I imagine you have to walk over the AST and when you encounter a 
>> "Header 2"
>> > element, you delete this and everything else until you find the next 
>> "Head 1"
>> > element.
>>
>> That's a good approach. Untested Lua:
>>
>> local keep_deleting = false
>> function Block (b)
>> if b.t == 'Header' and b.level >= 3 then
>> keep_deleting = true
>> return {}
>> elseif b.t == 'Header' then
>> keep_deleting = false
>> elseif keep_deleting then
>> return {}
>> end
>> end
>>
>> The alternative is to let pandoc wrap all sections in divs and to delete
>> those that are undesired. See
>> https://gist.github.com/tarleb/a0f41adfa7b0e5a9be441e945f843299
>>
>>
>> --
>> Albert Krewinkel
>> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124
>>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/dea98b71-e65b-401e-b3df-338cd80ae369n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 3761 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Pandoc Lua script to filter specific markdown sub sections during PDF generation
       [not found]                 ` <dea98b71-e65b-401e-b3df-338cd80ae369n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-09-09 12:41                   ` Albert Krewinkel
  0 siblings, 0 replies; 6+ messages in thread
From: Albert Krewinkel @ 2020-09-09 12:41 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


Henrik Klang writes:

> Update: your suggestions worked. Thank you for excellent support.

Glad to help, and thanks for letting us know. Cheers!


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-09-09 12:41 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-01  8:01 Pandoc Lua script to filter specific markdown sub sections during PDF generation Henrik Klang
     [not found] ` <4e07ae0a-dcb1-4f89-8b91-ba787b7bea1cn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-09-01  8:28   ` Denis Maier
     [not found]     ` <92ac8df6-b88a-b047-b9f0-1fe62873b710-cl+VPiYnx/1AfugRpC6u6w@public.gmane.org>
2020-09-01  9:35       ` Albert Krewinkel
     [not found]         ` <87d035yh4n.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2020-09-01 10:05           ` Henrik Klang
     [not found]             ` <667487b4-f213-45c7-94fd-c0237d8b6592n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-09-09 11:24               ` Henrik Klang
     [not found]                 ` <dea98b71-e65b-401e-b3df-338cd80ae369n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-09-09 12:41                   ` Albert Krewinkel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).