* pandoc.read and pandoc.write using unspecified extensions. Adding captions to code blocks. @ 2022-04-04 9:34 Bart [not found] ` <1302c732-b7fe-4339-8234-99b761f47296n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Bart @ 2022-04-04 9:34 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 1738 bytes --] Hello, I'm trying to write a lua filter that adds captions to code blocks (markdown -> html). They are specified as attributes. At first I added the following: `{pandoc.Para(codeBlock.attributes.caption)}` which worked, but we found that the caption is a raw string, and not compiled from markdown to html like table captions are. So a caption like `**bold**` did not appear in bold. After looking around for a solution, I came up with the following that seemed very simple: ```lua {pandoc.Plain(pandoc.RawInline( 'html', pandoc.write( pandoc.read( el.attributes.caption, 'markdown', PANDOC_READER_OPTIONS ), 'html', PANDOC_WRITER_OPTIONS ) ))} ``` It certainly seems like that is supposed read the caption as a markdown document and then write it as html using the same options as the original pandoc run. However, we have disabled the raw_html extension for the main document. It is not listed in either PANDOC_READER_OPTIONS.extensions or PANDOC_WRITER_OPTIONS.extensions. But html tags in the caption are not encoded, even though they are everywhere else. Am I doing something wrong here? Is there a way to disable the extension? Is there a simpler way to add bold, italics, strikethrough and inline code to these captions? -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/1302c732-b7fe-4339-8234-99b761f47296n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 2484 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <1302c732-b7fe-4339-8234-99b761f47296n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: pandoc.read and pandoc.write using unspecified extensions. Adding captions to code blocks. [not found] ` <1302c732-b7fe-4339-8234-99b761f47296n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2022-04-04 9:55 ` Albert Krewinkel [not found] ` <8735it6u1y.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Albert Krewinkel @ 2022-04-04 9:55 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw Bart <bart.hijmans-df9nrXIP1oRmR6Xm/wNWPw@public.gmane.org> writes: > I'm trying to write a lua filter that adds captions to code blocks > (markdown -> html). They are specified as attributes. > > [...] > > After looking around for a solution, I came up with the following that > seemed very simple: > > ```lua > {pandoc.Plain(pandoc.RawInline( > 'html', > pandoc.write( > pandoc.read( > el.attributes.caption, > 'markdown', > PANDOC_READER_OPTIONS > ), > 'html', > PANDOC_WRITER_OPTIONS > ) > ))} > ``` That's a good solution, in general. Note that you can omit the PANDOC_READER_OPTiONS and PANDOC_WRITER_OPTIONS, in which case the default options will be used. Even simpler would be to replace the full snippet above with ```lua pandoc.Plain( pandoc.utils.blocks_to_inlines( pandoc.read(el.attributes.caption, 'markdown').blocks ) ) ``` This should give you the result you need and has the advantage of working with any output format. Does this solve the problem? If not, could you add a little example for us to look at? Cheers, Albert -- Albert Krewinkel GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <8735it6u1y.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>]
* Re: pandoc.read and pandoc.write using unspecified extensions. Adding captions to code blocks. [not found] ` <8735it6u1y.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> @ 2022-04-04 14:26 ` Bart Hijmans [not found] ` <CANx_DfPG-_R4hVrjxnBegGDwEE=zYxFB3EASZAYjCYo_975Q_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Bart Hijmans @ 2022-04-04 14:26 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 2823 bytes --] That may be better, but it doesn't solve the problem. I have attached an input and output file, as well as my expected output. The difference is that in the caption, HTML tags are returned whole. But with the "raw_html" extension disabled, they should be encoded. You can see that in a paragraph HTML tags are encoded, but in the caption they are not. It seems to me like pandoc.read is using the "raw_html" extension. But I'm telling it to use PANDOC_READER_OPTIONS, which does not include that extension, as evidenced by the correct encoding of HTML in the paragraph. On Mon, Apr 4, 2022 at 12:07 PM Albert Krewinkel <albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> wrote: > > Bart <bart.hijmans-df9nrXIP1oRmR6Xm/wNWPw@public.gmane.org> writes: > > > I'm trying to write a lua filter that adds captions to code blocks > > (markdown -> html). They are specified as attributes. > > > > [...] > > > > After looking around for a solution, I came up with the following that > > seemed very simple: > > > > ```lua > > {pandoc.Plain(pandoc.RawInline( > > 'html', > > pandoc.write( > > pandoc.read( > > el.attributes.caption, > > 'markdown', > > PANDOC_READER_OPTIONS > > ), > > 'html', > > PANDOC_WRITER_OPTIONS > > ) > > ))} > > ``` > > That's a good solution, in general. Note that you can omit the > PANDOC_READER_OPTiONS and PANDOC_WRITER_OPTIONS, in which case the > default options will be used. > > Even simpler would be to replace the full snippet above with > > ```lua > pandoc.Plain( > pandoc.utils.blocks_to_inlines( > pandoc.read(el.attributes.caption, 'markdown').blocks > ) > ) > ``` > > This should give you the result you need and has the advantage of > working with any output format. > > Does this solve the problem? If not, could you add a little example for > us to look at? > > Cheers, > Albert > > -- > Albert Krewinkel > GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/8735it6u1y.fsf%40zeitkraut.de > . > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CANx_DfPG-_R4hVrjxnBegGDwEE%3DzYxFB3EASZAYjCYo_975Q_A%40mail.gmail.com. [-- Attachment #1.2: Type: text/html, Size: 4185 bytes --] [-- Attachment #2: expected.txt --] [-- Type: text/plain, Size: 258 bytes --] <div class="listing"> <div class="sourceCode"><pre class="sourceCode"><code class="sourceCode">code</code></pre></div> <p><strong>strong tags should be encoded</strong></p> </div> <p><strong>strong tags should be encoded</strong></p> [-- Attachment #3: input.txt --] [-- Type: text/plain, Size: 119 bytes --] ```{caption="<strong>strong tags should be encoded</strong>"} code ``` <strong>strong tags should be encoded</strong> [-- Attachment #4: output.txt --] [-- Type: text/plain, Size: 246 bytes --] <div class="listing"> <div class="sourceCode"><pre class="sourceCode"><code class="sourceCode">code</code></pre></div> <p><strong>strong tags should be encoded</strong></p> </div> <p><strong>strong tags should be encoded</strong></p> ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <CANx_DfPG-_R4hVrjxnBegGDwEE=zYxFB3EASZAYjCYo_975Q_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: pandoc.read and pandoc.write using unspecified extensions. Adding captions to code blocks. [not found] ` <CANx_DfPG-_R4hVrjxnBegGDwEE=zYxFB3EASZAYjCYo_975Q_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2022-04-04 16:07 ` Albert Krewinkel [not found] ` <87tub86d62.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Albert Krewinkel @ 2022-04-04 16:07 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw I'm still not sure if I understand completely. Maybe you'll find it helpful to know that you can selectively enable and disable extensions when using `pandoc.read` and `pandoc.write` by modifying the format string, just as you'd do on the command line: ``` lua function CodeBlock (cb) local caption = cb.attributes.caption and pandoc.read(cb.attributes.caption, 'markdown+raw_html').blocks or pandoc.Blocks{} local listing = {cb} .. caption cb.attributes.caption = nil return pandoc.Div(listing, {class="listing"}) end ``` Bart Hijmans <bart.hijmans-df9nrXIP1oRmR6Xm/wNWPw@public.gmane.org> writes: > That may be better, but it doesn't solve the problem. > I have attached an input and output file, as well as my expected > output. The difference is that in the caption, HTML tags are returned > whole. But with the "raw_html" extension disabled, they should be > encoded. You can see that in a paragraph HTML tags are encoded, but in > the caption they are not. > It seems to me like pandoc.read is using the "raw_html" extension. But > I'm telling it to use PANDOC_READER_OPTIONS, which does not include > that extension, as evidenced by the correct encoding of HTML in the > paragraph. > > On Mon, Apr 4, 2022 at 12:07 PM Albert Krewinkel > <albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> wrote: > > Bart <bart.hijmans-df9nrXIP1oRmR6Xm/wNWPw@public.gmane.org> writes: > > I'm trying to write a lua filter that adds captions to code blocks > > (markdown -> html). They are specified as attributes. > > > > [...] > > > > After looking around for a solution, I came up with the following > that > > seemed very simple: > > > > ```lua > > {pandoc.Plain(pandoc.RawInline( > > 'html', > > pandoc.write( > > pandoc.read( > > el.attributes.caption, > > 'markdown', > > PANDOC_READER_OPTIONS > > ), > > 'html', > > PANDOC_WRITER_OPTIONS > > ) > > ))} > > ``` > That's a good solution, in general. Note that you can omit the > PANDOC_READER_OPTiONS and PANDOC_WRITER_OPTIONS, in which case the > default options will be used. > Even simpler would be to replace the full snippet above with > ```lua > pandoc.Plain( > pandoc.utils.blocks_to_inlines( > pandoc.read(el.attributes.caption, 'markdown').blocks > ) > ) > ``` > This should give you the result you need and has the advantage of > working with any output format. > Does this solve the problem? If not, could you add a little example > for > us to look at? > Cheers, > Albert > -- > Albert Krewinkel > GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 > -- > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, > send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/8735it6u1y.fsf%40ze > itkraut.de. > > -- > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/CANx_DfPG-_R4hVrjxnBeg > GDwEE%3DzYxFB3EASZAYjCYo_975Q_A%40mail.gmail.com. > [4. text/plain; expected.txt]... > > [5. text/plain; input.txt]... > > [6. text/plain; output.txt]... -- Albert Krewinkel GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <87tub86d62.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>]
* Re: pandoc.read and pandoc.write using unspecified extensions. Adding captions to code blocks. [not found] ` <87tub86d62.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> @ 2022-04-04 18:11 ` Bart Hijmans 0 siblings, 0 replies; 5+ messages in thread From: Bart Hijmans @ 2022-04-04 18:11 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1: Type: text/plain, Size: 5354 bytes --] That helps a lot, thank you! I expected that passing PANDOC_READER_OPTIONS would enable exactly the extensions in PANDOC_READER_OPTIONS.extensions and no others. Is there a way to disable all extensions, and then enable just the ones I want? On Mon, Apr 4, 2022 at 6:12 PM Albert Krewinkel <albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> wrote: > I'm still not sure if I understand completely. Maybe you'll find it > helpful to know that you can selectively enable and disable extensions > when using `pandoc.read` and `pandoc.write` by modifying the format > string, just as you'd do on the command line: > > ``` lua > function CodeBlock (cb) > local caption = cb.attributes.caption > and pandoc.read(cb.attributes.caption, 'markdown+raw_html').blocks > or pandoc.Blocks{} > local listing = {cb} .. caption > cb.attributes.caption = nil > return pandoc.Div(listing, {class="listing"}) > end > ``` > > Bart Hijmans <bart.hijmans-df9nrXIP1oRmR6Xm/wNWPw@public.gmane.org> writes: > > > That may be better, but it doesn't solve the problem. > > I have attached an input and output file, as well as my expected > > output. The difference is that in the caption, HTML tags are returned > > whole. But with the "raw_html" extension disabled, they should be > > encoded. You can see that in a paragraph HTML tags are encoded, but in > > the caption they are not. > > It seems to me like pandoc.read is using the "raw_html" extension. But > > I'm telling it to use PANDOC_READER_OPTIONS, which does not include > > that extension, as evidenced by the correct encoding of HTML in the > > paragraph. > > > > On Mon, Apr 4, 2022 at 12:07 PM Albert Krewinkel > > <albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> wrote: > > > > Bart <bart.hijmans-df9nrXIP1oRmR6Xm/wNWPw@public.gmane.org> writes: > > > I'm trying to write a lua filter that adds captions to code blocks > > > (markdown -> html). They are specified as attributes. > > > > > > [...] > > > > > > After looking around for a solution, I came up with the following > > that > > > seemed very simple: > > > > > > ```lua > > > {pandoc.Plain(pandoc.RawInline( > > > 'html', > > > pandoc.write( > > > pandoc.read( > > > el.attributes.caption, > > > 'markdown', > > > PANDOC_READER_OPTIONS > > > ), > > > 'html', > > > PANDOC_WRITER_OPTIONS > > > ) > > > ))} > > > ``` > > That's a good solution, in general. Note that you can omit the > > PANDOC_READER_OPTiONS and PANDOC_WRITER_OPTIONS, in which case the > > default options will be used. > > Even simpler would be to replace the full snippet above with > > ```lua > > pandoc.Plain( > > pandoc.utils.blocks_to_inlines( > > pandoc.read(el.attributes.caption, 'markdown').blocks > > ) > > ) > > ``` > > This should give you the result you need and has the advantage of > > working with any output format. > > Does this solve the problem? If not, could you add a little example > > for > > us to look at? > > Cheers, > > Albert > > -- > > Albert Krewinkel > > GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 > > -- > > You received this message because you are subscribed to the Google > > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, > > send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit > > > https://groups.google.com/d/msgid/pandoc-discuss/8735it6u1y.fsf%40ze > > itkraut.de. > > > > -- > > You received this message because you are subscribed to the Google > > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > > an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit > > > https://groups.google.com/d/msgid/pandoc-discuss/CANx_DfPG-_R4hVrjxnBeg > > GDwEE%3DzYxFB3EASZAYjCYo_975Q_A%40mail.gmail.com. > > [4. text/plain; expected.txt]... > > > > [5. text/plain; input.txt]... > > > > [6. text/plain; output.txt]... > > > -- > Albert Krewinkel > GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/87tub86d62.fsf%40zeitkraut.de > . > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CANx_DfO1La-86iGeiWk8ZU0_hZDUgCOzD1fj5yN%3DRUcGpMoKLQ%40mail.gmail.com. [-- Attachment #2: Type: text/html, Size: 8052 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-04-04 18:11 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-04-04 9:34 pandoc.read and pandoc.write using unspecified extensions. Adding captions to code blocks Bart [not found] ` <1302c732-b7fe-4339-8234-99b761f47296n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2022-04-04 9:55 ` Albert Krewinkel [not found] ` <8735it6u1y.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> 2022-04-04 14:26 ` Bart Hijmans [not found] ` <CANx_DfPG-_R4hVrjxnBegGDwEE=zYxFB3EASZAYjCYo_975Q_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2022-04-04 16:07 ` Albert Krewinkel [not found] ` <87tub86d62.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> 2022-04-04 18:11 ` Bart Hijmans
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).