public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* pandoc.read and pandoc.write using unspecified extensions. Adding captions to code blocks.
@ 2022-04-04  9:34 Bart
       [not found] ` <1302c732-b7fe-4339-8234-99b761f47296n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Bart @ 2022-04-04  9:34 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1738 bytes --]

Hello,

I'm trying to write a lua filter that adds captions to code blocks 
(markdown -> html). They are specified as attributes. At first I added the 
following:

`{pandoc.Para(codeBlock.attributes.caption)}`

which worked, but we found that the caption is a raw string, and not 
compiled from markdown to html like table captions are. So a caption like 
`**bold**` did not appear in bold.

After looking around for a solution, I came up with the following that 
seemed very simple:

```lua
{pandoc.Plain(pandoc.RawInline(
    'html',
    pandoc.write(
        pandoc.read(
            el.attributes.caption,
            'markdown',
            PANDOC_READER_OPTIONS
        ),
        'html',
        PANDOC_WRITER_OPTIONS
     )
))}
```
It certainly seems like that is supposed read the caption as a markdown 
document and then write it as html using the same options as the original 
pandoc run. However, we have disabled the raw_html extension for the main 
document. It is not listed in either PANDOC_READER_OPTIONS.extensions or 
PANDOC_WRITER_OPTIONS.extensions. But html tags in the caption are not 
encoded, even though they are everywhere else.

Am I doing something wrong here? Is there a way to disable the extension? 
Is there a simpler way to add bold, italics, strikethrough and inline code 
to these captions?

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/1302c732-b7fe-4339-8234-99b761f47296n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 2484 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: pandoc.read and pandoc.write using unspecified extensions. Adding captions to code blocks.
       [not found] ` <1302c732-b7fe-4339-8234-99b761f47296n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2022-04-04  9:55   ` Albert Krewinkel
       [not found]     ` <8735it6u1y.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Albert Krewinkel @ 2022-04-04  9:55 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


Bart <bart.hijmans-df9nrXIP1oRmR6Xm/wNWPw@public.gmane.org> writes:

> I'm trying to write a lua filter that adds captions to code blocks
> (markdown -> html). They are specified as attributes.
>
> [...]
>
> After looking around for a solution, I came up with the following that
> seemed very simple:
>
> ```lua
> {pandoc.Plain(pandoc.RawInline(
>     'html',
>     pandoc.write(
>         pandoc.read(
>             el.attributes.caption,
>             'markdown',
>             PANDOC_READER_OPTIONS
>         ),
>         'html',
>         PANDOC_WRITER_OPTIONS
>      )
> ))}
> ```

That's a good solution, in general. Note that you can omit the
PANDOC_READER_OPTiONS and PANDOC_WRITER_OPTIONS, in which case the
default options will be used.

Even simpler would be to replace the full snippet above with

```lua
pandoc.Plain(
  pandoc.utils.blocks_to_inlines(
    pandoc.read(el.attributes.caption, 'markdown').blocks
  )
)
```

This should give you the result you need and has the advantage of
working with any output format.

Does this solve the problem? If not, could you add a little example for
us to look at?

Cheers,
Albert

-- 
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: pandoc.read and pandoc.write using unspecified extensions. Adding captions to code blocks.
       [not found]     ` <8735it6u1y.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
@ 2022-04-04 14:26       ` Bart Hijmans
       [not found]         ` <CANx_DfPG-_R4hVrjxnBegGDwEE=zYxFB3EASZAYjCYo_975Q_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Bart Hijmans @ 2022-04-04 14:26 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2823 bytes --]

That may be better, but it doesn't solve the problem.

I have attached an input and output file, as well as my expected output.
The difference is that in the caption, HTML tags are returned whole. But
with the "raw_html" extension disabled, they should be encoded. You can see
that in a paragraph HTML tags are encoded, but in the caption they are not.

It seems to me like pandoc.read is using the "raw_html" extension. But I'm
telling it to use PANDOC_READER_OPTIONS, which does not include that
extension, as evidenced by the correct encoding of HTML in the paragraph.

On Mon, Apr 4, 2022 at 12:07 PM Albert Krewinkel <albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
wrote:

>
> Bart <bart.hijmans-df9nrXIP1oRmR6Xm/wNWPw@public.gmane.org> writes:
>
> > I'm trying to write a lua filter that adds captions to code blocks
> > (markdown -> html). They are specified as attributes.
> >
> > [...]
> >
> > After looking around for a solution, I came up with the following that
> > seemed very simple:
> >
> > ```lua
> > {pandoc.Plain(pandoc.RawInline(
> >     'html',
> >     pandoc.write(
> >         pandoc.read(
> >             el.attributes.caption,
> >             'markdown',
> >             PANDOC_READER_OPTIONS
> >         ),
> >         'html',
> >         PANDOC_WRITER_OPTIONS
> >      )
> > ))}
> > ```
>
> That's a good solution, in general. Note that you can omit the
> PANDOC_READER_OPTiONS and PANDOC_WRITER_OPTIONS, in which case the
> default options will be used.
>
> Even simpler would be to replace the full snippet above with
>
> ```lua
> pandoc.Plain(
>   pandoc.utils.blocks_to_inlines(
>     pandoc.read(el.attributes.caption, 'markdown').blocks
>   )
> )
> ```
>
> This should give you the result you need and has the advantage of
> working with any output format.
>
> Does this solve the problem? If not, could you add a little example for
> us to look at?
>
> Cheers,
> Albert
>
> --
> Albert Krewinkel
> GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/8735it6u1y.fsf%40zeitkraut.de
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CANx_DfPG-_R4hVrjxnBegGDwEE%3DzYxFB3EASZAYjCYo_975Q_A%40mail.gmail.com.

[-- Attachment #1.2: Type: text/html, Size: 4185 bytes --]

[-- Attachment #2: expected.txt --]
[-- Type: text/plain, Size: 258 bytes --]

<div class="listing">
<div class="sourceCode"><pre class="sourceCode"><code class="sourceCode">code</code></pre></div>
<p>&lt;strong&gt;strong tags should be encoded&lt;/strong&gt;</p>
</div>
<p>&lt;strong&gt;strong tags should be encoded&lt;/strong&gt;</p>

[-- Attachment #3: input.txt --]
[-- Type: text/plain, Size: 119 bytes --]

```{caption="<strong>strong tags should be encoded</strong>"}
code
```

<strong>strong tags should be encoded</strong>

[-- Attachment #4: output.txt --]
[-- Type: text/plain, Size: 246 bytes --]

<div class="listing">
<div class="sourceCode"><pre class="sourceCode"><code class="sourceCode">code</code></pre></div>
<p><strong>strong tags should be encoded</strong></p>
</div>
<p>&lt;strong&gt;strong tags should be encoded&lt;/strong&gt;</p>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: pandoc.read and pandoc.write using unspecified extensions. Adding captions to code blocks.
       [not found]         ` <CANx_DfPG-_R4hVrjxnBegGDwEE=zYxFB3EASZAYjCYo_975Q_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2022-04-04 16:07           ` Albert Krewinkel
       [not found]             ` <87tub86d62.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Albert Krewinkel @ 2022-04-04 16:07 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

I'm still not sure if I understand completely. Maybe you'll find it
helpful to know that you can selectively enable and disable extensions
when using `pandoc.read` and `pandoc.write` by modifying the format
string, just as you'd do on the command line:

``` lua
function CodeBlock (cb)
  local caption = cb.attributes.caption
    and pandoc.read(cb.attributes.caption, 'markdown+raw_html').blocks
    or pandoc.Blocks{}
  local listing = {cb} .. caption
  cb.attributes.caption = nil
  return pandoc.Div(listing, {class="listing"})
end
```

Bart Hijmans <bart.hijmans-df9nrXIP1oRmR6Xm/wNWPw@public.gmane.org> writes:

>    That may be better, but it doesn't solve the problem.
>    I have attached an input and output file, as well as my expected
>    output. The difference is that in the caption, HTML tags are returned
>    whole. But with the "raw_html" extension disabled, they should be
>    encoded. You can see that in a paragraph HTML tags are encoded, but in
>    the caption they are not.
>    It seems to me like pandoc.read is using the "raw_html" extension. But
>    I'm telling it to use PANDOC_READER_OPTIONS, which does not include
>    that extension, as evidenced by the correct encoding of HTML in the
>    paragraph.
>
>    On Mon, Apr 4, 2022 at 12:07 PM Albert Krewinkel
>    <albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> wrote:
>
>      Bart <bart.hijmans-df9nrXIP1oRmR6Xm/wNWPw@public.gmane.org> writes:
>      > I'm trying to write a lua filter that adds captions to code blocks
>      > (markdown -> html). They are specified as attributes.
>      >
>      > [...]
>      >
>      > After looking around for a solution, I came up with the following
>      that
>      > seemed very simple:
>      >
>      > ```lua
>      > {pandoc.Plain(pandoc.RawInline(
>      >     'html',
>      >     pandoc.write(
>      >         pandoc.read(
>      >             el.attributes.caption,
>      >             'markdown',
>      >             PANDOC_READER_OPTIONS
>      >         ),
>      >         'html',
>      >         PANDOC_WRITER_OPTIONS
>      >      )
>      > ))}
>      > ```
>      That's a good solution, in general. Note that you can omit the
>      PANDOC_READER_OPTiONS and PANDOC_WRITER_OPTIONS, in which case the
>      default options will be used.
>      Even simpler would be to replace the full snippet above with
>      ```lua
>      pandoc.Plain(
>        pandoc.utils.blocks_to_inlines(
>          pandoc.read(el.attributes.caption, 'markdown').blocks
>        )
>      )
>      ```
>      This should give you the result you need and has the advantage of
>      working with any output format.
>      Does this solve the problem? If not, could you add a little example
>      for
>      us to look at?
>      Cheers,
>      Albert
>      --
>      Albert Krewinkel
>      GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124
>      --
>      You received this message because you are subscribed to the Google
>      Groups "pandoc-discuss" group.
>      To unsubscribe from this group and stop receiving emails from it,
>      send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>      To view this discussion on the web visit
>      https://groups.google.com/d/msgid/pandoc-discuss/8735it6u1y.fsf%40ze
>      itkraut.de.
>
>    --
>    You received this message because you are subscribed to the Google
>    Groups "pandoc-discuss" group.
>    To unsubscribe from this group and stop receiving emails from it, send
>    an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>    To view this discussion on the web visit
>    https://groups.google.com/d/msgid/pandoc-discuss/CANx_DfPG-_R4hVrjxnBeg
>    GDwEE%3DzYxFB3EASZAYjCYo_975Q_A%40mail.gmail.com.
> [4. text/plain; expected.txt]...
>
> [5. text/plain; input.txt]...
>
> [6. text/plain; output.txt]...


-- 
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: pandoc.read and pandoc.write using unspecified extensions. Adding captions to code blocks.
       [not found]             ` <87tub86d62.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
@ 2022-04-04 18:11               ` Bart Hijmans
  0 siblings, 0 replies; 5+ messages in thread
From: Bart Hijmans @ 2022-04-04 18:11 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 5354 bytes --]

That helps a lot, thank you!

I expected that passing PANDOC_READER_OPTIONS would enable exactly the
extensions in PANDOC_READER_OPTIONS.extensions and no others. Is there a
way to disable all extensions, and then enable just the ones I want?

On Mon, Apr 4, 2022 at 6:12 PM Albert Krewinkel <albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
wrote:

> I'm still not sure if I understand completely. Maybe you'll find it
> helpful to know that you can selectively enable and disable extensions
> when using `pandoc.read` and `pandoc.write` by modifying the format
> string, just as you'd do on the command line:
>
> ``` lua
> function CodeBlock (cb)
>   local caption = cb.attributes.caption
>     and pandoc.read(cb.attributes.caption, 'markdown+raw_html').blocks
>     or pandoc.Blocks{}
>   local listing = {cb} .. caption
>   cb.attributes.caption = nil
>   return pandoc.Div(listing, {class="listing"})
> end
> ```
>
> Bart Hijmans <bart.hijmans-df9nrXIP1oRmR6Xm/wNWPw@public.gmane.org> writes:
>
> >    That may be better, but it doesn't solve the problem.
> >    I have attached an input and output file, as well as my expected
> >    output. The difference is that in the caption, HTML tags are returned
> >    whole. But with the "raw_html" extension disabled, they should be
> >    encoded. You can see that in a paragraph HTML tags are encoded, but in
> >    the caption they are not.
> >    It seems to me like pandoc.read is using the "raw_html" extension. But
> >    I'm telling it to use PANDOC_READER_OPTIONS, which does not include
> >    that extension, as evidenced by the correct encoding of HTML in the
> >    paragraph.
> >
> >    On Mon, Apr 4, 2022 at 12:07 PM Albert Krewinkel
> >    <albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> wrote:
> >
> >      Bart <bart.hijmans-df9nrXIP1oRmR6Xm/wNWPw@public.gmane.org> writes:
> >      > I'm trying to write a lua filter that adds captions to code blocks
> >      > (markdown -> html). They are specified as attributes.
> >      >
> >      > [...]
> >      >
> >      > After looking around for a solution, I came up with the following
> >      that
> >      > seemed very simple:
> >      >
> >      > ```lua
> >      > {pandoc.Plain(pandoc.RawInline(
> >      >     'html',
> >      >     pandoc.write(
> >      >         pandoc.read(
> >      >             el.attributes.caption,
> >      >             'markdown',
> >      >             PANDOC_READER_OPTIONS
> >      >         ),
> >      >         'html',
> >      >         PANDOC_WRITER_OPTIONS
> >      >      )
> >      > ))}
> >      > ```
> >      That's a good solution, in general. Note that you can omit the
> >      PANDOC_READER_OPTiONS and PANDOC_WRITER_OPTIONS, in which case the
> >      default options will be used.
> >      Even simpler would be to replace the full snippet above with
> >      ```lua
> >      pandoc.Plain(
> >        pandoc.utils.blocks_to_inlines(
> >          pandoc.read(el.attributes.caption, 'markdown').blocks
> >        )
> >      )
> >      ```
> >      This should give you the result you need and has the advantage of
> >      working with any output format.
> >      Does this solve the problem? If not, could you add a little example
> >      for
> >      us to look at?
> >      Cheers,
> >      Albert
> >      --
> >      Albert Krewinkel
> >      GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124
> >      --
> >      You received this message because you are subscribed to the Google
> >      Groups "pandoc-discuss" group.
> >      To unsubscribe from this group and stop receiving emails from it,
> >      send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> >      To view this discussion on the web visit
> >
> https://groups.google.com/d/msgid/pandoc-discuss/8735it6u1y.fsf%40ze
> >      itkraut.de.
> >
> >    --
> >    You received this message because you are subscribed to the Google
> >    Groups "pandoc-discuss" group.
> >    To unsubscribe from this group and stop receiving emails from it, send
> >    an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> >    To view this discussion on the web visit
> >
> https://groups.google.com/d/msgid/pandoc-discuss/CANx_DfPG-_R4hVrjxnBeg
> >    GDwEE%3DzYxFB3EASZAYjCYo_975Q_A%40mail.gmail.com.
> > [4. text/plain; expected.txt]...
> >
> > [5. text/plain; input.txt]...
> >
> > [6. text/plain; output.txt]...
>
>
> --
> Albert Krewinkel
> GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/87tub86d62.fsf%40zeitkraut.de
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CANx_DfO1La-86iGeiWk8ZU0_hZDUgCOzD1fj5yN%3DRUcGpMoKLQ%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 8052 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-04-04 18:11 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-04  9:34 pandoc.read and pandoc.write using unspecified extensions. Adding captions to code blocks Bart
     [not found] ` <1302c732-b7fe-4339-8234-99b761f47296n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-04-04  9:55   ` Albert Krewinkel
     [not found]     ` <8735it6u1y.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2022-04-04 14:26       ` Bart Hijmans
     [not found]         ` <CANx_DfPG-_R4hVrjxnBegGDwEE=zYxFB3EASZAYjCYo_975Q_A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2022-04-04 16:07           ` Albert Krewinkel
     [not found]             ` <87tub86d62.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2022-04-04 18:11               ` Bart Hijmans

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).