public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Bold figure caption prefix in docx output with filter
@ 2023-06-13 13:16 Stephan Boltzmann
       [not found] ` <534b2214-42e6-4be9-8b0e-537509f5be3an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Stephan Boltzmann @ 2023-06-13 13:16 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1502 bytes --]

Hello everybody out there using Pandoc,

The following Lua (used with RMarkdown in RStudio) filter should put "*Figure 
n.*" in bold at the beginning of every figure caption, but it doesn't 
change my output:

function Image (img)
  if FORMAT:match 'docx' then
    caption = pandoc.utils.stringify(img.caption)
    if (string.find(caption, 'Fig') ~= nil) then
      img.caption[1] = pandoc.Strong(img.caption[1])
      img.caption[3] = pandoc.Str(string.gsub(img.caption[3].text, ":", 
"."))
      img.caption[3] = pandoc.Strong(img.caption[3])
      fig_num_string = 
string.sub(pandoc.utils.stringify(img.caption[3]),1,2)
      fig_num = math.floor(tonumber(fig_num_string))
      if (fig_num > 6) then
        img.caption[3] = pandoc.Strong("S" .. tostring(8-fig_num) .. '.')
      end
      img.caption.long = pandoc.Strong('A')
      img.caption = pandoc.Strong('A')
    end
  end
  print(pandoc.utils.stringify(img.caption.long))
  return img
end

By putting print statements, I can partially verify that the filter 
operates on the correct elements, but it doesn't change the output.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/534b2214-42e6-4be9-8b0e-537509f5be3an%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 2110 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Bold figure caption prefix in docx output with filter
       [not found] ` <534b2214-42e6-4be9-8b0e-537509f5be3an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2023-06-13 13:56   ` 'William Lupton' via pandoc-discuss
       [not found]     ` <CAEe_xxgZwgmFNJ+s60WJvnvXfk4kGg8UWPoqV2cEPPg_uBev8w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: 'William Lupton' via pandoc-discuss @ 2023-06-13 13:56 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 5376 bytes --]

I think that the main thing here is that you need to operate on the Figure
rather than the Image. Also note that image and figure captions are
different:

   - An image caption is an Inlines list;
   https://pandoc.org/lua-filters.html#type-image
   - A figure caption is a Caption object, which has a long (Blocks list)
   caption with option short (Inlines list) caption;
   https://pandoc.org/lua-filters.html#type-figure

I'm not quite sure when or if you should use the short figure caption, but
am pretty sure that you do need to set the long figure caption.

Finally, a plug for the https://github.com/pandoc-ext/logging module, which
can help to shed light on the AST structure. With this document (I guessed
your input format):

![Figure 1: Cat](Cat.png)

...and with this lua filter (derived from yours):

local logging = require 'logging'

function Figure(fig)
    logging.temp('figure', fig)
end

function Image(img)
    logging.temp('image', img)
    local caption = pandoc.utils.stringify(img.caption)
    if (string.find(caption, 'Fig') ~= nil) then
        img.caption[1] = pandoc.Strong(img.caption[1])
        img.caption[3] = pandoc.Str(string.gsub(img.caption[3].text, ":",
"."))
        img.caption[3] = pandoc.Strong(img.caption[3])
        local fig_num_string = string.sub(
            pandoc.utils.stringify(img.caption[3]),1,2)
        local fig_num = math.floor(tonumber(fig_num_string))
        if (fig_num > 6) then
            img.caption[3] = pandoc.Strong("S" .. tostring(8-fig_num) ..
'.')
        end
        img.caption.long = pandoc.Strong('A')
        img.caption = pandoc.Strong('A')
    end
    logging.temp('->', img)
    return img
end

...you get this output:

% pandoc figure.md -L figure.lua
(#) image Image {
  attr: Attr {
    attributes: AttributeList {}
    classes: List {}
    identifier: ""
  }
  caption: Inlines[5] {
    [1] Str "Figure"
    [2] Space
    [3] Str "1:"
    [4] Space
    [5] Str "Cat"
  }
  src: "Cat.png"
  title: ""
}
(#) -> Image {
  attr: Attr {
    attributes: AttributeList {}
    classes: List {}
    identifier: ""
  }
  caption: Inlines[1] {
    [1] Strong {
      content: Inlines[1] {
        [1] Str "A"
      }
    }
  }
  src: "Cat.png"
  title: ""
}
(#) figure Figure {
  attr: Attr {
    attributes: AttributeList {}
    classes: List {}
    identifier: ""
  }
  caption: {
    long: Blocks[1] {
      [1] Plain {
        content: Inlines[5] {
          [1] Str "Figure"
          [2] Space
          [3] Str "1:"
          [4] Space
          [5] Str "Cat"
        }
      }
    }
  }
  content: Blocks[1] {
    [1] Plain {
      content: Inlines[1] {
        [1] Image {
          attr: Attr {
            attributes: AttributeList {}
            classes: List {}
            identifier: ""
          }
          caption: Inlines[1] {
            [1] Strong {
              content: Inlines[1] {
                [1] Str "A"
              }
            }
          }
          src: "Cat.png"
          title: ""
        }
      }
    }
  }
}
<figure>
<img src="Cat.png" alt="A" />
<figcaption>Figure 1: Cat</figcaption>
</figure>

On Tue, 13 Jun 2023 at 14:16, Stephan Boltzmann <stephan2boltzmann-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
wrote:

> Hello everybody out there using Pandoc,
>
> The following Lua (used with RMarkdown in RStudio) filter should put "*Figure
> n.*" in bold at the beginning of every figure caption, but it doesn't
> change my output:
>
> function Image (img)
>   if FORMAT:match 'docx' then
>     caption = pandoc.utils.stringify(img.caption)
>     if (string.find(caption, 'Fig') ~= nil) then
>       img.caption[1] = pandoc.Strong(img.caption[1])
>       img.caption[3] = pandoc.Str(string.gsub(img.caption[3].text, ":",
> "."))
>       img.caption[3] = pandoc.Strong(img.caption[3])
>       fig_num_string =
> string.sub(pandoc.utils.stringify(img.caption[3]),1,2)
>       fig_num = math.floor(tonumber(fig_num_string))
>       if (fig_num > 6) then
>         img.caption[3] = pandoc.Strong("S" .. tostring(8-fig_num) .. '.')
>       end
>       img.caption.long = pandoc.Strong('A')
>       img.caption = pandoc.Strong('A')
>     end
>   end
>   print(pandoc.utils.stringify(img.caption.long))
>   return img
> end
>
> By putting print statements, I can partially verify that the filter
> operates on the correct elements, but it doesn't change the output.
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/534b2214-42e6-4be9-8b0e-537509f5be3an%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/534b2214-42e6-4be9-8b0e-537509f5be3an%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAEe_xxgZwgmFNJ%2Bs60WJvnvXfk4kGg8UWPoqV2cEPPg_uBev8w%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 7830 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Bold figure caption prefix in docx output with filter
       [not found]     ` <CAEe_xxgZwgmFNJ+s60WJvnvXfk4kGg8UWPoqV2cEPPg_uBev8w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2023-06-14 12:35       ` Stephan Boltzmann
  0 siblings, 0 replies; 3+ messages in thread
From: Stephan Boltzmann @ 2023-06-14 12:35 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 6988 bytes --]

Thanks a lot for not only suggesting an answer, but also explaining how to 
debug Lua filters as well as providing the logging script, which helped me 
a lot.

My final solution looks like this and I hope it is of help to anyone arrive 
at this question via internet search:

function Figure (fig)  -- works on Windows
  if FORMAT:match 'docx' then
    caption = pandoc.utils.stringify(fig.caption.long)
    if (string.find(caption, 'Fig') ~= nil) then
      cap = fig.caption.long[1].content
      fig.caption.long[1].content[1] = pandoc.Strong(cap[1])
      num_suffix = string.gsub(cap[3].text,':','.')
      fig.caption.long[1].content[3] = pandoc.Strong(num_suffix)
      fig_num_string = string.sub(pandoc.utils.stringify(cap[3]), 1, 1)
      fig_num = math.floor(tonumber(fig_num_string))
      if (fig_num > 6) then
        num_str = "S" .. tostring(8-fig_num) .. '.'
        fig.caption.long[1].content[3] = pandoc.Strong(num_str)
      end
    end
  end
  return fig
end

 I'm happy to add when to use short captions:
In case one generates a list of figures in analogy to a table of contents, 
it can be very helpful to have short captions whereas longer captions might 
be required to explain in more detail what is shown in a figure.

William Lupton schrieb am Dienstag, 13. Juni 2023 um 15:57:05 UTC+2:

> I think that the main thing here is that you need to operate on the Figure 
> rather than the Image. Also note that image and figure captions are 
> different: 
>
>    - An image caption is an Inlines list; 
>    https://pandoc.org/lua-filters.html#type-image
>    - A figure caption is a Caption object, which has a long (Blocks list) 
>    caption with option short (Inlines list) caption; 
>    https://pandoc.org/lua-filters.html#type-figure
>
> I'm not quite sure when or if you should use the short figure caption, but 
> am pretty sure that you do need to set the long figure caption.
>
> Finally, a plug for the https://github.com/pandoc-ext/logging module, 
> which can help to shed light on the AST structure. With this document (I 
> guessed your input format):
>
> ![Figure 1: Cat](Cat.png)
>
> ...and with this lua filter (derived from yours):
>
> local logging = require 'logging'
>
> function Figure(fig)
>     logging.temp('figure', fig)
> end
>
> function Image(img)
>     logging.temp('image', img)
>     local caption = pandoc.utils.stringify(img.caption)
>
>     if (string.find(caption, 'Fig') ~= nil) then
>         img.caption[1] = pandoc.Strong(img.caption[1])
>         img.caption[3] = pandoc.Str(string.gsub(img.caption[3].text, ":", 
> "."))
>         img.caption[3] = pandoc.Strong(img.caption[3])
>         local fig_num_string = string.sub(
>             pandoc.utils.stringify(img.caption[3]),1,2)
>         local fig_num = math.floor(tonumber(fig_num_string))
>
>         if (fig_num > 6) then
>             img.caption[3] = pandoc.Strong("S" .. tostring(8-fig_num) .. 
> '.')
>         end
>         img.caption.long = pandoc.Strong('A')
>         img.caption = pandoc.Strong('A')
>     end
>     logging.temp('->', img)
>     return img
> end
>
> ...you get this output:
>
> % pandoc figure.md -L figure.lua
> (#) image Image {
>   attr: Attr {
>     attributes: AttributeList {}
>     classes: List {}
>     identifier: ""
>   }
>   caption: Inlines[5] {
>     [1] Str "Figure"
>     [2] Space
>     [3] Str "1:"
>     [4] Space
>     [5] Str "Cat"
>   }
>   src: "Cat.png"
>   title: ""
> }
> (#) -> Image {
>   attr: Attr {
>     attributes: AttributeList {}
>     classes: List {}
>     identifier: ""
>   }
>   caption: Inlines[1] {
>     [1] Strong {
>       content: Inlines[1] {
>         [1] Str "A"
>       }
>     }
>   }
>   src: "Cat.png"
>   title: ""
> }
> (#) figure Figure {
>   attr: Attr {
>     attributes: AttributeList {}
>     classes: List {}
>     identifier: ""
>   }
>   caption: {
>     long: Blocks[1] {
>       [1] Plain {
>         content: Inlines[5] {
>           [1] Str "Figure"
>           [2] Space
>           [3] Str "1:"
>           [4] Space
>           [5] Str "Cat"
>         }
>       }
>     }
>   }
>   content: Blocks[1] {
>     [1] Plain {
>       content: Inlines[1] {
>         [1] Image {
>           attr: Attr {
>             attributes: AttributeList {}
>             classes: List {}
>             identifier: ""
>           }
>           caption: Inlines[1] {
>             [1] Strong {
>               content: Inlines[1] {
>                 [1] Str "A"
>               }
>             }
>           }
>           src: "Cat.png"
>           title: ""
>         }
>       }
>     }
>   }
> }
> <figure>
> <img src="Cat.png" alt="A" />
> <figcaption>Figure 1: Cat</figcaption>
> </figure>
>
> On Tue, 13 Jun 2023 at 14:16, Stephan Boltzmann <stephan2...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 
> wrote:
>
>> Hello everybody out there using Pandoc,
>>
>> The following Lua (used with RMarkdown in RStudio) filter should put "*Figure 
>> n.*" in bold at the beginning of every figure caption, but it doesn't 
>> change my output:
>>
>> function Image (img)
>>   if FORMAT:match 'docx' then
>>     caption = pandoc.utils.stringify(img.caption)
>>     if (string.find(caption, 'Fig') ~= nil) then
>>       img.caption[1] = pandoc.Strong(img.caption[1])
>>       img.caption[3] = pandoc.Str(string.gsub(img.caption[3].text, ":", 
>> "."))
>>       img.caption[3] = pandoc.Strong(img.caption[3])
>>       fig_num_string = 
>> string.sub(pandoc.utils.stringify(img.caption[3]),1,2)
>>       fig_num = math.floor(tonumber(fig_num_string))
>>       if (fig_num > 6) then
>>         img.caption[3] = pandoc.Strong("S" .. tostring(8-fig_num) .. '.')
>>       end
>>       img.caption.long = pandoc.Strong('A')
>>       img.caption = pandoc.Strong('A')
>>     end
>>   end
>>   print(pandoc.utils.stringify(img.caption.long))
>>   return img
>> end
>>
>> By putting print statements, I can partially verify that the filter 
>> operates on the correct elements, but it doesn't change the output.
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/pandoc-discuss/534b2214-42e6-4be9-8b0e-537509f5be3an%40googlegroups.com 
>> <https://groups.google.com/d/msgid/pandoc-discuss/534b2214-42e6-4be9-8b0e-537509f5be3an%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/c1b91dac-7b1d-4d2b-96d0-7306ae80750en%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 11008 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-06-14 12:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-13 13:16 Bold figure caption prefix in docx output with filter Stephan Boltzmann
     [not found] ` <534b2214-42e6-4be9-8b0e-537509f5be3an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-06-13 13:56   ` 'William Lupton' via pandoc-discuss
     [not found]     ` <CAEe_xxgZwgmFNJ+s60WJvnvXfk4kGg8UWPoqV2cEPPg_uBev8w-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2023-06-14 12:35       ` Stephan Boltzmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).