public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: "'William Lupton' via pandoc-discuss" <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
Subject: Re: Lua filter to change macro for figure caption md -> latex
Date: Mon, 11 Dec 2023 11:30:14 +0000	[thread overview]
Message-ID: <CAEe_xxikobOS_G9x71nxtz0dr99VVhgBV8in=xKzXgh7JMaRcw@mail.gmail.com> (raw)
In-Reply-To: <32dfe8eb-98ac-40ee-92d7-162528add367n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 13123 bytes --]

Luke,

Filter functions take a single argument (the element in question) so you
need the second form: Figure(fig).

I think the crash is because the caption is in fig.caption (not in
fig.content[1].caption). See https://pandoc.org/lua-filters.html#type-figure
.

Perhaps I can put in a plug for https://github.com/pandoc-ext/logging. I
think you might find this helpful for gaining insight into the element
structure. See example below.

Hope this helps.

Cheers,
William

--------

With this filter in rep.lua:

local logging = require 'logging'

function Pandoc(pandoc)

    if logging.loglevel > 0 then

        logging.temp('meta', pandoc.meta)

    end

    logging.temp('blocks', pandoc.blocks)

end

...your input gives this:

*% *pandoc luke.md -L rep.lua

(#) blocks Blocks[3] {

  [1] Header {

    attr: Attr {

      attributes: AttributeList {}

      classes: List {}

      identifier: "headline"

    }

    content: Inlines[1] {

      [1] Str "Headline"

    }

    level: 1

  }

  [2] Para {

    content: Inlines[3] {

      [1] Str "Some"

      [2] Space

      [3] Str "text"

    }

  }

  [3] Figure {

    attr: Attr {

      attributes: AttributeList {}

      classes: List {}

      identifier: ""

    }

    caption: {

      long: Blocks[1] {

        [1] Plain {

          content: Inlines[7] {

            [1] Str "caption"

            [2] Space

            [3] Str "to"

            [4] Space

            [5] Str "an"

            [6] Space

            [7] Str "image"

          }

        }

      }

    }

    content: Blocks[1] {

      [1] Plain {

        content: Inlines[1] {

          [1] Image {

            attr: Attr {

              attributes: AttributeList {}

              classes: List {}

              identifier: ""

            }

            caption: Inlines[7] {

              [1] Str "caption"

              [2] Space

              [3] Str "to"

              [4] Space

              [5] Str "an"

              [6] Space

              [7] Str "image"

            }

            src: "counter_plot_new_periods.png"

            title: ""

          }

        }

      }

    }

  }

}

<h1 id="headline">Headline</h1>

<p>Some text</p>

<figure>

<img src="counter_plot_new_periods.png" alt="caption to an image" />

<figcaption aria-hidden="true">caption to an image</figcaption>

</figure>

On Mon, 11 Dec 2023 at 10:36, 'lukeflo' via pandoc-discuss <
pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> wrote:

> PS: I know I'm using two very different approaches calling the Figure
> function with arguments. Thats due to the fact that I'm not sure which way
> is the right one...
>
> lukeflo schrieb am Montag, 11. Dezember 2023 um 11:33:21 UTC+1:
>
>> So far, I took the markdwon writer example from the pandoc docs
>> <https://pandoc.org/custom-writers.html#example-modified-markdown-writer>
>> to try out the general function of writers. It works and I think that I
>> understand the general usage.
>>
>> But especially figures (in latex writer and presuambley in general) are
>> relativley complex. Here are two things I tried out so far but always got
>> an error:
>>
>> ``` lua
>> function Writer (doc, opts)
>>    local filter = {
>>       function Figure (caption, image, attr)
>>  local figcap = '\sidecaption{' .. caption .. '}'
>>  return '\\begin{figure}\n' .. image .. '\n' .. figcap .. '\n'
>> '\\end{figure}\n'
>>       end
>>    }
>>    return pandoc.write(doc:walk(filter), 'latex', opts)
>> end
>> ```
>> If I run this writer with my custom template from the CLI using *pandoc
>> --template=../custom.template -t test-writer.lua ast-test.txt -o
>> ast-test.tex* I get
>> *Error running Lua:test-writer.lua:27: '(' expected near 'Figure'*.
>>
>> Furthermore, I tried running the following code just to understand how
>> those writer work. Here I just wanted to replace {figure} with the starred
>> version {figure*} (not sidecaption):
>>
>> ``` lua
>> function Writer (doc, opts)
>>   local filter = {
>>      Figure = function (fig)
>> local tester = '\\begin{figure*}\n' ..
>> fig.content[1].caption[1].attributes[1] .. '\\end{figure*}\n'
>> return pandoc.RawBlock('latex', tester)
>>     end
>>   }
>>   return pandoc.write(doc:walk(filter), 'latex', opts)
>> end
>> ```
>> But also got an error:
>>
>>
>>
>>
>>
>>
>>
>>
>> *Error running Lua:test-writer.lua:28: attempt to index a nil value
>> (field 'caption')stack traceback: [C]: in ? [C]: in method 'walk'
>> test-writer.lua:32: in function 'Writer'stack traceback:
>> test-writer.lua:32: in function 'Writer'*
>>
>> I'm aware that I might be missing something very basic and maybe even
>> very simple. But I'm kind of getting lost a little bit inside all
>> functions, modules etc. as well as the general framework of such writers.
>>
>> Thus, any help explaining my errors and maybe suggesting some better code
>> is very appreciated!
>>
>> The test file in both cases is very simple:
>>
>> ``` markdown
>> ---
>> title: A title
>> ---
>>
>> # Headline
>>
>> Some text
>>
>> ![caption to an image](counter_plot_new_periods.png)
>> ```
>>
>> Thanks in advance!
>> lukeflo schrieb am Freitag, 8. Dezember 2023 um 08:35:48 UTC+1:
>>
>>> Hello Julien,
>>>
>>> thanks for the reply. Unfortunatley, as mentioned in the stackoverflow
>>> post, your suggested LaTeX code won't work.
>>>
>>> The \caption macro is very complex in the backend and cannot be copied
>>> on the fly via \let, \NewCommandCopy or something similar. Even after
>>> doing so with e.g. \NewCommandCopy{\oldcaption}{\caption} and then
>>> setting \RenewDocumentCommand{\caption}{o m}{\sidecaption[#1]{#2}}
>>> nothing changes and the definition of \caption, checked with \meaning
>>> or something similar, stays the same as before (even
>>> \DeclareDocumentCommand doesn't work).
>>>
>>> In the end, it might be possible to somehow change the \caption macro
>>> itself. But the effort might not be worth the result (and its more of a
>>> question for TeX.SE).
>>>
>>> Using a custom writer for building Latex figures and replace the
>>> \caption string inside would be a great solution. I read through the writer
>>> manual, but didn't really understand how the AST works and which values
>>> have to be used in such a writer. Furthermore, I'm using a a custom Latex
>>> template for exporting (based on the default.template.latex) which has to
>>> be integrated with such a writer.
>>>
>>> Therefore, I really woud appreciate a Lua framework to understand which
>>> functions have to be edited etc. to accomplish the substitution.
>>>
>>> Best
>>>
>>> Julien Dutant schrieb am Dienstag, 5. Dezember 2023 um 17:09:19 UTC+1:
>>>
>>>> Lua filters only change Pandoc's AST representation of your document,
>>>> i.e. before it is then converted to LaTeX. A Raw block filter will not act
>>>> on Pandoc's LaTeX output, but only on Raw LaTeX blocks that are in the
>>>> markdown itself.
>>>>
>>>> A Pandoc solution would be to write a custom Lua *writer*
>>>> <https://pandoc.org/custom-writers.html>. The writer would use
>>>> pandoc.write to generate Pandoc's own LaTeX output (body only) and modify
>>>> it with regular expressions or Lua patterns. To replace just a command name
>>>> this is fairly easy, though longer than the third solution below.
>>>>
>>>> A LaTeX solution is to redefine \caption as \sidecaption:
>>>> \renewcommand{\caption}{\sidecaption}
>>>>
>>>> You can keep this enclosed in groups ({...}) to ensure that the
>>>> redefinition only applies locally.
>>>>
>>>> A hybrid Pandoc/LaTeX solution is a Lua filter that insert LaTeX code
>>>> to redefine \caption around figures:
>>>>
>>>> ``` lua
>>>> if FORMAT:match 'latex' then
>>>>     function Figure (elem) return {
>>>>         pandoc.RawBlock('latex',
>>>> '{\\renewcommand{\\caption}{\\subcaption}'),
>>>>          elem,
>>>>          pandoc.RawBlock('latex','}')
>>>>        }
>>>>    end
>>>> end
>>>>
>>>> ```
>>>>
>>>> This replaces any 'Figure' block element by a list (succession) of
>>>> three raw LaTeX blocks. The output should look like:
>>>> {\renewcommand{\caption}{\subcaption}
>>>> ... Pandoc's LaTeX for the figure ...
>>>> }
>>>>
>>>> Reposted from
>>>> https://stackoverflow.com/questions/77504584/pandoc-md-latex-write-lua-filter-to-change-latex-macro-used-for-caption/77607636#77607636
>>>>
>>>> On Monday, November 20, 2023 at 7:06:57 AM UTC+11 lukeflo wrote:
>>>>
>>>>> Hi everybody,
>>>>>
>>>>> I have written a custom latex `.cls' file to establish a typesetting
>>>>> workflow for the scientific journals of my research institute. The
>>>>> texts
>>>>> should be written in Markdown and then be processed with `pandoc' to
>>>>> LaTeX.
>>>>>
>>>>> I already have an elaborated pandoc template to produce the LaTeX
>>>>> preambel etc. So far its working great.
>>>>>
>>>>> But for the figures I need the caption from the Markdown file to be set
>>>>> with `\sidecaption' instead of `\caption' in LaTeX, as well as with an
>>>>> optional argument (short-caption) for the image attribution in the list
>>>>> of figures.
>>>>>
>>>>> To get the latter working I use the following template from a GitHub
>>>>> discussion in the [pandoc repo]:
>>>>>
>>>>> ┌────
>>>>> │ PANDOC_VERSION:must_be_at_least '3.1'
>>>>> │
>>>>> │ if FORMAT:match 'latex' then
>>>>> │   function Figure(f)
>>>>> │     local short = f.content[1].content[1].attributes['short-caption']
>>>>> │     if short and not f.caption.short then
>>>>> │       f.caption.short = pandoc.Inlines(short)
>>>>> │     end
>>>>> │     return f
>>>>> │   end
>>>>> │ end
>>>>> └────
>>>>>
>>>>> That works without any flaws.
>>>>>
>>>>> But now I need to figure out how to change the LaTeX macro used for the
>>>>> caption. The older [approach of pre pandoc version 3.0 posted] by
>>>>> tarleb
>>>>> is really intuitive and I could have easily adapted it to my needs. But
>>>>> since pandoc 3.0 there is the new [/complex figures/] approach and, so
>>>>> far, I couldn't figure out how to change the LaTeX macro used for the
>>>>> captions with this new behaviour.
>>>>>
>>>>> I tried something like that (Adapted from [here]:
>>>>>
>>>>> ┌────
>>>>> │ if FORMAT:match 'latex' then
>>>>> │   function RawBlock (raw)
>>>>> │     local caption = raw.text:match('\\caption')
>>>>> │     if caption then
>>>>> │        raw:gsub('\\caption', '\\sidecaption')
>>>>> │     end
>>>>> │     return raw
>>>>> │   end
>>>>> │ end
>>>>> └────
>>>>>
>>>>> But nothing happened.
>>>>>
>>>>> The main challenge for me are my more-or-less non-existing lua skills.
>>>>> I
>>>>> just never had to use it for my daily tasks. I thought about using
>>>>> `awk'
>>>>> or `sed' to edit the `.tex' file itself using a regex-substitution, but
>>>>> that should remain an absolute stopgap, since it makes the whole
>>>>> workflow less portable.
>>>>>
>>>>> Thus, I'm hoping for a hint/a solution in form of a pandoc-lua script
>>>>> which 1. helps me to achieve the goal, and 2. improve my understanding
>>>>> of lua and the /complex figures/ approach for similar future tasks.
>>>>>
>>>>> I appreciate any tipp!
>>>>>
>>>>> Best,
>>>>> Lukeflo
>>>>>
>>>>> This question is also posted on StackOverFlow:
>>>>> https://stackoverflow.com/q/77504584/19647155
>>>>>
>>>>> [pandoc repo]
>>>>> <https://github.com/jgm/pandoc/issues/7915#issuecomment-1427113349>
>>>>>
>>>>> [approach of pre pandoc version 3.0 posted]
>>>>> <https://github.com/jgm/pandoc/issues/7915#issuecomment-1039370851>
>>>>>
>>>>> [/complex figures/] <https://github.com/jgm/pandoc/releases?page=2>
>>>>>
>>>>> [here] <https://stackoverflow.com/a/71296595/19647155>
>>>>>
>>>> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/32dfe8eb-98ac-40ee-92d7-162528add367n%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/32dfe8eb-98ac-40ee-92d7-162528add367n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAEe_xxikobOS_G9x71nxtz0dr99VVhgBV8in%3DxKzXgh7JMaRcw%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 52089 bytes --]

  parent reply	other threads:[~2023-12-11 11:30 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-19 20:06 'lukeflo' via pandoc-discuss
     [not found] ` <51ca8210-3d60-4d5d-9af2-04c85995deb6n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-12-05 16:09   ` Julien Dutant
     [not found]     ` <f3fa2d12-6277-47c6-a3fc-b5fea1485600n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-12-08  7:35       ` 'lukeflo' via pandoc-discuss
     [not found]         ` <b565fdd5-8216-4596-a2ed-c75019aad172n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-12-11 10:33           ` 'lukeflo' via pandoc-discuss
     [not found]             ` <b419ae83-de83-4035-97cd-fb41cb6be647n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-12-11 10:36               ` 'lukeflo' via pandoc-discuss
     [not found]                 ` <32dfe8eb-98ac-40ee-92d7-162528add367n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-12-11 11:30                   ` 'William Lupton' via pandoc-discuss [this message]
     [not found]                     ` <CAEe_xxikobOS_G9x71nxtz0dr99VVhgBV8in=xKzXgh7JMaRcw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2023-12-11 14:53                       ` Pablo Serrati
     [not found]                         ` <CACTSqG5L4o7npfcUV1iGPMi5fbUrqgPc+5ttPQmZhNTFW7Vsng-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2023-12-11 22:04                           ` 'lukeflo' via pandoc-discuss
     [not found]                             ` <78503784-88bd-4d34-b5f5-a8634d667ba0n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-12-12 10:16                               ` 'William Lupton' via pandoc-discuss

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEe_xxikobOS_G9x71nxtz0dr99VVhgBV8in=xKzXgh7JMaRcw@mail.gmail.com' \
    --to=pandoc-discuss-/jypxa39uh5tlh3mbocffw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).