public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Skipping commands in LaTeX document
@ 2021-12-04  0:34 'Greg Shuflin' via pandoc-discuss
  0 siblings, 0 replies; 12+ messages in thread
From: 'Greg Shuflin' via pandoc-discuss @ 2021-12-04  0:34 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 1429 bytes --]

I have a minimal test latex file `test.tex`:

\documentclass{article}

\usepackage{fontspec}

\newfontfamily\IPAFont{Doulos SIL}
\DeclareTextFontCommand{\IPA}{\IPAFont}

\begin{document}

\section{Test}
Hello \IPA{some IPA}

\end{document}

This builds fine with xelatex and produces a pdf I expect. When i try to convert this to an html document with `pandoc --pdf-engine=xelatex --verbose test.tex -o test.html`, I see the warnings:

[INFO] Could not load include file fontspec.sty at test.tex line 3 column 22
[INFO] Skipped '\newfontfamily' at test.tex line 5 column 15
[INFO] Skipped '\IPAFont{Doulos SIL}' at test.tex line 5 column 35
[INFO] Skipped '\DeclareTextFontCommand{\IPA}{\IPAFont}' at test.tex line 6 column 40
[INFO] Skipped '\IPA{some IPA}' at test.tex line 11 column 21

And the text within the custom \IPA command is skipped. How can I make pandoc not skip these?

Sent with [ProtonMail](https://protonmail.com/) Secure Email.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/KC6nxdEBAKvH0ltVrHWAJQUnIQa7_hpBYQeGXj5MKUR1g43Wtnbnbf_P33yQ_A5CS5MvFt4HT2-f0Y4hJgNVCN_SZUt2EKf0wKdvvWlfKJY%3D%40protonmail.com.

[-- Attachment #2: Type: text/html, Size: 2436 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Skipping commands in LaTeX document
       [not found]                                 ` <4f3956c3-e028-473c-b622-dae2f0b72dedn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2021-12-09 16:18                                   ` John MacFarlane
  0 siblings, 0 replies; 12+ messages in thread
From: John MacFarlane @ 2021-12-09 16:18 UTC (permalink / raw)
  To: Greg S, pandoc-discuss

Greg S <elorian.mestec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Redefining \IPA and \makecell in latex fixes the problem as far as pandoc 
> is concerned. But I am using these commands to generate the latex output 
> that I want, so I don't want the redefinitions permanently within the latex 
> source file. Is there a way to configure pandoc to only insert these when 
> pandoc is processing the .tex file, so I have correct tables and no 
> strikethrough when xelatex is processing the file?

You could automate this in a variety of ways. For example,
you could add \include{extras} in your latex file, and
then have extras_for_pandoc.tex containing the special definitions,
and extras_for_latex.tex not containing them (maybe empty).
Before running latex on the file, symlink extras.tex to
extras_for_latex.tex.  Before running pandoc, symlink it
to extras_for_pandoc.tex.  Put this all in a Makefile so
you don't have to think about it.

Or you could use a conditional in your latex file that is
sensitive to something you can set on the command line.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Skipping commands in LaTeX document
       [not found]                             ` <CADAJKhCC9xm6HX0aF5SzJr9vG3xZR1eiQxxCpA6QNRi1BRE-7g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2021-12-09  3:03                               ` Greg S
       [not found]                                 ` <4f3956c3-e028-473c-b622-dae2f0b72dedn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Greg S @ 2021-12-09  3:03 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 13137 bytes --]

>What do you get if you run pandoc with -f latex+raw_tex -t native without 
any filter? My guess is that it is one of these:
>
> 1.  The whole tabular ends up inside a huge RawBlock.
>
> 2.  The \makecell command ends up inside a RawInline or RawBlock and 
doesn't get rendered in the output.
> 
> 3. #2 + the regex doesn't see the \IPA command because it is inside the 
\makecell command.

It looks like it's #3, everything within the makecell command is just 
completely missing from the document.

Redefining \IPA and \makecell in latex fixes the problem as far as pandoc 
is concerned. But I am using these commands to generate the latex output 
that I want, so I don't want the redefinitions permanently within the latex 
source file. Is there a way to configure pandoc to only insert these when 
pandoc is processing the .tex file, so I have correct tables and no 
strikethrough when xelatex is processing the file?

On Monday, December 6, 2021 at 2:34:27 AM UTC-8 BPJ wrote:

> What do you get if you run pandoc with -f latex+raw_tex -t native without 
> any filter? My guess is that it is one of these:
>
> 1.  The whole tabular ends up inside a huge RawBlock.
>
> 2.  The \makecell command ends up inside a RawInline or RawBlock and 
> doesn't get rendered in the output.
>
> 3. #2 + the regex doesn't see the \IPA command because it is inside the 
> \makecell command.
>
> Also you may need a non-greedy regex: "\\IPA{(.*?)}" — and you may need 
> the regex module for that to work.
>
> Please try putting these definitions at the top of your document body[^0]:
>
> ``````latex
> \usepackage[normalem]{ulem}
>
> \renewcommand{\makecell}[1]{#1}
>
> \renewcommand{\IPA}[1]{\sout{#1}}
> ``````
>
> Then save the Lua code below to a file sout2ipa.lua in the current 
> directory and run pandoc with -f latex -t html -L sout2ipa.lua
>
> ``````lua
> function Strikeout (elem)
>   return pandoc.Span(elem.content, { class = 'IPA' })
> end
> ``````
>
> Now you should get all your IPA nicely inside spans with class "IPA".
>
> There is a gotcha: this trick requires that you don't have any actual 
> strikeout text in your document.
>
> @jgm there really should be an extension which makes the LaTeX reader 
> recognise a pseudocommand `\PandocSpan{attrA=value, attrB={long 
> value}}{content}` so that one could do redefinitions like those below and 
> get native spans in the Pandoc AST.
>
> ``````latex
> \renewcommand{\IPA}[1]{\PandocSpan{class=IPA}{#1}}
>
>
> \renewcommand{\TakesTwo}[2]{\PandocSpan{class=foo}{\PandocSpan{data-foo=1}{#1}\PandocSpan{data-foo=2}{#2}}}
>
> \renewcommand{\TakesKeyVals}[2][]{\PandocSpan{#1, class=bar}{#2}
> ``````
>
> where the reader wi convert any keyval-style content in the first argument 
> to span attributes, with later ones overriding.
>
> (And possibly an analogous PandocDiv command, working somewhat like the 
> `\NewEnviron` command of the LaTeX environ package[^1] with a 
> pseudo-command `\BODY` (or `\DIV` so as to not clash with environ!) which 
> gets replaced with the content of the div.)
>
> Even if such a structure isn't usable on its own it would be much easier 
> to modify it with filters.
>
> /bpj
>
> [^0]: I'm not sure that \renewcommand works but since I am on my phone ATM 
> I can't check. If not comment out original \newcommand and/or \usepackage 
> commands and define substitute commands with \newcommand as appropriate.
>
> [^1]: https://ctan.org/pkg/environ
>
>
> Den mån 6 dec. 2021 05:26Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skrev:
>
>> Okay I've written a filter:
>>
>> ```
>> #!/usr/bin/python
>> import logging
>> import re
>> from pandocfilters import toJSONFilter, Emph, Para, RawInline
>>
>> ipa_regex = re.compile("\\\IPA{(.*)}")
>>
>> def handle(key, value, format, meta): 
>>     logging.warning(f"KEY {key} VALUE {value} format {format} META 
>> {meta}") 
>>     if key == "RawInline": 
>>           if m := ipa_regex.match(value[1]):
>>                return RawInline('html', f"{m.group(1)}") 
>>
>> if __name__ == "__main__":
>>     toJSONFilter(handle) 
>> ```
>>
>> and with the `-f latex+raw_tex` option passed to pandoc it looks like 
>> this is correctly capturing the text in the IPA macro.
>>
>> However, I noticed that the filter completely skips over text in the \IPA 
>> macro if that macro occurs within a latex table defined with 
>> \begin{tabular}. I'm using the 
>> makecell latex package and wrapping the cells with the \makecell command 
>> (i.e. `\makecell { \IPA{ some text } }`, but I tried removing the \makecell 
>> and the IPA macro still gets skipped in this context.
>>
>>
>> On Sunday, December 5, 2021 at 12:12:44 PM UTC-8 John MacFarlane wrote:
>>
>>>
>>> I should have mentioned before that you'll need to enable 
>>> the `raw_tex` extension as shown above, to allow inclusion 
>>> of RawBlock or RawInline. 
>>>
>>> % pandoc -t native -f latex+raw_tex 
>>> \IPA{hi} there 
>>> ^D 
>>> [ Para 
>>> [ RawInline (Format "latex") "\\IPA{hi}" 
>>> , Space 
>>> , Str "there" 
>>> ] 
>>> ] 
>>>
>>>
>>> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: 
>>>
>>> > How can I write a filter that matches RawInline elements if the filter 
>>> > applies after the unknown latex macros have been applied in the 
>>> parsing 
>>> > stage? I'm not seeing the text within the \IPA macro at all in the 
>>> logging 
>>> > from the test filter I wrote - is there something I need to do to make 
>>> that 
>>> > filter apply earlier? 
>>> > 
>>> > On Sunday, December 5, 2021 at 10:56:51 AM UTC-8 John MacFarlane 
>>> wrote: 
>>> > 
>>> >> 
>>> >> You can't insert the macro with a filter, because the filter 
>>> >> is applied after parsing, and the macro would be resolved in 
>>> >> the parsing phase. 
>>> >> 
>>> >> However, you could have a filter that matches RawInline 
>>> >> elements that are "\IPA" commands, extracts their textual 
>>> >> content, and returns a Str element. 
>>> >> 
>>> >> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: 
>>> >> 
>>> >> > Is there a way I can tell pandoc to insert a new Latex macro before 
>>> >> > processing that doesn't exist in the document? Using 
>>> >> > \renewcommand{\IPA}[1]{#1} makes the text appear in the output of 
>>> the 
>>> >> latex 
>>> >> > -> html conversion, but it breaks the formatting I care about in 
>>> the pdf 
>>> >> > version so I don't want to have that line permanently in the latex 
>>> source 
>>> >> > file. 
>>> >> > 
>>> >> > I think I'd ultimately like to use a filter to intercept the raw 
>>> latex 
>>> >> from 
>>> >> > \IPA{...} and do something specific with it in HTML (probably put 
>>> it 
>>> >> within 
>>> >> > a <span class="IPA"> tag). I also have some other latex macros from 
>>> >> > specific packages that pandoc doesn't seem to understand, that I'd 
>>> like 
>>> >> to 
>>> >> > handle in a custom way. I tried creating a simple logging Python 
>>> filter 
>>> >> > just to understand how they work. 
>>> >> > 
>>> >> > ``` 
>>> >> > #!/usr/bin/python 
>>> >> > import logging 
>>> >> > from pandocfilters import toJSONFilter, Emph, Para 
>>> >> > 
>>> >> > def handle(key, value, format, meta): 
>>> >> > logging.warn(f"KEY {key} VALUE {value} format {format} META 
>>> {meta}") 
>>> >> > 
>>> >> > if __name__ == "__main__": 
>>> >> > toJSONFilter(handle) 
>>> >> > ``` 
>>> >> > And then running `pandoc --pdf-engine=xelatex --verbose test.tex -o 
>>> >> > test.html --filter filter.py`. 
>>> >> > 
>>> >> > But it seems like latex macros that pandoc doesn't understand are 
>>> getting 
>>> >> > skipped before the filter is applied, so the `handle` function 
>>> never gets 
>>> >> > called with the text contents of my \IPA macro. 
>>> >> > 
>>> >> > On Saturday, December 4, 2021 at 9:37:16 AM UTC-8 John MacFarlane 
>>> wrote: 
>>> >> > 
>>> >> >> 
>>> >> >> Pandoc doesn't understand everything, especially outside of 
>>> >> >> core LaTeX. In particular, it doesn't understand 
>>> >> >> 
>>> >> >> \DeclareTextFontCommand 
>>> >> >> 
>>> >> >> from fontspec, so the \IPA macro isn't understood. 
>>> >> >> 
>>> >> >> You can work around this by adding your own macro 
>>> >> >> definition before you convert with pandoc: 
>>> >> >> 
>>> >> >> \renewcommand{\IPA}[1]{#1} 
>>> >> >> 
>>> >> >> and then the contents of \IPA will just be passed 
>>> >> >> through. 
>>> >> >> 
>>> >> >> I suppose you could alternatively redefine 
>>> >> >> 
>>> >> >> \renewcommand{\DeclareTextFontCommand}[2]{\newcommand{#1}[1]{##1}} 
>>> >> >> 
>>> >> >> before your fontspec stuff (untested and may not work). 
>>> >> >> 
>>> >> >> Another option is to use a filter and intercept the raw 
>>> >> >> LaTeX inline produced from \IPA{some text}, changing it 
>>> >> >> into textual content, but I think the first approach above 
>>> >> >> is the simplest. 
>>> >> >> 
>>> >> >> 
>>> >> >> 
>>> >> >> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: 
>>> >> >> 
>>> >> >> > I have a minimal test latex file `test.tex`: 
>>> >> >> > 
>>> >> >> > 
>>> >> >> > \documentclass{article} 
>>> >> >> > 
>>> >> >> > \usepackage{fontspec} 
>>> >> >> > 
>>> >> >> > \newfontfamily\IPAFont{Doulos SIL} 
>>> >> >> > \DeclareTextFontCommand{\IPA}{\IPAFont} 
>>> >> >> > 
>>> >> >> > \begin{document} 
>>> >> >> > 
>>> >> >> > \section{Test} 
>>> >> >> > Hello \IPA{some IPA} 
>>> >> >> > 
>>> >> >> > \end{document} 
>>> >> >> > 
>>> >> >> > 
>>> >> >> > This builds fine with xelatex and produces a pdf I expect. When 
>>> i try 
>>> >> to 
>>> >> >> > convert this to an html document with `pandoc 
>>> --pdf-engine=xelatex 
>>> >> >> > --verbose test.tex -o test.html`, I see the warnings: 
>>> >> >> > 
>>> >> >> > [INFO] Could not load include file fontspec.sty at test.tex line 
>>> 3 
>>> >> >> column 22 
>>> >> >> > [INFO] Skipped '\newfontfamily' at test.tex line 5 column 15 
>>> >> >> > [INFO] Skipped '\IPAFont{Doulos SIL}' at test.tex line 5 column 
>>> 35 
>>> >> >> > [INFO] Skipped '\DeclareTextFontCommand{\IPA}{\IPAFont}' at 
>>> test.tex 
>>> >> >> line 6 
>>> >> >> > column 40 
>>> >> >> > [INFO] Skipped '\IPA{some IPA}' at test.tex line 11 column 21 
>>> >> >> > 
>>> >> >> > And the text within the custom \IPA command is skipped. How can 
>>> I make 
>>> >> >> > pandoc not skip these? 
>>> >> >> > 
>>> >> >> > 
>>> >> >> > -- 
>>> >> >> > You received this message because you are subscribed to the 
>>> Google 
>>> >> >> Groups "pandoc-discuss" group. 
>>> >> >> > To unsubscribe from this group and stop receiving emails from 
>>> it, send 
>>> >> >> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org 
>>> >> >> > To view this discussion on the web visit 
>>> >> >> 
>>> >> 
>>> https://groups.google.com/d/msgid/pandoc-discuss/0462fc42-ae24-4c52-b267-1126ed5834edn%40googlegroups.com 
>>> >> >> . 
>>> >> >> 
>>> >> > 
>>> >> > -- 
>>> >> > You received this message because you are subscribed to the Google 
>>> >> Groups "pandoc-discuss" group. 
>>> >> > To unsubscribe from this group and stop receiving emails from it, 
>>> send 
>>> >> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org 
>>> >> > To view this discussion on the web visit 
>>> >> 
>>> https://groups.google.com/d/msgid/pandoc-discuss/bac7947b-259e-4774-b993-33f69fffc05fn%40googlegroups.com 
>>> >> . 
>>> >> 
>>> > 
>>> > -- 
>>> > You received this message because you are subscribed to the Google 
>>> Groups "pandoc-discuss" group. 
>>> > To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org 
>>> > To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/pandoc-discuss/84e207d9-eaed-4b24-8b6b-62ea07bb2b5bn%40googlegroups.com. 
>>>
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/pandoc-discuss/c648fb98-d892-4f1e-b3aa-0da071d8de4bn%40googlegroups.com 
>> <https://groups.google.com/d/msgid/pandoc-discuss/c648fb98-d892-4f1e-b3aa-0da071d8de4bn%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/4f3956c3-e028-473c-b622-dae2f0b72dedn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 19194 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Skipping commands in LaTeX document
       [not found]                         ` <c648fb98-d892-4f1e-b3aa-0da071d8de4bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2021-12-06 10:34                           ` BPJ
@ 2021-12-06 17:57                           ` John MacFarlane
  1 sibling, 0 replies; 12+ messages in thread
From: John MacFarlane @ 2021-12-06 17:57 UTC (permalink / raw)
  To: Greg S, pandoc-discuss


Is that because the whole tabular is being parsed as a raw TeX
chunk?  Or is it treated as a table, but the contents of cells are
parsed differently than outside the table?

Greg S <elorian.mestec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Okay I've written a filter:
>
> ```
> #!/usr/bin/python
> import logging
> import re
> from pandocfilters import toJSONFilter, Emph, Para, RawInline
>
> ipa_regex = re.compile("\\\IPA{(.*)}")
>
> def handle(key, value, format, meta): 
>     logging.warning(f"KEY {key} VALUE {value} format {format} META {meta}") 
>     if key == "RawInline": 
>           if m := ipa_regex.match(value[1]):
>                return RawInline('html', f"{m.group(1)}") 
>
> if __name__ == "__main__":
>     toJSONFilter(handle) 
> ```
>
> and with the `-f latex+raw_tex` option passed to pandoc it looks like this 
> is correctly capturing the text in the IPA macro.
>
> However, I noticed that the filter completely skips over text in the \IPA 
> macro if that macro occurs within a latex table defined with 
> \begin{tabular}. I'm using the 
> makecell latex package and wrapping the cells with the \makecell command 
> (i.e. `\makecell { \IPA{ some text } }`, but I tried removing the \makecell 
> and the IPA macro still gets skipped in this context.
>
>
> On Sunday, December 5, 2021 at 12:12:44 PM UTC-8 John MacFarlane wrote:
>
>>
>> I should have mentioned before that you'll need to enable
>> the `raw_tex` extension as shown above, to allow inclusion
>> of RawBlock or RawInline.
>>
>> % pandoc -t native -f latex+raw_tex 
>> \IPA{hi} there
>> ^D
>> [ Para
>> [ RawInline (Format "latex") "\\IPA{hi}"
>> , Space
>> , Str "there"
>> ]
>> ]
>>
>>
>> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>
>> > How can I write a filter that matches RawInline elements if the filter 
>> > applies after the unknown latex macros have been applied in the parsing 
>> > stage? I'm not seeing the text within the \IPA macro at all in the 
>> logging 
>> > from the test filter I wrote - is there something I need to do to make 
>> that 
>> > filter apply earlier?
>> >
>> > On Sunday, December 5, 2021 at 10:56:51 AM UTC-8 John MacFarlane wrote:
>> >
>> >>
>> >> You can't insert the macro with a filter, because the filter
>> >> is applied after parsing, and the macro would be resolved in
>> >> the parsing phase.
>> >>
>> >> However, you could have a filter that matches RawInline
>> >> elements that are "\IPA" commands, extracts their textual
>> >> content, and returns a Str element.
>> >>
>> >> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>> >>
>> >> > Is there a way I can tell pandoc to insert a new Latex macro before
>> >> > processing that doesn't exist in the document? Using
>> >> > \renewcommand{\IPA}[1]{#1} makes the text appear in the output of the 
>> >> latex
>> >> > -> html conversion, but it breaks the formatting I care about in the 
>> pdf
>> >> > version so I don't want to have that line permanently in the latex 
>> source
>> >> > file.
>> >> >
>> >> > I think I'd ultimately like to use a filter to intercept the raw 
>> latex 
>> >> from
>> >> > \IPA{...} and do something specific with it in HTML (probably put it 
>> >> within
>> >> > a <span class="IPA"> tag). I also have some other latex macros from
>> >> > specific packages that pandoc doesn't seem to understand, that I'd 
>> like 
>> >> to
>> >> > handle in a custom way. I tried creating a simple logging Python 
>> filter
>> >> > just to understand how they work.
>> >> >
>> >> > ```
>> >> > #!/usr/bin/python
>> >> > import logging
>> >> > from pandocfilters import toJSONFilter, Emph, Para
>> >> >
>> >> > def handle(key, value, format, meta):
>> >> > logging.warn(f"KEY {key} VALUE {value} format {format} META {meta}")
>> >> >
>> >> > if __name__ == "__main__":
>> >> > toJSONFilter(handle)
>> >> > ```
>> >> > And then running `pandoc --pdf-engine=xelatex --verbose test.tex -o
>> >> > test.html --filter filter.py`.
>> >> >
>> >> > But it seems like latex macros that pandoc doesn't understand are 
>> getting
>> >> > skipped before the filter is applied, so the `handle` function never 
>> gets
>> >> > called with the text contents of my \IPA macro.
>> >> >
>> >> > On Saturday, December 4, 2021 at 9:37:16 AM UTC-8 John MacFarlane 
>> wrote:
>> >> >
>> >> >>
>> >> >> Pandoc doesn't understand everything, especially outside of
>> >> >> core LaTeX. In particular, it doesn't understand
>> >> >>
>> >> >> \DeclareTextFontCommand
>> >> >>
>> >> >> from fontspec, so the \IPA macro isn't understood.
>> >> >>
>> >> >> You can work around this by adding your own macro
>> >> >> definition before you convert with pandoc:
>> >> >>
>> >> >> \renewcommand{\IPA}[1]{#1}
>> >> >>
>> >> >> and then the contents of \IPA will just be passed
>> >> >> through.
>> >> >>
>> >> >> I suppose you could alternatively redefine
>> >> >>
>> >> >> \renewcommand{\DeclareTextFontCommand}[2]{\newcommand{#1}[1]{##1}}
>> >> >>
>> >> >> before your fontspec stuff (untested and may not work).
>> >> >>
>> >> >> Another option is to use a filter and intercept the raw
>> >> >> LaTeX inline produced from \IPA{some text}, changing it
>> >> >> into textual content, but I think the first approach above
>> >> >> is the simplest.
>> >> >>
>> >> >>
>> >> >>
>> >> >> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>> >> >>
>> >> >> > I have a minimal test latex file `test.tex`:
>> >> >> >
>> >> >> >
>> >> >> > \documentclass{article}
>> >> >> >
>> >> >> > \usepackage{fontspec}
>> >> >> >
>> >> >> > \newfontfamily\IPAFont{Doulos SIL}
>> >> >> > \DeclareTextFontCommand{\IPA}{\IPAFont}
>> >> >> >
>> >> >> > \begin{document}
>> >> >> >
>> >> >> > \section{Test}
>> >> >> > Hello \IPA{some IPA}
>> >> >> >
>> >> >> > \end{document}
>> >> >> >
>> >> >> >
>> >> >> > This builds fine with xelatex and produces a pdf I expect. When i 
>> try 
>> >> to
>> >> >> > convert this to an html document with `pandoc --pdf-engine=xelatex
>> >> >> > --verbose test.tex -o test.html`, I see the warnings:
>> >> >> >
>> >> >> > [INFO] Could not load include file fontspec.sty at test.tex line 3
>> >> >> column 22
>> >> >> > [INFO] Skipped '\newfontfamily' at test.tex line 5 column 15
>> >> >> > [INFO] Skipped '\IPAFont{Doulos SIL}' at test.tex line 5 column 35
>> >> >> > [INFO] Skipped '\DeclareTextFontCommand{\IPA}{\IPAFont}' at 
>> test.tex
>> >> >> line 6
>> >> >> > column 40
>> >> >> > [INFO] Skipped '\IPA{some IPA}' at test.tex line 11 column 21
>> >> >> >
>> >> >> > And the text within the custom \IPA command is skipped. How can I 
>> make
>> >> >> > pandoc not skip these?
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > You received this message because you are subscribed to the Google
>> >> >> Groups "pandoc-discuss" group.
>> >> >> > To unsubscribe from this group and stop receiving emails from it, 
>> send
>> >> >> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> >> >> > To view this discussion on the web visit
>> >> >> 
>> >> 
>> https://groups.google.com/d/msgid/pandoc-discuss/0462fc42-ae24-4c52-b267-1126ed5834edn%40googlegroups.com
>> >> >> .
>> >> >>
>> >> >
>> >> > --
>> >> > You received this message because you are subscribed to the Google 
>> >> Groups "pandoc-discuss" group.
>> >> > To unsubscribe from this group and stop receiving emails from it, 
>> send 
>> >> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> >> > To view this discussion on the web visit 
>> >> 
>> https://groups.google.com/d/msgid/pandoc-discuss/bac7947b-259e-4774-b993-33f69fffc05fn%40googlegroups.com
>> >> .
>> >>
>> >
>> > -- 
>> > You received this message because you are subscribed to the Google 
>> Groups "pandoc-discuss" group.
>> > To unsubscribe from this group and stop receiving emails from it, send 
>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> > To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/pandoc-discuss/84e207d9-eaed-4b24-8b6b-62ea07bb2b5bn%40googlegroups.com
>> .
>>
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/c648fb98-d892-4f1e-b3aa-0da071d8de4bn%40googlegroups.com.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Skipping commands in LaTeX document
       [not found]                         ` <c648fb98-d892-4f1e-b3aa-0da071d8de4bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2021-12-06 10:34                           ` BPJ
       [not found]                             ` <CADAJKhCC9xm6HX0aF5SzJr9vG3xZR1eiQxxCpA6QNRi1BRE-7g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2021-12-06 17:57                           ` John MacFarlane
  1 sibling, 1 reply; 12+ messages in thread
From: BPJ @ 2021-12-06 10:34 UTC (permalink / raw)
  To: pandoc-discuss

[-- Attachment #1: Type: text/plain, Size: 11545 bytes --]

What do you get if you run pandoc with -f latex+raw_tex -t native without
any filter? My guess is that it is one of these:

1.  The whole tabular ends up inside a huge RawBlock.

2.  The \makecell command ends up inside a RawInline or RawBlock and
doesn't get rendered in the output.

3. #2 + the regex doesn't see the \IPA command because it is inside the
\makecell command.

Also you may need a non-greedy regex: "\\IPA{(.*?)}" — and you may need the
regex module for that to work.

Please try putting these definitions at the top of your document body[^0]:

``````latex
\usepackage[normalem]{ulem}

\renewcommand{\makecell}[1]{#1}

\renewcommand{\IPA}[1]{\sout{#1}}
``````

Then save the Lua code below to a file sout2ipa.lua in the current
directory and run pandoc with -f latex -t html -L sout2ipa.lua

``````lua
function Strikeout (elem)
  return pandoc.Span(elem.content, { class = 'IPA' })
end
``````

Now you should get all your IPA nicely inside spans with class "IPA".

There is a gotcha: this trick requires that you don't have any actual
strikeout text in your document.

@jgm there really should be an extension which makes the LaTeX reader
recognise a pseudocommand `\PandocSpan{attrA=value, attrB={long
value}}{content}` so that one could do redefinitions like those below and
get native spans in the Pandoc AST.

``````latex
\renewcommand{\IPA}[1]{\PandocSpan{class=IPA}{#1}}

\renewcommand{\TakesTwo}[2]{\PandocSpan{class=foo}{\PandocSpan{data-foo=1}{#1}\PandocSpan{data-foo=2}{#2}}}

\renewcommand{\TakesKeyVals}[2][]{\PandocSpan{#1, class=bar}{#2}
``````

where the reader wi convert any keyval-style content in the first argument
to span attributes, with later ones overriding.

(And possibly an analogous PandocDiv command, working somewhat like the
`\NewEnviron` command of the LaTeX environ package[^1] with a
pseudo-command `\BODY` (or `\DIV` so as to not clash with environ!) which
gets replaced with the content of the div.)

Even if such a structure isn't usable on its own it would be much easier to
modify it with filters.

/bpj

[^0]: I'm not sure that \renewcommand works but since I am on my phone ATM
I can't check. If not comment out original \newcommand and/or \usepackage
commands and define substitute commands with \newcommand as appropriate.

[^1]: https://ctan.org/pkg/environ


Den mån 6 dec. 2021 05:26Greg S <elorian.mestec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> skrev:

> Okay I've written a filter:
>
> ```
> #!/usr/bin/python
> import logging
> import re
> from pandocfilters import toJSONFilter, Emph, Para, RawInline
>
> ipa_regex = re.compile("\\\IPA{(.*)}")
>
> def handle(key, value, format, meta):
>     logging.warning(f"KEY {key} VALUE {value} format {format} META
> {meta}")
>     if key == "RawInline":
>           if m := ipa_regex.match(value[1]):
>                return RawInline('html', f"{m.group(1)}")
>
> if __name__ == "__main__":
>     toJSONFilter(handle)
> ```
>
> and with the `-f latex+raw_tex` option passed to pandoc it looks like this
> is correctly capturing the text in the IPA macro.
>
> However, I noticed that the filter completely skips over text in the \IPA
> macro if that macro occurs within a latex table defined with
> \begin{tabular}. I'm using the
> makecell latex package and wrapping the cells with the \makecell command
> (i.e. `\makecell { \IPA{ some text } }`, but I tried removing the \makecell
> and the IPA macro still gets skipped in this context.
>
>
> On Sunday, December 5, 2021 at 12:12:44 PM UTC-8 John MacFarlane wrote:
>
>>
>> I should have mentioned before that you'll need to enable
>> the `raw_tex` extension as shown above, to allow inclusion
>> of RawBlock or RawInline.
>>
>> % pandoc -t native -f latex+raw_tex
>> \IPA{hi} there
>> ^D
>> [ Para
>> [ RawInline (Format "latex") "\\IPA{hi}"
>> , Space
>> , Str "there"
>> ]
>> ]
>>
>>
>> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>
>> > How can I write a filter that matches RawInline elements if the filter
>> > applies after the unknown latex macros have been applied in the parsing
>> > stage? I'm not seeing the text within the \IPA macro at all in the
>> logging
>> > from the test filter I wrote - is there something I need to do to make
>> that
>> > filter apply earlier?
>> >
>> > On Sunday, December 5, 2021 at 10:56:51 AM UTC-8 John MacFarlane wrote:
>> >
>> >>
>> >> You can't insert the macro with a filter, because the filter
>> >> is applied after parsing, and the macro would be resolved in
>> >> the parsing phase.
>> >>
>> >> However, you could have a filter that matches RawInline
>> >> elements that are "\IPA" commands, extracts their textual
>> >> content, and returns a Str element.
>> >>
>> >> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>> >>
>> >> > Is there a way I can tell pandoc to insert a new Latex macro before
>> >> > processing that doesn't exist in the document? Using
>> >> > \renewcommand{\IPA}[1]{#1} makes the text appear in the output of
>> the
>> >> latex
>> >> > -> html conversion, but it breaks the formatting I care about in the
>> pdf
>> >> > version so I don't want to have that line permanently in the latex
>> source
>> >> > file.
>> >> >
>> >> > I think I'd ultimately like to use a filter to intercept the raw
>> latex
>> >> from
>> >> > \IPA{...} and do something specific with it in HTML (probably put it
>> >> within
>> >> > a <span class="IPA"> tag). I also have some other latex macros from
>> >> > specific packages that pandoc doesn't seem to understand, that I'd
>> like
>> >> to
>> >> > handle in a custom way. I tried creating a simple logging Python
>> filter
>> >> > just to understand how they work.
>> >> >
>> >> > ```
>> >> > #!/usr/bin/python
>> >> > import logging
>> >> > from pandocfilters import toJSONFilter, Emph, Para
>> >> >
>> >> > def handle(key, value, format, meta):
>> >> > logging.warn(f"KEY {key} VALUE {value} format {format} META {meta}")
>> >> >
>> >> > if __name__ == "__main__":
>> >> > toJSONFilter(handle)
>> >> > ```
>> >> > And then running `pandoc --pdf-engine=xelatex --verbose test.tex -o
>> >> > test.html --filter filter.py`.
>> >> >
>> >> > But it seems like latex macros that pandoc doesn't understand are
>> getting
>> >> > skipped before the filter is applied, so the `handle` function never
>> gets
>> >> > called with the text contents of my \IPA macro.
>> >> >
>> >> > On Saturday, December 4, 2021 at 9:37:16 AM UTC-8 John MacFarlane
>> wrote:
>> >> >
>> >> >>
>> >> >> Pandoc doesn't understand everything, especially outside of
>> >> >> core LaTeX. In particular, it doesn't understand
>> >> >>
>> >> >> \DeclareTextFontCommand
>> >> >>
>> >> >> from fontspec, so the \IPA macro isn't understood.
>> >> >>
>> >> >> You can work around this by adding your own macro
>> >> >> definition before you convert with pandoc:
>> >> >>
>> >> >> \renewcommand{\IPA}[1]{#1}
>> >> >>
>> >> >> and then the contents of \IPA will just be passed
>> >> >> through.
>> >> >>
>> >> >> I suppose you could alternatively redefine
>> >> >>
>> >> >> \renewcommand{\DeclareTextFontCommand}[2]{\newcommand{#1}[1]{##1}}
>> >> >>
>> >> >> before your fontspec stuff (untested and may not work).
>> >> >>
>> >> >> Another option is to use a filter and intercept the raw
>> >> >> LaTeX inline produced from \IPA{some text}, changing it
>> >> >> into textual content, but I think the first approach above
>> >> >> is the simplest.
>> >> >>
>> >> >>
>> >> >>
>> >> >> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>> >> >>
>> >> >> > I have a minimal test latex file `test.tex`:
>> >> >> >
>> >> >> >
>> >> >> > \documentclass{article}
>> >> >> >
>> >> >> > \usepackage{fontspec}
>> >> >> >
>> >> >> > \newfontfamily\IPAFont{Doulos SIL}
>> >> >> > \DeclareTextFontCommand{\IPA}{\IPAFont}
>> >> >> >
>> >> >> > \begin{document}
>> >> >> >
>> >> >> > \section{Test}
>> >> >> > Hello \IPA{some IPA}
>> >> >> >
>> >> >> > \end{document}
>> >> >> >
>> >> >> >
>> >> >> > This builds fine with xelatex and produces a pdf I expect. When i
>> try
>> >> to
>> >> >> > convert this to an html document with `pandoc
>> --pdf-engine=xelatex
>> >> >> > --verbose test.tex -o test.html`, I see the warnings:
>> >> >> >
>> >> >> > [INFO] Could not load include file fontspec.sty at test.tex line
>> 3
>> >> >> column 22
>> >> >> > [INFO] Skipped '\newfontfamily' at test.tex line 5 column 15
>> >> >> > [INFO] Skipped '\IPAFont{Doulos SIL}' at test.tex line 5 column
>> 35
>> >> >> > [INFO] Skipped '\DeclareTextFontCommand{\IPA}{\IPAFont}' at
>> test.tex
>> >> >> line 6
>> >> >> > column 40
>> >> >> > [INFO] Skipped '\IPA{some IPA}' at test.tex line 11 column 21
>> >> >> >
>> >> >> > And the text within the custom \IPA command is skipped. How can I
>> make
>> >> >> > pandoc not skip these?
>> >> >> >
>> >> >> >
>> >> >> > --
>> >> >> > You received this message because you are subscribed to the
>> Google
>> >> >> Groups "pandoc-discuss" group.
>> >> >> > To unsubscribe from this group and stop receiving emails from it,
>> send
>> >> >> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> >> >> > To view this discussion on the web visit
>> >> >>
>> >>
>> https://groups.google.com/d/msgid/pandoc-discuss/0462fc42-ae24-4c52-b267-1126ed5834edn%40googlegroups.com
>> >> >> .
>> >> >>
>> >> >
>> >> > --
>> >> > You received this message because you are subscribed to the Google
>> >> Groups "pandoc-discuss" group.
>> >> > To unsubscribe from this group and stop receiving emails from it,
>> send
>> >> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> >> > To view this discussion on the web visit
>> >>
>> https://groups.google.com/d/msgid/pandoc-discuss/bac7947b-259e-4774-b993-33f69fffc05fn%40googlegroups.com
>> >> .
>> >>
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> Groups "pandoc-discuss" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> > To view this discussion on the web visit
>> https://groups.google.com/d/msgid/pandoc-discuss/84e207d9-eaed-4b24-8b6b-62ea07bb2b5bn%40googlegroups.com.
>>
>>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/c648fb98-d892-4f1e-b3aa-0da071d8de4bn%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/c648fb98-d892-4f1e-b3aa-0da071d8de4bn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhCC9xm6HX0aF5SzJr9vG3xZR1eiQxxCpA6QNRi1BRE-7g%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 16497 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Skipping commands in LaTeX document
       [not found]                     ` <m2fsr6hkvl.fsf-d8241O7hbXoP5tpWdHSM3tPlBySK3R6THiGdP5j34PU@public.gmane.org>
@ 2021-12-06  4:25                       ` Greg S
       [not found]                         ` <c648fb98-d892-4f1e-b3aa-0da071d8de4bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Greg S @ 2021-12-06  4:25 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 8008 bytes --]

Okay I've written a filter:

```
#!/usr/bin/python
import logging
import re
from pandocfilters import toJSONFilter, Emph, Para, RawInline

ipa_regex = re.compile("\\\IPA{(.*)}")

def handle(key, value, format, meta): 
    logging.warning(f"KEY {key} VALUE {value} format {format} META {meta}") 
    if key == "RawInline": 
          if m := ipa_regex.match(value[1]):
               return RawInline('html', f"{m.group(1)}") 

if __name__ == "__main__":
    toJSONFilter(handle) 
```

and with the `-f latex+raw_tex` option passed to pandoc it looks like this 
is correctly capturing the text in the IPA macro.

However, I noticed that the filter completely skips over text in the \IPA 
macro if that macro occurs within a latex table defined with 
\begin{tabular}. I'm using the 
makecell latex package and wrapping the cells with the \makecell command 
(i.e. `\makecell { \IPA{ some text } }`, but I tried removing the \makecell 
and the IPA macro still gets skipped in this context.


On Sunday, December 5, 2021 at 12:12:44 PM UTC-8 John MacFarlane wrote:

>
> I should have mentioned before that you'll need to enable
> the `raw_tex` extension as shown above, to allow inclusion
> of RawBlock or RawInline.
>
> % pandoc -t native -f latex+raw_tex 
> \IPA{hi} there
> ^D
> [ Para
> [ RawInline (Format "latex") "\\IPA{hi}"
> , Space
> , Str "there"
> ]
> ]
>
>
> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
> > How can I write a filter that matches RawInline elements if the filter 
> > applies after the unknown latex macros have been applied in the parsing 
> > stage? I'm not seeing the text within the \IPA macro at all in the 
> logging 
> > from the test filter I wrote - is there something I need to do to make 
> that 
> > filter apply earlier?
> >
> > On Sunday, December 5, 2021 at 10:56:51 AM UTC-8 John MacFarlane wrote:
> >
> >>
> >> You can't insert the macro with a filter, because the filter
> >> is applied after parsing, and the macro would be resolved in
> >> the parsing phase.
> >>
> >> However, you could have a filter that matches RawInline
> >> elements that are "\IPA" commands, extracts their textual
> >> content, and returns a Str element.
> >>
> >> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> >>
> >> > Is there a way I can tell pandoc to insert a new Latex macro before
> >> > processing that doesn't exist in the document? Using
> >> > \renewcommand{\IPA}[1]{#1} makes the text appear in the output of the 
> >> latex
> >> > -> html conversion, but it breaks the formatting I care about in the 
> pdf
> >> > version so I don't want to have that line permanently in the latex 
> source
> >> > file.
> >> >
> >> > I think I'd ultimately like to use a filter to intercept the raw 
> latex 
> >> from
> >> > \IPA{...} and do something specific with it in HTML (probably put it 
> >> within
> >> > a <span class="IPA"> tag). I also have some other latex macros from
> >> > specific packages that pandoc doesn't seem to understand, that I'd 
> like 
> >> to
> >> > handle in a custom way. I tried creating a simple logging Python 
> filter
> >> > just to understand how they work.
> >> >
> >> > ```
> >> > #!/usr/bin/python
> >> > import logging
> >> > from pandocfilters import toJSONFilter, Emph, Para
> >> >
> >> > def handle(key, value, format, meta):
> >> > logging.warn(f"KEY {key} VALUE {value} format {format} META {meta}")
> >> >
> >> > if __name__ == "__main__":
> >> > toJSONFilter(handle)
> >> > ```
> >> > And then running `pandoc --pdf-engine=xelatex --verbose test.tex -o
> >> > test.html --filter filter.py`.
> >> >
> >> > But it seems like latex macros that pandoc doesn't understand are 
> getting
> >> > skipped before the filter is applied, so the `handle` function never 
> gets
> >> > called with the text contents of my \IPA macro.
> >> >
> >> > On Saturday, December 4, 2021 at 9:37:16 AM UTC-8 John MacFarlane 
> wrote:
> >> >
> >> >>
> >> >> Pandoc doesn't understand everything, especially outside of
> >> >> core LaTeX. In particular, it doesn't understand
> >> >>
> >> >> \DeclareTextFontCommand
> >> >>
> >> >> from fontspec, so the \IPA macro isn't understood.
> >> >>
> >> >> You can work around this by adding your own macro
> >> >> definition before you convert with pandoc:
> >> >>
> >> >> \renewcommand{\IPA}[1]{#1}
> >> >>
> >> >> and then the contents of \IPA will just be passed
> >> >> through.
> >> >>
> >> >> I suppose you could alternatively redefine
> >> >>
> >> >> \renewcommand{\DeclareTextFontCommand}[2]{\newcommand{#1}[1]{##1}}
> >> >>
> >> >> before your fontspec stuff (untested and may not work).
> >> >>
> >> >> Another option is to use a filter and intercept the raw
> >> >> LaTeX inline produced from \IPA{some text}, changing it
> >> >> into textual content, but I think the first approach above
> >> >> is the simplest.
> >> >>
> >> >>
> >> >>
> >> >> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> >> >>
> >> >> > I have a minimal test latex file `test.tex`:
> >> >> >
> >> >> >
> >> >> > \documentclass{article}
> >> >> >
> >> >> > \usepackage{fontspec}
> >> >> >
> >> >> > \newfontfamily\IPAFont{Doulos SIL}
> >> >> > \DeclareTextFontCommand{\IPA}{\IPAFont}
> >> >> >
> >> >> > \begin{document}
> >> >> >
> >> >> > \section{Test}
> >> >> > Hello \IPA{some IPA}
> >> >> >
> >> >> > \end{document}
> >> >> >
> >> >> >
> >> >> > This builds fine with xelatex and produces a pdf I expect. When i 
> try 
> >> to
> >> >> > convert this to an html document with `pandoc --pdf-engine=xelatex
> >> >> > --verbose test.tex -o test.html`, I see the warnings:
> >> >> >
> >> >> > [INFO] Could not load include file fontspec.sty at test.tex line 3
> >> >> column 22
> >> >> > [INFO] Skipped '\newfontfamily' at test.tex line 5 column 15
> >> >> > [INFO] Skipped '\IPAFont{Doulos SIL}' at test.tex line 5 column 35
> >> >> > [INFO] Skipped '\DeclareTextFontCommand{\IPA}{\IPAFont}' at 
> test.tex
> >> >> line 6
> >> >> > column 40
> >> >> > [INFO] Skipped '\IPA{some IPA}' at test.tex line 11 column 21
> >> >> >
> >> >> > And the text within the custom \IPA command is skipped. How can I 
> make
> >> >> > pandoc not skip these?
> >> >> >
> >> >> >
> >> >> > --
> >> >> > You received this message because you are subscribed to the Google
> >> >> Groups "pandoc-discuss" group.
> >> >> > To unsubscribe from this group and stop receiving emails from it, 
> send
> >> >> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> >> >> > To view this discussion on the web visit
> >> >> 
> >> 
> https://groups.google.com/d/msgid/pandoc-discuss/0462fc42-ae24-4c52-b267-1126ed5834edn%40googlegroups.com
> >> >> .
> >> >>
> >> >
> >> > --
> >> > You received this message because you are subscribed to the Google 
> >> Groups "pandoc-discuss" group.
> >> > To unsubscribe from this group and stop receiving emails from it, 
> send 
> >> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> >> > To view this discussion on the web visit 
> >> 
> https://groups.google.com/d/msgid/pandoc-discuss/bac7947b-259e-4774-b993-33f69fffc05fn%40googlegroups.com
> >> .
> >>
> >
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/pandoc-discuss/84e207d9-eaed-4b24-8b6b-62ea07bb2b5bn%40googlegroups.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/c648fb98-d892-4f1e-b3aa-0da071d8de4bn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 12498 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Skipping commands in LaTeX document
       [not found]                 ` <84e207d9-eaed-4b24-8b6b-62ea07bb2b5bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2021-12-05 20:11                   ` John MacFarlane
       [not found]                     ` <m2fsr6hkvl.fsf-d8241O7hbXoP5tpWdHSM3tPlBySK3R6THiGdP5j34PU@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: John MacFarlane @ 2021-12-05 20:11 UTC (permalink / raw)
  To: Greg S, pandoc-discuss


I should have mentioned before that you'll need to enable
the `raw_tex` extension as shown above, to allow inclusion
of RawBlock or RawInline.

% pandoc -t native -f latex+raw_tex  
\IPA{hi} there
^D
[ Para
    [ RawInline (Format "latex") "\\IPA{hi}"
    , Space
    , Str "there"
    ]
]


Greg S <elorian.mestec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> How can I write a filter that matches RawInline elements if the filter 
> applies after the unknown latex macros have been applied in the parsing 
> stage? I'm not seeing the text within the \IPA macro at all in the logging 
> from the test filter I wrote - is there something I need to do to make that 
> filter apply earlier?
>
> On Sunday, December 5, 2021 at 10:56:51 AM UTC-8 John MacFarlane wrote:
>
>>
>> You can't insert the macro with a filter, because the filter
>> is applied after parsing, and the macro would be resolved in
>> the parsing phase.
>>
>> However, you could have a filter that matches RawInline
>> elements that are "\IPA" commands, extracts their textual
>> content, and returns a Str element.
>>
>> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>
>> > Is there a way I can tell pandoc to insert a new Latex macro before
>> > processing that doesn't exist in the document? Using
>> > \renewcommand{\IPA}[1]{#1} makes the text appear in the output of the 
>> latex
>> > -> html conversion, but it breaks the formatting I care about in the pdf
>> > version so I don't want to have that line permanently in the latex source
>> > file.
>> >
>> > I think I'd ultimately like to use a filter to intercept the raw latex 
>> from
>> > \IPA{...} and do something specific with it in HTML (probably put it 
>> within
>> > a <span class="IPA"> tag). I also have some other latex macros from
>> > specific packages that pandoc doesn't seem to understand, that I'd like 
>> to
>> > handle in a custom way. I tried creating a simple logging Python filter
>> > just to understand how they work.
>> >
>> > ```
>> > #!/usr/bin/python
>> > import logging
>> > from pandocfilters import toJSONFilter, Emph, Para
>> >
>> > def handle(key, value, format, meta):
>> > logging.warn(f"KEY {key} VALUE {value} format {format} META {meta}")
>> >
>> > if __name__ == "__main__":
>> > toJSONFilter(handle)
>> > ```
>> > And then running `pandoc --pdf-engine=xelatex --verbose test.tex -o
>> > test.html --filter filter.py`.
>> >
>> > But it seems like latex macros that pandoc doesn't understand are getting
>> > skipped before the filter is applied, so the `handle` function never gets
>> > called with the text contents of my \IPA macro.
>> >
>> > On Saturday, December 4, 2021 at 9:37:16 AM UTC-8 John MacFarlane wrote:
>> >
>> >>
>> >> Pandoc doesn't understand everything, especially outside of
>> >> core LaTeX. In particular, it doesn't understand
>> >>
>> >> \DeclareTextFontCommand
>> >>
>> >> from fontspec, so the \IPA macro isn't understood.
>> >>
>> >> You can work around this by adding your own macro
>> >> definition before you convert with pandoc:
>> >>
>> >> \renewcommand{\IPA}[1]{#1}
>> >>
>> >> and then the contents of \IPA will just be passed
>> >> through.
>> >>
>> >> I suppose you could alternatively redefine
>> >>
>> >> \renewcommand{\DeclareTextFontCommand}[2]{\newcommand{#1}[1]{##1}}
>> >>
>> >> before your fontspec stuff (untested and may not work).
>> >>
>> >> Another option is to use a filter and intercept the raw
>> >> LaTeX inline produced from \IPA{some text}, changing it
>> >> into textual content, but I think the first approach above
>> >> is the simplest.
>> >>
>> >>
>> >>
>> >> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>> >>
>> >> > I have a minimal test latex file `test.tex`:
>> >> >
>> >> >
>> >> > \documentclass{article}
>> >> >
>> >> > \usepackage{fontspec}
>> >> >
>> >> > \newfontfamily\IPAFont{Doulos SIL}
>> >> > \DeclareTextFontCommand{\IPA}{\IPAFont}
>> >> >
>> >> > \begin{document}
>> >> >
>> >> > \section{Test}
>> >> > Hello \IPA{some IPA}
>> >> >
>> >> > \end{document}
>> >> >
>> >> >
>> >> > This builds fine with xelatex and produces a pdf I expect. When i try 
>> to
>> >> > convert this to an html document with `pandoc --pdf-engine=xelatex
>> >> > --verbose test.tex -o test.html`, I see the warnings:
>> >> >
>> >> > [INFO] Could not load include file fontspec.sty at test.tex line 3
>> >> column 22
>> >> > [INFO] Skipped '\newfontfamily' at test.tex line 5 column 15
>> >> > [INFO] Skipped '\IPAFont{Doulos SIL}' at test.tex line 5 column 35
>> >> > [INFO] Skipped '\DeclareTextFontCommand{\IPA}{\IPAFont}' at test.tex
>> >> line 6
>> >> > column 40
>> >> > [INFO] Skipped '\IPA{some IPA}' at test.tex line 11 column 21
>> >> >
>> >> > And the text within the custom \IPA command is skipped. How can I make
>> >> > pandoc not skip these?
>> >> >
>> >> >
>> >> > --
>> >> > You received this message because you are subscribed to the Google
>> >> Groups "pandoc-discuss" group.
>> >> > To unsubscribe from this group and stop receiving emails from it, send
>> >> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> >> > To view this discussion on the web visit
>> >> 
>> https://groups.google.com/d/msgid/pandoc-discuss/0462fc42-ae24-4c52-b267-1126ed5834edn%40googlegroups.com
>> >> .
>> >>
>> >
>> > --
>> > You received this message because you are subscribed to the Google 
>> Groups "pandoc-discuss" group.
>> > To unsubscribe from this group and stop receiving emails from it, send 
>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> > To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/pandoc-discuss/bac7947b-259e-4774-b993-33f69fffc05fn%40googlegroups.com
>> .
>>
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/84e207d9-eaed-4b24-8b6b-62ea07bb2b5bn%40googlegroups.com.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Skipping commands in LaTeX document
       [not found]             ` <m2r1aqhod6.fsf-d8241O7hbXoP5tpWdHSM3tPlBySK3R6THiGdP5j34PU@public.gmane.org>
@ 2021-12-05 19:53               ` Greg S
       [not found]                 ` <84e207d9-eaed-4b24-8b6b-62ea07bb2b5bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Greg S @ 2021-12-05 19:53 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 5648 bytes --]

How can I write a filter that matches RawInline elements if the filter 
applies after the unknown latex macros have been applied in the parsing 
stage? I'm not seeing the text within the \IPA macro at all in the logging 
from the test filter I wrote - is there something I need to do to make that 
filter apply earlier?

On Sunday, December 5, 2021 at 10:56:51 AM UTC-8 John MacFarlane wrote:

>
> You can't insert the macro with a filter, because the filter
> is applied after parsing, and the macro would be resolved in
> the parsing phase.
>
> However, you could have a filter that matches RawInline
> elements that are "\IPA" commands, extracts their textual
> content, and returns a Str element.
>
> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
> > Is there a way I can tell pandoc to insert a new Latex macro before
> > processing that doesn't exist in the document? Using
> > \renewcommand{\IPA}[1]{#1} makes the text appear in the output of the 
> latex
> > -> html conversion, but it breaks the formatting I care about in the pdf
> > version so I don't want to have that line permanently in the latex source
> > file.
> >
> > I think I'd ultimately like to use a filter to intercept the raw latex 
> from
> > \IPA{...} and do something specific with it in HTML (probably put it 
> within
> > a <span class="IPA"> tag). I also have some other latex macros from
> > specific packages that pandoc doesn't seem to understand, that I'd like 
> to
> > handle in a custom way. I tried creating a simple logging Python filter
> > just to understand how they work.
> >
> > ```
> > #!/usr/bin/python
> > import logging
> > from pandocfilters import toJSONFilter, Emph, Para
> >
> > def handle(key, value, format, meta):
> > logging.warn(f"KEY {key} VALUE {value} format {format} META {meta}")
> >
> > if __name__ == "__main__":
> > toJSONFilter(handle)
> > ```
> > And then running `pandoc --pdf-engine=xelatex --verbose test.tex -o
> > test.html --filter filter.py`.
> >
> > But it seems like latex macros that pandoc doesn't understand are getting
> > skipped before the filter is applied, so the `handle` function never gets
> > called with the text contents of my \IPA macro.
> >
> > On Saturday, December 4, 2021 at 9:37:16 AM UTC-8 John MacFarlane wrote:
> >
> >>
> >> Pandoc doesn't understand everything, especially outside of
> >> core LaTeX. In particular, it doesn't understand
> >>
> >> \DeclareTextFontCommand
> >>
> >> from fontspec, so the \IPA macro isn't understood.
> >>
> >> You can work around this by adding your own macro
> >> definition before you convert with pandoc:
> >>
> >> \renewcommand{\IPA}[1]{#1}
> >>
> >> and then the contents of \IPA will just be passed
> >> through.
> >>
> >> I suppose you could alternatively redefine
> >>
> >> \renewcommand{\DeclareTextFontCommand}[2]{\newcommand{#1}[1]{##1}}
> >>
> >> before your fontspec stuff (untested and may not work).
> >>
> >> Another option is to use a filter and intercept the raw
> >> LaTeX inline produced from \IPA{some text}, changing it
> >> into textual content, but I think the first approach above
> >> is the simplest.
> >>
> >>
> >>
> >> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> >>
> >> > I have a minimal test latex file `test.tex`:
> >> >
> >> >
> >> > \documentclass{article}
> >> >
> >> > \usepackage{fontspec}
> >> >
> >> > \newfontfamily\IPAFont{Doulos SIL}
> >> > \DeclareTextFontCommand{\IPA}{\IPAFont}
> >> >
> >> > \begin{document}
> >> >
> >> > \section{Test}
> >> > Hello \IPA{some IPA}
> >> >
> >> > \end{document}
> >> >
> >> >
> >> > This builds fine with xelatex and produces a pdf I expect. When i try 
> to
> >> > convert this to an html document with `pandoc --pdf-engine=xelatex
> >> > --verbose test.tex -o test.html`, I see the warnings:
> >> >
> >> > [INFO] Could not load include file fontspec.sty at test.tex line 3
> >> column 22
> >> > [INFO] Skipped '\newfontfamily' at test.tex line 5 column 15
> >> > [INFO] Skipped '\IPAFont{Doulos SIL}' at test.tex line 5 column 35
> >> > [INFO] Skipped '\DeclareTextFontCommand{\IPA}{\IPAFont}' at test.tex
> >> line 6
> >> > column 40
> >> > [INFO] Skipped '\IPA{some IPA}' at test.tex line 11 column 21
> >> >
> >> > And the text within the custom \IPA command is skipped. How can I make
> >> > pandoc not skip these?
> >> >
> >> >
> >> > --
> >> > You received this message because you are subscribed to the Google
> >> Groups "pandoc-discuss" group.
> >> > To unsubscribe from this group and stop receiving emails from it, send
> >> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> >> > To view this discussion on the web visit
> >> 
> https://groups.google.com/d/msgid/pandoc-discuss/0462fc42-ae24-4c52-b267-1126ed5834edn%40googlegroups.com
> >> .
> >>
> >
> > --
> > You received this message because you are subscribed to the Google 
> Groups "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/pandoc-discuss/bac7947b-259e-4774-b993-33f69fffc05fn%40googlegroups.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/84e207d9-eaed-4b24-8b6b-62ea07bb2b5bn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 8224 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Skipping commands in LaTeX document
       [not found]         ` <bac7947b-259e-4774-b993-33f69fffc05fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2021-12-05 18:56           ` John MacFarlane
       [not found]             ` <m2r1aqhod6.fsf-d8241O7hbXoP5tpWdHSM3tPlBySK3R6THiGdP5j34PU@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: John MacFarlane @ 2021-12-05 18:56 UTC (permalink / raw)
  To: Greg S, pandoc-discuss


You can't insert the macro with a filter, because the filter
is applied after parsing, and the macro would be resolved in
the parsing phase.

However, you could have a filter that matches RawInline
elements that are "\IPA" commands, extracts their textual
content, and returns a Str element.

Greg S <elorian.mestec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> Is there a way I can tell  pandoc to insert a new Latex macro before
> processing that doesn't exist in the document? Using
> \renewcommand{\IPA}[1]{#1} makes the text appear in the output of the latex
> -> html conversion, but it breaks the formatting I care about in the pdf
> version so I don't want to have that line permanently in the latex source
> file.
>
> I think I'd ultimately like to use a filter to intercept the raw latex from
> \IPA{...} and do something specific with it in HTML (probably put it within
> a <span class="IPA"> tag). I also have some other latex macros from
> specific packages that pandoc doesn't seem to understand, that I'd like to
> handle in a custom way. I tried creating a simple logging Python filter
> just to understand how they work.
>
> ```
> #!/usr/bin/python
> import logging
> from pandocfilters import toJSONFilter, Emph, Para
>
> def handle(key, value, format, meta):
>     logging.warn(f"KEY {key} VALUE {value} format {format} META {meta}")
>
> if __name__ == "__main__":
>    toJSONFilter(handle)
> ```
> And then running `pandoc --pdf-engine=xelatex --verbose test.tex -o
> test.html --filter filter.py`.
>
> But it seems like latex macros that pandoc doesn't understand are getting
> skipped before the filter is applied, so the `handle` function never gets
> called with the text contents of my \IPA macro.
>
> On Saturday, December 4, 2021 at 9:37:16 AM UTC-8 John MacFarlane wrote:
>
>>
>> Pandoc doesn't understand everything, especially outside of
>> core LaTeX. In particular, it doesn't understand
>>
>> \DeclareTextFontCommand
>>
>> from fontspec, so the \IPA macro isn't understood.
>>
>> You can work around this by adding your own macro
>> definition before you convert with pandoc:
>>
>> \renewcommand{\IPA}[1]{#1}
>>
>> and then the contents of \IPA will just be passed
>> through.
>>
>> I suppose you could alternatively redefine
>>
>> \renewcommand{\DeclareTextFontCommand}[2]{\newcommand{#1}[1]{##1}}
>>
>> before your fontspec stuff (untested and may not work).
>>
>> Another option is to use a filter and intercept the raw
>> LaTeX inline produced from \IPA{some text}, changing it
>> into textual content, but I think the first approach above
>> is the simplest.
>>
>>
>>
>> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>>
>> > I have a minimal test latex file `test.tex`:
>> >
>> >
>> > \documentclass{article}
>> >
>> > \usepackage{fontspec}
>> >
>> > \newfontfamily\IPAFont{Doulos SIL}
>> > \DeclareTextFontCommand{\IPA}{\IPAFont}
>> >
>> > \begin{document}
>> >
>> > \section{Test}
>> > Hello \IPA{some IPA}
>> >
>> > \end{document}
>> >
>> >
>> > This builds fine with xelatex and produces a pdf I expect. When i try to
>> > convert this to an html document with `pandoc --pdf-engine=xelatex
>> > --verbose test.tex -o test.html`, I see the warnings:
>> >
>> > [INFO] Could not load include file fontspec.sty at test.tex line 3
>> column 22
>> > [INFO] Skipped '\newfontfamily' at test.tex line 5 column 15
>> > [INFO] Skipped '\IPAFont{Doulos SIL}' at test.tex line 5 column 35
>> > [INFO] Skipped '\DeclareTextFontCommand{\IPA}{\IPAFont}' at test.tex
>> line 6
>> > column 40
>> > [INFO] Skipped '\IPA{some IPA}' at test.tex line 11 column 21
>> >
>> > And the text within the custom \IPA command is skipped. How can I make
>> > pandoc not skip these?
>> >
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> Groups "pandoc-discuss" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> > To view this discussion on the web visit
>> https://groups.google.com/d/msgid/pandoc-discuss/0462fc42-ae24-4c52-b267-1126ed5834edn%40googlegroups.com
>> .
>>
>
> --
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/bac7947b-259e-4774-b993-33f69fffc05fn%40googlegroups.com.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Skipping commands in LaTeX document
       [not found]     ` <m2zgpgi869.fsf-jF64zX8BO0+FqBokazbCQ6OPv3vYUT2dxr7GGTnW70NeoWH0uzbU5w@public.gmane.org>
@ 2021-12-05  2:50       ` Greg S
       [not found]         ` <bac7947b-259e-4774-b993-33f69fffc05fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Greg S @ 2021-12-05  2:50 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 4102 bytes --]

Is there a way I can tell  pandoc to insert a new Latex macro before 
processing that doesn't exist in the document? Using 
\renewcommand{\IPA}[1]{#1} makes the text appear in the output of the latex 
-> html conversion, but it breaks the formatting I care about in the pdf 
version so I don't want to have that line permanently in the latex source 
file.

I think I'd ultimately like to use a filter to intercept the raw latex from 
\IPA{...} and do something specific with it in HTML (probably put it within 
a <span class="IPA"> tag). I also have some other latex macros from 
specific packages that pandoc doesn't seem to understand, that I'd like to 
handle in a custom way. I tried creating a simple logging Python filter 
just to understand how they work.

```
#!/usr/bin/python
import logging
from pandocfilters import toJSONFilter, Emph, Para

def handle(key, value, format, meta): 
    logging.warn(f"KEY {key} VALUE {value} format {format} META {meta}") 

if __name__ == "__main__":
   toJSONFilter(handle) 
```
And then running `pandoc --pdf-engine=xelatex --verbose test.tex -o 
test.html --filter filter.py`.

But it seems like latex macros that pandoc doesn't understand are getting 
skipped before the filter is applied, so the `handle` function never gets 
called with the text contents of my \IPA macro.

On Saturday, December 4, 2021 at 9:37:16 AM UTC-8 John MacFarlane wrote:

>
> Pandoc doesn't understand everything, especially outside of
> core LaTeX. In particular, it doesn't understand
>
> \DeclareTextFontCommand
>
> from fontspec, so the \IPA macro isn't understood.
>
> You can work around this by adding your own macro
> definition before you convert with pandoc:
>
> \renewcommand{\IPA}[1]{#1}
>
> and then the contents of \IPA will just be passed
> through.
>
> I suppose you could alternatively redefine
>
> \renewcommand{\DeclareTextFontCommand}[2]{\newcommand{#1}[1]{##1}}
>
> before your fontspec stuff (untested and may not work).
>
> Another option is to use a filter and intercept the raw
> LaTeX inline produced from \IPA{some text}, changing it
> into textual content, but I think the first approach above
> is the simplest.
>
>
>
> Greg S <elorian...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
> > I have a minimal test latex file `test.tex`:
> >
> >
> > \documentclass{article}
> >
> > \usepackage{fontspec}
> >
> > \newfontfamily\IPAFont{Doulos SIL}
> > \DeclareTextFontCommand{\IPA}{\IPAFont}
> >
> > \begin{document}
> >
> > \section{Test}
> > Hello \IPA{some IPA}
> >
> > \end{document}
> >
> >
> > This builds fine with xelatex and produces a pdf I expect. When i try to 
> > convert this to an html document with `pandoc --pdf-engine=xelatex 
> > --verbose test.tex -o test.html`, I see the warnings:
> >
> > [INFO] Could not load include file fontspec.sty at test.tex line 3 
> column 22
> > [INFO] Skipped '\newfontfamily' at test.tex line 5 column 15
> > [INFO] Skipped '\IPAFont{Doulos SIL}' at test.tex line 5 column 35
> > [INFO] Skipped '\DeclareTextFontCommand{\IPA}{\IPAFont}' at test.tex 
> line 6 
> > column 40
> > [INFO] Skipped '\IPA{some IPA}' at test.tex line 11 column 21
> >
> > And the text within the custom \IPA command is skipped. How can I make 
> > pandoc not skip these?
> >
> >
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/pandoc-discuss/0462fc42-ae24-4c52-b267-1126ed5834edn%40googlegroups.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/bac7947b-259e-4774-b993-33f69fffc05fn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 5501 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Skipping commands in LaTeX document
       [not found] ` <0462fc42-ae24-4c52-b267-1126ed5834edn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2021-12-04 17:36   ` John MacFarlane
       [not found]     ` <m2zgpgi869.fsf-jF64zX8BO0+FqBokazbCQ6OPv3vYUT2dxr7GGTnW70NeoWH0uzbU5w@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: John MacFarlane @ 2021-12-04 17:36 UTC (permalink / raw)
  To: Greg S, pandoc-discuss


Pandoc doesn't understand everything, especially outside of
core LaTeX.  In particular, it doesn't understand

\DeclareTextFontCommand

from fontspec, so the \IPA macro isn't understood.

You can work around this by adding your own macro
definition before you convert with pandoc:

\renewcommand{\IPA}[1]{#1}

and then the contents of \IPA will just be passed
through.

I suppose you could alternatively redefine

\renewcommand{\DeclareTextFontCommand}[2]{\newcommand{#1}[1]{##1}}

before your fontspec stuff (untested and may not work).

Another option is to use a filter and intercept the raw
LaTeX inline produced from \IPA{some text}, changing it
into textual content, but I think the first approach above
is the simplest.



Greg S <elorian.mestec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> I have a minimal test latex file `test.tex`:
>
>
> \documentclass{article}
>
> \usepackage{fontspec}
>
> \newfontfamily\IPAFont{Doulos SIL}
> \DeclareTextFontCommand{\IPA}{\IPAFont}
>
> \begin{document}
>
> \section{Test}
> Hello \IPA{some IPA}
>
> \end{document}
>
>
> This builds fine with xelatex and produces a pdf I expect. When i try to 
> convert this to an html document with `pandoc --pdf-engine=xelatex 
> --verbose test.tex -o test.html`,  I see the warnings:
>
> [INFO] Could not load include file fontspec.sty at test.tex line 3 column 22
> [INFO] Skipped '\newfontfamily' at test.tex line 5 column 15
> [INFO] Skipped '\IPAFont{Doulos SIL}' at test.tex line 5 column 35
> [INFO] Skipped '\DeclareTextFontCommand{\IPA}{\IPAFont}' at test.tex line 6 
> column 40
> [INFO] Skipped '\IPA{some IPA}' at test.tex line 11 column 21
>
> And the text within the custom \IPA command is skipped. How can I make 
> pandoc not skip these?
>
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/0462fc42-ae24-4c52-b267-1126ed5834edn%40googlegroups.com.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Skipping commands in LaTeX document
@ 2021-12-04  1:16 Greg S
       [not found] ` <0462fc42-ae24-4c52-b267-1126ed5834edn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 12+ messages in thread
From: Greg S @ 2021-12-04  1:16 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1303 bytes --]

I have a minimal test latex file `test.tex`:


\documentclass{article}

\usepackage{fontspec}

\newfontfamily\IPAFont{Doulos SIL}
\DeclareTextFontCommand{\IPA}{\IPAFont}

\begin{document}

\section{Test}
Hello \IPA{some IPA}

\end{document}


This builds fine with xelatex and produces a pdf I expect. When i try to 
convert this to an html document with `pandoc --pdf-engine=xelatex 
--verbose test.tex -o test.html`,  I see the warnings:

[INFO] Could not load include file fontspec.sty at test.tex line 3 column 22
[INFO] Skipped '\newfontfamily' at test.tex line 5 column 15
[INFO] Skipped '\IPAFont{Doulos SIL}' at test.tex line 5 column 35
[INFO] Skipped '\DeclareTextFontCommand{\IPA}{\IPAFont}' at test.tex line 6 
column 40
[INFO] Skipped '\IPA{some IPA}' at test.tex line 11 column 21

And the text within the custom \IPA command is skipped. How can I make 
pandoc not skip these?


-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/0462fc42-ae24-4c52-b267-1126ed5834edn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1992 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-12-09 16:18 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-04  0:34 Skipping commands in LaTeX document 'Greg Shuflin' via pandoc-discuss
2021-12-04  1:16 Greg S
     [not found] ` <0462fc42-ae24-4c52-b267-1126ed5834edn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2021-12-04 17:36   ` John MacFarlane
     [not found]     ` <m2zgpgi869.fsf-jF64zX8BO0+FqBokazbCQ6OPv3vYUT2dxr7GGTnW70NeoWH0uzbU5w@public.gmane.org>
2021-12-05  2:50       ` Greg S
     [not found]         ` <bac7947b-259e-4774-b993-33f69fffc05fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2021-12-05 18:56           ` John MacFarlane
     [not found]             ` <m2r1aqhod6.fsf-d8241O7hbXoP5tpWdHSM3tPlBySK3R6THiGdP5j34PU@public.gmane.org>
2021-12-05 19:53               ` Greg S
     [not found]                 ` <84e207d9-eaed-4b24-8b6b-62ea07bb2b5bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2021-12-05 20:11                   ` John MacFarlane
     [not found]                     ` <m2fsr6hkvl.fsf-d8241O7hbXoP5tpWdHSM3tPlBySK3R6THiGdP5j34PU@public.gmane.org>
2021-12-06  4:25                       ` Greg S
     [not found]                         ` <c648fb98-d892-4f1e-b3aa-0da071d8de4bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2021-12-06 10:34                           ` BPJ
     [not found]                             ` <CADAJKhCC9xm6HX0aF5SzJr9vG3xZR1eiQxxCpA6QNRi1BRE-7g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2021-12-09  3:03                               ` Greg S
     [not found]                                 ` <4f3956c3-e028-473c-b622-dae2f0b72dedn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2021-12-09 16:18                                   ` John MacFarlane
2021-12-06 17:57                           ` John MacFarlane

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).