public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Help with lua filter for docx to latex conversion
@ 2022-08-26 17:08 Sandra Martin
       [not found] ` <0df70b72-8e13-4e1c-986f-6a54ef352f6cn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Sandra Martin @ 2022-08-26 17:08 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1289 bytes --]

Hello all, 

I have trouble writing the correct lua filter for my pandoc conversion of 
docx to latex. 

In short, I have citations in the format "\cite{reference}" (csl style from 
Better Bibtex) in my docx file, which I would like to preserve and keep 
unchanged during pandoc conversion.

When calling "pandoc --to=native test.docx", I see that pandoc reads these 
entries as strings and I've tried writing filters with pandoc.RawInline to 
preserve these strings. However, using for instance this function keeps the 
reference keys but gets rid of all the latex formatting (the backslash and 
the curly brackets):
function Str(el)
  local citekey = el.text:match("\\cite[{](%w+)[}]")
  if citekey then
    return pandoc.RawInline('latex', citekey)
  end
end

How do I keep my latex-styled reference strings as they are during pandoc 
conversion? 

Thanks in advance!
Sandra

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/0df70b72-8e13-4e1c-986f-6a54ef352f6cn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1777 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Help with lua filter for docx to latex conversion
       [not found] ` <0df70b72-8e13-4e1c-986f-6a54ef352f6cn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2022-08-26 17:40   ` Bastien DUMONT
  2022-08-26 19:23     ` Sandra Martin
  0 siblings, 1 reply; 6+ messages in thread
From: Bastien DUMONT @ 2022-08-26 17:40 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

In `"\\cite[{](%w+)[}]"`, `(%w+)` is a capture. When the pattern in string.match() specifies a capture, it is returned instead of the whole match, so `citekey` has the value of the *content* of `\cite{...}` instead of the whole macro. I guess that it should work if you remove the parentheses.

Le Friday 26 August 2022 à 10:08:50AM, Sandra Martin a écrit :
> Hello all,
> 
> I have trouble writing the correct lua filter for my pandoc conversion of docx
> to latex.
> 
> In short, I have citations in the format "\cite{reference}" (csl style from
> Better Bibtex) in my docx file, which I would like to preserve and keep
> unchanged during pandoc conversion.
> 
> When calling "pandoc --to=native test.docx", I see that pandoc reads these
> entries as strings and I've tried writing filters with pandoc.RawInline to
> preserve these strings. However, using for instance this function keeps the
> reference keys but gets rid of all the latex formatting (the backslash and the
> curly brackets):
> function Str(el)
>   local citekey = el.text:match("\\cite[{](%w+)[}]")
>   if citekey then
>     return pandoc.RawInline('latex', citekey)
>   end
> end
> 
> How do I keep my latex-styled reference strings as they are during pandoc
> conversion?
> 
> Thanks in advance!
> Sandra
> 
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to [1]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit [2]https://groups.google.com/d/msgid/
> pandoc-discuss/0df70b72-8e13-4e1c-986f-6a54ef352f6cn%40googlegroups.com.
> 
> References:
> 
> [1] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> [2] https://groups.google.com/d/msgid/pandoc-discuss/0df70b72-8e13-4e1c-986f-6a54ef352f6cn%40googlegroups.com?utm_medium=email&utm_source=footer

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/YwkBUzXZumcue7DE%40localhost.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Help with lua filter for docx to latex conversion
  2022-08-26 17:40   ` Bastien DUMONT
@ 2022-08-26 19:23     ` Sandra Martin
       [not found]       ` <7fc77e34-86e4-48a2-8642-e226d1ae08ben-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Sandra Martin @ 2022-08-26 19:23 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3003 bytes --]

Hi Bastien, 

Thank you for your quick reply! I got it to work with: "\\cite{%w+}" 
The only left is that this only matches one reference but doesn't work when 
there's multiple, for example "\cite{reference, referecen2, reference3}"

I've been playing around with "\\cite{%w+%p?%s?%w+}" but I can't get it to 
work with the space after the comma. Would you know a solution? 

Thanks a lot!!
Bastien Dumont schrieb am Freitag, 26. August 2022 um 19:41:08 UTC+2:

> In `"\\cite[{](%w+)[}]"`, `(%w+)` is a capture. When the pattern in 
> string.match() specifies a capture, it is returned instead of the whole 
> match, so `citekey` has the value of the *content* of `\cite{...}` instead 
> of the whole macro. I guess that it should work if you remove the 
> parentheses.
>
> Le Friday 26 August 2022 à 10:08:50AM, Sandra Martin a écrit :
> > Hello all,
> > 
> > I have trouble writing the correct lua filter for my pandoc conversion 
> of docx
> > to latex.
> > 
> > In short, I have citations in the format "\cite{reference}" (csl style 
> from
> > Better Bibtex) in my docx file, which I would like to preserve and keep
> > unchanged during pandoc conversion.
> > 
> > When calling "pandoc --to=native test.docx", I see that pandoc reads 
> these
> > entries as strings and I've tried writing filters with pandoc.RawInline 
> to
> > preserve these strings. However, using for instance this function keeps 
> the
> > reference keys but gets rid of all the latex formatting (the backslash 
> and the
> > curly brackets):
> > function Str(el)
> > local citekey = el.text:match("\\cite[{](%w+)[}]")
> > if citekey then
> > return pandoc.RawInline('latex', citekey)
> > end
> > end
> > 
> > How do I keep my latex-styled reference strings as they are during pandoc
> > conversion?
> > 
> > Thanks in advance!
> > Sandra
> > 
> > --
> > You received this message because you are subscribed to the Google Groups
> > "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email
> > to [1]pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > To view this discussion on the web visit [2]
> https://groups.google.com/d/msgid/
> > pandoc-discuss/0df70b72-8e13-4e1c-986f-6a54ef352f6cn%40googlegroups.com.
> > 
> > References:
> > 
> > [1] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> > [2] 
> https://groups.google.com/d/msgid/pandoc-discuss/0df70b72-8e13-4e1c-986f-6a54ef352f6cn%40googlegroups.com?utm_medium=email&utm_source=footer
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/7fc77e34-86e4-48a2-8642-e226d1ae08ben%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 4836 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Help with lua filter for docx to latex conversion
       [not found]       ` <7fc77e34-86e4-48a2-8642-e226d1ae08ben-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2022-08-26 20:09         ` Albert Krewinkel
       [not found]           ` <36B1ABBF-804C-4785-BB14-E29AEE6423E4-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Albert Krewinkel @ 2022-08-26 20:09 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 3881 bytes --]

Lua patterns have a nice feature for this, try '\\cite%b{}'

This should match everything between the starting and closing curly braces.

Sandra Martin <sandrushba-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> hat am 26.08.2022 21:23 CEST geschrieben:

 Hi Bastien, 

Thank you for your quick reply! I got it to work with: "\\cite{%w+}"

The only left is that this only matches one reference but doesn't work when there's multiple, for example "\cite{reference, referecen2, reference3}"

I've been playing around with "\\cite{%w+%p?%s?%w+}" but I can't get it to work with the space after the comma. Would you know a solution?

Thanks a lot!!

Bastien Dumont schrieb am Freitag, 26. August 2022 um 19:41:08 UTC+2:

In `"\\cite[{](%w+)[}]"`, `(%w+)` is a capture. When the pattern in string.match() specifies a capture, it is returned instead of the whole match, so `citekey` has the value of the *content* of `\cite{...}` instead of the whole macro. I guess that it should work if you remove the parentheses. 

Le Friday 26 August 2022 à 10:08:50AM, Sandra Martin a écrit : 
> Hello all, 
> 
> I have trouble writing the correct lua filter for my pandoc conversion of docx 
> to latex. 
> 
> In short, I have citations in the format "\cite{reference}" (csl style from 
> Better Bibtex) in my docx file, which I would like to preserve and keep 
> unchanged during pandoc conversion. 
> 
> When calling "pandoc --to=native test.docx", I see that pandoc reads these 
> entries as strings and I've tried writing filters with pandoc.RawInline to 
> preserve these strings. However, using for instance this function keeps the 
> reference keys but gets rid of all the latex formatting (the backslash and the 
> curly brackets): 
> function Str(el) 
> local citekey = el.text:match("\\cite[{](%w+)[}]") 
> if citekey then 
> return pandoc.RawInline('latex', citekey) 
> end 
> end 
> 
> How do I keep my latex-styled reference strings as they are during pandoc 
> conversion? 
> 
> Thanks in advance! 
> Sandra 
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "pandoc-discuss" group. 
> To unsubscribe from this group and stop receiving emails from it, send an email 
> to [1]pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org 
> To view this discussion on the web visit [2]https://groups.google.com/d/msgid/ 
> pandoc-discuss/0df70b72-8e13-4e1c-986f-6a54ef352f6cn%40googlegroups.com <http://40googlegroups.com>. 
> 
> References: 
> 
> [1] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org 
> [2] https://groups.google.com/d/msgid/pandoc-discuss/0df70b72-8e13-4e1c-986f-6a54ef352f6cn%40googlegroups.com?utm_medium=email&utm_source=footer 

 -- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/7fc77e34-86e4-48a2-8642-e226d1ae08ben%40googlegroups.com <https://groups.google.com/d/msgid/pandoc-discuss/7fc77e34-86e4-48a2-8642-e226d1ae08ben%40googlegroups.com?utm_medium=email&utm_source=footer>.
-- 
Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/36B1ABBF-804C-4785-BB14-E29AEE6423E4%40zeitkraut.de.

[-- Attachment #2: Type: text/html, Size: 4991 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Help with lua filter for docx to latex conversion
       [not found]           ` <36B1ABBF-804C-4785-BB14-E29AEE6423E4-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
@ 2022-08-26 21:37             ` Bastien DUMONT
  2022-08-27  8:49               ` Sandra Martin
  0 siblings, 1 reply; 6+ messages in thread
From: Bastien DUMONT @ 2022-08-26 21:37 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

Yes, but the problem is that if the argument of \cite contains spaces, they are converted into Space objects in the AST... For instance (simulating docx input with markdown with TeX extension disabled):

`pandoc -t native -f markdown-raw_tex <<< '\cite{reference, reference2, reference3}'`

outputs:

```
[ Para
    [ Str "\\cite{reference,"
    , Space
    , Str "referecen2,"
    , Space
    , Str "reference3}"
    ]
]
```

I can see two ways. The easiest one is to remove all spaces from the argument of `\cite` in the DOCX file (I guess that it should not be too difficult using a regexp search-and-replace or even a macro). Otherwise, using the Inlines function in a filter, you could iterate over the inlines from the end to the beginning of the file. When you encounter a Str ending with `}`, you store it in a table and you keep collecting the following (i.e. preceding) inlines until you get the Str beginning with `\cite`. Then you can stringify your table and replace all the inlines with a RawInline containing the resulting string. Obviously, this solution requires that you never use the curly braces for other purposes. Otherwise, it would be possible to iterate from the beginning to the end, but you would have to continuously update the current index and the total number of the inlines.


Le Friday 26 August 2022 à 10:09:18PM, Albert Krewinkel a écrit :
> Lua patterns have a nice feature for this, try '\\cite%b{}'
> 
> This should match everything between the starting and closing curly braces.
> 
> Sandra Martin <sandrushba-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> hat am 26.08.2022 21:23 CEST geschrieben:
> 
> Hi Bastien,
> 
> Thank you for your quick reply! I got it to work with: "\\cite{%w+}"
> 
> The only left is that this only matches one reference but doesn't work when
> there's multiple, for example "\cite{reference, referecen2, reference3}"
> 
> I've been playing around with "\\cite{%w+%p?%s?%w+}" but I can't get it to work
> with the space after the comma. Would you know a solution?
> 
> Thanks a lot!!
> 
> Bastien Dumont schrieb am Freitag, 26. August 2022 um 19:41:08 UTC+2:
> 
> In `"\\cite[{](%w+)[}]"`, `(%w+)` is a capture. When the pattern in
> string.match() specifies a capture, it is returned instead of the whole match,
> so `citekey` has the value of the *content* of `\cite{...}` instead of the
> whole macro. I guess that it should work if you remove the parentheses.
> 
> Le Friday 26 August 2022 à 10:08:50AM, Sandra Martin a écrit :
> > Hello all,
> >
> > I have trouble writing the correct lua filter for my pandoc conversion of
> docx
> > to latex.
> >
> > In short, I have citations in the format "\cite{reference}" (csl style from
> > Better Bibtex) in my docx file, which I would like to preserve and keep
> > unchanged during pandoc conversion.
> >
> > When calling "pandoc --to=native test.docx", I see that pandoc reads these
> > entries as strings and I've tried writing filters with pandoc.RawInline to
> > preserve these strings. However, using for instance this function keeps the
> > reference keys but gets rid of all the latex formatting (the backslash and
> the
> > curly brackets):
> > function Str(el)
> > local citekey = el.text:match("\\cite[{](%w+)[}]")
> > if citekey then
> > return pandoc.RawInline('latex', citekey)
> > end
> > end
> >
> > How do I keep my latex-styled reference strings as they are during pandoc
> > conversion?
> >
> > Thanks in advance!
> > Sandra
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> email
> > to [1]pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > To view this discussion on the web visit [2]https://groups.google.com/d/msgid
> /
> > pandoc-discuss/0df70b72-8e13-4e1c-986f-6a54ef352f6cn%40googlegroups.com <[1]
> http://40googlegroups.com>.
> >
> > References:
> >
> > [1] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> > [2] [2]https://groups.google.com/d/msgid/pandoc-discuss/
> 0df70b72-8e13-4e1c-986f-6a54ef352f6cn%40googlegroups.com?utm_medium=email&
> utm_source=footer
> 
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> <mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>.
> To view this discussion on the web visit [3]https://groups.google.com/d/msgid/
> pandoc-discuss/7fc77e34-86e4-48a2-8642-e226d1ae08ben%40googlegroups.com <[4]
> https://groups.google.com/d/msgid/pandoc-discuss/
> 7fc77e34-86e4-48a2-8642-e226d1ae08ben%40googlegroups.com?utm_medium=email&
> utm_source=footer>.
> --
> Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail gesendet.
> 
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to [5]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit [6]https://groups.google.com/d/msgid/
> pandoc-discuss/36B1ABBF-804C-4785-BB14-E29AEE6423E4%40zeitkraut.de.
> 
> References:
> 
> [1] http://40googlegroups.com/
> [2] https://groups.google.com/d/msgid/pandoc-discuss/0df70b72-8e13-4e1c-986f-6a54ef352f6cn%40googlegroups.com?utm_medium=email&utm_source=footer
> [3] https://groups.google.com/d/msgid/pandoc-discuss/7fc77e34-86e4-48a2-8642-e226d1ae08ben%40googlegroups.com
> [4] https://groups.google.com/d/msgid/pandoc-discuss/7fc77e34-86e4-48a2-8642-e226d1ae08ben%40googlegroups.com?utm_medium=email&utm_source=footer
> [5] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> [6] https://groups.google.com/d/msgid/pandoc-discuss/36B1ABBF-804C-4785-BB14-E29AEE6423E4%40zeitkraut.de?utm_medium=email&utm_source=footer

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/Ywk9INd79hmWD0Wb%40localhost.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Help with lua filter for docx to latex conversion
  2022-08-26 21:37             ` Bastien DUMONT
@ 2022-08-27  8:49               ` Sandra Martin
  0 siblings, 0 replies; 6+ messages in thread
From: Sandra Martin @ 2022-08-27  8:49 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 7945 bytes --]

Hi Bastien, 

Yes, that's correct. I figured out the special situation of spaces as well. 
Thank you for your suggestions! I came up with a different solution which 
works well for my case. Basically just detecting all possible occurences 
and assigning them as citations since the reference keys have a unique 
combination of words and numbers. I post it here in case it helps people in 
the future.

function Str(elem)
  local citekey = elem.text:match("\\cite{%w+}.")
  local citekey2 = elem.text:match("\\cite{%w+,")
  local citekey3 = elem.text:match("%a+%d+,")
  local citekey4 = elem.text:match("%w+}.")         
  if citekey then
    return pandoc.RawInline('latex', citekey)
  elseif citekey2 then
    return pandoc.RawInline('latex', citekey2)
  elseif citekey3 then
    return pandoc.RawInline('latex', citekey3)
  elseif citekey4 then
    return pandoc.RawInline('latex', citekey4)
  end
end

Best, 
Sandra
Bastien Dumont schrieb am Freitag, 26. August 2022 um 23:37:42 UTC+2:

> Yes, but the problem is that if the argument of \cite contains spaces, 
> they are converted into Space objects in the AST... For instance 
> (simulating docx input with markdown with TeX extension disabled):
>
> `pandoc -t native -f markdown-raw_tex <<< '\cite{reference, reference2, 
> reference3}'`
>
> outputs:
>
> ```
> [ Para
> [ Str "\\cite{reference,"
> , Space
> , Str "referecen2,"
> , Space
> , Str "reference3}"
> ]
> ]
> ```
>
> I can see two ways. The easiest one is to remove all spaces from the 
> argument of `\cite` in the DOCX file (I guess that it should not be too 
> difficult using a regexp search-and-replace or even a macro). Otherwise, 
> using the Inlines function in a filter, you could iterate over the inlines 
> from the end to the beginning of the file. When you encounter a Str ending 
> with `}`, you store it in a table and you keep collecting the following 
> (i.e. preceding) inlines until you get the Str beginning with `\cite`. Then 
> you can stringify your table and replace all the inlines with a RawInline 
> containing the resulting string. Obviously, this solution requires that you 
> never use the curly braces for other purposes. Otherwise, it would be 
> possible to iterate from the beginning to the end, but you would have to 
> continuously update the current index and the total number of the inlines.
>
>
> Le Friday 26 August 2022 à 10:09:18PM, Albert Krewinkel a écrit :
> > Lua patterns have a nice feature for this, try '\\cite%b{}'
> > 
> > This should match everything between the starting and closing curly 
> braces.
> > 
> > Sandra Martin <sandr...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> hat am 26.08.2022 21:23 CEST 
> geschrieben:
> > 
> > Hi Bastien,
> > 
> > Thank you for your quick reply! I got it to work with: "\\cite{%w+}"
> > 
> > The only left is that this only matches one reference but doesn't work 
> when
> > there's multiple, for example "\cite{reference, referecen2, reference3}"
> > 
> > I've been playing around with "\\cite{%w+%p?%s?%w+}" but I can't get it 
> to work
> > with the space after the comma. Would you know a solution?
> > 
> > Thanks a lot!!
> > 
> > Bastien Dumont schrieb am Freitag, 26. August 2022 um 19:41:08 UTC+2:
> > 
> > In `"\\cite[{](%w+)[}]"`, `(%w+)` is a capture. When the pattern in
> > string.match() specifies a capture, it is returned instead of the whole 
> match,
> > so `citekey` has the value of the *content* of `\cite{...}` instead of 
> the
> > whole macro. I guess that it should work if you remove the parentheses.
> > 
> > Le Friday 26 August 2022 à 10:08:50AM, Sandra Martin a écrit :
> > > Hello all,
> > >
> > > I have trouble writing the correct lua filter for my pandoc conversion 
> of
> > docx
> > > to latex.
> > >
> > > In short, I have citations in the format "\cite{reference}" (csl style 
> from
> > > Better Bibtex) in my docx file, which I would like to preserve and keep
> > > unchanged during pandoc conversion.
> > >
> > > When calling "pandoc --to=native test.docx", I see that pandoc reads 
> these
> > > entries as strings and I've tried writing filters with 
> pandoc.RawInline to
> > > preserve these strings. However, using for instance this function 
> keeps the
> > > reference keys but gets rid of all the latex formatting (the backslash 
> and
> > the
> > > curly brackets):
> > > function Str(el)
> > > local citekey = el.text:match("\\cite[{](%w+)[}]")
> > > if citekey then
> > > return pandoc.RawInline('latex', citekey)
> > > end
> > > end
> > >
> > > How do I keep my latex-styled reference strings as they are during 
> pandoc
> > > conversion?
> > >
> > > Thanks in advance!
> > > Sandra
> > >
> > > --
> > > You received this message because you are subscribed to the Google 
> Groups
> > > "pandoc-discuss" group.
> > > To unsubscribe from this group and stop receiving emails from it, send 
> an
> > email
> > > to [1]pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > > To view this discussion on the web visit [2]
> https://groups.google.com/d/msgid
> > /
> > > pandoc-discuss/0df70b72-8e13-4e1c-986f-6a54ef352f6cn%
> 40googlegroups.com <[1]
> > http://40googlegroups.com>.
> > >
> > > References:
> > >
> > > [1] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> > > [2] [2]https://groups.google.com/d/msgid/pandoc-discuss/
> > 0df70b72-8e13-4e1c-986f-6a54ef352f6cn%
> 40googlegroups.com?utm_medium=email&
> > utm_source=footer
> > 
> > --
> > You received this message because you are subscribed to the Google Groups
> > "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email
> > to pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> > <mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>.
> > To view this discussion on the web visit [3]
> https://groups.google.com/d/msgid/
> > pandoc-discuss/7fc77e34-86e4-48a2-8642-e226d1ae08ben%40googlegroups.com 
> <[4]
> > https://groups.google.com/d/msgid/pandoc-discuss/
> > 7fc77e34-86e4-48a2-8642-e226d1ae08ben%
> 40googlegroups.com?utm_medium=email&
> > utm_source=footer>.
> > --
> > Diese Nachricht wurde von meinem Android-Mobiltelefon mit K-9 Mail 
> gesendet.
> > 
> > --
> > You received this message because you are subscribed to the Google Groups
> > "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email
> > to [5]pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > To view this discussion on the web visit [6]
> https://groups.google.com/d/msgid/
> > pandoc-discuss/36B1ABBF-804C-4785-BB14-E29AEE6423E4%40zeitkraut.de.
> > 
> > References:
> > 
> > [1] http://40googlegroups.com/
> > [2] 
> https://groups.google.com/d/msgid/pandoc-discuss/0df70b72-8e13-4e1c-986f-6a54ef352f6cn%40googlegroups.com?utm_medium=email&utm_source=footer
> > [3] 
> https://groups.google.com/d/msgid/pandoc-discuss/7fc77e34-86e4-48a2-8642-e226d1ae08ben%40googlegroups.com
> > [4] 
> https://groups.google.com/d/msgid/pandoc-discuss/7fc77e34-86e4-48a2-8642-e226d1ae08ben%40googlegroups.com?utm_medium=email&utm_source=footer
> > [5] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
> > [6] 
> https://groups.google.com/d/msgid/pandoc-discuss/36B1ABBF-804C-4785-BB14-E29AEE6423E4%40zeitkraut.de?utm_medium=email&utm_source=footer
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ead3099d-9f49-44ca-bdd7-c723604a74fan%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 14551 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-08-27  8:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-26 17:08 Help with lua filter for docx to latex conversion Sandra Martin
     [not found] ` <0df70b72-8e13-4e1c-986f-6a54ef352f6cn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-08-26 17:40   ` Bastien DUMONT
2022-08-26 19:23     ` Sandra Martin
     [not found]       ` <7fc77e34-86e4-48a2-8642-e226d1ae08ben-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-08-26 20:09         ` Albert Krewinkel
     [not found]           ` <36B1ABBF-804C-4785-BB14-E29AEE6423E4-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2022-08-26 21:37             ` Bastien DUMONT
2022-08-27  8:49               ` Sandra Martin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).