* Mixed Rawblock / Plain as Para in JATS output @ 2022-04-18 13:10 Julien Dutant [not found] ` <8589f717-fe73-4813-9126-6beba194a3f1n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Julien Dutant @ 2022-04-18 13:10 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 2975 bytes --] Hi all, I'm writing a new Lua filter to handle theorems and the like in various format including JATS (https://github.com/jdutant/statement). Given the following markdown and a suitable bibliography: ```markdown ::: theorem (from @article) Some very interesting fact holds. ::: ``` The filter is trying to generate a JATS statement ( https://jats.nlm.nih.gov/archiving/tag-library/1.1/element/statement.html ): ```xml <statement> <label>Theorem 1.1</label> <title>from Doe, J (2003)</title> <p>Some very interesting fact holds.</p> </statement> ``` The problem I encounter is that the JATS writer converts pandoc.Plain to <p> blocks. So if I build inlines list: inlines = { pandoc.RawInline('jats','<title>'), ... more inlines ..., pandoc.RawInline('jats','<title>')} and try to insert it in the document with: blocks:insert( pandoc.Plain(inlines) ) I get the unintended output: ```xml <p><title>from Doe, J (2003)</title><p> ``` Now I could of course stringify the title inlines first, add the title tags and insert the result in a RawBlock. But that's bad too, because the title inlines may contain things that aren't yet to be stringified like a citation. Is it necessary for the JATS writer to turn Plain elements into <p> ones? The HTML one doesn't, after all. At the moment my best approach is to use `pandoc.write` (thanks so much Albert for giving us this!) to convert the inlines on the spot. There are some issues with this though. * If something in the inlines needs to be handled by another filter it won't be handled properly. For instance, a pandoc-crossref crossreference will be turned into plain text and unretrievable. AFAIK my filter can't tell which other filters are run and can't pass them as WriterOptions anyway. * If I don't pass PANDOC_WRITER_OPTIONS, pandoc.write won't know which citation_mode to use, and may lack other relevant settings to write the inlines. * If I pass all of PANDOC_WRITER_OPTIONS, pandoc.write uses standalone mode if the document is in standalone mode, so I get a whole preamble within my label. Yet there is no standalone setting in WriterOptions, and I can unset it (I've tried to remove `template` but it didn't work). * So my best guess so far is to create a new WriterOptions by copying only those fields of PANDOC_WRITER_OPTIONS that might be useful to format the inlines. All in all, it's pretty heavy handed just to handle a label of inlines. Is there a better approach? Any chance that the JATS writer converts Plain blocks to plain blocks without <p> tags? Best, J -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/8589f717-fe73-4813-9126-6beba194a3f1n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 3855 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <8589f717-fe73-4813-9126-6beba194a3f1n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Mixed Rawblock / Plain as Para in JATS output [not found] ` <8589f717-fe73-4813-9126-6beba194a3f1n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2022-04-18 13:15 ` Julien Dutant [not found] ` <e40cf1d2-9f51-442b-8aa5-28905c6e77b6n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 5+ messages in thread From: Julien Dutant @ 2022-04-18 13:15 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 3275 bytes --] And related question, shouldn't the WriterOptions Lua constructor include a `standalone` field? On Monday, April 18, 2022 at 2:10:12 PM UTC+1 Julien Dutant wrote: > Hi all, > > I'm writing a new Lua filter to handle theorems and the like in various > format including JATS (https://github.com/jdutant/statement). Given the > following markdown and a suitable bibliography: > > ```markdown > ::: theorem > (from @article) Some very interesting fact holds. > ::: > ``` > > The filter is trying to generate a JATS statement ( > https://jats.nlm.nih.gov/archiving/tag-library/1.1/element/statement.html > ): > ```xml > <statement> > <label>Theorem 1.1</label> > <title>from Doe, J (2003)</title> > <p>Some very interesting fact holds.</p> > </statement> > ``` > > The problem I encounter is that the JATS writer converts pandoc.Plain to > <p> blocks. So if I build inlines list: > inlines = { pandoc.RawInline('jats','<title>'), ... more inlines ..., > pandoc.RawInline('jats','<title>')} > > and try to insert it in the document with: > blocks:insert( pandoc.Plain(inlines) ) > > I get the unintended output: > ```xml > <p><title>from Doe, J (2003)</title><p> > ``` > > Now I could of course stringify the title inlines first, add the title > tags and insert the result in a RawBlock. But that's bad too, because the > title inlines may contain things that aren't yet to be stringified like a > citation. > > Is it necessary for the JATS writer to turn Plain elements into <p> ones? > The HTML one doesn't, after all. > > At the moment my best approach is to use `pandoc.write` (thanks so much > Albert for giving us this!) to convert the inlines on the spot. There are > some issues with this though. > > * If something in the inlines needs to be handled by another filter it > won't be handled properly. For instance, a pandoc-crossref crossreference > will be turned into plain text and unretrievable. AFAIK my filter can't > tell which other filters are run and can't pass them as WriterOptions > anyway. > * If I don't pass PANDOC_WRITER_OPTIONS, pandoc.write won't know which > citation_mode to use, and may lack other relevant settings to write the > inlines. > * If I pass all of PANDOC_WRITER_OPTIONS, pandoc.write uses standalone > mode if the document is in standalone mode, so I get a whole preamble > within my label. Yet there is no standalone setting in WriterOptions, and I > can unset it (I've tried to remove `template` but it didn't work). > * So my best guess so far is to create a new WriterOptions by copying only > those fields of PANDOC_WRITER_OPTIONS that might be useful to format the > inlines. > > All in all, it's pretty heavy handed just to handle a label of inlines. Is > there a better approach? Any chance that the JATS writer converts Plain > blocks to plain blocks without <p> tags? > > Best, > J > > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/e40cf1d2-9f51-442b-8aa5-28905c6e77b6n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 4909 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <e40cf1d2-9f51-442b-8aa5-28905c6e77b6n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Mixed Rawblock / Plain as Para in JATS output [not found] ` <e40cf1d2-9f51-442b-8aa5-28905c6e77b6n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2022-04-18 14:02 ` Julien Dutant 2022-04-18 14:03 ` Bastien DUMONT 1 sibling, 0 replies; 5+ messages in thread From: Julien Dutant @ 2022-04-18 14:02 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 3555 bytes --] Just noticed that Bastien Dumont faced a same issue writing a filter for ODT output: https://github.com/jgm/pandoc/issues/7262 . On Monday, April 18, 2022 at 2:15:06 PM UTC+1 Julien Dutant wrote: > And related question, shouldn't the WriterOptions Lua constructor include > a `standalone` field? > > On Monday, April 18, 2022 at 2:10:12 PM UTC+1 Julien Dutant wrote: > >> Hi all, >> >> I'm writing a new Lua filter to handle theorems and the like in various >> format including JATS (https://github.com/jdutant/statement). Given the >> following markdown and a suitable bibliography: >> >> ```markdown >> ::: theorem >> (from @article) Some very interesting fact holds. >> ::: >> ``` >> >> The filter is trying to generate a JATS statement ( >> https://jats.nlm.nih.gov/archiving/tag-library/1.1/element/statement.html >> ): >> ```xml >> <statement> >> <label>Theorem 1.1</label> >> <title>from Doe, J (2003)</title> >> <p>Some very interesting fact holds.</p> >> </statement> >> ``` >> >> The problem I encounter is that the JATS writer converts pandoc.Plain to >> <p> blocks. So if I build inlines list: >> inlines = { pandoc.RawInline('jats','<title>'), ... more inlines ..., >> pandoc.RawInline('jats','<title>')} >> >> and try to insert it in the document with: >> blocks:insert( pandoc.Plain(inlines) ) >> >> I get the unintended output: >> ```xml >> <p><title>from Doe, J (2003)</title><p> >> ``` >> >> Now I could of course stringify the title inlines first, add the title >> tags and insert the result in a RawBlock. But that's bad too, because the >> title inlines may contain things that aren't yet to be stringified like a >> citation. >> >> Is it necessary for the JATS writer to turn Plain elements into <p> ones? >> The HTML one doesn't, after all. >> >> At the moment my best approach is to use `pandoc.write` (thanks so much >> Albert for giving us this!) to convert the inlines on the spot. There are >> some issues with this though. >> >> * If something in the inlines needs to be handled by another filter it >> won't be handled properly. For instance, a pandoc-crossref crossreference >> will be turned into plain text and unretrievable. AFAIK my filter can't >> tell which other filters are run and can't pass them as WriterOptions >> anyway. >> * If I don't pass PANDOC_WRITER_OPTIONS, pandoc.write won't know which >> citation_mode to use, and may lack other relevant settings to write the >> inlines. >> * If I pass all of PANDOC_WRITER_OPTIONS, pandoc.write uses standalone >> mode if the document is in standalone mode, so I get a whole preamble >> within my label. Yet there is no standalone setting in WriterOptions, and I >> can unset it (I've tried to remove `template` but it didn't work). >> * So my best guess so far is to create a new WriterOptions by copying >> only those fields of PANDOC_WRITER_OPTIONS that might be useful to format >> the inlines. >> >> All in all, it's pretty heavy handed just to handle a label of inlines. >> Is there a better approach? Any chance that the JATS writer converts Plain >> blocks to plain blocks without <p> tags? >> >> Best, >> J >> >> -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/633d80cc-6595-4998-8f30-f15765ef1265n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 5326 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Mixed Rawblock / Plain as Para in JATS output [not found] ` <e40cf1d2-9f51-442b-8aa5-28905c6e77b6n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2022-04-18 14:02 ` Julien Dutant @ 2022-04-18 14:03 ` Bastien DUMONT 2022-04-18 15:56 ` Julien Dutant 1 sibling, 1 reply; 5+ messages in thread From: Bastien DUMONT @ 2022-04-18 14:03 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw Hi! You may be interested by the following discussion: https://github.com/jgm/pandoc/issues/7262, which was originally about DOCX but has become more general. (Sadly, it is not even possible to use pandoc.write the achieve the desired result for DOCX.) Le Monday 18 April 2022 à 06:15:06AM, Julien Dutant a écrit : > And related question, shouldn't the WriterOptions Lua constructor include a > `standalone` field? > > On Monday, April 18, 2022 at 2:10:12 PM UTC+1 Julien Dutant wrote: > > Hi all, > > I'm writing a new Lua filter to handle theorems and the like in various > format including JATS ([1]https://github.com/jdutant/statement). Given the > following markdown and a suitable bibliography: > > ```markdown > ::: theorem > (from @article) Some very interesting fact holds. > ::: > ``` > > The filter is trying to generate a JATS statement ( [2]https:// > jats.nlm.nih.gov/archiving/tag-library/1.1/element/statement.html ): > ```xml > <statement> > <label>Theorem 1.1</label> > <title>from Doe, J (2003)</title> > <p>Some very interesting fact holds.</p> > </statement> > ``` > > The problem I encounter is that the JATS writer converts pandoc.Plain to > <p> blocks. So if I build inlines list: > inlines = { pandoc.RawInline('jats','<title>'), ... more inlines ..., > pandoc.RawInline('jats','<title>')} > > and try to insert it in the document with: > blocks:insert( pandoc.Plain(inlines) ) > > I get the unintended output: > ```xml > <p><title>from Doe, J (2003)</title><p> > ``` > > Now I could of course stringify the title inlines first, add the title tags > and insert the result in a RawBlock. But that's bad too, because the title > inlines may contain things that aren't yet to be stringified like a > citation. > > Is it necessary for the JATS writer to turn Plain elements into <p> ones? > The HTML one doesn't, after all. > > At the moment my best approach is to use `pandoc.write` (thanks so much > Albert for giving us this!) to convert the inlines on the spot. There are > some issues with this though. > > * If something in the inlines needs to be handled by another filter it > won't be handled properly. For instance, a pandoc-crossref crossreference > will be turned into plain text and unretrievable. AFAIK my filter can't > tell which other filters are run and can't pass them as WriterOptions > anyway. > * If I don't pass PANDOC_WRITER_OPTIONS, pandoc.write won't know which > citation_mode to use, and may lack other relevant settings to write the > inlines. > * If I pass all of PANDOC_WRITER_OPTIONS, pandoc.write uses standalone mode > if the document is in standalone mode, so I get a whole preamble within my > label. Yet there is no standalone setting in WriterOptions, and I can unset > it (I've tried to remove `template` but it didn't work). > * So my best guess so far is to create a new WriterOptions by copying only > those fields of PANDOC_WRITER_OPTIONS that might be useful to format the > inlines. > > All in all, it's pretty heavy handed just to handle a label of inlines. Is > there a better approach? Any chance that the JATS writer converts Plain > blocks to plain blocks without <p> tags? > > Best, > J > > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email > to [3]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit [4]https://groups.google.com/d/msgid/ > pandoc-discuss/e40cf1d2-9f51-442b-8aa5-28905c6e77b6n%40googlegroups.com. > > References: > > [1] https://github.com/jdutant/statement > [2] https://jats.nlm.nih.gov/archiving/tag-library/1.1/element/statement.html > [3] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > [4] https://groups.google.com/d/msgid/pandoc-discuss/e40cf1d2-9f51-442b-8aa5-28905c6e77b6n%40googlegroups.com?utm_medium=email&utm_source=footer -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/Yl1vl42VOMXANVA3%40localhost. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Mixed Rawblock / Plain as Para in JATS output 2022-04-18 14:03 ` Bastien DUMONT @ 2022-04-18 15:56 ` Julien Dutant 0 siblings, 0 replies; 5+ messages in thread From: Julien Dutant @ 2022-04-18 15:56 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 4848 bytes --] Thanks Bastien - indeed, I posted there now. On Monday, April 18, 2022 at 3:03:10 PM UTC+1 Bastien Dumont wrote: > Hi! You may be interested by the following discussion: > https://github.com/jgm/pandoc/issues/7262, which was originally about > DOCX but has become more general. (Sadly, it is not even possible to use > pandoc.write the achieve the desired result for DOCX.) > > Le Monday 18 April 2022 à 06:15:06AM, Julien Dutant a écrit : > > And related question, shouldn't the WriterOptions Lua constructor > include a > > `standalone` field? > > > > On Monday, April 18, 2022 at 2:10:12 PM UTC+1 Julien Dutant wrote: > > > > Hi all, > > > > I'm writing a new Lua filter to handle theorems and the like in various > > format including JATS ([1]https://github.com/jdutant/statement). Given > the > > following markdown and a suitable bibliography: > > > > ```markdown > > ::: theorem > > (from @article) Some very interesting fact holds. > > ::: > > ``` > > > > The filter is trying to generate a JATS statement ( [2]https:// > > jats.nlm.nih.gov/archiving/tag-library/1.1/element/statement.html ): > > ```xml > > <statement> > > <label>Theorem 1.1</label> > > <title>from Doe, J (2003)</title> > > <p>Some very interesting fact holds.</p> > > </statement> > > ``` > > > > The problem I encounter is that the JATS writer converts pandoc.Plain to > > <p> blocks. So if I build inlines list: > > inlines = { pandoc.RawInline('jats','<title>'), ... more inlines ..., > > pandoc.RawInline('jats','<title>')} > > > > and try to insert it in the document with: > > blocks:insert( pandoc.Plain(inlines) ) > > > > I get the unintended output: > > ```xml > > <p><title>from Doe, J (2003)</title><p> > > ``` > > > > Now I could of course stringify the title inlines first, add the title > tags > > and insert the result in a RawBlock. But that's bad too, because the > title > > inlines may contain things that aren't yet to be stringified like a > > citation. > > > > Is it necessary for the JATS writer to turn Plain elements into <p> ones? > > The HTML one doesn't, after all. > > > > At the moment my best approach is to use `pandoc.write` (thanks so much > > Albert for giving us this!) to convert the inlines on the spot. There are > > some issues with this though. > > > > * If something in the inlines needs to be handled by another filter it > > won't be handled properly. For instance, a pandoc-crossref crossreference > > will be turned into plain text and unretrievable. AFAIK my filter can't > > tell which other filters are run and can't pass them as WriterOptions > > anyway. > > * If I don't pass PANDOC_WRITER_OPTIONS, pandoc.write won't know which > > citation_mode to use, and may lack other relevant settings to write the > > inlines. > > * If I pass all of PANDOC_WRITER_OPTIONS, pandoc.write uses standalone > mode > > if the document is in standalone mode, so I get a whole preamble within > my > > label. Yet there is no standalone setting in WriterOptions, and I can > unset > > it (I've tried to remove `template` but it didn't work). > > * So my best guess so far is to create a new WriterOptions by copying > only > > those fields of PANDOC_WRITER_OPTIONS that might be useful to format the > > inlines. > > > > All in all, it's pretty heavy handed just to handle a label of inlines. > Is > > there a better approach? Any chance that the JATS writer converts Plain > > blocks to plain blocks without <p> tags? > > > > Best, > > J > > > > > > -- > > You received this message because you are subscribed to the Google Groups > > "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email > > to [3]pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit [4] > https://groups.google.com/d/msgid/ > > pandoc-discuss/e40cf1d2-9f51-442b-8aa5-28905c6e77b6n%40googlegroups.com. > > > > References: > > > > [1] https://github.com/jdutant/statement > > [2] > https://jats.nlm.nih.gov/archiving/tag-library/1.1/element/statement.html > > [3] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > > [4] > https://groups.google.com/d/msgid/pandoc-discuss/e40cf1d2-9f51-442b-8aa5-28905c6e77b6n%40googlegroups.com?utm_medium=email&utm_source=footer > > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/b83a8624-342c-4452-8c97-a7e12c884f76n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 8653 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-04-18 15:56 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-04-18 13:10 Mixed Rawblock / Plain as Para in JATS output Julien Dutant [not found] ` <8589f717-fe73-4813-9126-6beba194a3f1n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2022-04-18 13:15 ` Julien Dutant [not found] ` <e40cf1d2-9f51-442b-8aa5-28905c6e77b6n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2022-04-18 14:02 ` Julien Dutant 2022-04-18 14:03 ` Bastien DUMONT 2022-04-18 15:56 ` Julien Dutant
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).