public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Mixed Rawblock / Plain as Para in JATS output
@ 2022-04-18 13:10 Julien Dutant
       [not found] ` <8589f717-fe73-4813-9126-6beba194a3f1n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Julien Dutant @ 2022-04-18 13:10 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2975 bytes --]

Hi all,

I'm writing a new Lua filter to handle theorems and the like in various 
format including JATS (https://github.com/jdutant/statement). Given the 
following markdown and a suitable bibliography:

```markdown
::: theorem
(from @article) Some very interesting fact holds.
:::
```

The filter is trying to generate a JATS statement ( 
https://jats.nlm.nih.gov/archiving/tag-library/1.1/element/statement.html ):
```xml
<statement>
<label>Theorem 1.1</label>
<title>from Doe, J (2003)</title>
<p>Some very interesting fact holds.</p>
</statement>
```

The problem I encounter is that the JATS writer converts pandoc.Plain to 
<p> blocks. So if I build inlines list:
inlines = { pandoc.RawInline('jats','<title>'), ... more inlines ..., 
pandoc.RawInline('jats','<title>')}

and try to insert it in the document with:
blocks:insert( pandoc.Plain(inlines) )

I get the unintended output:
```xml
<p><title>from Doe, J (2003)</title><p>
```

Now I could of course stringify the title inlines first, add the title tags 
and insert the result in a RawBlock. But that's bad too, because the title 
inlines may contain things that aren't yet to be stringified like a 
citation. 

Is it necessary for the JATS writer to turn Plain elements into <p> ones? 
The HTML one doesn't, after all.

At the moment my best approach is to use `pandoc.write` (thanks so much 
Albert for giving us this!) to convert the inlines on the spot. There are 
some issues with this though.

* If something in the inlines needs to be handled by another filter it 
won't be handled properly. For instance, a pandoc-crossref crossreference 
will be turned into plain text and unretrievable. AFAIK my filter can't 
tell which other filters are run and can't pass them as WriterOptions 
anyway.
* If I don't pass PANDOC_WRITER_OPTIONS, pandoc.write won't know which 
citation_mode to use, and may lack other relevant settings to write the 
inlines.
* If I pass all of PANDOC_WRITER_OPTIONS, pandoc.write uses standalone mode 
if the document is in standalone mode, so I get a whole preamble within my 
label. Yet there is no standalone setting in WriterOptions, and I can unset 
it (I've tried to remove `template` but it didn't work).
* So my best guess so far is to create a new WriterOptions by copying only 
those fields of  PANDOC_WRITER_OPTIONS that might be useful to format the 
inlines.

All in all, it's pretty heavy handed just to handle a label of inlines. Is 
there a better approach? Any chance that the JATS writer converts Plain 
blocks to plain blocks without <p> tags?

Best,
J

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/8589f717-fe73-4813-9126-6beba194a3f1n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 3855 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-04-18 15:56 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-18 13:10 Mixed Rawblock / Plain as Para in JATS output Julien Dutant
     [not found] ` <8589f717-fe73-4813-9126-6beba194a3f1n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-04-18 13:15   ` Julien Dutant
     [not found]     ` <e40cf1d2-9f51-442b-8aa5-28905c6e77b6n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-04-18 14:02       ` Julien Dutant
2022-04-18 14:03       ` Bastien DUMONT
2022-04-18 15:56         ` Julien Dutant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).