public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: BPJ <melroch-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Denis Maier <denis.maier.lists-cl+VPiYnx/1AfugRpC6u6w@public.gmane.org>
Cc: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: Set language attribute directly on
Date: Wed, 23 Sep 2020 09:45:07 +0200	[thread overview]
Message-ID: <CADAJKhD9Cvd1dY6svqCro=QpZEtn13T3xx8H2HAi78Z7CMq7qQ@mail.gmail.com> (raw)
In-Reply-To: <63355b45-399b-023c-cc49-ca91a0347ab9-cl+VPiYnx/1AfugRpC6u6w@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 5789 bytes --]

Den sön 20 sep. 2020 11:04Denis Maier <denis.maier.lists-cl+VPiYnx/1AfugRpC6u6w@public.gmane.org> skrev:

> Yes, that's probably right.
>
> I'm currently in the process of figuring out the workflow. I was hoping I
> could away with not having too many tools involved. But maybe using an XML
> processing library with Python or some XSLT could be an easier solution
> than using pandoc filters for this. We'll see...
>

I don't know what, if anything the JATS writer makes of Pandoc spans and
divs. Possibly you may need both one or more Pandoc filters to replace
Pandoc spans and divs with raw markup which is more palatable to JATS *and*
use an XML processing library.

A benefit from using a Python XML library is that you can read in
configuration from a YAML file, though obviously you can get the same
benefit if you write your filters in Python instead of Lua. Also as I said
earlier if the format is just one `KEY: VALUE` per line it is trivial to
open the file in Lua, parse each line with a pattern
`^\s*([_%w]+)\s*%:\s*(.*)` (note that `%w` does not include the underscore
in Lua!) and populate a table with the captures. Note that you can put Lua
libraries both in the Lua search path and in `~/.pandoc` and load them from
a Lua filter!

Anyway you can probably hide away most of the ugliness in Pandoc defaults
files and/or Makefiles and/or wrapper scripts. I use some Pandoc filters,
pp <https://github.com/CDSoft/pp> macros (which also can convert CSV files
to Markdown tables and fill in Mustache templates, as well as injecting the
output of Python/Lua/shell code — I even can use Perl code through "shell"
code with a suitable shebang line!) and custom LaTeX files injected into
the preamble plus a similar set of tools for generating HTML from the same
Markdown sources, all administered through one Pandoc defaults file per
output format and a Makefile for generating sets of files or single files,
with the source and target specified as Makefile variables on the command
line. Thus I usually need not type a lot of command lines but typically
just the make target and possibly specify a variable or two, and everything
just happens, including removing temporary files if any, although I can
usually just pipe everything from pp into pandoc. Usually I have one
directory for files related to each output format `pdf-config`,
`html-config` with the files in each dir called `pandoc.yaml`,
`header.ltx`, `macros.pp.md` so the paths can be constructed by injecting a
variable for the output format. The exception is CSS files which go into a
`css` dir and are linked from each HTML file, and some filters (both Lua
and Perl in my case) which are used for either output format which go into
a `filters` directory or live in `~/.pandoc/filters/` or `~/bin/` if they
are useful for multiple projects.

Well I guess most of this is obvious, but perhaps something in my workflow
is useful.


> Am 19.09.2020 um 21:05 schrieb BPJ:
>
> Den lör 19 sep. 2020 13:14Denis Maier <denis.maier.lists-cl+VPiYnx/1AfugRpC6u6w@public.gmane.org>
> skrev:
>
>> Hi
>>
>> Is it possible to set the language attribute directly on a paragraph or
>> quotation?
>>
>> The documentation gives this example:
>>
>> ::: {lang=fr-CA}
>> > Cette citation est écrite en français canadien.
>> :::
>>
>> But, in JATS, this ends up as
>>
>> <boxed-text xml:lang="fr-CA">
>>     <disp-quote>
>>       <p>Cette citation est écrite en français canadien.</p>
>>     </disp-quote>
>>   </boxed-text>
>>
>> Is there a simple way to set attibute directly on a paragraph or a
>> quotation? To get this:
>>
>>     <disp-quote>
>>       <p xml:lang="fr-CA">Cette citation est écrite en français
>> canadien.</p>
>>     </disp-quote>
>>
>
> Judging from this and your other recent posts you might benefit from
> postprocessing your JATS with an XML processing library in Perl/Python.
> Maybe XSLT. That would be like a Pandoc filter but you filter the XML
> output by pandoc rather than the Pandoc AST before the XML is generated.
>
> Just a thought.
>
> /bpj
>
>
>
>
>> Best,
>> Denis
>> --
>> You received this message because you are subscribed to the Google Groups
>> "pandoc-discuss" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/pandoc-discuss/efcd2673-1f02-fd29-c2e1-19b90f7d0ba2%40mailbox.org
>> <https://groups.google.com/d/msgid/pandoc-discuss/efcd2673-1f02-fd29-c2e1-19b90f7d0ba2%40mailbox.org?utm_medium=email&utm_source=footer>
>> .
>>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhDsuZQVxqNP3bQAFqPHvgysG40WkgAPUtebadM27zvduA%40mail.gmail.com
> <https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhDsuZQVxqNP3bQAFqPHvgysG40WkgAPUtebadM27zvduA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhD9Cvd1dY6svqCro%3DQpZEtn13T3xx8H2HAi78Z7CMq7qQ%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 9165 bytes --]

      parent reply	other threads:[~2020-09-23  7:45 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-19 11:13 Denis Maier
     [not found] ` <efcd2673-1f02-fd29-c2e1-19b90f7d0ba2-cl+VPiYnx/1AfugRpC6u6w@public.gmane.org>
2020-09-19 17:41   ` John MacFarlane
2020-09-19 19:05   ` BPJ
     [not found]     ` <CADAJKhDsuZQVxqNP3bQAFqPHvgysG40WkgAPUtebadM27zvduA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2020-09-20  9:04       ` Denis Maier
     [not found]         ` <63355b45-399b-023c-cc49-ca91a0347ab9-cl+VPiYnx/1AfugRpC6u6w@public.gmane.org>
2020-09-23  7:45           ` BPJ [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CADAJKhD9Cvd1dY6svqCro=QpZEtn13T3xx8H2HAi78Z7CMq7qQ@mail.gmail.com' \
    --to=melroch-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=denis.maier.lists-cl+VPiYnx/1AfugRpC6u6w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).