From: BP Jonsson <bpjonsson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org,
Sean Winslow <mrspot-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
Subject: Re: Pandoc selectively transfers glyphs from LuaLaTeX to DOCX
Date: Mon, 24 Jul 2017 21:27:28 +0200 [thread overview]
Message-ID: <07b8ac66-75e5-04b8-b39c-d60157171baf@gmail.com> (raw)
In-Reply-To: <261e84b1-9891-465a-a21e-80a61b9e98c0-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Den 2017-07-24 kl. 17:01, skrev Sean Winslow:
> BPJ,
>
> I know nothing about filters in pandoc--what would you suggest as
> a starting place to learn more? Would these potentially help me
> with any of the issues above?
>
Actually the main problem with your LaTeX is that you are using
the legacy LaTeX accent commands instead of actual Unicode characters.
For one thing you shouldn't do that, because the main reason for
using XeTeX or LuaTeX is that they handle Unicode natively.
Secondly it is exactly the legacy accent commands which throw
Pandoc in your MWE. Once I had converted the legacy commands to
their Unicode equivalents your Pandoc converted your MWE to DOCX
just fine. (As I don't have Word I've checked it in LibreOffice,
where it looks OK.) Luckily you don't need to convert all those
legacy commands by hand. There is a Perl module LaTeX::Decode
which does that for you. Unfortunately there is a bug in the
command line script coming with the module, but I have written my
own CLI script which doesn't have that bug. :-)
Since you are on a Mac you should have a new enough version of
perl installed already. All you should need to do is to download
my script from <https://git.io/v7to6> unpack the contents into the
same directory (aka folder) as your original LaTeX file and run
the following commands:
cpan App::cpanminus
cpanm LaTeX::Decode Encode Unicode::Normalize Getopt::Long
Pod::Usage
perl ltx2utf8.pl nameofyourlatexfile.tex | pandoc -r latex -o
nameofyourdocxfile.docx
That will at least take care of the diacritics. Other fancy things
you have used like tikz will need to be addressed separately. I
have a somewhat working script to extract tikzpictures from a
LaTeX file, compile each to a PDF and print out the LaTeX file
with each `\begin{tikzpicture}...\end{tikzpicture}` replaced with
a `\includegraphics{...}` pointing to the right PDF file. I just
tried converting a LaTeX file thus processed to DOCX. It worked
but for some reason the fonts were lost in the DOCX. Your
publisher will anyway want to have any image files by themselves
if I'm not mistaken. This latter script lacks some necessary
documentation, which I have no time to write today. Let me know if
you are interested.
/bpj
prev parent reply other threads:[~2017-07-24 19:27 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-22 16:37 Sean Winslow
[not found] ` <b4abf81b-74e7-490a-8cb9-f6a313c651e0-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-07-23 7:47 ` John MacFarlane
2017-07-23 23:20 ` Sean Winslow
[not found] ` <94be1e1e-c49f-4fe6-92fe-4aaf13c083f3-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-07-24 8:58 ` John MacFarlane
[not found] ` <20170724085825.GA4877-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org>
2017-07-24 9:17 ` John MacFarlane
2017-07-24 11:34 ` Melroch
2017-07-24 15:01 ` Sean Winslow
[not found] ` <261e84b1-9891-465a-a21e-80a61b9e98c0-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-07-24 17:25 ` John MacFarlane
[not found] ` <20170724172502.GA26245-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org>
2017-07-25 16:30 ` Sean Winslow
[not found] ` <6ac7783a-acbb-4d7f-8ed4-0fcf150d3422-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-07-25 16:46 ` John MacFarlane
2017-07-24 19:27 ` BP Jonsson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=07b8ac66-75e5-04b8-b39c-d60157171baf@gmail.com \
--to=bpjonsson-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
--cc=mrspot-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
--cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).