public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: Sean Winslow <mrspot-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: Pandoc selectively transfers glyphs from LuaLaTeX to DOCX
Date: Mon, 24 Jul 2017 08:01:06 -0700 (PDT)	[thread overview]
Message-ID: <261e84b1-9891-465a-a21e-80a61b9e98c0@googlegroups.com> (raw)
In-Reply-To: <b4abf81b-74e7-490a-8cb9-f6a313c651e0-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 5035 bytes --]

John,

Thank you for the quick response, and for adding that! I currently have the 
release branch of pandoc installed from homebrew, but I will remove it and 
compile from the master branch late this evening to test out the solution.

Can you say more about what you're doing?  Are you 
converting this latex to some other format?  If so, 
which? 


I wrote a dissertation on Ethiopian scribal practices which uses a lot of 
LaTeX features (fig, subfig, pdfparcols, tikz, datatool, special 
diacritics, font-switching for Ethiopic, Arabic, greek). It has been 
accepted (with revisions) for publication, but I need to get the file into 
docx for the publisher, so that it fits the workflow they have for 
inDesign. Luckily, they do not want the images in, so I am going to write a 
macro that changes figures to the figure name and caption, and I realize 
all the parcolumns/tikz/datatool stuff is probably a complete loss and 
needs to be redone by hand, but there is so much Ethiopic text and 
transcribed Ethiopic that it would be a nightmare to replace it all by 
hand, so I am very keen to transfer that over automatically. After I 
recompile from the master branch, I will still be trying to figure out 
these issues:

1. As written, it is also highly-referenced, but labels do not seem to be 
transferring over--is there a procedure for making \label and \ref work, or 
do I need to fix every one by hand?

2. In LaTeX, I have a 
\renewcommand{\includegraphics}[2][]{%
    {(((\url{#2})))}% print file name in a small box with triple parens
}
which lists the name of the file and the caption. In the pandoc-created 
docx, the caption and the optional table of figures caption print twice, 
without the filename. Is there something wrong with the syntax of my 
renewcommand?

3. The Ethiopic text transfers over correctly, but since my main font 
(Brill) does not contain Ethiopic glyphs, I have a
\newfontfamily\ethiopicfont[Script=Ethiopic]{Abyssinica SIL}
set up. In the docx, I see blocks, which when I change the font by hand to 
Abyssinica, render correctly. What command do I need to pass to pandoc to 
get it to set the ethiopicfont in a different font?

BPJ,

I know nothing about filters in pandoc--what would you suggest as a 
starting place to learn more? Would these potentially help me with any of 
the issues above?

Thanks,

-Sean

On Saturday, July 22, 2017 at 12:37:26 PM UTC-4, Sean Winslow wrote:
>
> I am trying to convert a dissertation from LaTex to Word, in order to 
> comply with publisher requirements. Part of why I used LaTeX is my need for 
> complicated diacritics in transcriptions, which XeLaTeX/LuaLaTeX and the 
> dblaccent package made easy. Now, when I use pandoc to output to docx, 
> certain glyphs are missing. See, for example, \b{q} in Maqala and \v{\d{C}} 
> in Chelaqwot:
>
>
> LuaLaTeX (or XeLaTeX) produces this:
>
>
> <https://lh3.googleusercontent.com/-Nto3OG7FE8c/WXN9zv4_J3I/AAAAAAAAB3k/D5NTrxzrD2cjsdAQ9cWQUUkFTZzQYoYuwCLcBGAs/s1600/screenshot_latex.png>
>
> But this is what I see in Word:
>
>
> <https://lh3.googleusercontent.com/-Rjn4Lnkaxx8/WXN9_oO77FI/AAAAAAAAB3o/aa-KxIii0jwM3zt0OO1PqI9RkIDu1TfQgCLcBGAs/s1600/screenshot_word.png>
>
>
> Here is my MWE: 
>
> %!TEX TS-program = lualatex
> %!TEX encoding = UTF-8 Unicode
>
> \documentclass[a4]{memoir}
>
> %packages
> \usepackage{fontspec}
> \usepackage{dblaccnt}
>
> \usepackage{savesym}
> \savesymbol{U}
> \savesymbol{T}
> \usepackage{semtrans}
>
> %newcommands
> \newcommand{\schwa}{ǝ}
> \newcommand{\mekele}{M\"{a}\b{q}\"{a}l\"{a}}
> \newcommand{\chelekot}{\d{\v{C}}el\={a}qwot S\schwa{}lasse}
>
> \defaultfontfeatures{Mapping=tex-text}
> \setromanfont[Mapping=tex-text]{Brill}
>
> \begin{document}
>
> The two research locations visited were \mekele{} and \chelekot{}.\par
>
> \end{document}
>
> and the pandoc command I am using to convert it:
>
> pandoc test.tex \
>
>     --from=latex \
>
>     --to=docx \
>
>     --output=test.docx \
>
>     --latex-engine=lualatex \
>
>     --reference-docx=test_ref.docx \
>
>     -S \
>
>     -R
>
> The reference-docx is just the output, but changed to use Brill as the 
> font.
>
> Is there any way to have pandoc pass along the special diacritics I need? 
> Re-doing all of them by hand will be a nightmare, and is a lot of the 
> reason I am learning pandoc.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/261e84b1-9891-465a-a21e-80a61b9e98c0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 8906 bytes --]

  parent reply	other threads:[~2017-07-24 15:01 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-22 16:37 Sean Winslow
     [not found] ` <b4abf81b-74e7-490a-8cb9-f6a313c651e0-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-07-23  7:47   ` John MacFarlane
2017-07-23 23:20   ` Sean Winslow
     [not found]     ` <94be1e1e-c49f-4fe6-92fe-4aaf13c083f3-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-07-24  8:58       ` John MacFarlane
     [not found]         ` <20170724085825.GA4877-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org>
2017-07-24  9:17           ` John MacFarlane
2017-07-24 11:34   ` Melroch
2017-07-24 15:01   ` Sean Winslow [this message]
     [not found]     ` <261e84b1-9891-465a-a21e-80a61b9e98c0-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-07-24 17:25       ` John MacFarlane
     [not found]         ` <20170724172502.GA26245-l/d5Ua9yGnxXsXJlQylH7w@public.gmane.org>
2017-07-25 16:30           ` Sean Winslow
     [not found]             ` <6ac7783a-acbb-4d7f-8ed4-0fcf150d3422-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-07-25 16:46               ` John MacFarlane
2017-07-24 19:27       ` BP Jonsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=261e84b1-9891-465a-a21e-80a61b9e98c0@googlegroups.com \
    --to=mrspot-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).