From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/17950 Path: news.gmane.org!.POSTED!not-for-mail From: Sean Winslow Newsgroups: gmane.text.pandoc Subject: Re: Pandoc selectively transfers glyphs from LuaLaTeX to DOCX Date: Mon, 24 Jul 2017 08:01:06 -0700 (PDT) Message-ID: <261e84b1-9891-465a-a21e-80a61b9e98c0@googlegroups.com> References: Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_3571_6549718.1500908466754" X-Trace: blaine.gmane.org 1500908480 6783 195.159.176.226 (24 Jul 2017 15:01:20 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 24 Jul 2017 15:01:20 +0000 (UTC) To: pandoc-discuss Original-X-From: pandoc-discuss+bncBD2NPS5FZMBBBNMX3DFQKGQEAWXCJZA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mon Jul 24 17:01:15 2017 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-pg0-f64.google.com ([74.125.83.64]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dZer6-0001AN-Ki for gtp-pandoc-discuss@m.gmane.org; Mon, 24 Jul 2017 17:01:09 +0200 Original-Received: by mail-pg0-f64.google.com with SMTP id v190sf6282392pgv.2 for ; Mon, 24 Jul 2017 08:01:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=YhyULz5vQNE8l1r93Dxp9ihFrmKQg7Gq0HRM8540iws=; b=YfHAAwKcIxxRxNe08w23VOHdiXHF812j/QuEanE2ElyjXP6/jsIxDqCDAPJGQYDC29 4pCRuMnRMe72H8x8ofnW/BLSanl/tnsPl6PV+Omn+/2aBQp9GNDquLTTrToCfzYZWlOY VwljIA3P4N3fA6H6j1ZZng2H1PFyERUPPDoAXk6zQJLoTWExHmci6PCHpHN/UjBpCnwF 2AlqT6C+FvqP8bQ0ChlFs1UOXZNZDThCBLmrpNfGHmkAWTAbSfxNNJa+KWrSWwj6SYW0 f7aPhEYJhVkw1CFp109lbL7ZTr7/feSYteumXpupvVAxAk/OG7qcu8EPiwcHE7kerZfi BLIQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=YhyULz5vQNE8l1r93Dxp9ihFrmKQg7Gq0HRM8540iws=; b=iKB5XyrqOwc0SunXCorm5RMpyqZzzrU9Lmv3KyU1ukjCQ9Cp6gyJPALbeX2HNGLZ9H A9oAWa3QMWrrzfB34jNWJsgD9s7AoboUsGrWdfQorB/VEXDOtDPfQhNPM2sIM2wNT/uv d7/xodkA6Hmo39svlPH409q7qrTwFgHYB+02DRrBEeCzK0albuZKRJI+jMTK9TUMySyW o1wwrDFLtIVTaMlh3ph5wpxcNmzTXDWUXUMd2lQZEyJhIOn1CnWHuBOxhH5JG6TfRVfn 7K5fLbbjTgCeVJvk+zby1lcfYiugoiI0C/IZ5laNbacB1h2BXC5LOyc/I62TZllZi3e6 yDdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=YhyULz5vQNE8l1r93Dxp9ihFrmKQg7Gq0HRM8540iws=; b=sFMoD9hi56QmKqGuZe8PaNPVKI4mzHcshiAhTKHEYhzdG99PkrkeaB4prGPTnfElF0 GY6KQTjaDo6e1FRtFYj/gDWOFfmsx7VpVbnusXZ2DgbUjWlzVF2rltVG1KF1vCJKnuOw 5sr7I2wJ3PgebRt8smPUU0SjA6OyoeQjKscSvMW+d4Cr9JwUJL8GTSogN16weOjIzvIm HAbGedIVjVs3ShByBQ6dvb8AAD+IcBK6K5ccv+O9kbKakDTPA1cOcRhvJk1bt9+51y+g vIrao9slzKuPTrtzds2NBrwpzv+T97JHVIOQrQhFVO8n5syTcvDu1RiSZHadbv6QU/o2 EKKw== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AIVw110JYb6J0MJwXejhYXay4pyuCtyDJxrAFWqDVK6lr3v3h4qDlopd pRLXJUDIAvhnHg== X-Received: by 10.36.82.207 with SMTP id d198mr282750itb.8.1500908473828; Mon, 24 Jul 2017 08:01:13 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 10.36.29.85 with SMTP id 82ls3481781itj.21.canary-gmail; Mon, 24 Jul 2017 08:01:07 -0700 (PDT) X-Received: by 10.36.50.19 with SMTP id j19mr280500ita.11.1500908467363; Mon, 24 Jul 2017 08:01:07 -0700 (PDT) In-Reply-To: X-Original-Sender: mrspot-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:17950 Archived-At: ------=_Part_3571_6549718.1500908466754 Content-Type: multipart/alternative; boundary="----=_Part_3572_365478317.1500908466754" ------=_Part_3572_365478317.1500908466754 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable John, Thank you for the quick response, and for adding that! I currently have the= =20 release branch of pandoc installed from homebrew, but I will remove it and= =20 compile from the master branch late this evening to test out the solution. Can you say more about what you're doing? Are you=20 converting this latex to some other format? If so,=20 which?=20 I wrote a dissertation on Ethiopian scribal practices which uses a lot of= =20 LaTeX features (fig, subfig, pdfparcols, tikz, datatool, special=20 diacritics, font-switching for Ethiopic, Arabic, greek). It has been=20 accepted (with revisions) for publication, but I need to get the file into= =20 docx for the publisher, so that it fits the workflow they have for=20 inDesign. Luckily, they do not want the images in, so I am going to write a= =20 macro that changes figures to the figure name and caption, and I realize=20 all the parcolumns/tikz/datatool stuff is probably a complete loss and=20 needs to be redone by hand, but there is so much Ethiopic text and=20 transcribed Ethiopic that it would be a nightmare to replace it all by=20 hand, so I am very keen to transfer that over automatically. After I=20 recompile from the master branch, I will still be trying to figure out=20 these issues: 1. As written, it is also highly-referenced, but labels do not seem to be= =20 transferring over--is there a procedure for making \label and \ref work, or= =20 do I need to fix every one by hand? 2. In LaTeX, I have a=20 \renewcommand{\includegraphics}[2][]{% {(((\url{#2})))}% print file name in a small box with triple parens } which lists the name of the file and the caption. In the pandoc-created=20 docx, the caption and the optional table of figures caption print twice,=20 without the filename. Is there something wrong with the syntax of my=20 renewcommand? 3. The Ethiopic text transfers over correctly, but since my main font=20 (Brill) does not contain Ethiopic glyphs, I have a \newfontfamily\ethiopicfont[Script=3DEthiopic]{Abyssinica SIL} set up. In the docx, I see blocks, which when I change the font by hand to= =20 Abyssinica, render correctly. What command do I need to pass to pandoc to= =20 get it to set the ethiopicfont in a different font? BPJ, I know nothing about filters in pandoc--what would you suggest as a=20 starting place to learn more? Would these potentially help me with any of= =20 the issues above? Thanks, -Sean On Saturday, July 22, 2017 at 12:37:26 PM UTC-4, Sean Winslow wrote: > > I am trying to convert a dissertation from LaTex to Word, in order to=20 > comply with publisher requirements. Part of why I used LaTeX is my need f= or=20 > complicated diacritics in transcriptions, which XeLaTeX/LuaLaTeX and the= =20 > dblaccent package made easy. Now, when I use pandoc to output to docx,=20 > certain glyphs are missing. See, for example, \b{q} in Maqala and \v{\d{C= }}=20 > in Chelaqwot: > > > LuaLaTeX (or XeLaTeX) produces this: > > > > > But this is what I see in Word: > > > > > > Here is my MWE:=20 > > %!TEX TS-program =3D lualatex > %!TEX encoding =3D UTF-8 Unicode > > \documentclass[a4]{memoir} > > %packages > \usepackage{fontspec} > \usepackage{dblaccnt} > > \usepackage{savesym} > \savesymbol{U} > \savesymbol{T} > \usepackage{semtrans} > > %newcommands > \newcommand{\schwa}{=C7=9D} > \newcommand{\mekele}{M\"{a}\b{q}\"{a}l\"{a}} > \newcommand{\chelekot}{\d{\v{C}}el\=3D{a}qwot S\schwa{}lasse} > > \defaultfontfeatures{Mapping=3Dtex-text} > \setromanfont[Mapping=3Dtex-text]{Brill} > > \begin{document} > > The two research locations visited were \mekele{} and \chelekot{}.\par > > \end{document} > > and the pandoc command I am using to convert it: > > pandoc test.tex \ > > --from=3Dlatex \ > > --to=3Ddocx \ > > --output=3Dtest.docx \ > > --latex-engine=3Dlualatex \ > > --reference-docx=3Dtest_ref.docx \ > > -S \ > > -R > > The reference-docx is just the output, but changed to use Brill as the=20 > font. > > Is there any way to have pandoc pass along the special diacritics I need?= =20 > Re-doing all of them by hand will be a nightmare, and is a lot of the=20 > reason I am learning pandoc. > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/261e84b1-9891-465a-a21e-80a61b9e98c0%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. ------=_Part_3572_365478317.1500908466754 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
John,

Thank you for the quick response,= and for adding that! I currently have the release branch of pandoc install= ed from homebrew, but I will remove it and compile from the master branch l= ate this evening to test out the solution.

Can you say more = about what you're doing? =C2=A0Are you=C2=A0
converting this = latex to some other format? =C2=A0If so,=C2=A0
which?=C2=A0
=

I wrote a dissertation on Ethiopian scriba= l practices which uses a lot of LaTeX features (fig, subfig, pdfparcols, ti= kz, datatool, special diacritics, font-switching for Ethiopic, Arabic, gree= k). It has been accepted (with revisions) for publication, but I need to ge= t the file into docx for the publisher, so that it fits the workflow they h= ave for inDesign. Luckily, they do not want the images in, so I am going to= write a macro that changes figures to the figure name and caption, and I r= ealize all the parcolumns/tikz/datatool stuff is probably a complete loss a= nd needs to be redone by hand, but there is so much Ethiopic text and trans= cribed Ethiopic that it would be a nightmare to replace it all by hand, so = I am very keen to transfer that over automatically. After I recompile from = the master branch, I will still be trying to figure out these issues:
=

1. As written, it is also highly-referenced, but labels= do not seem to be transferring over--is there a procedure for making \labe= l and \ref work, or do I need to fix every one by hand?

2. In LaTeX, I have a=C2=A0
\renewcommand{\includegraphics}[2][]{%
=C2=A0 =C2=A0 {(((\url{#2})))}% print file name in a small box with trip= le parens
}
whic= h lists the name of the file and the caption. In the pandoc-created docx, t= he caption and the optional table of figures caption print twice, without t= he filename. Is there something wrong with the syntax of my renewcommand?

3. The Ethiopic text transfers over correctly, but = since my main font (Brill) does not contain Ethiopic glyphs, I have a
=
\newfontfamily\ethiopicfont= [Script=3DEthiopic]{Abyssinica SIL}
set up. In the docx, I see blocks, which when I change the font by hand = to Abyssinica, render correctly. What command do I need to pass to pandoc t= o get it to set the ethiopicfont in a different font?

BPJ,

I know nothing about filters in pando= c--what would you suggest as a starting place to learn more? Would these po= tentially help me with any of the issues above?

Th= anks,

-Sean

On Saturday, July 22, 2017 at 1= 2:37:26 PM UTC-4, Sean Winslow wrote:

I am trying= to convert a dissertation from LaTex to Word, in order to comply with publ= isher requirements. Part of why I used LaTeX is my need for complicated dia= critics in transcriptions, which XeLaTeX/LuaLaTeX and the dblaccent package= made easy. Now, when I use pandoc to output to docx, certain glyphs are mi= ssing. See, for example, \b{q} in Maqala and \v{\d{C}} in Chelaqwot:


LuaLaTeX (or XeLaTeX) produces this:

But this is what I see in Word:


Here is my MWE:= =C2=A0

%!TEX TS-program =3D lu= alatex
%!TEX encoding =3D UTF-8 Unicode

= \documentclass[a4]{memoir}

%packages
\us= epackage{fontspec}
\usepackage{dblaccnt}

\usepackage{savesym}
\savesymbol{U}
\savesymbol{T}
\usepackage{semtrans}

%newcommands
\newcommand{\schwa}{=C7=9D}
\newcommand{\mekele}{M\"{a}\b{= q}\"{a}l\"{a}}
\newcommand{\chelekot}{\d{\v{C= }}el\=3D{a}qwot S\schwa{}lasse}

\defaultfontfeatur= es{Mapping=3Dtex-text}
\setromanfont[Mapping=3Dtex-text= ]{Brill}

\begin{document}

The two research locations visited were \mekele{} and \chelekot{}.\par

\end{document}

and= the pandoc command I am using to convert it:

pandoc test.tex \

=C2=A0 =C2=A0 --from=3Dlatex \

=C2=A0 =C2=A0 --to=3Ddocx \

=C2=A0 =C2=A0 --output=3Dtest.docx \

=C2=A0 =C2=A0 --latex-engine=3Dlualatex \

=C2=A0 =C2=A0 --reference-docx=3Dtest_ref.docx \

=C2=A0 =C2=A0 -S \

=C2=A0 =C2=A0 -R


The reference-do= cx is just the output, but changed to use Brill as the font.

=
Is there any way to have pandoc pass along the special diacritic= s I need? Re-doing all of them by hand will be a nightmare, and is a lot of= the reason I am learning pandoc.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/= msgid/pandoc-discuss/261e84b1-9891-465a-a21e-80a61b9e98c0%40googlegroups.co= m.
For more options, visit http= s://groups.google.com/d/optout.
------=_Part_3572_365478317.1500908466754-- ------=_Part_3571_6549718.1500908466754--