From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/17846 Path: news.gmane.org!.POSTED!not-for-mail From: Alan Storm Newsgroups: gmane.text.pandoc Subject: Re: Pandoc, XeLaTex, and Hebrew Characters Date: Tue, 20 Jun 2017 15:21:44 -0700 (PDT) Message-ID: References: <3d7a8fd7-346d-4c74-ad52-e137c9509719@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_1949_1972603551.1497997305220" X-Trace: blaine.gmane.org 1497997307 30665 195.159.176.226 (20 Jun 2017 22:21:47 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Tue, 20 Jun 2017 22:21:47 +0000 (UTC) To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCUNT6E62AJBB6N7U3FAKGQEJ4W2MTY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Wed Jun 21 00:21:42 2017 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-oi0-f58.google.com ([209.85.218.58]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1dNRWn-0007cz-5d for gtp-pandoc-discuss@m.gmane.org; Wed, 21 Jun 2017 00:21:41 +0200 Original-Received: by mail-oi0-f58.google.com with SMTP id v74sf16757280oie.1 for ; Tue, 20 Jun 2017 15:21:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=B1jfpsSoG6VkadsyYmwhvBOFiMYpCtO7fqURYWWB9p8=; b=mOo+gLa3+53CJ2BMzVMR170lBkKmPAMqLv/V27Sf4T4StrZR8CCMZ7UuNGTIuJLT9P FDPQnemnSxZ+zLUWrLJPZaEwsdl26jv9ErBQB8b/X7DQJEQQ4bGdZ0DHWulcWxGHxOAe uFYhuG5MJlV8lXzoJBDrPXF88lAPFwgZhs9uQRZWKao1ywCkskC+DfMJ1z7T0125owZD f04Vuvaq7ymOUK23CgXHq44Qz0ptzxLAe5snstcTQ8LeZOtwB8cx4+wjz6Ks2bmtx1TH yXyHAeDRvgG0vpqg2JuV6DGV+XattXi6j4q5ejaXWOxBezIwwdZ25Mki2KhEv/0LweVU YJMg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=B1jfpsSoG6VkadsyYmwhvBOFiMYpCtO7fqURYWWB9p8=; b=aUt+q5L8Xa5hiaHix0rb1qEDXIQNj3St7C5QSm8bsdGhwDS8J1QJ0Xd7IWwZgsxP01 byufyhrN6lnqHapcThdL1I35rTLlEhUbuDw92DwumCRKVzjaTMlA7fh+ORQo+hbq/Btx 2KdY3zAxenRvZrnDh9GWMA18Dkik9iCiN4ERcAX3EHDQzt5Nu36pMYyORcqjCP8Txtl/ 4WH12hee2nhe5v4tnX1YvM1n+vIubRpBECfRJDwmLp16uAoJUwrvc0HnL2z0EGtAzR5+ CBVWu8AYh+THY1XBeS9DmgsXiuKdQlbvkjOr4IapIB9X6qSFZZ3J1G2hwug8kjEQIQjR 5vkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=B1jfpsSoG6VkadsyYmwhvBOFiMYpCtO7fqURYWWB9p8=; b=GN01zv95qb3ZLN0TDebhmDOfExtEetArBZMKTREUnj7wGtLF+DYBAkrEHWPy7ldeTW vJ7hAfhl87xAX3b31F4C5fLcYxSlBdETh6b2MwciOYGe6QjZLmpnwMYnWHUjhAgdysQa rFIITkQ22lemWccKLhnnfq8x/7wG2mTGGwfRVOwwzxRxNOhTGsFAhmbXljfuFI8ZXg/F coQHBAx/gZHHJKC42JUM7M80LTuNkUr2Ao29zE8aRJEAXB7ruz+9nMpqQEyl0NYXsQpJ eGPikfMU/xVXkWhADpcpp6KWfGq1YQ2hZxhvuhp1qO3D5VyF9LmfNfm7d3OqDo4mJLS5 JA8A== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AKS2vOy0OvTo7scPzwj+gvshlAQEPnQWnLZp7Ph1MOwFQR2c8MZ1JMUD OPQlBZVyJxDGRg== X-Received: by 10.157.5.178 with SMTP id 47mr48156otd.19.1497997306157; Tue, 20 Jun 2017 15:21:46 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 10.157.80.167 with SMTP id b39ls1180339oth.37.gmail; Tue, 20 Jun 2017 15:21:45 -0700 (PDT) X-Received: by 10.157.82.166 with SMTP id f38mr1005944oth.8.1497997305639; Tue, 20 Jun 2017 15:21:45 -0700 (PDT) In-Reply-To: <3d7a8fd7-346d-4c74-ad52-e137c9509719-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> X-Original-Sender: alan.storm-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:17846 Archived-At: ------=_Part_1949_1972603551.1497997305220 Content-Type: multipart/alternative; boundary="----=_Part_1950_342050202.1497997305220" ------=_Part_1950_342050202.1497997305220 Content-Type: text/plain; charset="UTF-8" Thanks all for the help. The close the loop on this for anyone coming along later on. 1. Despite hearing it from several people, it took me a while to realize that pandoc, even when converting from tex to PDF, will still perform it's "munge this doc into an internal representation" sub-routines. This means most of what you're doing in LaTeX gets lost, unless pandoc happens to understand it 2. Pandoc's default font doesn't support hebrew: https://github.com/jgm/pandoc/issues/3742 3. You can specify a font that does support pandoc via `-V mainfont:` : https://tex.stackexchange.com/a/375544/4689 4. If you're dealing with a document that has mixed english/hebrew you may be out of luck with pure pandoc, as you need to markup which sections contain hebrew and which contain english: https://tex.stackexchange.com/a/375498/4689 Thanks again for the help, and best wishes! On Saturday, June 17, 2017 at 9:35:05 AM UTC-7, Alan Storm wrote: > > New to the group, and only (so far!) the most casual of pandoc users. I'm > having some trouble getting pandoc to do things that the xelatex command > can do. I'm looking for someone (or someones) that can help get me to a > solution **and** explain what pandoc is doing behind the scenes so I can > debug these sort of problems in the future. > > In short -- I have an tex document with some hebrew characters. If I use > the xelatex command directly to convert this document, the hebrew > characters are rendered correctly in the PDF > > $ xelatex simple.tex > > > However, if I attempt to do the conversion with pandoc using the > xelatex engine > > $ pandoc --latex-engine=xelatex simple.tex -o from-pandoc.pdf > > > pandoc will render the PDF **without** the hebrew characters. There's > just blank white space where the hebrew should be. > > So, first question if anyone knows: How do I make pandoc render the hebrew > into a PDF? > > If there's no clear path to that answer, how to do I debug the > pandoc/xelatex interaction in order to understand why pandoc's use of the > engine produces different results, and how I might be able to change that > invocation. Regarding this -- my understanding of pandoc is very limited, > so baby words/steps that let me become less of a baby are appreciated :) > > > Also, I have some specific examples posted over on the tex StackExchange > with hebrew examples > > > https://tex.stackexchange.com/questions/375380/getting-hebrew-support-working-with-pandoc/375443 > > > https://tex.stackexchange.com/questions/375380/getting-hebrew-support-working-with-pandoc > > Finally -- my ultimate goal here is to convert HTML documents with hebrew > into PDFs. As that also doesn't work, I'm focused on understanding the > [tex] to [pdf] conversion. If there's some extra wrinkle involved with > [html] to [pdf] ( which I presume -- (correctly?) -- involves xelatex in > the middle) feel free to chime in there as well. > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ab8c9513-f33d-4965-8c3c-a10c855c9a3d%40googlegroups.com. For more options, visit https://groups.google.com/d/optout. ------=_Part_1950_342050202.1497997305220 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks all for the help. =C2=A0The close the loop on = this for anyone coming along later on.=C2=A0

1. De= spite hearing it from several people, it took me a while to realize that pa= ndoc, even when converting from tex to PDF, will still perform it's &qu= ot;munge this doc into an internal representation" sub-routines. =C2= =A0This means most of what you're doing in LaTeX gets lost, unless pand= oc happens to understand it

2. Pandoc's defaul= t font doesn't support hebrew: https://github.com/jgm/pandoc/issues/374= 2

3. You can specify a font that does support pand= oc via `-V mainfont:` : https://tex.stackexchange.com/a/375544/4689

4. If you're dealing with a document that has mixed e= nglish/hebrew you may be out of luck with pure pandoc, as you need to marku= p which sections contain hebrew and which contain english: https://tex.stac= kexchange.com/a/375498/4689

Thanks again for the h= elp, and best wishes!

On Saturday, June 17, 2017 at 9:35:05 AM UTC= -7, Alan Storm wrote:
New to the group, and only (so far!) the most casual of pando= c users. I'm having some trouble getting pandoc to do things that the x= elatex command can do. =C2=A0I'm looking for someone (or someones) that= can help get me to a solution **and** explain what pandoc is doing behind = the scenes so I can debug these sort of problems in the future.=C2=A0
=

In short -- I have an tex document with some hebrew cha= racters. =C2=A0If I use the =C2=A0xelatex command directly to convert this = document, the hebrew characters are rendered correctly in the PDF

=C2=A0 =C2=A0 $ xelatex simple.tex


=C2=A0 =C2=A0 However, if I attempt to do th= e conversion with pandoc using the xelatex engine

=C2=A0 =C2=A0 $ pandoc --latex-engine=3Dxelatex simple.tex -o from-pandoc.pdf


pandoc will render the PDF= **without** the hebrew characters. =C2=A0There's just blank white spac= e where the hebrew should be.
=C2=A0 =C2=A0=C2=A0
S= o, first question if anyone knows: How do I make pandoc render the hebrew i= nto a PDF?

If there's no clear path to that an= swer, how to do I debug the pandoc/xelatex interaction in order to understa= nd why pandoc's use of the engine produces different results, and how I= might be able to change that invocation. =C2=A0Regarding this -- my unders= tanding of pandoc is very limited, so baby words/steps that let me become l= ess of a baby are appreciated :) =C2=A0 =C2=A0 =C2=A0=C2=A0
=C2= =A0 =C2=A0=C2=A0
Also, I have some specific examples posted over = on the tex StackExchange with hebrew examples



Finally -- my ultimate goal here is to convert HTML documents with hebre= w into PDFs. =C2=A0As that also doesn't work, I'm focused on unders= tanding the [tex] to [pdf] conversion. =C2=A0If there's some extra wrin= kle involved with [html] to [pdf] ( which I presume -- (correctly?) -- invo= lves xelatex in the middle) feel free to chime in there as well.=C2=A0

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/= msgid/pandoc-discuss/ab8c9513-f33d-4965-8c3c-a10c855c9a3d%40googlegroups.co= m.
For more options, visit http= s://groups.google.com/d/optout.
------=_Part_1950_342050202.1497997305220-- ------=_Part_1949_1972603551.1497997305220--