From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/24333 Path: news.gmane.io!.POSTED.ciao.gmane.io!not-for-mail From: Chris Jones Newsgroups: gmane.text.pandoc Subject: Re: pandoc correctly translates U+2024 thin space to '\,' but the spaces in PDF created by Xelatex are full-width Date: Sun, 2 Feb 2020 14:36:35 -0800 (PST) Message-ID: <158fd0ac-89bc-4fb1-9920-386bf325dad6@googlegroups.com> References: <818817e7-17c7-4bf4-b9fb-e300f6faaf37@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_2302_1374598930.1580682995167" Injection-Info: ciao.gmane.io; posting-host="ciao.gmane.io:159.69.161.202"; logging-data="69182"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBD4675FCS4BBB5E53XYQKGQENL2NRVA-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sun Feb 02 23:36:39 2020 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-ot1-f61.google.com ([209.85.210.61]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1iyNr5-000Htd-Oo for gtp-pandoc-discuss@m.gmane-mx.org; Sun, 02 Feb 2020 23:36:39 +0100 Original-Received: by mail-ot1-f61.google.com with SMTP id d16sf7950181otp.10 for ; Sun, 02 Feb 2020 14:36:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=1LQyrsaOZF8H8/2u4p01SnLrt4fGqyCJdwFL0Uogdac=; b=skd0Y+Dn8gJ9To2Z8do/xdaHyVo5dTA7Ii1H1IaEr8zp8JolxRK/Biipyuzk1PBfSI fAxdnhAYDAFUJTQwxdQ6TlfnKMlL0FqMfFFGWhvg8L0U3FCjOlUg1JXSQ/qsFQNx+DTn HvykwWPdHHWVTm1LvEm+FBvH5aBcad8GRPoCLhlwTAE4SXXt1f1cZD5bv+QHvjGo3w/w Qfg3trvIkz7+UyulAmhQ1LCvIYjJELy+GQ0eUNjPE3xXESgSNNWS1lgfRIdoDl7zzxEm kFAHXofzs2XNza2Eq2vkvQw8L3RShls2/3p8+QnCBjUOQYAjQCfJRQa4aQlCQLDf/TaM gzYg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=1LQyrsaOZF8H8/2u4p01SnLrt4fGqyCJdwFL0Uogdac=; b=PX7+ULwK060pyYAT4tRxI52L0HttnB36iKyOn+f09DXf2/KbxDrhiLYsSm7kERIWhw smVrUf1l+YtvgAJvMvoUx3b7649JD/tKiAOucmvo5K/0q+5Kr2TwpMQLdJFlLY3ZRq0D SUmb9Xl0qim8oWaakJjpsa2LtcmTFwm228WDv6UHK1UvNOa+mANVmHtnHbKK2ku09yd2 /fmP3njNeEI/PCsgt+/AA8wUIuY4y28yjaNq5PoNJSLu84DdvFPWyf4+mnd1KoIfikzd 9XqoG9lyLXbhS7VZNLDUd7fcsBKoknyAut9ls06Ir5U4TwzWvCEFvLZiLN1dai19G4it mF1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=1LQyrsaOZF8H8/2u4p01SnLrt4fGqyCJdwFL0Uogdac=; b=Ix83mpA/3eTcVUM30z+YRNlTB6xm7vGzzTWvPZGr93PcKa4B340Ws1U5zz515mLl3H GbmncB5wp5I7axHZaXjqET4G8mKBhqggZyWD1FSZRMU2G5OGv/WLQgYt3UPZXNLVFS4T Ik8Ue5M48MtKTiug9z1Oa1ZE55uny9QxC4banu/esE5yYCKQd+p1wf008d2KsKuOObO7 y+V8LAffX0uLEYBZF9I3EzB5nhCgag8kozPhqYErbqXifTBQO3HlQ6bhXzIo7I3sD6MF DpxzopNjRIGLl8i55rlyKitrQ5lB3GzfcXeWRDCylF0t95y+B6qTsXTRgWNEihrr3SNJ cBsQ== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: APjAAAX6uIYfKQa3DYRj7H9CB+KMSffVFxPx9SV8yJ8jIJrQOXi86eNT uz5dWRWfkWDAHshuQrsTEmY= X-Google-Smtp-Source: APXvYqxOHpFuVtqBourk40dfG8vW/3Cazx6vRHRuE/w25EcDkwY1rtTbq9FjGOLStRIOlIYZYHx20g== X-Received: by 2002:a9d:48d:: with SMTP id 13mr15051988otm.249.1580682998727; Sun, 02 Feb 2020 14:36:38 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a9d:7109:: with SMTP id n9ls345801otj.10.gmail; Sun, 02 Feb 2020 14:36:35 -0800 (PST) X-Received: by 2002:a05:6830:1:: with SMTP id c1mr14408400otp.254.1580682995803; Sun, 02 Feb 2020 14:36:35 -0800 (PST) In-Reply-To: <818817e7-17c7-4bf4-b9fb-e300f6faaf37-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> X-Original-Sender: cjns1989-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:24333 Archived-At: ------=_Part_2302_1374598930.1580682995167 Content-Type: multipart/alternative; boundary="----=_Part_2303_1761323727.1580682995168" ------=_Part_2303_1761323727.1580682995168 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I replaced these lines in the intermediate latex file generated by pandoc: `\ifnum 0\ifxetex 1\fi\ifluatex 1\fi=3D0 % if pdftex \usepackage[shorthands=3Doff, main=3Dfrench]{babel} \else % load polyglossia as late as possible as it *could* call bidi if RTL=20 lang (e.g. Hebrew or Arabic) \usepackage{polyglossia} \setmainlanguage[]{french} \fi` by=E2=80=A6 `\usepackage[french]{babel}` =E2=80=A6 and that took care of the problem. So it looks like it's either a bug in the panglossia package or misuse=20 thereof by pandoc? Not sure where to go from there since I don't understand what the if/else &= =20 conditionals are meant to do. Thanks, CJ On Saturday, February 1, 2020 at 2:18:30 PM UTC-5, Chris Jones wrote: > > Searched online for similar cases and didn't find anything relevant. > > The context is that I recently was made aware that the French insist that= =20 > a *thin space* be inserted immediately before some punctuation characters= =20 > *',:!?=C2=BB%*' etc.=E2=80=A6 So in dialogs for instance e.g. =E2=80=A6 t= he .md source has: =C2=AB=20 > =C2=B7 bonjour mademoiselle =C2=B7 =C2=BB where the middle dots represent= a single U+202f=20 > non-breaking space. > > When I take a look at the intermediate .tex file that pandoc generates th= e=20 > thin spaces are correctly converted to '\,' which I believe is the *latex= =20 > way *of coding thin spaces. But when I run xelatex on the latex file and= =20 > look at the resulting PDF I can see that the thin spaces have become=20 > regular-width spaces.=20 > > I compared the PDF output to another PDF I had created using plain latex= =20 > rather than pandoc and the U+202F's that I typed in my .tex source clearl= y=20 > materialize as thin spaces in the PDF. =20 > > What I suspect at this point is that one of the latex packages that pando= c=20 > sticks in the generated latex file (or the way it is invoked? perhaps a= =20 > combination of packages? =E2=80=A6?) is causing this. > > As to an *MWE*=E2=80=A6 I'm not sure it's really appropriate in this part= icular=20 > case=E2=80=A6 > > *Just in case=E2=80=A6 here's what I get from a minimal .md input file:* > > `\PassOptionsToPackage{unicode=3Dtrue}{hyperref} % options for packages= =20 > loaded elsewhere > \PassOptionsToPackage{hyphens}{url} > % > \documentclass[oneside,10pt,french,]{extbook} % cjns1989 - 27112019 -=20 > added the oneside option: so that the text doesn't jump left & right when= =20 > reading on a tablet/ereader > \usepackage{lmodern} > \usepackage{amssymb,amsmath} > \usepackage{ifxetex,ifluatex} > \usepackage{fixltx2e} % provides \textsubscript > \ifnum 0\ifxetex 1\fi\ifluatex 1\fi=3D0 % if pdftex > \usepackage[T1]{fontenc} > \usepackage[utf8]{inputenc} > \usepackage{textcomp} % provides euro and other symbols > \else % if luatex or xelatex > \usepackage{unicode-math} > \defaultfontfeatures{Ligatures=3DTeX,Scale=3DMatchLowercase} > % \setmainfont[]{EBGaramond-Regular} > \setmainfont[Numbers=3D{OldStyle,Proportional}]{EBGaramond-Regular} = =20 > % cjns1989 - 20191129 - old style numbers=20 > \fi > % use upquote if available, for straight quotes in verbatim environments > \IfFileExists{upquote.sty}{\usepackage{upquote}}{} > % use microtype if available > \IfFileExists{microtype.sty}{% > \usepackage[]{microtype} > \UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts > }{} > \usepackage{hyperref} > \hypersetup{ > pdftitle=3D{WME}, > pdfborder=3D{0 0 0}, > breaklinks=3Dtrue} > \urlstyle{same} % don't use monospace font for urls > \usepackage[papersize=3D{3.75 in, 6.0 in},left=3D.3 in,right=3D.3 in]{geo= metry} > \setlength{\emergencystretch}{3em} % prevent overfull lines > \providecommand{\tightlist}{% > \setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}} > \setcounter{secnumdepth}{0} > % Redefines (sub)paragraphs to behave more like sections > \ifx\paragraph\undefined\else > \let\oldparagraph\paragraph > \renewcommand{\paragraph}[1]{\oldparagraph{#1}\mbox{}} > \fi > \ifx\subparagraph\undefined\else > \let\oldsubparagraph\subparagraph > \renewcommand{\subparagraph}[1]{\oldsubparagraph{#1}\mbox{}} > \fi > % set default figure placement to htbp > \makeatletter > \def\fps@figure{htbp} > \makeatother > > \ifnum 0\ifxetex 1\fi\ifluatex 1\fi=3D0 % if pdftex > \usepackage[shorthands=3Doff,main=3Dfrench]{babel} > \else > % load polyglossia as late as possible as it *could* call bidi if RTL= =20 > lang (e.g. Hebrew or Arabic) > \usepackage{polyglossia} > \setmainlanguage[]{french} > \fi > > \title{WME} > \date{} > > \begin{document} > \maketitle > > \$ ECM > > \hypertarget{wme-title}{% > \chapter{WME (title)}\label{wme-title}} > > en lettres capitales, soigneusement imprim=C3=A9es au pochoir\,: > > --- =C2=AB\,Cr=C3=A9tins\,!\,=C2=BB murmura-t-il. > > \end{document}` > > *Customization* is minimal: old style numbers (proportional) and=20 > one-sided since the document is not destined for hard-copy printing=E2=80= =A6 > > What I have in mind at this point to try and figure out what is happening= =20 > is to work with a one line .md source that has some U+202F's and remove= =20 > default packages until the problem goes away but before I do this I thoug= ht=20 > maybe someone has run into something similar or might suggest a better=20 > approach than plain trial and error to help determine the cause of the=20 > problem. > > Thoughts? > > Thanks, > > CJ > > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/158fd0ac-89bc-4fb1-9920-386bf325dad6%40googlegroups.com. ------=_Part_2303_1761323727.1580682995168 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I replaced these lines in the intermediate latex file= generated by pandoc:

`\ifnum 0\ifxetex 1\fi\iflua= tex 1\fi=3D0 % if pdftex
=C2=A0 \usepackage[shorthands=3Doff,
m= ain=3Dfrench]{babel}
\else
=C2=A0 % load polyglossia as late as possi= ble as it *could* call bidi if RTL lang (e.g. Hebrew or Arabic)
=C2=A0 \= usepackage{polyglossia}
=C2=A0 \setmainlanguage[]{french}
\fi`
<= div>
by=E2=80=A6

`\usepackage[french= ]{babel}`

=E2=80=A6 and that took care of the prob= lem.

So it looks like it's either a bug in the= panglossia package or misuse thereof by pandoc?

N= ot sure where to go from there since I don't understand what the if/els= e & conditionals are meant to do.

Thanks,<= /div>

CJ

On Saturday, February 1, 202= 0 at 2:18:30 PM UTC-5, Chris Jones wrote:
Searched online for similar cases and didn= 't find anything relevant.

The context is that= I recently was made aware that the French insist that a thin space = be inserted immediately before some punctuation characters ',:!?=C2= =BB%' etc.=E2=80=A6 So in dialogs for instance e.g. =E2=80=A6 the .= md source has: =C2=AB =C2=B7 bonjour mademoiselle =C2=B7 =C2=BB where the m= iddle dots represent a single U+202f non-breaking space.

When I take a look at the intermediate .tex file that pandoc generat= es the thin spaces are correctly converted to '\,' which I believe = is the latex way of coding thin spaces. But when I run xelatex on th= e latex file and look at the resulting PDF I can see that the thin spaces h= ave become regular-width spaces.

I compared t= he PDF output to another PDF I had created using plain latex rather than pa= ndoc and the U+202F's that I typed in my .tex source clearly materializ= e as thin spaces in the PDF.=C2=A0

What I sus= pect at this point is that one of the latex packages that pandoc sticks in = the generated latex file (or the way it is invoked? perhaps a combination o= f packages? =E2=80=A6?) is causing this.

As to an = MWE=E2=80=A6 I'm not sure it's really appropriate in this pa= rticular case=E2=80=A6

Just in case=E2=80=A6 he= re's what I get from a minimal .md input file:

=
`\PassOptionsToPackage{unicode=3Dtrue}{hyperref} % options for pa= ckages loaded elsewhere
\PassOptionsToPackage{hyphens}{url}
%\documentclass[oneside,10pt,french,]{extbook} % cjns1989 - 27112019 -= added the oneside option: so that the text doesn't jump left & rig= ht when reading on a tablet/ereader
\usepackage{lmodern}
\usepackage{= amssymb,amsmath}
\usepackage{ifxetex,ifluatex}
\usepackage{fixltx2e} = % provides \textsubscript
\ifnum 0\ifxetex 1\fi\ifluatex 1\fi=3D0 % if p= dftex
=C2=A0 \usepackage[T1]{fontenc}
=C2=A0 \usepackage[utf8]{inpute= nc}
=C2=A0 \usepackage{textcomp} % provides euro and other symbols
\e= lse % if luatex or xelatex
=C2=A0 \usepackage{unicode-math}
=C2=A0 \d= efaultfontfeatures{Ligatures=3DTeX,Scale=3DMatchLowercase}
%= =C2=A0=C2=A0 \setmainfont[]{EBGaramond-Regular}
=C2=A0=C2=A0=C2=A0 = \setmainfont[Numbers=3D{OldStyle,Proportional}]{EBGaramond-Regula= r}=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 % cjns1989 - 20191129 - old style numbers =
\fi
% use upquote if available, for straight quotes in verbatim envi= ronments
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
% us= e microtype if available
\IfFileExists{microtype.sty}{%
\usepackage[]= {microtype}
\UseMicrotypeSet[protrusion]{basicmath} % disable protr= usion for tt fonts
}{}
\usepackage{hyperref}
\hypersetup{
=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 pdftitle=3D= {WME},
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0 pdfborder=3D{0 0 0},
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 breaklinks=3Dtrue}
\urlstyle{same}=C2=A0 % don'= ;t use monospace font for urls
\usepackage[papersize=3D{3.75 in, 6.0 in}= ,left=3D.3 in,right=3D.3 in]{geometry}
\setlength{\emergencystretch}{3em}=C2=A0 % prevent overfull lines
\providecommand{\tightlist}{%
= =C2=A0 \setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}
\setcoun= ter{secnumdepth}{0}
% Redefines (sub)paragraphs to behave more like sect= ions
\ifx\paragraph\undefined\else
\let\oldparagraph\paragraph
\re= newcommand{\paragraph}[1]{\oldparagraph{#1}\mbox{}}
\fi
\ifx\sub= paragraph\undefined\else
\let\oldsubparagraph\subparagraph
= \renewcommand{\subparagraph}[1]{\oldsubparagraph{#1}\mbox{}}
\fi
% set default figure placement to htbp
\makeatletter
\de= f\fps@figure{htbp}
\makeatother

\ifnum 0\ifxetex 1\fi\ifluatex 1\= fi=3D0 % if pdftex
=C2=A0 \usepackage[shorthands=3Doff,main=3Dfrenc= h]{babel}
\else
=C2=A0 % load polyglossia as late as possible as it *= could* call bidi if RTL lang (e.g. Hebrew or Arabic)
=C2=A0 \usepackage{= polyglossia}
=C2=A0 \setmainlanguage[]{french}
\fi

\title{WME}=
\date{}

\begin{document}
\maketitle

\$ ECM

\hyp= ertarget{wme-title}{%
\chapter{WME (title)}\label{wme-title}}

en = lettres capitales, soigneusement imprim=C3=A9es au pochoir\,:

--- = =C2=AB\,Cr=C3=A9tins\,!\,=C2=BB murmura-t-il.

\end{document}`
<= div>
Customization is minimal: old style numbers (prop= ortional) and one-sided since the document is not destined for hard-copy pr= inting=E2=80=A6

What I have in mind at this point = to try and figure out what is happening is to work=20 with a one line .md source that has some U+202F's and remove default pa= ckages=20 until the problem goes away but before I do this I thought maybe someone has run into something similar or might suggest a better approach than=20 plain trial and error to help determine the cause of the problem.

Thoughts?

Thanks,

CJ

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/= msgid/pandoc-discuss/158fd0ac-89bc-4fb1-9920-386bf325dad6%40googlegroups.co= m.
------=_Part_2303_1761323727.1580682995168-- ------=_Part_2302_1374598930.1580682995167--