* U+200B and LaTeX
@ 2019-09-18 10:05 nopria
[not found] ` <45688658-4762-4910-b8d1-a28a23efd91c-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 3+ messages in thread
From: nopria @ 2019-09-18 10:05 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 1404 bytes --]
Converting from docbook to LaTeX I came across a possible uncorrect
management of U+200B when converting to LaTeX.
The following docbook MWE (the simple string "...abc")
<?xml version="1.0" encoding="UTF-8"?>
<?asciidoc-toc?>
<?asciidoc-numbered?>
<article xmlns="http://docbook.org/ns/docbook" xmlns:xl=
"http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
<simpara>…​abc</simpara>
is converted to LaTeX
\ldotsabc
with a (invisible but detectable in the real output) zero-width-space
between "\ldots" and "abc".
I think that the correct LaTeX output should be
\ldots abc
with a standard space after `\ldots`, because if you try to produce a PDF
you get
[WARNING] Missing character: There is no ÔÇï (U+200B) in font [lmroman10-
regular]:mapping=tex-text;!
because of the presence of the zero-width-space, whereas with the standard
space you get the correct output ("...abc" and not "... abc") in PDF (and
no warnings).
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/45688658-4762-4910-b8d1-a28a23efd91c%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 8191 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: U+200B and LaTeX
[not found] ` <45688658-4762-4910-b8d1-a28a23efd91c-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2019-09-18 16:24 ` John MacFarlane
[not found] ` <m25zlp4lon.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
0 siblings, 1 reply; 3+ messages in thread
From: John MacFarlane @ 2019-09-18 16:24 UTC (permalink / raw)
To: nopria, pandoc-discuss
The question is how we should render U+200B zero-width space
in LaTeX. Currently we are just outputing the unicode character
(which should work okay with xelatex anyway).
Is there a better way?
We could just output {}, for example.
It's probably worth putting an issue on the tracker.
nopria <mmj529-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
> Converting from docbook to LaTeX I came across a possible uncorrect
> management of U+200B when converting to LaTeX.
> The following docbook MWE (the simple string "...abc")
>
> <?xml version="1.0" encoding="UTF-8"?>
> <?asciidoc-toc?>
> <?asciidoc-numbered?>
> <article xmlns="http://docbook.org/ns/docbook" xmlns:xl=
> "http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
> <simpara>…​abc</simpara>
>
> is converted to LaTeX
>
> \ldotsabc
>
> with a (invisible but detectable in the real output) zero-width-space
> between "\ldots" and "abc".
>
> I think that the correct LaTeX output should be
>
> \ldots abc
>
> with a standard space after `\ldots`, because if you try to produce a PDF
> you get
>
> [WARNING] Missing character: There is no ÔÇï (U+200B) in font [lmroman10-
> regular]:mapping=tex-text;!
>
> because of the presence of the zero-width-space, whereas with the standard
> space you get the correct output ("...abc" and not "... abc") in PDF (and
> no warnings).
>
> --
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/45688658-4762-4910-b8d1-a28a23efd91c%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/m25zlp4lon.fsf%40johnmacfarlane.net.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: U+200B and LaTeX
[not found] ` <m25zlp4lon.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
@ 2019-09-18 18:21 ` nopria
0 siblings, 0 replies; 3+ messages in thread
From: nopria @ 2019-09-18 18:21 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 2570 bytes --]
I opened the issue https://github.com/jgm/pandoc/issues/5756
Il giorno mercoledì 18 settembre 2019 18:24:40 UTC+2, John MacFarlane ha
scritto:
>
>
> The question is how we should render U+200B zero-width space
> in LaTeX. Currently we are just outputing the unicode character
> (which should work okay with xelatex anyway).
>
> Is there a better way?
>
> We could just output {}, for example.
>
> It's probably worth putting an issue on the tracker.
>
> nopria <mmj...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> writes:
>
> > Converting from docbook to LaTeX I came across a possible uncorrect
> > management of U+200B when converting to LaTeX.
> > The following docbook MWE (the simple string "...abc")
> >
> > <?xml version="1.0" encoding="UTF-8"?>
> > <?asciidoc-toc?>
> > <?asciidoc-numbered?>
> > <article xmlns="http://docbook.org/ns/docbook" xmlns:xl=
> > "http://www.w3.org/1999/xlink" version="5.0" xml:lang="en">
> > <simpara>…​abc</simpara>
> >
> > is converted to LaTeX
> >
> > \ldotsabc
> >
> > with a (invisible but detectable in the real output) zero-width-space
> > between "\ldots" and "abc".
> >
> > I think that the correct LaTeX output should be
> >
> > \ldots abc
> >
> > with a standard space after `\ldots`, because if you try to produce a
> PDF
> > you get
> >
> > [WARNING] Missing character: There is no ÔÇï (U+200B) in font
> [lmroman10-
> > regular]:mapping=tex-text;!
> >
> > because of the presence of the zero-width-space, whereas with the
> standard
> > space you get the correct output ("...abc" and not "... abc") in PDF
> (and
> > no warnings).
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>.
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/45688658-4762-4910-b8d1-a28a23efd91c%40googlegroups.com.
>
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/e205a1a5-ac61-41e7-a990-eb563a7e5a9d%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 5120 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-09-18 18:21 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-18 10:05 U+200B and LaTeX nopria
[not found] ` <45688658-4762-4910-b8d1-a28a23efd91c-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2019-09-18 16:24 ` John MacFarlane
[not found] ` <m25zlp4lon.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2019-09-18 18:21 ` nopria
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).