* U+200B and LaTeX @ 2019-09-18 10:05 nopria [not found] ` <45688658-4762-4910-b8d1-a28a23efd91c-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 3+ messages in thread From: nopria @ 2019-09-18 10:05 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 1404 bytes --] Converting from docbook to LaTeX I came across a possible uncorrect management of U+200B when converting to LaTeX. The following docbook MWE (the simple string "...abc") <?xml version="1.0" encoding="UTF-8"?> <?asciidoc-toc?> <?asciidoc-numbered?> <article xmlns="http://docbook.org/ns/docbook" xmlns:xl= "http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> <simpara>…​abc</simpara> is converted to LaTeX \ldotsabc with a (invisible but detectable in the real output) zero-width-space between "\ldots" and "abc". I think that the correct LaTeX output should be \ldots abc with a standard space after `\ldots`, because if you try to produce a PDF you get [WARNING] Missing character: There is no ÔÇï (U+200B) in font [lmroman10- regular]:mapping=tex-text;! because of the presence of the zero-width-space, whereas with the standard space you get the correct output ("...abc" and not "... abc") in PDF (and no warnings). -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/45688658-4762-4910-b8d1-a28a23efd91c%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 8191 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <45688658-4762-4910-b8d1-a28a23efd91c-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: U+200B and LaTeX [not found] ` <45688658-4762-4910-b8d1-a28a23efd91c-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2019-09-18 16:24 ` John MacFarlane [not found] ` <m25zlp4lon.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> 0 siblings, 1 reply; 3+ messages in thread From: John MacFarlane @ 2019-09-18 16:24 UTC (permalink / raw) To: nopria, pandoc-discuss The question is how we should render U+200B zero-width space in LaTeX. Currently we are just outputing the unicode character (which should work okay with xelatex anyway). Is there a better way? We could just output {}, for example. It's probably worth putting an issue on the tracker. nopria <mmj529-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > Converting from docbook to LaTeX I came across a possible uncorrect > management of U+200B when converting to LaTeX. > The following docbook MWE (the simple string "...abc") > > <?xml version="1.0" encoding="UTF-8"?> > <?asciidoc-toc?> > <?asciidoc-numbered?> > <article xmlns="http://docbook.org/ns/docbook" xmlns:xl= > "http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> > <simpara>…​abc</simpara> > > is converted to LaTeX > > \ldotsabc > > with a (invisible but detectable in the real output) zero-width-space > between "\ldots" and "abc". > > I think that the correct LaTeX output should be > > \ldots abc > > with a standard space after `\ldots`, because if you try to produce a PDF > you get > > [WARNING] Missing character: There is no ÔÇï (U+200B) in font [lmroman10- > regular]:mapping=tex-text;! > > because of the presence of the zero-width-space, whereas with the standard > space you get the correct output ("...abc" and not "... abc") in PDF (and > no warnings). > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/45688658-4762-4910-b8d1-a28a23efd91c%40googlegroups.com. -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/m25zlp4lon.fsf%40johnmacfarlane.net. ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <m25zlp4lon.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>]
* Re: U+200B and LaTeX [not found] ` <m25zlp4lon.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> @ 2019-09-18 18:21 ` nopria 0 siblings, 0 replies; 3+ messages in thread From: nopria @ 2019-09-18 18:21 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 2570 bytes --] I opened the issue https://github.com/jgm/pandoc/issues/5756 Il giorno mercoledì 18 settembre 2019 18:24:40 UTC+2, John MacFarlane ha scritto: > > > The question is how we should render U+200B zero-width space > in LaTeX. Currently we are just outputing the unicode character > (which should work okay with xelatex anyway). > > Is there a better way? > > We could just output {}, for example. > > It's probably worth putting an issue on the tracker. > > nopria <mmj...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> writes: > > > Converting from docbook to LaTeX I came across a possible uncorrect > > management of U+200B when converting to LaTeX. > > The following docbook MWE (the simple string "...abc") > > > > <?xml version="1.0" encoding="UTF-8"?> > > <?asciidoc-toc?> > > <?asciidoc-numbered?> > > <article xmlns="http://docbook.org/ns/docbook" xmlns:xl= > > "http://www.w3.org/1999/xlink" version="5.0" xml:lang="en"> > > <simpara>…​abc</simpara> > > > > is converted to LaTeX > > > > \ldotsabc > > > > with a (invisible but detectable in the real output) zero-width-space > > between "\ldots" and "abc". > > > > I think that the correct LaTeX output should be > > > > \ldots abc > > > > with a standard space after `\ldots`, because if you try to produce a > PDF > > you get > > > > [WARNING] Missing character: There is no ÔÇï (U+200B) in font > [lmroman10- > > regular]:mapping=tex-text;! > > > > because of the presence of the zero-width-space, whereas with the > standard > > space you get the correct output ("...abc" and not "... abc") in PDF > (and > > no warnings). > > > > -- > > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. > > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/45688658-4762-4910-b8d1-a28a23efd91c%40googlegroups.com. > > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/e205a1a5-ac61-41e7-a990-eb563a7e5a9d%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 5120 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-09-18 18:21 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-09-18 10:05 U+200B and LaTeX nopria [not found] ` <45688658-4762-4910-b8d1-a28a23efd91c-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2019-09-18 16:24 ` John MacFarlane [not found] ` <m25zlp4lon.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> 2019-09-18 18:21 ` nopria
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).