* Issues with Quotation Marks in Pandoc When Mixing Japanese and English Texts @ 2023-04-09 23:53 Shigeru Kobayashi [not found] ` <4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Shigeru Kobayashi @ 2023-04-09 23:53 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 2732 bytes --] Dear Pandoc community, I have encountered two issues regarding Pandoc's handling of quotation marks in cases where Japanese and English texts are mixed. I am using Pandoc version 3.1.2 on macOS 12.6.3, and I can reproduce these issues. If these are indeed bugs, I am planning to submit them as issues on GitHub. However, I would appreciate any guidance if these issues arise from my incorrect usage. *Issue 1: Conversion of English phrases within Japanese text* I have observed the following issue. "input.md" is the input file, and "input.tex" is the output file. $ pandoc input.md -o input.tex input.md: その人は"Hello, world!"と言いました。 input.tex: その人は''Hello, world!{}``と言いました。 However, the conversion is correct when spaces are added before and after the double quotation marks. input.md: その人は "Hello, world!" と言いました。 input.tex: その人は ``Hello, world!'' と言いました。 *Issue 2: The quotation marks are treated as Japanese text* When converting with Pandoc, the quotation marks are treated as Japanese text, resulting in an unnaturally wide gap. I have confirmed this using two files, "preamble.tex" and "input.md," and specifying as follows: $ pandoc input.md -o input.pdf --pdf-engine=xelatex -H preamble.tex. preamble.tex: \usepackage{fontspec} \setmainfont{Georgia} \setjamainfont{BIZ UDMincho Medium} input.md: --- documentclass: bxjsarticle classoption: pandoc papersize: a4 fontsize: 10pt --- # はじめに その人は "Hello, world!" と言いました。 That person said, "Hello, world!" [image: pandoc test 2023-04-10 8.47.43.png] In contrast, when I directly write the content in TeX and output it using $ xelatex test.tex, the quotation marks are treated as English text, and the expected output is obtained. test.tex: \documentclass[a4paper,xelatex,ja=standard]{bxjsarticle} \usepackage{fontspec} \setmainfont{Georgia} \setjamainfont{BIZ UDMincho Medium} \title{テスト} \begin{document} \maketitle \section{はじめに} その人は ``Hello, world!'' と言いました。 That person said, ``Hello, world!'' \end{document} [image: xelatex test 2023-04-10 8.46.44.png] Shigeru Kobayashi -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 3811 bytes --] [-- Attachment #2: xelatex test 2023-04-10 8.46.44.png --] [-- Type: image/png, Size: 150246 bytes --] [-- Attachment #3: pandoc test 2023-04-10 8.47.43.png --] [-- Type: image/png, Size: 187444 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Issues with Quotation Marks in Pandoc When Mixing Japanese and English Texts [not found] ` <4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2023-04-10 7:40 ` Bastien DUMONT 2023-04-10 17:34 ` John MacFarlane 1 sibling, 0 replies; 9+ messages in thread From: Bastien DUMONT @ 2023-04-10 7:40 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw Does marking the English text as such (with ["Hello world"]{lang=en}) solve your problem? Le Sunday 09 April 2023 à 04:53:00PM, Shigeru Kobayashi a écrit : > Dear Pandoc community, > > I have encountered two issues regarding Pandoc's handling of quotation marks in > cases where Japanese and English texts are mixed. > > I am using Pandoc version 3.1.2 on macOS 12.6.3, and I can reproduce these > issues. If these are indeed bugs, I am planning to submit them as issues on > GitHub. However, I would appreciate any guidance if these issues arise from my > incorrect usage. > > Issue 1: Conversion of English phrases within Japanese text > > I have observed the following issue. "input.md" is the input file, and > "input.tex" is the output file. > > $ pandoc input.md -o input.tex > > input.md: > その人は"Hello, world!"と言いました。 > > input.tex: > その人は''Hello, world!{}``と言いました。 > > However, the conversion is correct when spaces are added before and after the > double quotation marks. > > input.md: > その人は "Hello, world!" と言いました。 > > input.tex: > その人は ``Hello, world!'' と言いました。 > > > Issue 2: The quotation marks are treated as Japanese text > > When converting with Pandoc, the quotation marks are treated as Japanese text, > resulting in an unnaturally wide gap. I have confirmed this using two files, > "preamble.tex" and "input.md," and specifying as follows: > > $ pandoc input.md -o input.pdf --pdf-engine=xelatex -H preamble.tex. > > preamble.tex: > \usepackage{fontspec} > > \setmainfont{Georgia} > \setjamainfont{BIZ UDMincho Medium} > > > input.md: > --- > documentclass: bxjsarticle > classoption: pandoc > papersize: a4 > fontsize: 10pt > --- > > # はじめに > > その人は "Hello, world!" と言いました。 > > That person said, "Hello, world!" > > pandoc test 2023-04-10 8.47.43.png > > In contrast, when I directly write the content in TeX and output it using $ > xelatex test.tex, the quotation marks are treated as English text, and the > expected output is obtained. > > test.tex: > \documentclass[a4paper,xelatex,ja=standard]{bxjsarticle} > > \usepackage{fontspec} > \setmainfont{Georgia} > \setjamainfont{BIZ UDMincho Medium} > > \title{テスト} > \begin{document} > \maketitle > > \section{はじめに} > > その人は ``Hello, world!'' と言いました。 > > That person said, ``Hello, world!'' > > \end{document} > > xelatex test 2023-04-10 8.46.44.png > > Shigeru Kobayashi > > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email > to [1]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit [2]https://groups.google.com/d/msgid/ > pandoc-discuss/4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n%40googlegroups.com. > > References: > > [1] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > [2] https://groups.google.com/d/msgid/pandoc-discuss/4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n%40googlegroups.com?utm_medium=email&utm_source=footer -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ZDO9b9OpQyS8OJ_Z%40localhost. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Issues with Quotation Marks in Pandoc When Mixing Japanese and English Texts [not found] ` <4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2023-04-10 7:40 ` Bastien DUMONT @ 2023-04-10 17:34 ` John MacFarlane [not found] ` <D44375EB-4058-4C5F-AF39-461B38B30EE7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 1 sibling, 1 reply; 9+ messages in thread From: John MacFarlane @ 2023-04-10 17:34 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw I would recommend using unicode curly quotes in the markdown when you're working in a language without interword spacing. We rely on interword spacing for heuristics about smart quotes. > On Apr 9, 2023, at 4:53 PM, Shigeru Kobayashi <mayfair-+k8b35VvZrR3+QwDJ9on6Q@public.gmane.org> wrote: > > Dear Pandoc community, > > I have encountered two issues regarding Pandoc's handling of quotation marks in cases where Japanese and English texts are mixed. > > I am using Pandoc version 3.1.2 on macOS 12.6.3, and I can reproduce these issues. If these are indeed bugs, I am planning to submit them as issues on GitHub. However, I would appreciate any guidance if these issues arise from my incorrect usage. > > Issue 1: Conversion of English phrases within Japanese text > > I have observed the following issue. "input.md" is the input file, and "input.tex" is the output file. > > $ pandoc input.md -o input.tex > > input.md: > その人は"Hello, world!"と言いました。 > > input.tex: > その人は''Hello, world!{}``と言いました。 > > However, the conversion is correct when spaces are added before and after the double quotation marks. > > input.md: > その人は "Hello, world!" と言いました。 > > input.tex: > その人は ``Hello, world!'' と言いました。 > > > Issue 2: The quotation marks are treated as Japanese text > > When converting with Pandoc, the quotation marks are treated as Japanese text, resulting in an unnaturally wide gap. I have confirmed this using two files, "preamble.tex" and "input.md," and specifying as follows: > > $ pandoc input.md -o input.pdf --pdf-engine=xelatex -H preamble.tex. > > preamble.tex: > \usepackage{fontspec} > > \setmainfont{Georgia} > \setjamainfont{BIZ UDMincho Medium} > > > input.md: > --- > documentclass: bxjsarticle > classoption: pandoc > papersize: a4 > fontsize: 10pt > --- > > # はじめに > > その人は "Hello, world!" と言いました。 > > That person said, "Hello, world!" > > <pandoc test 2023-04-10 8.47.43.png> > > In contrast, when I directly write the content in TeX and output it using $ xelatex test.tex, the quotation marks are treated as English text, and the expected output is obtained. > > test.tex: > \documentclass[a4paper,xelatex,ja=standard]{bxjsarticle} > > \usepackage{fontspec} > \setmainfont{Georgia} > \setjamainfont{BIZ UDMincho Medium} > > \title{テスト} > \begin{document} > \maketitle > > \section{はじめに} > > その人は ``Hello, world!'' と言いました。 > > That person said, ``Hello, world!'' > > \end{document} > > <xelatex test 2023-04-10 8.46.44.png> > > Shigeru Kobayashi > > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n%40googlegroups.com. > <xelatex test 2023-04-10 8.46.44.png><pandoc test 2023-04-10 8.47.43.png> -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/D44375EB-4058-4C5F-AF39-461B38B30EE7%40gmail.com. ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <D44375EB-4058-4C5F-AF39-461B38B30EE7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]
* Re: Issues with Quotation Marks in Pandoc When Mixing Japanese and English Texts [not found] ` <D44375EB-4058-4C5F-AF39-461B38B30EE7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> @ 2023-04-10 23:30 ` Shigeru Kobayashi [not found] ` <602edc59-8983-4459-bbbb-85cee5f013b3n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Shigeru Kobayashi @ 2023-04-10 23:30 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 4666 bytes --] Dear Bastien DUMONT and John MacFarlane, Thank you very much for your reply. Regarding issue 1, I will use Unicode curly quotes instead of strait quotes to avoid misinterpretation. Regarding issue 2, I tried marking the English text as [That person said, "Hello, world!"]{lang=en}, but the result was the same (i.e., quotes are typeset with the Japanese font instead of the English font). I also tried Unicode curly quotes. I have confirmed that "pandoc input.md -t input.tex" generates a code as expected as follows: \foreignlanguage{english}{That person said, ``Hello, world!''} Therefore, this is puzzling to me... Best regards, Shigeru KOBAYASHI On Tuesday, April 11, 2023 at 2:34:40 AM UTC+9 John MacFarlane wrote: > I would recommend using unicode curly quotes in the markdown when you're > working in a language without interword spacing. We rely on interword > spacing for heuristics about smart quotes. > > > On Apr 9, 2023, at 4:53 PM, Shigeru Kobayashi <may...-+k8b35VvZrR3+QwDJ9on6Q@public.gmane.org> > wrote: > > > > Dear Pandoc community, > > > > I have encountered two issues regarding Pandoc's handling of quotation > marks in cases where Japanese and English texts are mixed. > > > > I am using Pandoc version 3.1.2 on macOS 12.6.3, and I can reproduce > these issues. If these are indeed bugs, I am planning to submit them as > issues on GitHub. However, I would appreciate any guidance if these issues > arise from my incorrect usage. > > > > Issue 1: Conversion of English phrases within Japanese text > > > > I have observed the following issue. "input.md" is the input file, and > "input.tex" is the output file. > > > > $ pandoc input.md -o input.tex > > > > input.md: > > その人は"Hello, world!"と言いました。 > > > > input.tex: > > その人は''Hello, world!{}``と言いました。 > > > > However, the conversion is correct when spaces are added before and > after the double quotation marks. > > > > input.md: > > その人は "Hello, world!" と言いました。 > > > > input.tex: > > その人は ``Hello, world!'' と言いました。 > > > > > > Issue 2: The quotation marks are treated as Japanese text > > > > When converting with Pandoc, the quotation marks are treated as Japanese > text, resulting in an unnaturally wide gap. I have confirmed this using two > files, "preamble.tex" and "input.md," and specifying as follows: > > > > $ pandoc input.md -o input.pdf --pdf-engine=xelatex -H preamble.tex. > > > > preamble.tex: > > \usepackage{fontspec} > > > > \setmainfont{Georgia} > > \setjamainfont{BIZ UDMincho Medium} > > > > > > input.md: > > --- > > documentclass: bxjsarticle > > classoption: pandoc > > papersize: a4 > > fontsize: 10pt > > --- > > > > # はじめに > > > > その人は "Hello, world!" と言いました。 > > > > That person said, "Hello, world!" > > > > <pandoc test 2023-04-10 8.47.43.png> > > > > In contrast, when I directly write the content in TeX and output it > using $ xelatex test.tex, the quotation marks are treated as English text, > and the expected output is obtained. > > > > test.tex: > > \documentclass[a4paper,xelatex,ja=standard]{bxjsarticle} > > > > \usepackage{fontspec} > > \setmainfont{Georgia} > > \setjamainfont{BIZ UDMincho Medium} > > > > \title{テスト} > > \begin{document} > > \maketitle > > > > \section{はじめに} > > > > その人は ``Hello, world!'' と言いました。 > > > > That person said, ``Hello, world!'' > > > > \end{document} > > > > <xelatex test 2023-04-10 8.46.44.png> > > > > Shigeru Kobayashi > > > > > > -- > > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n%40googlegroups.com > . > > <xelatex test 2023-04-10 8.46.44.png><pandoc test 2023-04-10 8.47.43.png> > > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/602edc59-8983-4459-bbbb-85cee5f013b3n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 6316 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <602edc59-8983-4459-bbbb-85cee5f013b3n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Issues with Quotation Marks in Pandoc When Mixing Japanese and English Texts [not found] ` <602edc59-8983-4459-bbbb-85cee5f013b3n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2023-04-11 5:32 ` Bastien DUMONT 2023-04-11 5:50 ` Shigeru Kobayashi 0 siblings, 1 reply; 9+ messages in thread From: Bastien DUMONT @ 2023-04-11 5:32 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw An if you put the English parts in language spans as I said and add in header-includes \babelfont[english]{rm}{Georgia}? Le Monday 10 April 2023 à 04:30:37PM, Shigeru Kobayashi a écrit : > Dear Bastien DUMONT and John MacFarlane, > > Thank you very much for your reply. > > Regarding issue 1, I will use Unicode curly quotes instead of strait quotes to > avoid misinterpretation. > > Regarding issue 2, I tried marking the English text as [That person said, > "Hello, world!"]{lang=en}, but the result was the same (i.e., quotes are > typeset with the Japanese font instead of the English font). I also tried > Unicode curly quotes. > > I have confirmed that "pandoc input.md -t input.tex" generates a code as > expected as follows: > \foreignlanguage{english}{That person said, ``Hello, world!''} > > Therefore, this is puzzling to me... > > > Best regards, > Shigeru KOBAYASHI > > > On Tuesday, April 11, 2023 at 2:34:40 AM UTC+9 John MacFarlane wrote: > > I would recommend using unicode curly quotes in the markdown when you're > working in a language without interword spacing. We rely on interword > spacing for heuristics about smart quotes. > > > On Apr 9, 2023, at 4:53 PM, Shigeru Kobayashi <may...-+k8b35VvZrR3+QwDJ9on6Q@public.gmane.org> wrote: > > > > Dear Pandoc community, > > > > I have encountered two issues regarding Pandoc's handling of quotation > marks in cases where Japanese and English texts are mixed. > > > > I am using Pandoc version 3.1.2 on macOS 12.6.3, and I can reproduce > these issues. If these are indeed bugs, I am planning to submit them as > issues on GitHub. However, I would appreciate any guidance if these issues > arise from my incorrect usage. > > > > Issue 1: Conversion of English phrases within Japanese text > > > > I have observed the following issue. "input.md" is the input file, and > "input.tex" is the output file. > > > > $ pandoc input.md -o input.tex > > > > input.md: > > その人は"Hello, world!"と言いました。 > > > > input.tex: > > その人は''Hello, world!{}``と言いました。 > > > > However, the conversion is correct when spaces are added before and after > the double quotation marks. > > > > input.md: > > その人は "Hello, world!" と言いました。 > > > > input.tex: > > その人は ``Hello, world!'' と言いました。 > > > > > > Issue 2: The quotation marks are treated as Japanese text > > > > When converting with Pandoc, the quotation marks are treated as Japanese > text, resulting in an unnaturally wide gap. I have confirmed this using two > files, "preamble.tex" and "input.md," and specifying as follows: > > > > $ pandoc input.md -o input.pdf --pdf-engine=xelatex -H preamble.tex. > > > > preamble.tex: > > \usepackage{fontspec} > > > > \setmainfont{Georgia} > > \setjamainfont{BIZ UDMincho Medium} > > > > > > input.md: > > --- > > documentclass: bxjsarticle > > classoption: pandoc > > papersize: a4 > > fontsize: 10pt > > --- > > > > # はじめに > > > > その人は "Hello, world!" と言いました。 > > > > That person said, "Hello, world!" > > > > <pandoc test 2023-04-10 8.47.43.png> > > > > In contrast, when I directly write the content in TeX and output it using > $ xelatex test.tex, the quotation marks are treated as English text, and > the expected output is obtained. > > > > test.tex: > > \documentclass[a4paper,xelatex,ja=standard]{bxjsarticle} > > > > \usepackage{fontspec} > > \setmainfont{Georgia} > > \setjamainfont{BIZ UDMincho Medium} > > > > \title{テスト} > > \begin{document} > > \maketitle > > > > \section{はじめに} > > > > その人は ``Hello, world!'' と言いました。 > > > > That person said, ``Hello, world!'' > > > > \end{document} > > > > <xelatex test 2023-04-10 8.46.44.png> > > > > Shigeru Kobayashi > > > > > > -- > > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit [1]https://groups.google.com/d/ > msgid/pandoc-discuss/ > 4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n%40googlegroups.com. > > <xelatex test 2023-04-10 8.46.44.png><pandoc test 2023-04-10 8.47.43.png> > > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email > to [2]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit [3]https://groups.google.com/d/msgid/ > pandoc-discuss/602edc59-8983-4459-bbbb-85cee5f013b3n%40googlegroups.com. > > References: > > [1] https://groups.google.com/d/msgid/pandoc-discuss/4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n%40googlegroups.com > [2] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > [3] https://groups.google.com/d/msgid/pandoc-discuss/602edc59-8983-4459-bbbb-85cee5f013b3n%40googlegroups.com?utm_medium=email&utm_source=footer -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ZDTxAtxL0d8CgXx4%40localhost. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Issues with Quotation Marks in Pandoc When Mixing Japanese and English Texts 2023-04-11 5:32 ` Bastien DUMONT @ 2023-04-11 5:50 ` Shigeru Kobayashi [not found] ` <bad8035a-d12b-4ec2-be89-788476948a56n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Shigeru Kobayashi @ 2023-04-11 5:50 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 6612 bytes --] Dear Bastien DUMONT, Thank you very much for your reply. I tried as follows and got no changes. I'm sorry if I misunderstood your suggestion. *preamble.tex:* \usepackage{fontspec} \setmainfont{Georgia} \setjamainfont{BIZ UDMincho Medium} \babelfont[english]{rm}{Georgia} *input.md:* --- documentclass: bxjsarticle classoption: pandoc papersize: a4 --- [That person said, "Hello, world!"]{lang=en} *shell:* $ pandoc -V lang=ja input.md -o input.pdf --pdf-engine=xelatex -V pandoc -H preamble.tex On Tuesday, April 11, 2023 at 2:32:57 PM UTC+9 Bastien DUMONT wrote: > An if you put the English parts in language spans as I said and add in > header-includes \babelfont[english]{rm}{Georgia}? > > Le Monday 10 April 2023 à 04:30:37PM, Shigeru Kobayashi a écrit : > > Dear Bastien DUMONT and John MacFarlane, > > > > Thank you very much for your reply. > > > > Regarding issue 1, I will use Unicode curly quotes instead of strait > quotes to > > avoid misinterpretation. > > > > Regarding issue 2, I tried marking the English text as [That person said, > > "Hello, world!"]{lang=en}, but the result was the same (i.e., quotes are > > typeset with the Japanese font instead of the English font). I also tried > > Unicode curly quotes. > > > > I have confirmed that "pandoc input.md -t input.tex" generates a code as > > expected as follows: > > \foreignlanguage{english}{That person said, ``Hello, world!''} > > > > Therefore, this is puzzling to me... > > > > > > Best regards, > > Shigeru KOBAYASHI > > > > > > On Tuesday, April 11, 2023 at 2:34:40 AM UTC+9 John MacFarlane wrote: > > > > I would recommend using unicode curly quotes in the markdown when you're > > working in a language without interword spacing. We rely on interword > > spacing for heuristics about smart quotes. > > > > > On Apr 9, 2023, at 4:53 PM, Shigeru Kobayashi <may...-+k8b35VvZrR3+QwDJ9on6Q@public.gmane.org> > wrote: > > > > > > Dear Pandoc community, > > > > > > I have encountered two issues regarding Pandoc's handling of quotation > > marks in cases where Japanese and English texts are mixed. > > > > > > I am using Pandoc version 3.1.2 on macOS 12.6.3, and I can reproduce > > these issues. If these are indeed bugs, I am planning to submit them as > > issues on GitHub. However, I would appreciate any guidance if these > issues > > arise from my incorrect usage. > > > > > > Issue 1: Conversion of English phrases within Japanese text > > > > > > I have observed the following issue. "input.md" is the input file, and > > "input.tex" is the output file. > > > > > > $ pandoc input.md -o input.tex > > > > > > input.md: > > > その人は"Hello, world!"と言いました。 > > > > > > input.tex: > > > その人は''Hello, world!{}``と言いました。 > > > > > > However, the conversion is correct when spaces are added before and > after > > the double quotation marks. > > > > > > input.md: > > > その人は "Hello, world!" と言いました。 > > > > > > input.tex: > > > その人は ``Hello, world!'' と言いました。 > > > > > > > > > Issue 2: The quotation marks are treated as Japanese text > > > > > > When converting with Pandoc, the quotation marks are treated as > Japanese > > text, resulting in an unnaturally wide gap. I have confirmed this using > two > > files, "preamble.tex" and "input.md," and specifying as follows: > > > > > > $ pandoc input.md -o input.pdf --pdf-engine=xelatex -H preamble.tex. > > > > > > preamble.tex: > > > \usepackage{fontspec} > > > > > > \setmainfont{Georgia} > > > \setjamainfont{BIZ UDMincho Medium} > > > > > > > > > input.md: > > > --- > > > documentclass: bxjsarticle > > > classoption: pandoc > > > papersize: a4 > > > fontsize: 10pt > > > --- > > > > > > # はじめに > > > > > > その人は "Hello, world!" と言いました。 > > > > > > That person said, "Hello, world!" > > > > > > <pandoc test 2023-04-10 8.47.43.png> > > > > > > In contrast, when I directly write the content in TeX and output it > using > > $ xelatex test.tex, the quotation marks are treated as English text, and > > the expected output is obtained. > > > > > > test.tex: > > > \documentclass[a4paper,xelatex,ja=standard]{bxjsarticle} > > > > > > \usepackage{fontspec} > > > \setmainfont{Georgia} > > > \setjamainfont{BIZ UDMincho Medium} > > > > > > \title{テスト} > > > \begin{document} > > > \maketitle > > > > > > \section{はじめに} > > > > > > その人は ``Hello, world!'' と言いました。 > > > > > > That person said, ``Hello, world!'' > > > > > > \end{document} > > > > > > <xelatex test 2023-04-10 8.46.44.png> > > > > > > Shigeru Kobayashi > > > > > > > > > -- > > > You received this message because you are subscribed to the Google > Groups > > "pandoc-discuss" group. > > > To unsubscribe from this group and stop receiving emails from it, send > an > > email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > > To view this discussion on the web visit [1] > https://groups.google.com/d/ > > msgid/pandoc-discuss/ > > 4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n%40googlegroups.com. > > > <xelatex test 2023-04-10 8.46.44.png><pandoc test 2023-04-10 > 8.47.43.png> > > > > > > -- > > You received this message because you are subscribed to the Google Groups > > "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email > > to [2]pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit [3] > https://groups.google.com/d/msgid/ > > pandoc-discuss/602edc59-8983-4459-bbbb-85cee5f013b3n%40googlegroups.com. > > > > References: > > > > [1] > https://groups.google.com/d/msgid/pandoc-discuss/4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n%40googlegroups.com > > [2] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > > [3] > https://groups.google.com/d/msgid/pandoc-discuss/602edc59-8983-4459-bbbb-85cee5f013b3n%40googlegroups.com?utm_medium=email&utm_source=footer > > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/bad8035a-d12b-4ec2-be89-788476948a56n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 10816 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <bad8035a-d12b-4ec2-be89-788476948a56n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Issues with Quotation Marks in Pandoc When Mixing Japanese and English Texts [not found] ` <bad8035a-d12b-4ec2-be89-788476948a56n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2023-04-11 22:01 ` BPJ [not found] ` <CADAJKhC8hggDWF93mdtd2kYhUhjuBGJ7Jk-q1qx7omm+dCsMXw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: BPJ @ 2023-04-11 22:01 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1: Type: text/plain, Size: 8150 bytes --] Den tis 11 apr. 2023 07:53Shigeru Kobayashi <mayfair-+k8b35VvZrR3+QwDJ9on6Q@public.gmane.org> skrev: > Dear Bastien DUMONT, > > Thank you very much for your reply. I tried as follows and got no changes. > I'm sorry if I misunderstood your suggestion. > > *preamble.tex:* > \usepackage{fontspec} > \setmainfont{Georgia} > \setjamainfont{BIZ UDMincho Medium} > \babelfont[english]{rm}{Georgia} > > *input.md:* > --- > documentclass: bxjsarticle > classoption: pandoc > papersize: a4 > --- > > [That person said, "Hello, world!"]{lang=en} > And if you use Unicode curly quotes inside the span? If all else fails use Unicode curly quotes throughout and try this in the preamble: \newfontfamily{\GeorgiaFont}{Georgia} \usepackage{newunicodechar} \newunicodechar{‘}{{\GeorgiaFont ‘}} \newunicodechar{’}{{\GeorgiaFont ’}} \newunicodechar{“}{{\GeorgiaFont “}} \newunicodechar{”}{{\GeorgiaFont ”}} Inspect the LaTeX output by pandoc and make sure you get curly quotes in the LaTeX source, otherwise experiment with the +smart/-smart extension on input and output format till you get curly quotes in the LaTeX source. > *shell:* > $ pandoc -V lang=ja input.md -o input.pdf --pdf-engine=xelatex -V pandoc > -H preamble.tex > > > > On Tuesday, April 11, 2023 at 2:32:57 PM UTC+9 Bastien DUMONT wrote: > >> An if you put the English parts in language spans as I said and add in >> header-includes \babelfont[english]{rm}{Georgia}? >> >> Le Monday 10 April 2023 à 04:30:37PM, Shigeru Kobayashi a écrit : >> > Dear Bastien DUMONT and John MacFarlane, >> > >> > Thank you very much for your reply. >> > >> > Regarding issue 1, I will use Unicode curly quotes instead of strait >> quotes to >> > avoid misinterpretation. >> > >> > Regarding issue 2, I tried marking the English text as [That person >> said, >> > "Hello, world!"]{lang=en}, but the result was the same (i.e., quotes >> are >> > typeset with the Japanese font instead of the English font). I also >> tried >> > Unicode curly quotes. >> > >> > I have confirmed that "pandoc input.md -t input.tex" generates a code >> as >> > expected as follows: >> > \foreignlanguage{english}{That person said, ``Hello, world!''} >> > >> > Therefore, this is puzzling to me... >> > >> > >> > Best regards, >> > Shigeru KOBAYASHI >> > >> > >> > On Tuesday, April 11, 2023 at 2:34:40 AM UTC+9 John MacFarlane wrote: >> > >> > I would recommend using unicode curly quotes in the markdown when >> you're >> > working in a language without interword spacing. We rely on interword >> > spacing for heuristics about smart quotes. >> > >> > > On Apr 9, 2023, at 4:53 PM, Shigeru Kobayashi <may...-+k8b35VvZrR3+QwDJ9on6Q@public.gmane.org> >> wrote: >> > > >> > > Dear Pandoc community, >> > > >> > > I have encountered two issues regarding Pandoc's handling of >> quotation >> > marks in cases where Japanese and English texts are mixed. >> > > >> > > I am using Pandoc version 3.1.2 on macOS 12.6.3, and I can reproduce >> > these issues. If these are indeed bugs, I am planning to submit them as >> > issues on GitHub. However, I would appreciate any guidance if these >> issues >> > arise from my incorrect usage. >> > > >> > > Issue 1: Conversion of English phrases within Japanese text >> > > >> > > I have observed the following issue. "input.md" is the input file, >> and >> > "input.tex" is the output file. >> > > >> > > $ pandoc input.md -o input.tex >> > > >> > > input.md: >> > > その人は"Hello, world!"と言いました。 >> > > >> > > input.tex: >> > > その人は''Hello, world!{}``と言いました。 >> > > >> > > However, the conversion is correct when spaces are added before and >> after >> > the double quotation marks. >> > > >> > > input.md: >> > > その人は "Hello, world!" と言いました。 >> > > >> > > input.tex: >> > > その人は ``Hello, world!'' と言いました。 >> > > >> > > >> > > Issue 2: The quotation marks are treated as Japanese text >> > > >> > > When converting with Pandoc, the quotation marks are treated as >> Japanese >> > text, resulting in an unnaturally wide gap. I have confirmed this using >> two >> > files, "preamble.tex" and "input.md," and specifying as follows: >> > > >> > > $ pandoc input.md -o input.pdf --pdf-engine=xelatex -H preamble.tex. >> > > >> > > preamble.tex: >> > > \usepackage{fontspec} >> > > >> > > \setmainfont{Georgia} >> > > \setjamainfont{BIZ UDMincho Medium} >> > > >> > > >> > > input.md: >> > > --- >> > > documentclass: bxjsarticle >> > > classoption: pandoc >> > > papersize: a4 >> > > fontsize: 10pt >> > > --- >> > > >> > > # はじめに >> > > >> > > その人は "Hello, world!" と言いました。 >> > > >> > > That person said, "Hello, world!" >> > > >> > > <pandoc test 2023-04-10 8.47.43.png> >> > > >> > > In contrast, when I directly write the content in TeX and output it >> using >> > $ xelatex test.tex, the quotation marks are treated as English text, >> and >> > the expected output is obtained. >> > > >> > > test.tex: >> > > \documentclass[a4paper,xelatex,ja=standard]{bxjsarticle} >> > > >> > > \usepackage{fontspec} >> > > \setmainfont{Georgia} >> > > \setjamainfont{BIZ UDMincho Medium} >> > > >> > > \title{テスト} >> > > \begin{document} >> > > \maketitle >> > > >> > > \section{はじめに} >> > > >> > > その人は ``Hello, world!'' と言いました。 >> > > >> > > That person said, ``Hello, world!'' >> > > >> > > \end{document} >> > > >> > > <xelatex test 2023-04-10 8.46.44.png> >> > > >> > > Shigeru Kobayashi >> > > >> > > >> > > -- >> > > You received this message because you are subscribed to the Google >> Groups >> > "pandoc-discuss" group. >> > > To unsubscribe from this group and stop receiving emails from it, >> send an >> > email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> > > To view this discussion on the web visit [1] >> https://groups.google.com/d/ >> > msgid/pandoc-discuss/ >> > 4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n%40googlegroups.com. >> > > <xelatex test 2023-04-10 8.46.44.png><pandoc test 2023-04-10 >> 8.47.43.png> >> > >> > >> > -- >> > You received this message because you are subscribed to the Google >> Groups >> > "pandoc-discuss" group. >> > To unsubscribe from this group and stop receiving emails from it, send >> an email >> > to [2]pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> > To view this discussion on the web visit [3] >> https://groups.google.com/d/msgid/ >> > pandoc-discuss/602edc59-8983-4459-bbbb-85cee5f013b3n%40googlegroups.com. >> >> > >> > References: >> > >> > [1] >> https://groups.google.com/d/msgid/pandoc-discuss/4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n%40googlegroups.com >> > [2] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org >> > [3] >> https://groups.google.com/d/msgid/pandoc-discuss/602edc59-8983-4459-bbbb-85cee5f013b3n%40googlegroups.com?utm_medium=email&utm_source=footer >> >> -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/bad8035a-d12b-4ec2-be89-788476948a56n%40googlegroups.com > <https://groups.google.com/d/msgid/pandoc-discuss/bad8035a-d12b-4ec2-be89-788476948a56n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhC8hggDWF93mdtd2kYhUhjuBGJ7Jk-q1qx7omm%2BdCsMXw%40mail.gmail.com. [-- Attachment #2: Type: text/html, Size: 11993 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <CADAJKhC8hggDWF93mdtd2kYhUhjuBGJ7Jk-q1qx7omm+dCsMXw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Issues with Quotation Marks in Pandoc When Mixing Japanese and English Texts [not found] ` <CADAJKhC8hggDWF93mdtd2kYhUhjuBGJ7Jk-q1qx7omm+dCsMXw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2023-04-11 22:26 ` Shigeru Kobayashi [not found] ` <94d4f547-ec9b-44bf-9cf0-122d1eea1f02n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 9+ messages in thread From: Shigeru Kobayashi @ 2023-04-11 22:26 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 8755 bytes --] Dear Bastien DUMONT, Thank you very much for your suggestions. > And if you use Unicode curly quotes inside the span? I tried this version, but unfortunately, there was no change in the output PDF (i.e., the Japanese font is used for the curly quotes). > If all else fails use Unicode curly quotes throughout and try this in the preamble: > > \newfontfamily{\GeorgiaFont}{Georgia} > \usepackage{newunicodechar} > \newunicodechar{‘}{{\GeorgiaFont ‘}} > \newunicodechar{’}{{\GeorgiaFont ’}} > \newunicodechar{“}{{\GeorgiaFont “}} > \newunicodechar{”}{{\GeorgiaFont ”}} I tried this as well, but unfortunately, there was no change in the output PDF. Initially, I suspected it was something in bxjsarticle that I had set in documentclass. However, this problem does not occur when I output PDF with xelatex from an equivalent source, rather than converting from Pandoc to TeX. I will investigate this a bit further. Best, Shigeru On Wednesday, April 12, 2023 at 7:01:45 AM UTC+9 BPJ wrote: Den tis 11 apr. 2023 07:53Shigeru Kobayashi <may...-+k8b35VvZrR3+QwDJ9on6Q@public.gmane.org> skrev: Dear Bastien DUMONT, Thank you very much for your reply. I tried as follows and got no changes. I'm sorry if I misunderstood your suggestion. *preamble.tex:* \usepackage{fontspec} \setmainfont{Georgia} \setjamainfont{BIZ UDMincho Medium} \babelfont[english]{rm}{Georgia} *input.md:* --- documentclass: bxjsarticle classoption: pandoc papersize: a4 --- [That person said, "Hello, world!"]{lang=en} And if you use Unicode curly quotes inside the span? If all else fails use Unicode curly quotes throughout and try this in the preamble: \newfontfamily{\GeorgiaFont}{Georgia} \usepackage{newunicodechar} \newunicodechar{‘}{{\GeorgiaFont ‘}} \newunicodechar{’}{{\GeorgiaFont ’}} \newunicodechar{“}{{\GeorgiaFont “}} \newunicodechar{”}{{\GeorgiaFont ”}} Inspect the LaTeX output by pandoc and make sure you get curly quotes in the LaTeX source, otherwise experiment with the +smart/-smart extension on input and output format till you get curly quotes in the LaTeX source. *shell:* $ pandoc -V lang=ja input.md -o input.pdf --pdf-engine=xelatex -V pandoc -H preamble.tex On Tuesday, April 11, 2023 at 2:32:57 PM UTC+9 Bastien DUMONT wrote: An if you put the English parts in language spans as I said and add in header-includes \babelfont[english]{rm}{Georgia}? Le Monday 10 April 2023 à 04:30:37PM, Shigeru Kobayashi a écrit : > Dear Bastien DUMONT and John MacFarlane, > > Thank you very much for your reply. > > Regarding issue 1, I will use Unicode curly quotes instead of strait quotes to > avoid misinterpretation. > > Regarding issue 2, I tried marking the English text as [That person said, > "Hello, world!"]{lang=en}, but the result was the same (i.e., quotes are > typeset with the Japanese font instead of the English font). I also tried > Unicode curly quotes. > > I have confirmed that "pandoc input.md -t input.tex" generates a code as > expected as follows: > \foreignlanguage{english}{That person said, ``Hello, world!''} > > Therefore, this is puzzling to me... > > > Best regards, > Shigeru KOBAYASHI > > > On Tuesday, April 11, 2023 at 2:34:40 AM UTC+9 John MacFarlane wrote: > > I would recommend using unicode curly quotes in the markdown when you're > working in a language without interword spacing. We rely on interword > spacing for heuristics about smart quotes. > > > On Apr 9, 2023, at 4:53 PM, Shigeru Kobayashi <may...-+k8b35VvZrR3+QwDJ9on6Q@public.gmane.org> wrote: > > > > Dear Pandoc community, > > > > I have encountered two issues regarding Pandoc's handling of quotation > marks in cases where Japanese and English texts are mixed. > > > > I am using Pandoc version 3.1.2 on macOS 12.6.3, and I can reproduce > these issues. If these are indeed bugs, I am planning to submit them as > issues on GitHub. However, I would appreciate any guidance if these issues > arise from my incorrect usage. > > > > Issue 1: Conversion of English phrases within Japanese text > > > > I have observed the following issue. "input.md" is the input file, and > "input.tex" is the output file. > > > > $ pandoc input.md -o input.tex > > > > input.md: > > その人は"Hello, world!"と言いました。 > > > > input.tex: > > その人は''Hello, world!{}``と言いました。 > > > > However, the conversion is correct when spaces are added before and after > the double quotation marks. > > > > input.md: > > その人は "Hello, world!" と言いました。 > > > > input.tex: > > その人は ``Hello, world!'' と言いました。 > > > > > > Issue 2: The quotation marks are treated as Japanese text > > > > When converting with Pandoc, the quotation marks are treated as Japanese > text, resulting in an unnaturally wide gap. I have confirmed this using two > files, "preamble.tex" and "input.md," and specifying as follows: > > > > $ pandoc input.md -o input.pdf --pdf-engine=xelatex -H preamble.tex. > > > > preamble.tex: > > \usepackage{fontspec} > > > > \setmainfont{Georgia} > > \setjamainfont{BIZ UDMincho Medium} > > > > > > input.md: > > --- > > documentclass: bxjsarticle > > classoption: pandoc > > papersize: a4 > > fontsize: 10pt > > --- > > > > # はじめに > > > > その人は "Hello, world!" と言いました。 > > > > That person said, "Hello, world!" > > > > <pandoc test 2023-04-10 8.47.43.png> > > > > In contrast, when I directly write the content in TeX and output it using > $ xelatex test.tex, the quotation marks are treated as English text, and > the expected output is obtained. > > > > test.tex: > > \documentclass[a4paper,xelatex,ja=standard]{bxjsarticle} > > > > \usepackage{fontspec} > > \setmainfont{Georgia} > > \setjamainfont{BIZ UDMincho Medium} > > > > \title{テスト} > > \begin{document} > > \maketitle > > > > \section{はじめに} > > > > その人は ``Hello, world!'' と言いました。 > > > > That person said, ``Hello, world!'' > > > > \end{document} > > > > <xelatex test 2023-04-10 8.46.44.png> > > > > Shigeru Kobayashi > > > > > > -- > > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit [1]https://groups.google.com/d/ > msgid/pandoc-discuss/ > 4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n%40googlegroups.com. > > <xelatex test 2023-04-10 8.46.44.png><pandoc test 2023-04-10 8.47.43.png> > > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email > to [2]pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit [3] https://groups.google.com/d/msgid/ > pandoc-discuss/602edc59-8983-4459-bbbb-85cee5f013b3n%40googlegroups.com. > > References: > > [1] https://groups.google.com/d/msgid/pandoc-discuss/4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n%40googlegroups.com > [2] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > [3] https://groups.google.com/d/msgid/pandoc-discuss/602edc59-8983-4459-bbbb-85cee5f013b3n%40googlegroups.com?utm_medium=email&utm_source=footer -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/bad8035a-d12b-4ec2-be89-788476948a56n%40googlegroups.com <https://groups.google.com/d/msgid/pandoc-discuss/bad8035a-d12b-4ec2-be89-788476948a56n%40googlegroups.com?utm_medium=email&utm_source=footer> . -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/94d4f547-ec9b-44bf-9cf0-122d1eea1f02n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 14034 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
[parent not found: <94d4f547-ec9b-44bf-9cf0-122d1eea1f02n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Issues with Quotation Marks in Pandoc When Mixing Japanese and English Texts [not found] ` <94d4f547-ec9b-44bf-9cf0-122d1eea1f02n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2023-04-12 17:33 ` BPJ 0 siblings, 0 replies; 9+ messages in thread From: BPJ @ 2023-04-12 17:33 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1: Type: text/plain, Size: 9840 bytes --] Have you looked at the LaTeX produced by pandoc. Den ons 12 apr. 2023 00:31Shigeru Kobayashi <mayfair-+k8b35VvZrR3+QwDJ9on6Q@public.gmane.org> skrev: > Dear Bastien DUMONT, > > Thank you very much for your suggestions. > > > And if you use Unicode curly quotes inside the span? > > I tried this version, but unfortunately, there was no change in the output > PDF (i.e., the Japanese font is used for the curly quotes). > > > > If all else fails use Unicode curly quotes throughout and try this in > the preamble: > > > > \newfontfamily{\GeorgiaFont}{Georgia} > > \usepackage{newunicodechar} > > \newunicodechar{‘}{{\GeorgiaFont ‘}} > > \newunicodechar{’}{{\GeorgiaFont ’}} > > \newunicodechar{“}{{\GeorgiaFont “}} > > \newunicodechar{”}{{\GeorgiaFont ”}} > > I tried this as well, but unfortunately, there was no change in the output > PDF. > > Initially, I suspected it was something in bxjsarticle that I had set in > documentclass. However, this problem does not occur when I output PDF with > xelatex from an equivalent source, rather than converting from Pandoc to > TeX. > > I will investigate this a bit further. > > > Best, > Shigeru > > > On Wednesday, April 12, 2023 at 7:01:45 AM UTC+9 BPJ wrote: > > > > Den tis 11 apr. 2023 07:53Shigeru Kobayashi <may...-+k8b35VvZrR3+QwDJ9on6Q@public.gmane.org> skrev: > > Dear Bastien DUMONT, > > Thank you very much for your reply. I tried as follows and got no changes. > I'm sorry if I misunderstood your suggestion. > > *preamble.tex:* > \usepackage{fontspec} > \setmainfont{Georgia} > \setjamainfont{BIZ UDMincho Medium} > \babelfont[english]{rm}{Georgia} > > *input.md:* > --- > documentclass: bxjsarticle > classoption: pandoc > papersize: a4 > --- > > [That person said, "Hello, world!"]{lang=en} > > > And if you use Unicode curly quotes inside the span? > > If all else fails use Unicode curly quotes throughout and try this in the > preamble: > > \newfontfamily{\GeorgiaFont}{Georgia} > \usepackage{newunicodechar} > \newunicodechar{‘}{{\GeorgiaFont ‘}} > \newunicodechar{’}{{\GeorgiaFont ’}} > \newunicodechar{“}{{\GeorgiaFont “}} > \newunicodechar{”}{{\GeorgiaFont ”}} > > Inspect the LaTeX output by pandoc and make sure you get curly quotes in > the LaTeX source, otherwise experiment with the +smart/-smart extension on > input and output format till you get curly quotes in the LaTeX source. > > > > > > *shell:* > $ pandoc -V lang=ja input.md -o input.pdf --pdf-engine=xelatex -V pandoc > -H preamble.tex > > > > On Tuesday, April 11, 2023 at 2:32:57 PM UTC+9 Bastien DUMONT wrote: > > An if you put the English parts in language spans as I said and add in > header-includes \babelfont[english]{rm}{Georgia}? > > Le Monday 10 April 2023 à 04:30:37PM, Shigeru Kobayashi a écrit : > > Dear Bastien DUMONT and John MacFarlane, > > > > Thank you very much for your reply. > > > > Regarding issue 1, I will use Unicode curly quotes instead of strait > quotes to > > avoid misinterpretation. > > > > Regarding issue 2, I tried marking the English text as [That person > said, > > "Hello, world!"]{lang=en}, but the result was the same (i.e., quotes are > > typeset with the Japanese font instead of the English font). I also > tried > > Unicode curly quotes. > > > > I have confirmed that "pandoc input.md -t input.tex" generates a code as > > expected as follows: > > \foreignlanguage{english}{That person said, ``Hello, world!''} > > > > Therefore, this is puzzling to me... > > > > > > Best regards, > > Shigeru KOBAYASHI > > > > > > On Tuesday, April 11, 2023 at 2:34:40 AM UTC+9 John MacFarlane wrote: > > > > I would recommend using unicode curly quotes in the markdown when you're > > working in a language without interword spacing. We rely on interword > > spacing for heuristics about smart quotes. > > > > > On Apr 9, 2023, at 4:53 PM, Shigeru Kobayashi <may...-+k8b35VvZrR3+QwDJ9on6Q@public.gmane.org> > wrote: > > > > > > Dear Pandoc community, > > > > > > I have encountered two issues regarding Pandoc's handling of quotation > > marks in cases where Japanese and English texts are mixed. > > > > > > I am using Pandoc version 3.1.2 on macOS 12.6.3, and I can reproduce > > these issues. If these are indeed bugs, I am planning to submit them as > > issues on GitHub. However, I would appreciate any guidance if these > issues > > arise from my incorrect usage. > > > > > > Issue 1: Conversion of English phrases within Japanese text > > > > > > I have observed the following issue. "input.md" is the input file, and > > "input.tex" is the output file. > > > > > > $ pandoc input.md -o input.tex > > > > > > input.md: > > > その人は"Hello, world!"と言いました。 > > > > > > input.tex: > > > その人は''Hello, world!{}``と言いました。 > > > > > > However, the conversion is correct when spaces are added before and > after > > the double quotation marks. > > > > > > input.md: > > > その人は "Hello, world!" と言いました。 > > > > > > input.tex: > > > その人は ``Hello, world!'' と言いました。 > > > > > > > > > Issue 2: The quotation marks are treated as Japanese text > > > > > > When converting with Pandoc, the quotation marks are treated as > Japanese > > text, resulting in an unnaturally wide gap. I have confirmed this using > two > > files, "preamble.tex" and "input.md," and specifying as follows: > > > > > > $ pandoc input.md -o input.pdf --pdf-engine=xelatex -H preamble.tex. > > > > > > preamble.tex: > > > \usepackage{fontspec} > > > > > > \setmainfont{Georgia} > > > \setjamainfont{BIZ UDMincho Medium} > > > > > > > > > input.md: > > > --- > > > documentclass: bxjsarticle > > > classoption: pandoc > > > papersize: a4 > > > fontsize: 10pt > > > --- > > > > > > # はじめに > > > > > > その人は "Hello, world!" と言いました。 > > > > > > That person said, "Hello, world!" > > > > > > <pandoc test 2023-04-10 8.47.43.png> > > > > > > In contrast, when I directly write the content in TeX and output it > using > > $ xelatex test.tex, the quotation marks are treated as English text, and > > the expected output is obtained. > > > > > > test.tex: > > > \documentclass[a4paper,xelatex,ja=standard]{bxjsarticle} > > > > > > \usepackage{fontspec} > > > \setmainfont{Georgia} > > > \setjamainfont{BIZ UDMincho Medium} > > > > > > \title{テスト} > > > \begin{document} > > > \maketitle > > > > > > \section{はじめに} > > > > > > その人は ``Hello, world!'' と言いました。 > > > > > > That person said, ``Hello, world!'' > > > > > > \end{document} > > > > > > <xelatex test 2023-04-10 8.46.44.png> > > > > > > Shigeru Kobayashi > > > > > > > > > -- > > > You received this message because you are subscribed to the Google > Groups > > "pandoc-discuss" group. > > > To unsubscribe from this group and stop receiving emails from it, send > an > > email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > > To view this discussion on the web visit [1] > https://groups.google.com/d/ > > msgid/pandoc-discuss/ > > 4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n%40googlegroups.com. > > > <xelatex test 2023-04-10 8.46.44.png><pandoc test 2023-04-10 > 8.47.43.png> > > > > > > -- > > You received this message because you are subscribed to the Google > Groups > > "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email > > to [2]pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit [3] > https://groups.google.com/d/msgid/ > > pandoc-discuss/602edc59-8983-4459-bbbb-85cee5f013b3n%40googlegroups.com. > > > > > References: > > > > [1] > https://groups.google.com/d/msgid/pandoc-discuss/4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n%40googlegroups.com > > [2] mailto:pandoc-discus...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > > [3] > https://groups.google.com/d/msgid/pandoc-discuss/602edc59-8983-4459-bbbb-85cee5f013b3n%40googlegroups.com?utm_medium=email&utm_source=footer > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/bad8035a-d12b-4ec2-be89-788476948a56n%40googlegroups.com > <https://groups.google.com/d/msgid/pandoc-discuss/bad8035a-d12b-4ec2-be89-788476948a56n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/94d4f547-ec9b-44bf-9cf0-122d1eea1f02n%40googlegroups.com > <https://groups.google.com/d/msgid/pandoc-discuss/94d4f547-ec9b-44bf-9cf0-122d1eea1f02n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhAWNjWpvkSsy3j50v05q4e17AZpnkHbrFfF9modn0PWTw%40mail.gmail.com. [-- Attachment #2: Type: text/html, Size: 15138 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-04-12 17:33 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-04-09 23:53 Issues with Quotation Marks in Pandoc When Mixing Japanese and English Texts Shigeru Kobayashi [not found] ` <4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2023-04-10 7:40 ` Bastien DUMONT 2023-04-10 17:34 ` John MacFarlane [not found] ` <D44375EB-4058-4C5F-AF39-461B38B30EE7-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2023-04-10 23:30 ` Shigeru Kobayashi [not found] ` <602edc59-8983-4459-bbbb-85cee5f013b3n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2023-04-11 5:32 ` Bastien DUMONT 2023-04-11 5:50 ` Shigeru Kobayashi [not found] ` <bad8035a-d12b-4ec2-be89-788476948a56n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2023-04-11 22:01 ` BPJ [not found] ` <CADAJKhC8hggDWF93mdtd2kYhUhjuBGJ7Jk-q1qx7omm+dCsMXw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2023-04-11 22:26 ` Shigeru Kobayashi [not found] ` <94d4f547-ec9b-44bf-9cf0-122d1eea1f02n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2023-04-12 17:33 ` BPJ
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).