From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/32465 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: John MacFarlane Newsgroups: gmane.text.pandoc Subject: Re: Issues with Quotation Marks in Pandoc When Mixing Japanese and English Texts Date: Mon, 10 Apr 2023 10:34:32 -0700 Message-ID: References: <4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.2\)) Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="1672"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBDW7ZIEHTIIBBLMR2GQQMGQEQAGTJCQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mon Apr 10 19:34:41 2023 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-qk1-f189.google.com ([209.85.222.189]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1plvPd-0000Dc-6e for gtp-pandoc-discuss@m.gmane-mx.org; Mon, 10 Apr 2023 19:34:41 +0200 Original-Received: by mail-qk1-f189.google.com with SMTP id d128-20020a376886000000b007468706dfb7sf2996555qkc.9 for ; Mon, 10 Apr 2023 10:34:41 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1681148080; cv=pass; d=google.com; s=arc-20160816; b=Y1vadq+o3yBwgPXKYdgI47WRFW6FnB3q13gNlLdT8lAsbihvvlnLX+lLFmpJoLJmjd RZNkMvK+kosPEYiqDdX2GzLD7WVqNh9R1IzpnJt9I5Pp3dgeEnHfxXi/DmxCYH0aJokp S00wVe99T8HD8tdgrRDBDJ9XSuePcjtU7jxvPE8KTjkv971bpqgwAE6HUTRi8F6mo8Iv RNZNdp6XbBkZMSUTlOqxahioB2DR6tuiEzkx9x66mtDf5tD/5LYkpIcjeGc4rvk+Z5A4 JZ4XCx8h2CCMbYTgumoxPORvXdqOKB0Vn+vDcwplVWal9VZYxCWqw6ILOSsfZsDpiVTc sKLw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:message-id:in-reply-to:to :references:date:subject:mime-version:content-transfer-encoding:from :sender:dkim-signature:dkim-signature; bh=17VyBdU0LZOaq3X0B+vkrauG4E8v9UkKW9qAdeuh4Yg=; b=RQ1EKq7hQMwWxSq/kYS6g1FNxw7iznblcXSBxW+ETWL7LxQMXTR9AV6iUWfq1Ynd+A pxfkeJ3eCEuFc8nHBgmm2qR5DWIUXc5AnQFzVnS6SlvFqtk3ykGqTO1KrEFyh4ROlfdj GgGra+BflUy95Vd1sawVANbFXzkPzL9aoG+kSp97XxHvmiN8Cu3rIu4rteTBDY9MYCSz pqq3fvJCwHCYmvTAhNoo0vpPIl5qqFf2gAV9it+vlv/GLnUmHorqcsIzX5Xf2Y/LyC81 qa81RSM+WOcHvJ5TZfm/tLKRnS+f8GZpCa1pjapH93plXhG/cAm2ZVzM99gFjUYPBroH FD7A== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=jZznwi61; spf=pass (google.com: domain of fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::635 as permitted sender) smtp.mailfrom=fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20221208; t=1681148080; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:message-id :in-reply-to:to:references:date:subject:mime-version :content-transfer-encoding:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=17VyBdU0LZOaq3X0B+vkrauG4E8v9UkKW9qAdeuh4Yg=; b=hgnKdturdiMmUVlqf9QJG/eDVTs8yoCxk41ZvLTYoUOjFXEzq5+lEX3w0hdoaypO0N 9zCAJ9GJYr0GgBfjBiujiUNWeod6Z6T2WxI2owGfpW266LIHHNZ4UfouTMPLoyburPCf Knx6LNu0+SM+/KzrpDg/sqHXbjou7XKY/s3tR+I7B1OUDDd6AqzRPSRdolo7IWS+1pbI Qdc0iiv7F7SgUY8wnRHUbPs4ZwEdK1HRlKo9DUpgH0viBwMQjJHq4+E1oIqegP9njNQI SF1R0wtETDKCtg8J1YtH+NNLfqOMWSeGDUitxFS7vLQ7IzRBVO7 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1681148080; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:message-id :in-reply-to:to:references:date:subject:mime-version :content-transfer-encoding:from:from:to:cc:subject:date:message-id :reply-to; bh=17VyBdU0LZOaq3X0B+vkrauG4E8v9UkKW9qAdeuh4Yg=; b=GJ6ycKQwKh2P94YSRQf9IQyInlnnUdQpsS4Qc+9TG4jCthS4TGzQ8m4AUeD02XSCu8 aBOOJW29NtxaESVOtn/EsDY5i2whYVQaUo1HPbeiq6kV+GW5s4Yu5V6646sQS4IdK37Q 9fYT/SGeaf+/dfcZl+kkDbsmRkJyc5WwqZjVv5RgrRC91mnJgQTPA3858agtMt8hwDG7 4y/SKiQbIiuwjPkRD5Kbbh+JTHRBplGgS7e1OtHtNT1zibbdAGz+K+a8A8eymIMFZEpX bvRtaXOFNpWQ2/g+78iXNq/p30UDhocvgqKwFbqyKrX/6wvFy2QEZ7QlOrvCV7goP X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1681148080; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :x-spam-checked-in-group:list-id:mailing-list:precedence:reply-to :x-original-authentication-results:x-original-sender:message-id :in-reply-to:to:references:date:subject:mime-version :content-transfer-encoding:from:x-beenthere:x-gm-message-state :sender:from:to:cc:subject:date:message-id:reply-to; bh=17VyBdU0LZOaq3X0B+vkrauG4E8v9UkKW9qAdeuh4Yg=; b=xo162SZrfqOqrvUi2GRmwEy+g9cMpADmczSB/CpnG3DzsP0J/4Ru8ac2gPlwYSgyQI MWMOCV4DEzK8qzPBR5WYoNi3NDdYQ1vfEJRceN4E9t+N7Y75cRMNVbtp9hd2pOnCZwi3 BdMipor5GQZn07uESQ5vmdWrxK5XcvDWXG4HUChItvOWD3uOlhH0qYbk3Ca9Qz1bXiwL iWPtEvXg5Xgm2zOfLOSkFM5v6ycfN9hcWKSKnQFv3p/eZW6fngWbPIYY5VEAjfxxK+NN Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AAQBX9dfsUkyBuvidzwCVnuLdedp6rpW46LXtltTGONG4qwtZeE5ncys pZKyTnvbPN5FmomghanOO0E= X-Google-Smtp-Source: AKy350bzFWrnIhceMKe4MeyfRgfAb18zkTya+cHPX+8OkgCONOjVdY+G0MCStTRf3dz1GhfxA8sgNQ== X-Received: by 2002:a05:6214:192d:b0:56e:a203:5d1f with SMTP id es13-20020a056214192d00b0056ea2035d1fmr2144307qvb.5.1681148080172; Mon, 10 Apr 2023 10:34:40 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:622a:5149:b0:3e2:fe4e:6e8b with SMTP id ew9-20020a05622a514900b003e2fe4e6e8bls30787763qtb.1.-pod-prod-gmail; Mon, 10 Apr 2023 10:34:37 -0700 (PDT) X-Received: by 2002:ac8:5c43:0:b0:3e1:59e8:7451 with SMTP id j3-20020ac85c43000000b003e159e87451mr13143419qtj.14.1681148076965; Mon, 10 Apr 2023 10:34:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681148076; cv=none; d=google.com; s=arc-20160816; b=W+fzfLYJ7d0EBMPpETDMIFXQ9E5aGTRCsHFeVnqVls+c3puNMu8BtavCW2gCo3PFsw /mPGr9qUqj6w4G8Qz+OAG+xgiAVy4MYMUk6z6VVTsHzIaJrNOhxgTlX8nXcfKQP0GFNE 6esIJ3J2mRuPPygC+EOwXBTLmxknUi4RovPhdGKIxOIpkKC4LAHcBtivo442XcsJhYDe GsfB5N5Jwn/l3UYZ5CPrw2eR2uKeFYoIUO8c1ZGcS0DPzwFgTAhtlc/nxbvZezoNe59C NpkgEPZs52u++82j6b8F+SxQvOw7VAKAaxtj84OfTbsBdIH5SCIAxgoifZgvnOpCBHrW 0Hvg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:in-reply-to:to:references:date:subject:mime-version :content-transfer-encoding:from:dkim-signature; bh=3snznSfNjfK4snjHxyX2ZYFkzPJsEzMXYhQWCY2KySc=; b=jREDzDMzjCQNpD0p6b6SxWSlTIKwn2RSwazbGmBQ8zkFXcwmqtXz4Fb9k3xc8r5kNs At5Oi/EGXrdnidxnQs5Zh3Q1gcMJc3OYbLGuHBEm0M0sDfCzJszDxKBz3SEy+xoQN5K2 rEA3H3QrZdjAtISdsVBDPaYB10I15RL+f5/9HQTG40cYaz2NMFzDUoR9an9YVFMasEUp bx1lSI4x98plsg/i+kWkii/ueJ1a7U6mXA8/b0NOKKP2QXeYHTfOZKT872vYkmfd9BQj Q5NISqdkXmRIx4AGlyffUZw24mk+ab2yVx3J4dGdHeE6/K8VhBsexzaDxwM3BPB6mz/t DOqQ== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=jZznwi61; spf=pass (google.com: domain of fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::635 as permitted sender) smtp.mailfrom=fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Original-Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com. [2607:f8b0:4864:20::635]) by gmr-mx.google.com with ESMTPS id cd3-20020a05622a418300b003e3958636fasi890653qtb.4.2023.04.10.10.34.36 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 10 Apr 2023 10:34:36 -0700 (PDT) Received-SPF: pass (google.com: domain of fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::635 as permitted sender) client-ip=2607:f8b0:4864:20::635; Original-Received: by mail-pl1-x635.google.com with SMTP id 20so7062200plk.10 for ; Mon, 10 Apr 2023 10:34:36 -0700 (PDT) X-Received: by 2002:a17:902:f1d2:b0:1a2:8924:225d with SMTP id e18-20020a170902f1d200b001a28924225dmr7962631plc.47.1681148075785; Mon, 10 Apr 2023 10:34:35 -0700 (PDT) Original-Received: from smtpclient.apple ([2601:644:4700:2110:c17f:9f2:efd0:377f]) by smtp.gmail.com with ESMTPSA id iw5-20020a170903044500b0019a91895cdfsm8096976plb.50.2023.04.10.10.34.33 for (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 10 Apr 2023 10:34:34 -0700 (PDT) In-Reply-To: <4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> X-Mailer: Apple Mail (2.3696.120.41.1.2) X-Original-Sender: fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=jZznwi61; spf=pass (google.com: domain of fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4864:20::635 as permitted sender) smtp.mailfrom=fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:32465 Archived-At: I would recommend using unicode curly quotes in the markdown when you're wo= rking in a language without interword spacing. We rely on interword spacing= for heuristics about smart quotes. > On Apr 9, 2023, at 4:53 PM, Shigeru Kobayashi wrote= : >=20 > Dear Pandoc community, >=20 > I have encountered two issues regarding Pandoc's handling of quotation ma= rks in cases where Japanese and English texts are mixed. >=20 > I am using Pandoc version 3.1.2 on macOS 12.6.3, and I can reproduce thes= e issues. If these are indeed bugs, I am planning to submit them as issues = on GitHub. However, I would appreciate any guidance if these issues arise f= rom my incorrect usage. >=20 > Issue 1: Conversion of English phrases within Japanese text >=20 > I have observed the following issue. "input.md" is the input file, and "i= nput.tex" is the output file. >=20 > $ pandoc input.md -o input.tex >=20 > input.md: > =E3=81=9D=E3=81=AE=E4=BA=BA=E3=81=AF"Hello, world!"=E3=81=A8=E8=A8=80=E3= =81=84=E3=81=BE=E3=81=97=E3=81=9F=E3=80=82 >=20 > input.tex: > =E3=81=9D=E3=81=AE=E4=BA=BA=E3=81=AF''Hello, world!{}``=E3=81=A8=E8=A8=80= =E3=81=84=E3=81=BE=E3=81=97=E3=81=9F=E3=80=82 >=20 > However, the conversion is correct when spaces are added before and after= the double quotation marks. >=20 > input.md: > =E3=81=9D=E3=81=AE=E4=BA=BA=E3=81=AF "Hello, world!" =E3=81=A8=E8=A8=80= =E3=81=84=E3=81=BE=E3=81=97=E3=81=9F=E3=80=82 >=20 > input.tex: > =E3=81=9D=E3=81=AE=E4=BA=BA=E3=81=AF ``Hello, world!'' =E3=81=A8=E8=A8=80= =E3=81=84=E3=81=BE=E3=81=97=E3=81=9F=E3=80=82 >=20 >=20 > Issue 2: The quotation marks are treated as Japanese text >=20 > When converting with Pandoc, the quotation marks are treated as Japanese = text, resulting in an unnaturally wide gap. I have confirmed this using two= files, "preamble.tex" and "input.md," and specifying as follows: >=20 > $ pandoc input.md -o input.pdf --pdf-engine=3Dxelatex -H preamble.tex. >=20 > preamble.tex: > \usepackage{fontspec} >=20 > \setmainfont{Georgia} > \setjamainfont{BIZ UDMincho Medium} >=20 >=20 > input.md: > --- > documentclass: bxjsarticle > classoption: pandoc > papersize: a4 > fontsize: 10pt > --- >=20 > # =E3=81=AF=E3=81=98=E3=82=81=E3=81=AB >=20 > =E3=81=9D=E3=81=AE=E4=BA=BA=E3=81=AF "Hello, world!" =E3=81=A8=E8=A8=80= =E3=81=84=E3=81=BE=E3=81=97=E3=81=9F=E3=80=82 >=20 > That person said, "Hello, world!" >=20 > >=20 > In contrast, when I directly write the content in TeX and output it using= $ xelatex test.tex, the quotation marks are treated as English text, and t= he expected output is obtained. >=20 > test.tex: > \documentclass[a4paper,xelatex,ja=3Dstandard]{bxjsarticle} >=20 > \usepackage{fontspec} > \setmainfont{Georgia} > \setjamainfont{BIZ UDMincho Medium} >=20 > \title{=E3=83=86=E3=82=B9=E3=83=88} > \begin{document} > \maketitle >=20 > \section{=E3=81=AF=E3=81=98=E3=82=81=E3=81=AB} >=20 > =E3=81=9D=E3=81=AE=E4=BA=BA=E3=81=AF ``Hello, world!'' =E3=81=A8=E8=A8=80= =E3=81=84=E3=81=BE=E3=81=97=E3=81=9F=E3=80=82 >=20 > That person said, ``Hello, world!'' >=20 > \end{document} >=20 > >=20 > Shigeru Kobayashi >=20 >=20 > --=20 > You received this message because you are subscribed to the Google Groups= "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an= email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgi= d/pandoc-discuss/4a0eafdc-b4a2-4a6a-9488-d2a1c9ef8351n%40googlegroups.com. > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/D44375EB-4058-4C5F-AF39-461B38B30EE7%40gmail.com.