From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/29306 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: John MacFarlane Newsgroups: gmane.text.pandoc Subject: Re: controlling smart typography with org or markdown Date: Mon, 27 Sep 2021 16:35:08 -0700 (PDT) Message-ID: <4331fc4f-e662-4ea0-89c0-c62b0c0228e1n@googlegroups.com> References: Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_3484_1928849664.1632785708762" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="19193"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCJZJHG45QDBBLVKZGFAMGQE3C6YFVQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Tue Sep 28 01:35:12 2021 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-oo1-f59.google.com ([209.85.161.59]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1mV09Q-0004jd-HN for gtp-pandoc-discuss@m.gmane-mx.org; Tue, 28 Sep 2021 01:35:12 +0200 Original-Received: by mail-oo1-f59.google.com with SMTP id j27-20020a4ad2db000000b0029ac522e7d5sf20875234oos.7 for ; Mon, 27 Sep 2021 16:35:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=FtUCW7b78e0zOVWTrpYEU0x/LKXUNYZPQ5WKJ4P/jFk=; b=M7o4wMUKQ3AbC4XvnfVHd6cp6I1hH6hB2K+UD54SsUxE22lC1oLlMV6K3gkx8IGdv+ OMppDFz5soZ9oy34YiSiJG4dyGX2TBXw867kckCIJ6lpZyr0P2FgBfru3GbxUxvpdLhb bZEAI5Gq6Cd6X+QdLyW3Fi0zExZp88ExVaMaNiWxdGYD9tGEjontKmcY+t4E9lU3W30l e0xN3zsjlRl4EcjDK8UrIPKrCMt7l+osDh5d8VprOiZTodS88Xjn0JNqiB0K66tOq3GB viOReP3BUsDL7GPdqv+R0CwBjgURMAJhPkQgkvd/pcgyouZuCHZzR4UOh72TT73AiZWK gPCQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=berkeley-edu.20210112.gappssmtp.com; s=20210112; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=FtUCW7b78e0zOVWTrpYEU0x/LKXUNYZPQ5WKJ4P/jFk=; b=yjbGVoS08DfXsQzRcq2ZLuntvrpJEWPZ+IyVA5nbsr6V226jXV80Tq6ZRGOvsISDvK qy/WYhqugkIhQ6OlWXkOng8D60SVjIrdD/BXJ2jqcMRCfSK7ooCrQ5cvSjakafQ6tJiq I2c38HD7sTPRzeNsoPq4NGKyIu1e68UIu9s0hN0KCsWJSUs0PQMWnZZNdobjeGzl0fnj Ng6KnpfwlyPbCvw4gvca4JDErzH/pftKYUSuQ9lDnWjQMeGBTB+0SS8JknWb6t10mllP VbeVpr5jDa7XN2sAfoEaK5AN2CcKh9MbfCJa4riq2T9sUFrJ1MdJRKu74oeYpZ24MwZx my1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=FtUCW7b78e0zOVWTrpYEU0x/LKXUNYZPQ5WKJ4P/jFk=; b=LeSOIKBq2eFIfpoBQN2UBUdtBDV66KvqeRxdN7J0JC0vWyVIafYEYCxPfVJ86fNQ6r QEgH8qzD2VmCqhl6P6ms3BoArfnAvaHZ43CX8/5YlbLeBptOYd4UP5OZ1W27I8DHOxRo FwA3tT+OvfTCAn8XVSBP54xBsbGACXN38KvVrLQG0z86BHVttWxSYo0hl7j4DJss/7Rk 4iTw3as4vXFAIgUGbNwFh0OR3XiAd1xHmJPwdq9mcuFy5Vf2pfXBkjLTgalColhzWfRL RHw/OPuC2+7wbM2Sfrhy+igMjPAjAPOAFZU4f7q0tfE4n4U13yi2R/py1iOUOrJ/gF3j vCfw== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM532qY7WFXpEBnpU544GnVHbAntook/3DPSYN6Z2cR5hNj3MmccHf NCaDFHepeVHD4yIA1g9F8cI= X-Google-Smtp-Source: ABdhPJxBkJlvEQtv/aNvcq0XOVRpaTCMHVtR4Kg3+ul1NpF1HuaLs2kKNNNu/h5Im4qS9STf6o72LQ== X-Received: by 2002:a9d:6d14:: with SMTP id o20mr2343617otp.357.1632785711441; Mon, 27 Sep 2021 16:35:11 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6808:48:: with SMTP id v8ls4097992oic.11.gmail; Mon, 27 Sep 2021 16:35:09 -0700 (PDT) X-Received: by 2002:aca:d988:: with SMTP id q130mr1328450oig.148.1632785709322; Mon, 27 Sep 2021 16:35:09 -0700 (PDT) In-Reply-To: X-Original-Sender: jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:29306 Archived-At: ------=_Part_3484_1928849664.1632785708762 Content-Type: multipart/alternative; boundary="----=_Part_3485_444742584.1632785708762" ------=_Part_3485_444742584.1632785708762 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable I'm not sure what's going on with org (2), but I can illuminate the other= =20 mysteries. 1) It's not a typo. markdown_strict is the name of the format (really=20 shorthand for markdown with certain extension settings). 3) Why the reverse effect with `-t markdown-smart`? Think of it this way. You're writing markdown for consumption by a=20 markdown processor which does not have "smart" support. How, then, should= =20 we render Str "\8211"? Not as "--", because the target processor doesn't= =20 support smart quotes; it will interpret "--" as two dashes. Instead, as a= =20 unicode en-dash. Another way to think of it is this: 'pandoc -f markdown+smart -t=20 markdown+smart' should be ideally be an identity, at least up to the=20 semantics. (There might be changes in indentation, bullet markers, and=20 things like that, but we'd hope that the source and target would correspond= =20 to the same AST.) So, if '-f markdown+smart' changes "--" to "\8211", then= =20 "-t markdown+smart" had better change "\8211" back to "--". 4) Hopefully it makes more sense now in light of the above. I don't really= =20 understand how you're using "smart markdown" here. The way we use it,=20 "smart markdown" is a markdown dialect in which "--" gets parsed as Str=20 "\8211", and Str "\8211" gets rendered as "--". That is, a dialect in=20 which the markdown string "--" *means* Str "\8211". On Sunday, September 26, 2021 at 12:41:00 PM UTC-7 Simon Michael wrote: > G'day all! > > As someone documenting command line software, I never want `smart`=20 > typography. Still, Pandoc lets me control it, right ? In the end, yes..= =20 > but I must share some notes. Perhaps I'm getting confused between=20 > readers and writers ? Any comments welcome. > > > 1. A small correction > --------------------- > > > https://pandoc.org/MANUAL.html#extensions: "For example, --from=20 > markdown_strict+footnotes is..." > > The underscore is a typo I think. > > > 2. With the org reader, smart can not be disabled > ------------------------------------------------- > > > https://pandoc.org/MANUAL.html#extension-smart: "Interpret straight=20 > quotes as curly quotes, --- as em-dashes, -- as en-dashes, and ... as=20 > ellipses. " > > This suggests smart is disabled by default for org (reader, or so I think= ): > > $ pandoc --version > pandoc 2.14.2 > Compiled with pandoc-types 1.22, texmath 0.12.3.1, skylighting 0.11, > citeproc 0.5, ipynb 0.1.0.1 > ... > $ pandoc --list-extensions=3Dorg > -ascii_identifiers > +auto_identifiers > +citations > -east_asian_line_breaks > -gfm_auto_identifiers > -smart > > But this shows it enabled by default: > > $ echo '--version' | pandoc -f org -t native > [Para [Str "\8211version"]] > > And enabling/disabling the extension has no effect: > > $ echo '--version' | pandoc -f org-smart -t native > [Para [Str "\8211version"]] > $ echo '--version' | pandoc -f org+smart -t native > [Para [Str "\8211version"]] > > > 3. With the markdown writer, smart is selected oppositely > --------------------------------------------------------- > > Since I am converting to markdown, maybe I could control it there. I was= =20 > already disabling smart in the markdown writer I thought, but it was not= =20 > working: > > $ echo '--version' | pandoc -f org -t markdown-smart > =E2=80=93version > > Then I found this: > > > "Note: If you are writing Markdown, then the smart extension has the=20 > reverse effect: what would have been curly quotes comes out straight." > > Which indeed achieves my goal: > > $ echo '--version' | pandoc -f org -t markdown+smart > --version > > But.. why the reverse effect ? There must be a reason, but I found this= =20 > non-intuitive. > > > 4. More > --------------------------------------------------------- > > So it seems I have been *enabling* smart in my markdown web docs (with=20 > -t markdown-smart) for years. But I haven't been seeing smart=20 > quotes/dashes; apparently they were still being suppressed by other=20 > means. I investigated: > > With the markdown reader, smart is enabled by default, and disabled with= =20 > -smart as one would expect (unlike the markdown writer): > > $ echo '--version' | pandoc -f markdown -t native > [Para [Str "\8211version"]] > $ echo '--version' | pandoc -f markdown-smart -t native > [Para [Str "--version"]] > > Here are the combinations of the markdown reader and markdown writer, as= =20 > I understand them. > > A. from smart markdown to non-smart markdown: > > $ echo '--version' | pandoc -f markdown -t markdown > --version > > B. from smart markdown to smart markdown: > > $ echo '--version' | pandoc -f markdown -t markdown-smart > =E2=80=93version > > C. from non-smart markdown to non-smart markdown (why the backslash ?): > > $ echo '--version' | pandoc -f markdown-smart -t markdown > \--version > > D. from non-smart markdown to smart markdown (why no en-dash here ?) > > $ echo '--version' | pandoc -f markdown-smart -t markdown-smart > --version > > > And now I must go for a lie down. > > --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/= pandoc-discuss/4331fc4f-e662-4ea0-89c0-c62b0c0228e1n%40googlegroups.com. ------=_Part_3485_444742584.1632785708762 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I'm not sure what's going on with org (2), but I can illuminate the ot= her mysteries.

1)  It's not a typo.  mar= kdown_strict is the name of the format (really shorthand for markdown with = certain extension settings).

3) Why the reverse ef= fect with `-t markdown-smart`?

Think of it this wa= y.  You're writing markdown for consumption by a markdown processor wh= ich does not have "smart" support.  How, then, should we render Str "\= 8211"?  Not as "--", because the target processor doesn't support smar= t quotes; it will interpret "--" as two dashes.  Instead, as a unicode= en-dash.

Another way to think of it is this: = ; 'pandoc -f markdown+smart -t markdown+smart' should be ideally be an iden= tity, at least up to the semantics.  (There might be changes in indent= ation, bullet markers, and things like that, but we'd hope that the source = and target would correspond to the same AST.)  So, if '-f markdown+sma= rt' changes "--" to "\8211", then "-t markdown+smart" had better change "\8= 211" back to "--".

4) Hopefully it makes more sens= e now in light of the above.  I don't really understand how you're usi= ng "smart markdown" here.  The way we use it, "smart markdown" is a ma= rkdown dialect in which "--" gets parsed as Str "\8211", and Str "\8211" ge= ts rendered as "--".  That is, a dialect in which the markdown string = "--" *means* Str "\8211".



On Sunda= y, September 26, 2021 at 12:41:00 PM UTC-7 Simon Michael wrote:
<= blockquote class=3D"gmail_quote" style=3D"margin: 0 0 0 0.8ex; border-left:= 1px solid rgb(204, 204, 204); padding-left: 1ex;">G'day all!

As someone documenting command line software, I never want `smart`=20
typography. Still, Pandoc lets me control it, right ? In the end, yes..= =20
but I must share some notes. Perhaps I'm getting confused between= =20
readers and writers ? Any comments welcome.


1. A small correction
---------------------

> https://pandoc.org/MANUAL.html#extensions: "For example, --from= =20
markdown_strict+footnotes is..."

The underscore is a typo I think.


2. With the org reader, smart can not be disabled
-------------------------------------------------

> https://pandoc.org/MANUAL.html#extension-smart: "Inter= pret straight=20
quotes as curly quotes, --- as em-dashes, -- as en-dashes, and ... as= =20
ellipses. "

This suggests smart is disabled by default for org (reader, or so I thi= nk):

$ pandoc --version
pandoc 2.14.2
Compiled with pandoc-types 1.22, texmath 0.12.3.1, skylighting 0.11,
citeproc 0.5, ipynb 0.1.0.1
...
$ pandoc --list-extensions=3Dorg
-ascii_identifiers
+auto_identifiers
+citations
-east_asian_line_breaks
-gfm_auto_identifiers
-smart

But this shows it enabled by default:

$ echo '--version' | pandoc -f org -t native
[Para [Str "\8211version"]]

And enabling/disabling the extension has no effect:

$ echo '--version' | pandoc -f org-smart -t native
[Para [Str "\8211version"]]
$ echo '--version' | pandoc -f org+smart -t native
[Para [Str "\8211version"]]


3. With the markdown writer, smart is selected oppositely
---------------------------------------------------------

Since I am converting to markdown, maybe I could control it there. I wa= s=20
already disabling smart in the markdown writer I thought, but it was no= t=20
working:

$ echo '--version' | pandoc -f org -t markdown-smart
=E2=80=93version

Then I found this:

> "Note: If you are writing Markdown, then the smart extension= has the=20
reverse effect: what would have been curly quotes comes out straight.&q= uot;

Which indeed achieves my goal:

$ echo '--version' | pandoc -f org -t markdown+smart
--version

But.. why the reverse effect ? There must be a reason, but I found this= =20
non-intuitive.


4. More
---------------------------------------------------------

So it seems I have been *enabling* smart in my markdown web docs (with= =20
-t markdown-smart) for years. But I haven't been seeing smart=20
quotes/dashes; apparently they were still being suppressed by other=20
means. I investigated:

With the markdown reader, smart is enabled by default, and disabled wit= h=20
-smart as one would expect (unlike the markdown writer):

$ echo '--version' | pandoc -f markdown -t native
[Para [Str "\8211version"]]
$ echo '--version' | pandoc -f markdown-smart -t native
[Para [Str "--version"]]

Here are the combinations of the markdown reader and markdown writer, a= s=20
I understand them.

A. from smart markdown to non-smart markdown:

$ echo '--version' | pandoc -f markdown -t markdown
--version

B. from smart markdown to smart markdown:

$ echo '--version' | pandoc -f markdown -t markdown-smart
=E2=80=93version

C. from non-smart markdown to non-smart markdown (why the backslash ?):

$ echo '--version' | pandoc -f markdown-smart -t markdown
\--version

D. from non-smart markdown to smart markdown (why no en-dash here ?)

$ echo '--version' | pandoc -f markdown-smart -t markdown-smar= t
--version


And now I must go for a lie down.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/4331fc4f-e662-4ea0-89c0-c62b0c0228e1n%40googlegroups.= com.
------=_Part_3485_444742584.1632785708762-- ------=_Part_3484_1928849664.1632785708762--