From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/28252 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Jason Miller Newsgroups: gmane.text.pandoc Subject: Re: Character escapes in asciidoc Date: Thu, 29 Apr 2021 13:45:29 -0700 (PDT) Message-ID: <19506eb9-b617-4e81-91ed-a72d9c8b7978n@googlegroups.com> References: <5c3d5ae9-c427-402e-90bd-41705fdbcb1an@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_52_1120777205.1619729129665" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="10933"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCN5VGW52MNRB2VVVSCAMGQEBKKM2HI-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Thu Apr 29 22:45:34 2021 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-oo1-f63.google.com ([209.85.161.63]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1lcDXR-0002gZ-FQ for gtp-pandoc-discuss@m.gmane-mx.org; Thu, 29 Apr 2021 22:45:33 +0200 Original-Received: by mail-oo1-f63.google.com with SMTP id w16-20020a4ae9f00000b02901f8bce2ba65sf6894865ooc.3 for ; Thu, 29 Apr 2021 13:45:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:date:from:to:message-id:in-reply-to:references:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:list-post:list-help:list-archive:list-subscribe :list-unsubscribe; bh=vEe8zYLdWX8jxvHKBxaEeC56szS53IwpvhGhKZ/Tezk=; b=sNZUa1A5QPJgBryqHUaIguSPBM16MYEQzDpgZCxhLPT0eU429qd21FFs7MsBHq0xzL bIkCV3yfH6cQN/8oOeZ1ciI+sTyn7TXB38/BIfXmKmXuBOSXaU5T62TDCC6kEsW1EDRr kSU9oufAvQHP414uwUrub5L5b23PfVqI3B1OjNxE24BpHcS6Z/RiesRoIsLvwE/Z6I4Y /Jn9NJ6/KWBCoaG6ptoF8eN7ww+TfX7J6jq4nfx5JWO3ewp9I9zr+AtDqrxpByWG3YNI 9/rxA0aPgslon3dz9iQ5rNoJKe9Zru2vCARJbpODhzfdwHklTlvh9HGQINm+higZvP+z TD5g== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:message-id:in-reply-to:references:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=vEe8zYLdWX8jxvHKBxaEeC56szS53IwpvhGhKZ/Tezk=; b=R6lOQWAwK29KB5gEVTR89fJNFYsiFehVYXTKHsf40assRnfJHjbyNHtiRCKRj3fxou T53u3ixxJVTXn5fjrUDcbXRrp2ORvu6SmsTMMIeV++am2ZOHk8jDGt7ni91ppRkwzVil sh1H5Nf9Gu5cUsWBen5yLFDXBjhtLoYAQV44QWZ9fLqhyFvaigf1zIb7jay4z4ceh9Yq clx9x5VpY0CTqgb7L8IOca8o37TMjbOiHlfLUQc7PM3UvROxvJBd8Au2A7dri8SHe4+a rvUu4CoekZXkSLFXu9BWExVbugj6Gd9ZxG++SQiU3Aim9KK9GR14d88O8PoGG7LHRpmN VKlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:date:from:to:message-id:in-reply-to :references:subject:mime-version:x-original-sender:reply-to :precedence:mailing-list:list-id:x-spam-checked-in-group:list-post :list-help:list-archive:list-subscribe:list-unsubscribe; bh=vEe8zYLdWX8jxvHKBxaEeC56szS53IwpvhGhKZ/Tezk=; b=BSKWDceP1GEG4tJgzNoH0aRmkLDn0oHfxFPLkgf/x97WR9fqUg55BK0Pn+4fFu3lSj O1sMSkCd4SXjQsHb1VOe68+s+0MiC/v2ezFcc/QgDXnG6OTZSDUSaCnCsuIoxmPJMljv tIUswAzRk0IzLFEB5TwMwCeGwqUXv9f7uJUZ0f2vchrc8D4vFysUHoWCP3j2amspda5k NUcqQLA4SGrJpHzyh/Oi18p6GW6GM7juJbMG1B3SN+UreoyQfGYUBvlxAChrDdqYG/Qi hZ36xipdVAwoNZncMUne0rb/j7CbLiOJt/q2z4rsw1EgVcYe3+hX493dkSeJa/jUWqGW P10A== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM5318uxVfKAvr8uIfNfCscjgRcudDp6HQynNdaT9yacGga4PlfdZp qbwcoLsro0i20rkZBlwykmk= X-Google-Smtp-Source: ABdhPJyWTeDCNUy/Biqj0eWm4wbZKlgHjfHlpdD4/w/XbIClIHc6ASOMVvqheK1S2Ee+K7vXKdgRfw== X-Received: by 2002:aca:c7cd:: with SMTP id x196mr8708139oif.158.1619729132474; Thu, 29 Apr 2021 13:45:32 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a4a:c314:: with SMTP id c20ls330932ooq.11.gmail; Thu, 29 Apr 2021 13:45:30 -0700 (PDT) X-Received: by 2002:a4a:855d:: with SMTP id l29mr1488444ooh.29.1619729130460; Thu, 29 Apr 2021 13:45:30 -0700 (PDT) In-Reply-To: X-Original-Sender: jasnmilr-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:28252 Archived-At: ------=_Part_52_1120777205.1619729129665 Content-Type: multipart/alternative; boundary="----=_Part_53_1794146542.1619729129665" ------=_Part_53_1794146542.1619729129665 Content-Type: text/plain; charset="UTF-8" Not sure how I missed that. I've left a comment. I still think escaping with passthrough only when special characters are encountered is the best solution, but will add to the discussion. I'm locally using a patch now so I have a workaround for me that I'm happy to share if issue 2337 makes any progress. On Thursday, April 29, 2021 at 10:07:27 AM UTC-7 John MacFarlane wrote: > > Have you seen https://github.com/jgm/pandoc/issues/2337 > > The escaping "rules" seem a complete mess on asciidoc's side, > unless things have been improved since I investigated. > > The "passthrough" method seems like too big a hammer. > You avoid the escape problems but you'll get very > ugly asciidoc. > > Anyway, it might make sense to comment on that issue. > > Jason Miller writes: > > > ~, *, +, and _ must be escaped in some cases for asciidoc (and all cases > > when doubled). The rules for when they must be escaped is fairly > confusing > > and backslash-escaping is disallowed when either the escaping is > > non-mandatory or the characters are doubled. > > > > I'm willing to submit a patch, but wanted some thoughts on the approach > > first > > > > Simple test-case that shows the issue: > > > > --- > > pandoc -f html -t asciidoctor /dev/stdin < >

*foo*

> >

foo

> > EOF > > --- > > > > Will yield identical asciidoc (with strong tag) for both paragraphs and > > results in bolded text for both cases. > > > > I have found two ways to escape that work in both asciidoc and > asciidoctor > > (with examples using an asterisk): > > > > 1. Define an attribute for each character that needs to be escaped, and > use > > that; the definitions can go anywhere in the document before being used, > > with the header being typical: > > > > --- > > :star: * > > > > {star} > > --- > > > > 2. Use a passthrough: > > > > --- > > pass:specialcharacters[*] > > --- > > > > #2 lets you escape any arbitrary strings, so escapeString could just > render > > any string with special characters as: > > > > pass:specialcharacters[this is some text that has * special + characters] > > > > I'm leaning towards implementing #2 as it's easier to implement and #1 > > requires rendering the header correctly. #1 probably is closer to how I > > would hand-write these escapes though. > > > > -- > > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/5c3d5ae9-c427-402e-90bd-41705fdbcb1an%40googlegroups.com > . > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/19506eb9-b617-4e81-91ed-a72d9c8b7978n%40googlegroups.com. ------=_Part_53_1794146542.1619729129665 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Not sure how I missed that.  I've left a comment.  I still think = escaping with passthrough only when special characters are encountered is t= he best solution, but will add to the discussion.  I'm locally using a= patch now so I have a workaround for me that I'm happy to share if issue 2= 337 makes any progress.

On Thursday, April 29, 2021 at 10:07:27 AM UTC-7 John= MacFarlane wrote:

Have you seen https://github.com/jgm/pandoc/issues/2337

The escaping "rules" seem a complete mess on asciidoc's s= ide,
unless things have been improved since I investigated.

The "passthrough" method seems like too big a hammer.
You avoid the escape problems but you'll get very
ugly asciidoc.

Anyway, it might make sense to comment on that issue.

Jason Miller <jasn...@gma= il.com> writes:

> ~, *, +, and _ must be escaped in some cases for asciidoc (and all= cases=20
> when doubled). The rules for when they must be escaped is fairly = confusing=20
> and backslash-escaping is disallowed when either the escaping is= =20
> non-mandatory or the characters are doubled.
>
> I'm willing to submit a patch, but wanted some thoughts on the= approach=20
> first
>
> Simple test-case that shows the issue:
>
> ---
> pandoc -f html -t asciidoctor /dev/stdin <<EOF
> <p>*foo*</p>
> <p><b>foo</b></p>
> EOF
> ---
>
> Will yield identical asciidoc (with strong tag) for both paragraph= s and=20
> results in bolded text for both cases.
>
> I have found two ways to escape that work in both asciidoc and asc= iidoctor=20
> (with examples using an asterisk):
>
> 1. Define an attribute for each character that needs to be escaped= , and use=20
> that; the definitions can go anywhere in the document before being= used,=20
> with the header being typical:
>
> ---
> :star: *
>
> {star}
> ---
>
> 2. Use a passthrough:
>
> ---
> pass:specialcharacters[*]
> ---
>
> #2 lets you escape any arbitrary strings, so escapeString could ju= st render=20
> any string with special characters as:
>
> pass:specialcharacters[this is some text that has * special + char= acters]
>
> I'm leaning towards implementing #2 as it's easier to impl= ement and #1=20
> requires rendering the header correctly. #1 probably is closer to= how I=20
> would hand-write these escapes though.
>
> --=20
> You received this message because you are subscribed to the Google= Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, = send an email to pandoc-discus..= .@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/5c3d= 5ae9-c427-402e-90bd-41705fdbcb1an%40googlegroups.com.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/19506eb9-b617-4e81-91ed-a72d9c8b7978n%40googlegroups.= com.
------=_Part_53_1794146542.1619729129665-- ------=_Part_52_1120777205.1619729129665--