From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/28242 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Jason Miller Newsgroups: gmane.text.pandoc Subject: Character escapes in asciidoc Date: Wed, 28 Apr 2021 21:08:55 -0700 (PDT) Message-ID: <5c3d5ae9-c427-402e-90bd-41705fdbcb1an@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_498_1746774802.1619669335669" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="37543"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss Original-X-From: pandoc-discuss+bncBCN5VGW52MNRBWHCVCCAMGQEX5TWQWQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Thu Apr 29 06:08:59 2021 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-ot1-f56.google.com ([209.85.210.56]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1lbxz1-0009cu-2o for gtp-pandoc-discuss@m.gmane-mx.org; Thu, 29 Apr 2021 06:08:59 +0200 Original-Received: by mail-ot1-f56.google.com with SMTP id l10-20020a056830054ab0290241bf5f8c25sf27806943otb.11 for ; Wed, 28 Apr 2021 21:08:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20161025; h=sender:date:from:to:message-id:subject:mime-version :x-original-sender:reply-to:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-subscribe:list-unsubscribe; bh=8+hKip7G4vC53Ma1BXWL882E2qDf39Qk+vcIWVIJThY=; b=PVe4Apzsw6+6wyQ2yyCwGHjKfgOSLRisuzdvs1crMPwm3g00AhAVVfrESMWSrqCQMn Hd/57DiQopRnbAbnobjz2aQ2kaqPDpeq/KJce47wfn1hIkzoyZ6nHOX0k6qBh68WsDUX ZaL2APuHs5HMSo4PsEkkCfR+6T/g4vzHgmc7VjE6T26HCXmONoQsCx0ai2l1JFEfvEJg H+WtiAuKXaKrYU7LbDNiM4sUacp/JIEt45wDsN/VygxrWd9PWuFlx8skP2ct67wUncxR 718pnLtePHStFJsG8b/a3i27yAQjpqlR8vtxnbqDpsHJ3RDApq3SY8ka26s4jZgvXEch zhpQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:message-id:subject:mime-version:x-original-sender :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=8+hKip7G4vC53Ma1BXWL882E2qDf39Qk+vcIWVIJThY=; b=ov9x1sANpy0bYLPPUXC1BeJrNrY+WOWe4bALJ+dD97u2A1Qp6PAyeFVvlALl3Lg805 4hsVz2+Et730CctVm8Qi/SRPrJeFgzMo4Ug2R2wK/WZTz7MckYljgp6ePUBP07AtY/fW QAQ8fIFXjN/TGdVTmO2g41d0Xl53WgXuyBt+lELNkF8Bhh1zJzbrK+y4mMIuQTztHOiB 1QceIYBBySF8hGXQSlBPYKqt7n3ut3WH1Dt9GvlowudvkwMd+sMP8ec+CfhtclGKoIQ4 M+2fWL/URplw+iaAMp24cdm6vuV2gzvK9rHYWwayYGE/bmA6SGR0XbJHBm+Y5NPdI3hr 6N6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=sender:x-gm-message-state:date:from:to:message-id:subject :mime-version:x-original-sender:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=8+hKip7G4vC53Ma1BXWL882E2qDf39Qk+vcIWVIJThY=; b=EdS5WbysTpfM94lkgFmk0Qf2X8Y41MDFrGdw8g8VdUD9bVi4hxkz/lMxLJBm6Yttt9 x7m2i39JkQzc09U460mh63h4YD3Oxt3C4151IA5diLoCokTTFxG4m85FFbpLZSy6YcVf JjjQKmSPdaUWWdc8uRzs0yguJa6scN1xVXdOBAqLH6S0oLKLOgzmDoykYZ7+17JGrz7o 43Pk5NBps8m2o8GSnbZ5iWDgEoh/Zqv3h9b3/hCr625+LI2QR/WlBIRMdTfts29pL7Gk 5Hfb3s9i3Gp6LATe4i9/iS6AU8MFUypMGN6A5WllK9bep3k/sDoKxcSjOGWVozn34BAt zJww== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM530qEMe/EkjpTn0VgKctOLfto4Z5BAdfiP7Fy3ah807mynhJmXya 1+jc9s9ccRjLtc38VheVEaY= X-Google-Smtp-Source: ABdhPJzx7aEIaQxGNVtFizj6aZLt2sJ/NYy/B8XxSoeWCiYsn8Cfk7gDsg7YI22Pnu7YkrTduhO0RA== X-Received: by 2002:aca:f245:: with SMTP id q66mr23428496oih.179.1619669338109; Wed, 28 Apr 2021 21:08:58 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:aca:1e0a:: with SMTP id m10ls535283oic.7.gmail; Wed, 28 Apr 2021 21:08:56 -0700 (PDT) X-Received: by 2002:aca:af16:: with SMTP id y22mr8739954oie.18.1619669336354; Wed, 28 Apr 2021 21:08:56 -0700 (PDT) X-Original-Sender: jasnmilr-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:28242 Archived-At: ------=_Part_498_1746774802.1619669335669 Content-Type: multipart/alternative; boundary="----=_Part_499_946041609.1619669335669" ------=_Part_499_946041609.1619669335669 Content-Type: text/plain; charset="UTF-8" ~, *, +, and _ must be escaped in some cases for asciidoc (and all cases when doubled). The rules for when they must be escaped is fairly confusing and backslash-escaping is disallowed when either the escaping is non-mandatory or the characters are doubled. I'm willing to submit a patch, but wanted some thoughts on the approach first Simple test-case that shows the issue: --- pandoc -f html -t asciidoctor /dev/stdin <*foo*

foo

EOF --- Will yield identical asciidoc (with strong tag) for both paragraphs and results in bolded text for both cases. I have found two ways to escape that work in both asciidoc and asciidoctor (with examples using an asterisk): 1. Define an attribute for each character that needs to be escaped, and use that; the definitions can go anywhere in the document before being used, with the header being typical: --- :star: * {star} --- 2. Use a passthrough: --- pass:specialcharacters[*] --- #2 lets you escape any arbitrary strings, so escapeString could just render any string with special characters as: pass:specialcharacters[this is some text that has * special + characters] I'm leaning towards implementing #2 as it's easier to implement and #1 requires rendering the header correctly. #1 probably is closer to how I would hand-write these escapes though. -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/5c3d5ae9-c427-402e-90bd-41705fdbcb1an%40googlegroups.com. ------=_Part_499_946041609.1619669335669 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
~, *, +, and _ mus= t be escaped in some cases for asciidoc (and all cases when doubled). = The rules for when they must be escaped is fairly confusing and backslash-= escaping is disallowed when either the escaping is non-mandatory or the cha= racters are doubled.

I'm will= ing to submit a patch, but wanted some thoughts on the approach first

Simple test-case th= at shows the issue:

---
pandoc -f html -t asciidoctor /dev/stdin <<EOF<= br><p>*foo*</p>
<p><b>foo</b></p>EOF

---
Will yield iden= tical asciidoc (with strong tag) for both paragraphs and results in bolded = text for both cases.

I have found two ways to escape that work in bo= th asciidoc and asciidoctor (with examples using an asterisk):

1. Define an attribute for = each character that needs to be escaped, and use that; the definitions can = go anywhere in the document before being used, with the header being typica= l:

---
<= span style=3D"font-family: Courier New;">:star: *

{star}
---
=
2. Use a pas= sthrough:

---
pass:specialcharacters[*]
---

#2 lets you escape any arbitrary strings, s= o escapeString could just render any string with special characters as:
=

pass:specialcharacters[this= is some text that has * special + characters]

I'm leaning towards implementing #2 as it's easier to imp= lement and #1 requires rendering the header correctly.  #1 probably is= closer to how I would hand-write these escapes though.

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d= /msgid/pandoc-discuss/5c3d5ae9-c427-402e-90bd-41705fdbcb1an%40googlegroups.= com.
------=_Part_499_946041609.1619669335669-- ------=_Part_498_1746774802.1619669335669--