public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Character escapes in asciidoc
@ 2021-04-29  4:08 Jason Miller
       [not found] ` <5c3d5ae9-c427-402e-90bd-41705fdbcb1an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Jason Miller @ 2021-04-29  4:08 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1755 bytes --]


~, *, +, and _ must be escaped in some cases for asciidoc (and all cases 
when doubled).  The rules for when they must be escaped is fairly confusing 
and backslash-escaping is disallowed when either the escaping is 
non-mandatory or the characters are doubled.

I'm willing to submit a patch, but wanted some thoughts on the approach 
first

Simple test-case that shows the issue:

---
pandoc -f html -t asciidoctor /dev/stdin <<EOF
<p>*foo*</p>
<p><b>foo</b></p>
EOF
---

Will yield identical asciidoc (with strong tag) for both paragraphs and 
results in bolded text for both cases.

I have found two ways to escape that work in both asciidoc and asciidoctor 
(with examples using an asterisk):

1. Define an attribute for each character that needs to be escaped, and use 
that; the definitions can go anywhere in the document before being used, 
with the header being typical:

---
:star: *

{star}
---

2. Use a passthrough:

---
pass:specialcharacters[*]
---

#2 lets you escape any arbitrary strings, so escapeString could just render 
any string with special characters as:

pass:specialcharacters[this is some text that has * special + characters]

I'm leaning towards implementing #2 as it's easier to implement and #1 
requires rendering the header correctly.  #1 probably is closer to how I 
would hand-write these escapes though.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/5c3d5ae9-c427-402e-90bd-41705fdbcb1an%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 3786 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Character escapes in asciidoc
       [not found] ` <5c3d5ae9-c427-402e-90bd-41705fdbcb1an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2021-04-29 17:07   ` John MacFarlane
       [not found]     ` <m2r1itc94u.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: John MacFarlane @ 2021-04-29 17:07 UTC (permalink / raw)
  To: Jason Miller, pandoc-discuss


Have you seen https://github.com/jgm/pandoc/issues/2337

The escaping "rules" seem a complete mess on asciidoc's side,
unless things have been improved since I investigated.

The "passthrough" method seems like too big a hammer.
You avoid the escape problems but you'll get very
ugly asciidoc.

Anyway, it might make sense to comment on that issue.

Jason Miller <jasnmilr-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> ~, *, +, and _ must be escaped in some cases for asciidoc (and all cases 
> when doubled).  The rules for when they must be escaped is fairly confusing 
> and backslash-escaping is disallowed when either the escaping is 
> non-mandatory or the characters are doubled.
>
> I'm willing to submit a patch, but wanted some thoughts on the approach 
> first
>
> Simple test-case that shows the issue:
>
> ---
> pandoc -f html -t asciidoctor /dev/stdin <<EOF
> <p>*foo*</p>
> <p><b>foo</b></p>
> EOF
> ---
>
> Will yield identical asciidoc (with strong tag) for both paragraphs and 
> results in bolded text for both cases.
>
> I have found two ways to escape that work in both asciidoc and asciidoctor 
> (with examples using an asterisk):
>
> 1. Define an attribute for each character that needs to be escaped, and use 
> that; the definitions can go anywhere in the document before being used, 
> with the header being typical:
>
> ---
> :star: *
>
> {star}
> ---
>
> 2. Use a passthrough:
>
> ---
> pass:specialcharacters[*]
> ---
>
> #2 lets you escape any arbitrary strings, so escapeString could just render 
> any string with special characters as:
>
> pass:specialcharacters[this is some text that has * special + characters]
>
> I'm leaning towards implementing #2 as it's easier to implement and #1 
> requires rendering the header correctly.  #1 probably is closer to how I 
> would hand-write these escapes though.
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/5c3d5ae9-c427-402e-90bd-41705fdbcb1an%40googlegroups.com.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Character escapes in asciidoc
       [not found]     ` <m2r1itc94u.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
@ 2021-04-29 20:45       ` Jason Miller
  0 siblings, 0 replies; 3+ messages in thread
From: Jason Miller @ 2021-04-29 20:45 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 3205 bytes --]

Not sure how I missed that.  I've left a comment.  I still think escaping 
with passthrough only when special characters are encountered is the best 
solution, but will add to the discussion.  I'm locally using a patch now so 
I have a workaround for me that I'm happy to share if issue 2337 makes any 
progress.

On Thursday, April 29, 2021 at 10:07:27 AM UTC-7 John MacFarlane wrote:

>
> Have you seen https://github.com/jgm/pandoc/issues/2337
>
> The escaping "rules" seem a complete mess on asciidoc's side,
> unless things have been improved since I investigated.
>
> The "passthrough" method seems like too big a hammer.
> You avoid the escape problems but you'll get very
> ugly asciidoc.
>
> Anyway, it might make sense to comment on that issue.
>
> Jason Miller <jasn...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
> > ~, *, +, and _ must be escaped in some cases for asciidoc (and all cases 
> > when doubled). The rules for when they must be escaped is fairly 
> confusing 
> > and backslash-escaping is disallowed when either the escaping is 
> > non-mandatory or the characters are doubled.
> >
> > I'm willing to submit a patch, but wanted some thoughts on the approach 
> > first
> >
> > Simple test-case that shows the issue:
> >
> > ---
> > pandoc -f html -t asciidoctor /dev/stdin <<EOF
> > <p>*foo*</p>
> > <p><b>foo</b></p>
> > EOF
> > ---
> >
> > Will yield identical asciidoc (with strong tag) for both paragraphs and 
> > results in bolded text for both cases.
> >
> > I have found two ways to escape that work in both asciidoc and 
> asciidoctor 
> > (with examples using an asterisk):
> >
> > 1. Define an attribute for each character that needs to be escaped, and 
> use 
> > that; the definitions can go anywhere in the document before being used, 
> > with the header being typical:
> >
> > ---
> > :star: *
> >
> > {star}
> > ---
> >
> > 2. Use a passthrough:
> >
> > ---
> > pass:specialcharacters[*]
> > ---
> >
> > #2 lets you escape any arbitrary strings, so escapeString could just 
> render 
> > any string with special characters as:
> >
> > pass:specialcharacters[this is some text that has * special + characters]
> >
> > I'm leaning towards implementing #2 as it's easier to implement and #1 
> > requires rendering the header correctly. #1 probably is closer to how I 
> > would hand-write these escapes though.
> >
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/pandoc-discuss/5c3d5ae9-c427-402e-90bd-41705fdbcb1an%40googlegroups.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/19506eb9-b617-4e81-91ed-a72d9c8b7978n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 4863 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-04-29 20:45 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-29  4:08 Character escapes in asciidoc Jason Miller
     [not found] ` <5c3d5ae9-c427-402e-90bd-41705fdbcb1an-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2021-04-29 17:07   ` John MacFarlane
     [not found]     ` <m2r1itc94u.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>
2021-04-29 20:45       ` Jason Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).