public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Links in superscripts
@ 2010-12-09 18:18 Conal Elliott
       [not found] ` <AANLkTi=TZw2uuYDR0AG+C5YxFFHVFGn7bOmfanzqhFMs-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: Conal Elliott @ 2010-12-09 18:18 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 1203 bytes --]

Links inside superscripts get confused during conversion from HTML to
Markdown. For instance,

    bash-3.2$ echo "foo<sup><a href="bar">baz</a></sup>" | pandoc --from
html --to markdown && echo
    foo^[baz](bar)^

This Markdown gets interpreted as a footnote:

    bash-3.2$ echo "foo<sup><a href="bar">baz</a></sup>" | pandoc --from
html --to markdown | pandoc --from markdown --to markdown &&     echo
    foo[^1](bar)\^

    [^1]:
        baz

Oddly, if I manually break the "^[" with a space, I don't get a superscript:

    bash-3.2$ echo "foo^ [baz](bar)^" | pandoc --to markdown
    foo\^ [baz](bar)\^

Is there a workaround? Is link-within-superscript even expressible in
markdown? Perhaps hacked by inserting some sort of invisible content between
"^" and "["?

  - Conal

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.


[-- Attachment #2: Type: text/html, Size: 1456 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Links in superscripts
       [not found] ` <AANLkTi=TZw2uuYDR0AG+C5YxFFHVFGn7bOmfanzqhFMs-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-12-09 19:31   ` John MacFarlane
       [not found]     ` <20101209193146.GA25629-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: John MacFarlane @ 2010-12-09 19:31 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Conal Elliott [Dec 09 10 10:18 ]:
> Links inside superscripts get confused during conversion from HTML to Markdown.
> For instance,
> 
>     bash-3.2$ echo "foo<sup><a href="bar">baz</a></sup>" | pandoc --from html
> --to markdown && echo
>     foo^[baz](bar)^
> 
> This Markdown gets interpreted as a footnote:

Yes, because ^[blah blah] is the syntax for an "inline footnote."
I hadn't anticipated this bad interaction when I thought of that.

You could put a thin unicode space between the ^ and the [ -- that
should work, though hacky.  A regular space doesn't work, because
the superscript parser starts with ^ not followed by space.
But a unicode space or an escaped space should work:

pandoc
foo^\ [baz](bar)^
<p
>foo<sup
  > <a href="bar"
      >baz</a
          ></sup
            ></p
            >

John

> 
>     bash-3.2$ echo "foo<sup><a href="bar">baz</a></sup>" | pandoc --from html
> --to markdown | pandoc --from markdown --to markdown &&     echo
>     foo[^1](bar)\^
> 
>     [^1]:
>         baz
> 
> Oddly, if I manually break the "^[" with a space, I don't get a superscript:
> 
>     bash-3.2$ echo "foo^ [baz](bar)^" | pandoc --to markdown
>     foo\^ [baz](bar)\^
> 
> Is there a workaround? Is link-within-superscript even expressible in markdown?
> Perhaps hacked by inserting some sort of invisible content between "^" and "["?
> 
>   - Conal
> 
> 
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To unsubscribe from this group, send email to
> pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> For more options, visit this group at http://groups.google.com/group/
> pandoc-discuss?hl=en.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Links in superscripts
       [not found]     ` <20101209193146.GA25629-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-12-11 15:25       ` BP Jonsson
       [not found]         ` <4D039806.4010302-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 7+ messages in thread
From: BP Jonsson @ 2010-12-11 15:25 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

2010-12-09 20:31, John MacFarlane skrev:

> You could put a thin unicode space between the ^ and the [ -- that
> should work, though hacky.  A regular space doesn't work, because
> the superscript parser starts with ^ not followed by space.
> But a unicode space or an escaped space should work:

Try U+034F COMBINING GRAPHEME JOINER which is zero width, and if
that doesn't work try U+200A HAIR SPACE which almost is.
You can enter them 'manually' as HTML entities: &#x034F: etc.

Or use any old character which you don't *really* use and delete
it from pandoc's output afterwards.  I always use ค which for
traditional reasons is easily available on the Swedish keyboard.
Thus foo^ค[bar](/baz)^ > pandoc > search-replace ค with nothing
in the output.

Another case where a proper comment syntax should do the trick...

/bpj

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Links in superscripts
       [not found]         ` <4D039806.4010302-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2010-12-11 16:36           ` John MacFarlane
       [not found]             ` <20101211163633.GA704-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
  2010-12-13  0:20           ` Conal Elliott
  1 sibling, 1 reply; 7+ messages in thread
From: John MacFarlane @ 2010-12-11 16:36 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ BP Jonsson [Dec 11 10 16:25 ]:
> 2010-12-09 20:31, John MacFarlane skrev:
> 
> >You could put a thin unicode space between the ^ and the [ -- that
> >should work, though hacky.  A regular space doesn't work, because
> >the superscript parser starts with ^ not followed by space.
> >But a unicode space or an escaped space should work:
> 
> Try U+034F COMBINING GRAPHEME JOINER which is zero width, and if
> that doesn't work try U+200A HAIR SPACE which almost is.
> You can enter them 'manually' as HTML entities: &#x034F: etc.
> 
> Or use any old character which you don't *really* use and delete
> it from pandoc's output afterwards.  I always use ค which for
> traditional reasons is easily available on the Swedish keyboard.
> Thus foo^ค[bar](/baz)^ > pandoc > search-replace ค with nothing
> in the output.
> 
> Another case where a proper comment syntax should do the trick...

You could just do:

hi^<!---->[link](/foo)^

You'd get an HTML comment in the output, but so what?

John

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Links in superscripts
       [not found]         ` <4D039806.4010302-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
  2010-12-11 16:36           ` John MacFarlane
@ 2010-12-13  0:20           ` Conal Elliott
       [not found]             ` <AANLkTikECxcSU-ukJ_6m+oZTfpxqMe=LK79yzy5WuuL1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  1 sibling, 1 reply; 7+ messages in thread
From: Conal Elliott @ 2010-12-13  0:20 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 2645 bytes --]

Thanks for the hack, BP. Works great on this end.

John - what do you think about inserting &#x034F; into the markdown whenever
rendering a superscript that starts with a link? For instance, in
Text.Pandoc.Writers.Markdown, in inlineToMarkdown, before the current
Superscript case, add the following code:

> -- Insert zero-width html entity (combining grapheme joiner) for
subscripted links.
> -- Otherwise will generate "^[...]^", which will be parsed as an inline
footnote
> -- and the start of something else.
> -- Potential problem: if the first inline in the superscript generates no
output,
> -- then we'll still output "^[...]^".
> inlineToMarkdown opts (Superscript lst@(Link{}:_)) =
>  inlineToMarkdown opts (Superscript (Str "&#x034F;" : lst))

  - Conal



2010/12/11 BP Jonsson <bpjonsson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

> 2010-12-09 20:31, John MacFarlane skrev:
>
>
>  You could put a thin unicode space between the ^ and the [ -- that
>> should work, though hacky.  A regular space doesn't work, because
>> the superscript parser starts with ^ not followed by space.
>> But a unicode space or an escaped space should work:
>>
>
> Try U+034F COMBINING GRAPHEME JOINER which is zero width, and if
> that doesn't work try U+200A HAIR SPACE which almost is.
> You can enter them 'manually' as HTML entities: &#x034F: etc.
>
> Or use any old character which you don't *really* use and delete
> it from pandoc's output afterwards.  I always use ค which for
> traditional reasons is easily available on the Swedish keyboard.
> Thus foo^ค[bar](/baz)^ > pandoc > search-replace ค with nothing
> in the output.
>
> Another case where a proper comment syntax should do the trick...
>
> /bpj
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To unsubscribe from this group, send email to
> pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org<pandoc-discuss%2Bunsubscribe@googlegroups.com>
> .
> For more options, visit this group at
> http://groups.google.com/group/pandoc-discuss?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.


[-- Attachment #2: Type: text/html, Size: 3567 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Links in superscripts
       [not found]             ` <AANLkTikECxcSU-ukJ_6m+oZTfpxqMe=LK79yzy5WuuL1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2010-12-13  4:32               ` John MacFarlane
  0 siblings, 0 replies; 7+ messages in thread
From: John MacFarlane @ 2010-12-13  4:32 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

I've pushed a less kludgy fix:  I moved the inlineNote parser
after the superscript parser in the markdown reader.  So now
^[link](/foo)^ is a superscripted link, with no need for unicode
spacers.

John

+++ Conal Elliott [Dec 12 10 16:20 ]:
> Thanks for the hack, BP. Works great on this end.
> 
> John - what do you think about inserting &#x034F; into the markdown whenever
> rendering a superscript that starts with a link? For instance, in
> Text.Pandoc.Writers.Markdown, in inlineToMarkdown, before the current
> Superscript case, add the following code:
> 
> > -- Insert zero-width html entity (combining grapheme joiner) for subscripted
> links.
> > -- Otherwise will generate "^[...]^", which will be parsed as an inline
> footnote
> > -- and the start of something else.
> > -- Potential problem: if the first inline in the superscript generates no
> output,
> > -- then we'll still output "^[...]^".
> > inlineToMarkdown opts (Superscript lst@(Link{}:_)) =
> >  inlineToMarkdown opts (Superscript (Str "&#x034F;" : lst))
> 
>   - Conal
> 
> 
> 
> 2010/12/11 BP Jonsson <bpjonsson-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
> 
>     2010-12-09 20:31, John MacFarlane skrev:
> 
> 
> 
>         You could put a thin unicode space between the ^ and the [ -- that
>         should work, though hacky.  A regular space doesn't work, because
>         the superscript parser starts with ^ not followed by space.
>         But a unicode space or an escaped space should work:
> 
> 
>     Try U+034F COMBINING GRAPHEME JOINER which is zero width, and if
>     that doesn't work try U+200A HAIR SPACE which almost is.
>     You can enter them 'manually' as HTML entities: &#x034F: etc.
> 
>     Or use any old character which you don't *really* use and delete
>     it from pandoc's output afterwards.  I always use ค which for
>     traditional reasons is easily available on the Swedish keyboard.
>     Thus foo^ค[bar](/baz)^ > pandoc > search-replace ค with nothing
>     in the output.
> 
>     Another case where a proper comment syntax should do the trick...
> 
>     /bpj
> 
>     --
>     You received this message because you are subscribed to the Google Groups
>     "pandoc-discuss" group.
>     To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>     To unsubscribe from this group, send email to
>     pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>     For more options, visit this group at http://groups.google.com/group/
>     pandoc-discuss?hl=en.
> 
> 
> 
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To unsubscribe from this group, send email to
> pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> For more options, visit this group at http://groups.google.com/group/
> pandoc-discuss?hl=en.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/pandoc-discuss?hl=en.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Links in superscripts
       [not found]             ` <20101211163633.GA704-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
@ 2010-12-13 14:53               ` BP Jonsson
  0 siblings, 0 replies; 7+ messages in thread
From: BP Jonsson @ 2010-12-13 14:53 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

2010-12-11 17:36, John MacFarlane skrev:

> You could just do:
>
> hi^<!---->[link](/foo)^
>
> You'd get an HTML comment in the output, but so what?
>
> John
>

Nope, I get:

hi^<!----><p
><a href="/foo"
   >link</a
   >^</p
>

/bpj


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-12-13 14:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-09 18:18 Links in superscripts Conal Elliott
     [not found] ` <AANLkTi=TZw2uuYDR0AG+C5YxFFHVFGn7bOmfanzqhFMs-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-12-09 19:31   ` John MacFarlane
     [not found]     ` <20101209193146.GA25629-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-12-11 15:25       ` BP Jonsson
     [not found]         ` <4D039806.4010302-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2010-12-11 16:36           ` John MacFarlane
     [not found]             ` <20101211163633.GA704-nFAEphtLEs+AA6luYCgp0U1S2cYJDpTV9nwVQlTi/Pw@public.gmane.org>
2010-12-13 14:53               ` BP Jonsson
2010-12-13  0:20           ` Conal Elliott
     [not found]             ` <AANLkTikECxcSU-ukJ_6m+oZTfpxqMe=LK79yzy5WuuL1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-12-13  4:32               ` John MacFarlane

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).