public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Pandoc RawInline conversion
@ 2017-01-22 22:12 Mirko Lelansky
       [not found] ` <246eb85b-905d-488c-9302-0513926cc55f-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 3+ messages in thread
From: Mirko Lelansky @ 2017-01-22 22:12 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1858 bytes --]

I make some markdown conversion to html and latex. For some use cases i 
need a special handling for html and latex output. For that problem i use 
raw latex and raw html. The problem is that the 
pandoc parser parse raw block differently. 

First an example for better understanding:

<abbr title="Application Programming Interface">API</abbr>                 
\gls{API}



. 

The pandoc parser converts this to the following json:

{"blocks":[{"t":"Para","c":[{"t":"RawInline","c":["html","<abbr 
title=\"Application Programming 
Interface\">"]},{"t":"Str","c":"API"},{"t":"RawInline","c":["html","</abbr>"]},{"t":"SoftBreak"},{"t":"RawInline","c":["tex","\\gls{API}"]}]}],"pandoc-api-version":[1,17,0,5],"meta":{}}

. 

Here you see that the raw latex block is parsed correctly as one RawInline 
element but the raw html element is parsed as two RawInline elements which 
wraps a String element. Is this behaviour normal? For me the following 
seems better:

{"t":"RawInline","c":["html","<abbr title=\"Application Programming 
Interface\">API</abbr>"]}

because then it will be easier to use output specific markup because the 
html writer will be ignore RawInline for latex and the latex writer will be 
ignore RawInline for html and print not the content of the String element. 

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/246eb85b-905d-488c-9302-0513926cc55f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 4169 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Pandoc RawInline conversion
       [not found] ` <246eb85b-905d-488c-9302-0513926cc55f-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2017-01-23 10:11   ` John MacFarlane
  2017-01-24  9:54   ` Melroch
  1 sibling, 0 replies; 3+ messages in thread
From: John MacFarlane @ 2017-01-23 10:11 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Mirko Lelansky [Jan 22 17 14:12 ]:
>   Here you see that the raw latex block is parsed correctly as one
>   RawInline element but the raw html element is parsed as two RawInline
>   elements which wraps a String element. Is this behaviour normal? For me

Yes, because in Markdown the contents of inline HTML tags
are interpreted as Markdown.

So, for example in

    <a>*hi*</a>

you have an emphasized "hi" inside the tags.  If we took
the whole thing as one raw HTML string, we'd miss that
meaning.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Pandoc RawInline conversion
       [not found] ` <246eb85b-905d-488c-9302-0513926cc55f-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2017-01-23 10:11   ` John MacFarlane
@ 2017-01-24  9:54   ` Melroch
  1 sibling, 0 replies; 3+ messages in thread
From: Melroch @ 2017-01-24  9:54 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 2949 bytes --]

Try this filter which converts Code and CodeBlock elements marked with a
`raw=FORMAT` attribute into RawInline and RawBlock elements respectively:

https://gist.github.com/bpj/e6e53cbe679d3ec77e25

Den 22 jan 2017 23:13 skrev "Mirko Lelansky" <mirkolelansky-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>:

> I make some markdown conversion to html and latex. For some use cases i
> need a special handling for html and latex output. For that problem i use
> raw latex and raw html. The problem is that the
> pandoc parser parse raw block differently.
>
> First an example for better understanding:
>
> <abbr title="Application Programming Interface">API</abbr>
>
> \gls{API}
>
>
>
> .
>
> The pandoc parser converts this to the following json:
>
> {"blocks":[{"t":"Para","c":[{"t":"RawInline","c":["html","<abbr
> title=\"Application Programming Interface\">"]},{"t":"Str","c"
> :"API"},{"t":"RawInline","c":["html","</abbr>"]},{"t":"
> SoftBreak"},{"t":"RawInline","c":["tex","\\gls{API}"]}]}],"
> pandoc-api-version":[1,17,0,5],"meta":{}}
>
> .
>
> Here you see that the raw latex block is parsed correctly as one RawInline
> element but the raw html element is parsed as two RawInline elements which
> wraps a String element. Is this behaviour normal? For me the following
> seems better:
>
> {"t":"RawInline","c":["html","<abbr title=\"Application Programming
> Interface\">API</abbr>"]}
>
> because then it will be easier to use output specific markup because the
> html writer will be ignore RawInline for latex and the latex writer will be
> ignore RawInline for html and print not the content of the String element.
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/pandoc-discuss/246eb85b-905d-488c-9302-0513926cc55f%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/246eb85b-905d-488c-9302-0513926cc55f%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CADAJKhCVTP0Bzfn8oC0uaMovoKZW%2BkuU7KG6KS8haqvQtBdRJA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #2: Type: text/html, Size: 6230 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-01-24  9:54 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-22 22:12 Pandoc RawInline conversion Mirko Lelansky
     [not found] ` <246eb85b-905d-488c-9302-0513926cc55f-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2017-01-23 10:11   ` John MacFarlane
2017-01-24  9:54   ` Melroch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).