public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: John MacFarlane <fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
Subject: Re: HTML attributes not being stripped off
Date: Sun, 11 Nov 2012 14:36:15 -0800	[thread overview]
Message-ID: <20121111223615.GE4399@Johns-MacBook-Air-2.local> (raw)
In-Reply-To: <509F89B3.4070403-S0/GAf8tV78@public.gmane.org>

You've got to remember that pandoc converts the input format to an
internal representation of the document (the 'Pandoc' structure), and
then converts that to the output format.

This internal representation (see
http://hackage.haskell.org/packages/archive/pandoc-types/1.9.1/doc/html/Text-Pandoc-Definition.html)
is much less expressive than HTML, and doesn't have a place for the
attributes you want.  That's why they are lost on HTML -> HTML
translation.

+++ Pablo Rodríguez [Nov 11 12 12:19 ]:
> Hi John,
> 
> I'm using pandoc mainly to generate ePub files.
> 
> I used textile first as source language, but it isn't fully implemented
> by pandoc and textile itself has issues with multiparagraph elements.
> 
> It seems HTML is probably a much better option for pandoc as source
> language, although I have to forget footnotes. There is no way to have
> it all.
> 
> But pandoc strips almost all attributes from HTML elements.
> 
> A minimal sample:
> 
> <ol start="2" style="list-style-type:lower-latin;">
> <li><p>Well there is no other way to tag <em lang="la">lingua
> latina</em>.</p>
> <li><p>Or even classes or ids.</p>.</li>
> </ol>
> 
> Would it be possible that there is an option that doesn't strip off
> attributes from HTML code?
> 
> BTW, when converting from HTML to another HTML code, at least id, class
> and lang attributes shouldn't be stripped off by default.
> 
> Many thanks for your help,
> 
> 
> Pablo
> -- 
> http://www.ousia.tk
> 
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
> 
> 

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.




  parent reply	other threads:[~2012-11-11 22:36 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-11 11:19 Pablo Rodríguez
     [not found] ` <509F89B3.4070403-S0/GAf8tV78@public.gmane.org>
2012-11-11 22:36   ` John MacFarlane [this message]
     [not found]     ` <20121111223615.GE4399-9Rnp8PDaXcZ2EAH53EmH34tHsfhOvSUSZkel5v8DVj8@public.gmane.org>
2012-11-12 19:14       ` Pablo Rodríguez
     [not found]         ` <50A14A92.9060301-S0/GAf8tV78@public.gmane.org>
2022-06-27  9:42           ` 'guenael Muller' via pandoc-discuss
     [not found]             ` <33fcfdbf-3edc-4145-a7f0-325bfd42698fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-06-27  9:47               ` Albert Krewinkel
2022-06-27  9:55               ` Sukil Etxenike arizaleta
     [not found]                 ` <87174047-ad9b-b702-4a08-eaa3c00c511d-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-06-27 10:17                   ` 'guenael Muller' via pandoc-discuss
     [not found]                     ` <e1b7f6d6-56c7-469e-b2f1-082718e2cbb2n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-06-27 11:37                       ` Albert Krewinkel
     [not found]                         ` <87r13abaeb.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2022-06-27 12:14                           ` Albert Krewinkel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121111223615.GE4399@Johns-MacBook-Air-2.local \
    --to=fiddlosopher-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).