From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/5092 Path: news.gmane.org!not-for-mail From: John MacFarlane Newsgroups: gmane.text.pandoc Subject: Re: HTML attributes not being stripped off Date: Sun, 11 Nov 2012 14:36:15 -0800 Message-ID: <20121111223615.GE4399@Johns-MacBook-Air-2.local> References: <509F89B3.4070403@web.de> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1352673377 3370 80.91.229.3 (11 Nov 2012 22:36:17 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 11 Nov 2012 22:36:17 +0000 (UTC) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBDW7ZIEHTIIBBYGQQCCQKGQENPHNKXY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Sun Nov 11 23:36:27 2012 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-gg0-f186.google.com ([209.85.161.186]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1TXg8g-00050f-Rk for gtp-pandoc-discuss@m.gmane.org; Sun, 11 Nov 2012 23:36:27 +0100 Original-Received: by mail-gg0-f186.google.com with SMTP id u2sf4498918ggn.3 for ; Sun, 11 Nov 2012 14:36:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=x-beenthere:received-spf:date:from:to:subject:message-id:references :mime-version:in-reply-to:x-pgp-key:user-agent:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-google-group-id:list-post:list-help:list-archive:sender :list-subscribe:list-unsubscribe:content-type:content-disposition :content-transfer-encoding; bh=wYWXgIyJsMkxNLQzSZFM8BCJa25kG3zY/NZBkw7duhU=; b=CCu5XK9j0IH0MZVgSep90AKRqIDWZGvb7yU5sh5ovDMG8To9YxEEbQSH3q6kS0QcUE 1cCHtS7/SXh9ugLAVvUSxFldzlGPBXHzYNEfmAvDoE0WNuyrMITNGu7U3tfF0ivzcexs QyFi8fsvY13CikE2ksAWKyXF4wUle154vjuOiT7xPYU4JUg9FUIHzjhM/8zotmYd8T5M frERI+EvU1h3ka1Ds9mpB9H0P/dR6nictlWkleZoJJnn7BkLXANcrCDZArRb9RdKjB6h Yq0/n0hCFCFO Original-Received: by 10.68.141.45 with SMTP id rl13mr4881188pbb.8.1352673376686; Sun, 11 Nov 2012 14:36:16 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 10.68.201.133 with SMTP id ka5ls22533795pbc.3.gmail; Sun, 11 Nov 2012 14:36:16 -0800 (PST) Original-Received: by 10.66.78.161 with SMTP id c1mr7455833pax.16.1352673376334; Sun, 11 Nov 2012 14:36:16 -0800 (PST) Original-Received: by 10.66.78.161 with SMTP id c1mr7455832pax.16.1352673376324; Sun, 11 Nov 2012 14:36:16 -0800 (PST) Original-Received: from cm05fe.IST.Berkeley.EDU (cm05fe.IST.Berkeley.EDU. [169.229.218.146]) by gmr-mx.google.com with ESMTPS id r4si1024954paz.1.2012.11.11.14.36.16 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 11 Nov 2012 14:36:16 -0800 (PST) Received-SPF: neutral (google.com: 169.229.218.146 is neither permitted nor denied by best guess record for domain of jgm-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org) client-ip=169.229.218.146; Original-Received: from li55-134.members.linode.com ([74.82.3.134] helo=johnmacfarlane.net) by cm05fe.ist.berkeley.edu with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.76) (auth plain:jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org) (envelope-from ) id 1TXg8U-0002td-Gi for pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; Sun, 11 Nov 2012 14:36:16 -0800 Original-Received: by johnmacfarlane.net (Postfix, from userid 1000) id 61F4221FD53; Sun, 11 Nov 2012 17:30:46 -0500 (EST) In-Reply-To: <509F89B3.4070403-S0/GAf8tV78@public.gmane.org> X-PGP-Key: http://johnmacfarlane.net/jgm.asc User-Agent: Mutt/1.5.21 (2010-09-15) X-Original-Sender: fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; spf=neutral (google.com: 169.229.218.146 is neither permitted nor denied by best guess record for domain of jgm-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org) smtp.mail=jgm-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-Subscribe: , List-Unsubscribe: , Content-Disposition: inline Xref: news.gmane.org gmane.text.pandoc:5092 Archived-At: You've got to remember that pandoc converts the input format to an internal representation of the document (the 'Pandoc' structure), and then converts that to the output format. This internal representation (see http://hackage.haskell.org/packages/archive/pandoc-types/1.9.1/doc/html/Tex= t-Pandoc-Definition.html) is much less expressive than HTML, and doesn't have a place for the attributes you want. That's why they are lost on HTML -> HTML translation. +++ Pablo Rodr=EDguez [Nov 11 12 12:19 ]: > Hi John, >=20 > I'm using pandoc mainly to generate ePub files. >=20 > I used textile first as source language, but it isn't fully implemented > by pandoc and textile itself has issues with multiparagraph elements. >=20 > It seems HTML is probably a much better option for pandoc as source > language, although I have to forget footnotes. There is no way to have > it all. >=20 > But pandoc strips almost all attributes from HTML elements. >=20 > A minimal sample: >=20 >
    >
  1. Well there is no other way to tag lingua > latina.

    >
  2. Or even classes or ids.

    .
  3. >
>=20 > Would it be possible that there is an option that doesn't strip off > attributes from HTML code? >=20 > BTW, when converting from HTML to another HTML code, at least id, class > and lang attributes shouldn't be stripped off by default. >=20 > Many thanks for your help, >=20 >=20 > Pablo > --=20 > http://www.ousia.tk >=20 > --=20 > You received this message because you are subscribed to the Google Groups= "pandoc-discuss" group. > To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@= googlegroups.com. > For more options, visit https://groups.google.com/groups/opt_out. >=20 >=20 --=20 You received this message because you are subscribed to the Google Groups "= pandoc-discuss" group. To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To unsubscribe from this group, send email to pandoc-discuss+unsubscribe@go= oglegroups.com. For more options, visit https://groups.google.com/groups/opt_out.