From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/30858 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: Albert Krewinkel Newsgroups: gmane.text.pandoc Subject: Re: HTML attributes not being stripped off Date: Mon, 27 Jun 2022 14:14:22 +0200 Message-ID: <87mtdyb7yl.fsf@zeitkraut.de> References: <509F89B3.4070403@web.de> <20121111223615.GE4399@Johns-MacBook-Air-2.local> <50A14A92.9060301@web.de> <33fcfdbf-3edc-4145-a7f0-325bfd42698fn@googlegroups.com> <87174047-ad9b-b702-4a08-eaa3c00c511d@gmail.com> <87r13abaeb.fsf@zeitkraut.de> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="23063"; mail-complaints-to="usenet@ciao.gmane.io" To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBCZJF7XJTILRB56H42KQMGQEHVHVBQI-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mon Jun 27 14:35:08 2022 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-lf1-f63.google.com ([209.85.167.63]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1o5nxM-0005mg-3M for gtp-pandoc-discuss@m.gmane-mx.org; Mon, 27 Jun 2022 14:35:08 +0200 Original-Received: by mail-lf1-f63.google.com with SMTP id cf10-20020a056512280a00b0047f5a295656sf4637922lfb.15 for ; Mon, 27 Jun 2022 05:35:08 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1656333307; cv=pass; d=google.com; s=arc-20160816; b=j/xsrCkco0mN5emHR3V06Taq3mPclFHpJZqVaToPJHbn6lmdHVXlwecFVxx6UOnjL4 SIbzXwmyFJggImV1jBXmmN5Zm6UwdOXst1Q5zSqnfmamn4tOaqJtpsIdDXLhTXQHovxW ORks2TMxrRTEtX5RfmdHy85G4RkJ2W+2a9Av+aCmq17EtXg3I8GEhfXqSm/xaBaB9I9Y DGvhsMA0m/3RAouTklbrCzKAgioXVQvlN8KumByzYqnMgEWSb1oa+2sR2hYxvwaqoPmJ 4N9zN11eBSA/sNK0g1lXuoR/pB8QGLHrn47Xnf01c7PlCYM1G8YlIPsJIGH6LcP2XWJ/ GYbQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:mime-version:message-id :in-reply-to:date:subject:to:from:references:sender:dkim-signature; bh=EmH8zIryGsAZ/G7zlA2Yssgz6GZhDy3qAimmlVYgruU=; b=0Zhi8+E78zHJII7or80Alz01mPBh6tTexa6M+xz9sdZ9Hc+7L0ZAMJK6fBZnH+ZEh9 M64ixu1H0SAi8j+Xm09GpXLu/FZnUZctmcD2LpjP+eYArXGQ4z18nwLmzhFVAJ4CB8sT LD44+mcJ6odLX+W0xo3GdX/SNBv1nMEX6ZZDXd9dD5cP7fSXhORUKsJTP2TAbtEmtnZK VQndrZbQS0uLb2UdgeT0aE5S5anmIwxLqXehnzNE1dKWb0mIkBSJIKRz54vlU55+pAyS /gK3K2vMqBeL+EDJ7o/5Qk1edaupIFhz18ahF4tuAUNClf/s9Auhk+9aAodF+ZANeuEM blkQ== ARC-Authentication-Results: i=2; gmr-mx.google.com; spf=pass (google.com: domain of albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org designates 2001:67c:2050:0:465::202 as permitted sender) smtp.mailfrom=albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=sender:references:from:to:subject:date:in-reply-to:message-id :mime-version:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=EmH8zIryGsAZ/G7zlA2Yssgz6GZhDy3qAimmlVYgruU=; b=JVHo4wErA2BNe2CK8eXRdt+hTfsX1vwPgKomY9+ECWI2Txj1YcsZfMIwpYDc3hcBFv hThwp96CJIN1gSnHzccLi2kEZY7WIQ/8eaOV4i2rEysM2X5LwwFnyWPwumkkjjEc2JmD jPjn83F4BEVfu0rBr2EkYW9wPZwKYqTUmQem6E8tLN527bM6k3jBWC1eg6KS1UKZ0aL+ 0nKUhbKo12cG+4XskwzitYfqHsPo+FVIbAS5uK3CcrkWrxe4Bw5hGbRyNvA0Ohg6EL2m gjI9VaVUsqMCpMW5+Zsu9g+PMp9VkFLX+k3AS7myU9Zxexoz6v0//I+jhsLrGGnx/h6m MIIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=sender:x-gm-message-state:references:from:to:subject:date :in-reply-to:message-id:mime-version:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=EmH8zIryGsAZ/G7zlA2Yssgz6GZhDy3qAimmlVYgruU=; b=o1qh0Ubx7AsHPIbyziRB0m52ReGFF5A54VU0x7QEEiKY+SYYwnOd5vsMh2kK4ayx1v kE/dPPRJUsT8ttKoYOW59TH7OHHLOlndrW93vyqQGRCZYrOjzejBAyW7JqmOVJ26fUdD XPRkrHRTajfPeiFfkYWch3MdGC+X/o9s5ZmQiMnSh1yFtHH55PoYhf2qhUOOgwbxCb2c V0CM4h6sm0TYCyEUnb3yerNLxc7rgMYLSuBeJb5QhC93SaRW75BcUMhH269Bu8kidcQN crVv3N4LqEgaxMVWWUqAhJFAvBatlApDQxS+sO7BbX6lS/vxUpRcJiA97DHY97rUP9v1 bnvg== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AJIora+lanSCzT5pnk0IsQWxp9iRdKJFiwKpSTo9BwtH9DkwMaH8Z3d8 BeTBGWJjwZyJT5DHiZNq4dg= X-Google-Smtp-Source: AGRyM1uPEcs5KIs4Wc3M9K16piliLZLLTbWrAKMa9QNxJjYbtY02guzs7GSjMy0ie+llSVM55pEjYQ== X-Received: by 2002:a05:6512:108d:b0:481:6f3:e641 with SMTP id j13-20020a056512108d00b0048106f3e641mr6855976lfg.251.1656333307466; Mon, 27 Jun 2022 05:35:07 -0700 (PDT) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a2e:91cf:0:b0:25a:8a25:7576 with SMTP id u15-20020a2e91cf000000b0025a8a257576ls2174688ljg.0.gmail; Mon, 27 Jun 2022 05:35:02 -0700 (PDT) X-Received: by 2002:a05:651c:553:b0:25b:ca1c:efb8 with SMTP id q19-20020a05651c055300b0025bca1cefb8mr923868ljp.239.1656333302400; Mon, 27 Jun 2022 05:35:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656333302; cv=none; d=google.com; s=arc-20160816; b=MQ0oulITyBxsGisQVP/lrvHH/fh451rjjUPUDL/0ACz5faoPUuGrorvr3vTxQ8loEw DQS1SBiuyDnGTNovAgFsOS3ZsxiRS8PD35g9vjp0FBfajnI/C68i/y/PcgDckZUrF5T8 IhQPifq6CRGDrhHzSxHMY9SAL6RkYeJeP6PB02cpRo2bk2DZBGMFju51cxCaCtRcGOeS VQ4v/L3wJKAvxAAr1IiNFa3KuGrrrhU6eYO6ciRGpWUQ5lhZZioD/fXbWJ9dWWKRjQ2M J3LrUB9B4TnUs/SIbTZYfeyG7IG9YiLgBchfXZ2ZnUgn/Xax5F562ojl8d4e1sgTbfHf B4Yg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:message-id:in-reply-to:date:subject:to:from:references; bh=slD5Dp1NimiCU0tSverDL1R+k1QL+NL5nO7/JqEiNO8=; b=eexBE6ixc8fRG19GmOg2w75Ny6Qg83hD5Mq4dFHym0+A6i1qmYkv6VBny6OMoivt6q eBEAxbi9LD4Rmai+hJPRapg5LT8qSgf8XVbse7d0d/BtTM6ittEwTWcmAc2fP2H+W8kl yAPwZVYMZ8NNeO0D0z5m6c/Xe8Yax1D604Fpm7EXTb9LWuUZwDifxgrEo5OHEA6XKm8v X0XrToMoGTfnK8D+l6Xsnx485U1armdCI3l7JBGjrfVDyHywduDXHvte4rVQh5nT+4de VplA4dXQLo4xpDdjBDDC+6uJu8aorzvc6W5Ls2qOIHmsqgQIfLhwLKTLreiIZMl9FCzh 8BVQ== ARC-Authentication-Results: i=1; gmr-mx.google.com; spf=pass (google.com: domain of albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org designates 2001:67c:2050:0:465::202 as permitted sender) smtp.mailfrom=albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org Original-Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org. [2001:67c:2050:0:465::202]) by gmr-mx.google.com with ESMTPS id z14-20020a0565120c0e00b0047f750285c2si338458lfu.5.2022.06.27.05.35.02 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Jun 2022 05:35:02 -0700 (PDT) Received-SPF: pass (google.com: domain of albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org designates 2001:67c:2050:0:465::202 as permitted sender) client-ip=2001:67c:2050:0:465::202; Original-Received: from smtp202.mailbox.org (smtp202.mailbox.org [IPv6:2001:67c:2050:b231:465::202]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-384) server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4LWnHR6dSDz9sTD for ; Mon, 27 Jun 2022 14:34:59 +0200 (CEST) In-reply-to: <87r13abaeb.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> X-Rspamd-Queue-Id: 4LWnHR6dSDz9sTD X-Original-Sender: albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org designates 2001:67c:2050:0:465::202 as permitted sender) smtp.mailfrom=albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:30858 Archived-At: Albert Krewinkel writes: > "'guenael Muller' via pandoc-discuss" writes: > >> The idea there, is to be able to convert both html (generated by a rich >> text editor) and markdown (or other similar markup language) file >> through a similar pipeline to a pdf with similar style. Using a >> different templating engine somewhere in the pipeline mean more >> complexity, so i'm considering the idea of using pandoc templating if >> the html result is okay. > > OK, I see. How about the following approach then: use a custom reader > that passes the input through as raw HTML if any of the files have an > `.html` extension, but otherwise treats the input as Markdown. Shorter version that requires a current development version of pandoc, but allows to mix .html files and Markdown files in the same command: ``` lua function Reader (sources, opts) local doc = pandoc.Pandoc{} for _, source in ipairs(sources) do doc = doc .. (source.name:match '%.htm[l]?$' and pandoc.Pandoc{pandoc.RawBlock('html', tostring(source))} or pandoc.read(source, 'markdown', opts)) end return doc end ``` -- Albert Krewinkel GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124