public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: Albert Krewinkel <albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
Subject: Re: HTML attributes not being stripped off
Date: Mon, 27 Jun 2022 13:37:43 +0200	[thread overview]
Message-ID: <87r13abaeb.fsf@zeitkraut.de> (raw)
In-Reply-To: <e1b7f6d6-56c7-469e-b2f1-082718e2cbb2n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>


"'guenael Muller' via pandoc-discuss" <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> writes:

> The idea there, is to be able to convert both html (generated by a rich
> text editor)  and markdown (or other similar markup language) file
> through a similar pipeline to a pdf with similar style. Using a
> different templating engine somewhere in the pipeline mean more
> complexity, so i'm considering the idea of using pandoc templating if
> the html result is okay.

OK, I see. How about the following approach then: use a custom reader
that passes the input through as raw HTML if any of the files have an
`.html` extension, but otherwise treats the input as Markdown.

``` lua
function Reader (sources, opts)
  local raw_html = false
  for _, source in ipairs(sources) do
    if source.name:match '%.htm[l]$' then
      raw_html = true
    end
  end
  if raw_html then
    local blocks = sources:map(
      function (source)
        return pandoc.RawBlock('html', tostring(source))
      end
    )
    return pandoc.Pandoc(blocks)
  else
    return pandoc.read(sources, 'markdown', opts)
  end
end
```

See also <https://pandoc.org/custom-readers.html>.

-- 
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124


  parent reply	other threads:[~2022-06-27 11:37 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-11 11:19 Pablo Rodríguez
     [not found] ` <509F89B3.4070403-S0/GAf8tV78@public.gmane.org>
2012-11-11 22:36   ` John MacFarlane
     [not found]     ` <20121111223615.GE4399-9Rnp8PDaXcZ2EAH53EmH34tHsfhOvSUSZkel5v8DVj8@public.gmane.org>
2012-11-12 19:14       ` Pablo Rodríguez
     [not found]         ` <50A14A92.9060301-S0/GAf8tV78@public.gmane.org>
2022-06-27  9:42           ` 'guenael Muller' via pandoc-discuss
     [not found]             ` <33fcfdbf-3edc-4145-a7f0-325bfd42698fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-06-27  9:47               ` Albert Krewinkel
2022-06-27  9:55               ` Sukil Etxenike arizaleta
     [not found]                 ` <87174047-ad9b-b702-4a08-eaa3c00c511d-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2022-06-27 10:17                   ` 'guenael Muller' via pandoc-discuss
     [not found]                     ` <e1b7f6d6-56c7-469e-b2f1-082718e2cbb2n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2022-06-27 11:37                       ` Albert Krewinkel [this message]
     [not found]                         ` <87r13abaeb.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2022-06-27 12:14                           ` Albert Krewinkel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87r13abaeb.fsf@zeitkraut.de \
    --to=albert+pandoc-9eawchwdxg8hfhg+jk9f0w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).