The Pandoc HTML reader is, perhaps surprisingly, worse for reading HTML than the Markdown reader, which will generally preserve HTML (because Markdown is defined as a superset of HTML). So if you want to read HTML without erasing stuff, you are generally better off specifying the *Markdown* reader. The results can be kinda ugly, but there's no way around it: there is no 'native' Markdown for a checkbox input, so it uses the fallback.

Example:

    $ echo '<p><input type="checkbox" /></p>' | pandoc -f html -w markdown
    $ echo '<p><input type="checkbox" /></p>' | pandoc -f markdown -w markdown
    ```{=html}
    <p>
    ```
    `<input type="checkbox" />`{=html}
    ```{=html}
    </p>
    ```

The HTML reader can't understand the <input> so it is silently dropped. The Markdown reader treats it as a HTML fragment embedded in Markdown, which is preserved as a literal, and passed through.

--
gwern
https://gwern.net

--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAMwO0gwyAFsVjJFyxJBB18p6innDv0ssH1Dx4NBo3Je5BvuoeQ%40mail.gmail.com.