public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: John MacFarlane <fiddlosopher-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org
Subject: Re: html checkbox to markdown
Date: Fri, 5 May 2023 09:04:41 -0700	[thread overview]
Message-ID: <3C5955E2-B09A-4805-873C-345300ED17F7@gmail.com> (raw)
In-Reply-To: <CAMwO0gwyAFsVjJFyxJBB18p6innDv0ssH1Dx4NBo3Je5BvuoeQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

The proper way to do this is:

 % echo '<p><input type="checkbox" /></p>' | pandoc -f html+raw_html -t markdown
`<input type="checkbox">`{=html}`</input>`{=html}

Using the `raw_html` extension with the html reader will cause the unknown things to be included as raw HTML rather than dropped.  If you don't want the pandoc 'raw attribute' syntax, you can disable that:

% echo '<p><input type="checkbox" /></p>' | pandoc -f html+raw_html -w markdown-raw_attribute
<input type="checkbox"></input>


> On May 5, 2023, at 8:27 AM, Gwern Branwen <gwern-v26ZT+9V8bxeoWH0uzbU5w@public.gmane.org> wrote:
> 
> The Pandoc HTML reader is, perhaps surprisingly, worse for reading HTML than the Markdown reader, which will generally preserve HTML (because Markdown is defined as a superset of HTML). So if you want to read HTML without erasing stuff, you are generally better off specifying the *Markdown* reader. The results can be kinda ugly, but there's no way around it: there is no 'native' Markdown for a checkbox input, so it uses the fallback.
> 
> Example:
> 
>     $ echo '<p><input type="checkbox" /></p>' | pandoc -f html -w markdown
>     $ echo '<p><input type="checkbox" /></p>' | pandoc -f markdown -w markdown
>     ```{=html}
>     <p>
>     ```
>     `<input type="checkbox" />`{=html}
>     ```{=html}
>     </p>
>     ```
> 
> The HTML reader can't understand the <input> so it is silently dropped. The Markdown reader treats it as a HTML fragment embedded in Markdown, which is preserved as a literal, and passed through.
> 
> -- 
> gwern
> https://gwern.net
> 
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAMwO0gwyAFsVjJFyxJBB18p6innDv0ssH1Dx4NBo3Je5BvuoeQ%40mail.gmail.com.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/3C5955E2-B09A-4805-873C-345300ED17F7%40gmail.com.


  parent reply	other threads:[~2023-05-05 16:04 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-05 13:09 姓名
     [not found] ` <c528195a-0d1a-4795-9f53-5c5ddad34b4fn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-05-05 13:51   ` Albert Krewinkel
     [not found]     ` <87zg6ian3w.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2023-05-05 14:56       ` 姓名
     [not found]         ` <0d96eb75-e25a-44b3-880d-94106f0b2cdbn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-05-05 15:27           ` Gwern Branwen
     [not found]             ` <CAMwO0gwyAFsVjJFyxJBB18p6innDv0ssH1Dx4NBo3Je5BvuoeQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2023-05-05 16:04               ` John MacFarlane [this message]
2023-05-05 16:06           ` Albert Krewinkel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3C5955E2-B09A-4805-873C-345300ED17F7@gmail.com \
    --to=fiddlosopher-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).