public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Escaping HTML in Markdown
@ 2020-06-24  6:19 Daniil Baturin
       [not found] ` <54f7b27c-24a9-483f-a161-ca9a9fd90cf7o-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 4+ messages in thread
From: Daniil Baturin @ 2020-06-24  6:19 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 996 bytes --]

Hi everyone,

Right now, pandoc doesn't make any attempts to remove or escape HTML tags 
when converting from Markdown to HTML.
I believe this is a perfectly sensible behaviour for a tool people normally 
run on data they wrote or at least reviewed themselves

I'd like to know maintainers' official position: whether it's a part of the 
design they aren't going to change, or they see it as a security issue that 
must be fixed at some point.

If there is a plan to change that behaviour, will backwards compatibility 
considerations be taken into account?

Thanks in advance,
Daniil

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/54f7b27c-24a9-483f-a161-ca9a9fd90cf7o%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 1452 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Escaping HTML in Markdown
       [not found] ` <54f7b27c-24a9-483f-a161-ca9a9fd90cf7o-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2020-06-24  8:53   ` Lukas Atkinson
  2020-06-24 10:48   ` Albert Krewinkel
  2020-06-24 16:36   ` John MacFarlane
  2 siblings, 0 replies; 4+ messages in thread
From: Lukas Atkinson @ 2020-06-24  8:53 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

[-- Attachment #1: Type: text/plain, Size: 2372 bytes --]

HTML is a core part of the Markdown syntax, but you can turn off most uses
by disabling the `raw_html` and `raw_attribute` extensions. You may also
want to disable Pandoc extensions like `native_spans`, or the author could
inject code like `[click me]{onclick=alert()}`.

But that's not a meaningful security barrier.

Conversion via Pandoc is reasonably secure (leaving aside denial of
service), but if you want to show content in a security-sensitive context
(like rendering user provided content on a web page) then you should strip
away unwanted HTML tags or attributes in a post-processing step.

(I'm just another user, this is no official statement)

On Wed, 24 Jun 2020 at 06:19, Daniil Baturin <daniil-urxn9axsVrEtq7phqP6ubQ@public.gmane.org> wrote:

> Hi everyone,
>
> Right now, pandoc doesn't make any attempts to remove or escape HTML tags
> when converting from Markdown to HTML.
> I believe this is a perfectly sensible behaviour for a tool people
> normally run on data they wrote or at least reviewed themselves
>
> I'd like to know maintainers' official position: whether it's a part of
> the design they aren't going to change, or they see it as a security issue
> that must be fixed at some point.
>
> If there is a plan to change that behaviour, will backwards compatibility
> considerations be taken into account?
>
> Thanks in advance,
> Daniil
>
> --
> You received this message because you are subscribed to the Google Groups
> "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/54f7b27c-24a9-483f-a161-ca9a9fd90cf7o%40googlegroups.com
> <https://groups.google.com/d/msgid/pandoc-discuss/54f7b27c-24a9-483f-a161-ca9a9fd90cf7o%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAJTYOd1aaoYvx1A5FL%3D9PF%2Bh%2BOKtN18oGtF8o2yAzMxGtorroQ%40mail.gmail.com.

[-- Attachment #2: Type: text/html, Size: 3376 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Escaping HTML in Markdown
       [not found] ` <54f7b27c-24a9-483f-a161-ca9a9fd90cf7o-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2020-06-24  8:53   ` Lukas Atkinson
@ 2020-06-24 10:48   ` Albert Krewinkel
  2020-06-24 16:36   ` John MacFarlane
  2 siblings, 0 replies; 4+ messages in thread
From: Albert Krewinkel @ 2020-06-24 10:48 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw


Daniil Baturin writes:

> Right now, pandoc doesn't make any attempts to remove or escape HTML tags
> when converting from Markdown to HTML. [...]
>
> I'd like to know maintainers' official position: whether it's a part of the
> design they aren't going to change, or they see it as a security issue that
> must be fixed at some point.

I think the following quote describes the official position best, taken
from
https://github.com/jgm/pandoc/blob/master/doc/using-the-pandoc-api.md#notes-on-using-pandoc-in-web-applications

> If pandoc generates HTML from untrusted user input, it is always a
> good idea to filter the generated HTML through a sanitizer (such as
> xss-sanitize) to avoid security problems.

I believe this is dependable behavior and highly unlikely to change.
Even GitHub is, AFAIK, using a separate sanitizer, which is not built
into their cmark fork.

Cheers,
Albert

--
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Escaping HTML in Markdown
       [not found] ` <54f7b27c-24a9-483f-a161-ca9a9fd90cf7o-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  2020-06-24  8:53   ` Lukas Atkinson
  2020-06-24 10:48   ` Albert Krewinkel
@ 2020-06-24 16:36   ` John MacFarlane
  2 siblings, 0 replies; 4+ messages in thread
From: John MacFarlane @ 2020-06-24 16:36 UTC (permalink / raw)
  To: Daniil Baturin, pandoc-discuss


See note 4 at
https://pandoc.org/MANUAL.html#a-note-on-security

The choice we made was to recommend using an external tool to
sanitize the output of pandoc.  There have not been requests to
build sanitization into pandoc itself, and it's probably safest
to rely on a sanitizer that is dedicated to this purpose.

Daniil Baturin <daniil-urxn9axsVrEtq7phqP6ubQ@public.gmane.org> writes:

> Hi everyone,
>
> Right now, pandoc doesn't make any attempts to remove or escape HTML tags 
> when converting from Markdown to HTML.
> I believe this is a perfectly sensible behaviour for a tool people normally 
> run on data they wrote or at least reviewed themselves
>
> I'd like to know maintainers' official position: whether it's a part of the 
> design they aren't going to change, or they see it as a security issue that 
> must be fixed at some point.
>
> If there is a plan to change that behaviour, will backwards compatibility 
> considerations be taken into account?
>
> Thanks in advance,
> Daniil
>
> -- 
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/54f7b27c-24a9-483f-a161-ca9a9fd90cf7o%40googlegroups.com.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-06-24 16:36 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-24  6:19 Escaping HTML in Markdown Daniil Baturin
     [not found] ` <54f7b27c-24a9-483f-a161-ca9a9fd90cf7o-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2020-06-24  8:53   ` Lukas Atkinson
2020-06-24 10:48   ` Albert Krewinkel
2020-06-24 16:36   ` John MacFarlane

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).