public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Using '&' for YouTube embeds in markdown?
@ 2015-09-17 13:36 Joseph Reagle
       [not found] ` <55FAC1F6.4000803-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Joseph Reagle @ 2015-09-17 13:36 UTC (permalink / raw)
  To: pandoc-discuss

For my class slides, I've long made use of embedded YouTube videos, sometimes with start and end parameters.


```
## Marshmallow study

<iframe width="640" height="480" src="//www.youtube.com/embed/Wio6Ue2-O_4?start=0&end=79" frameborder="0" allowfullscreen></iframe>

```

The bald '&' may be invalid HTML (and markdown) but I always got away with it. However, it appears pandoc is now rendering this as:

```

<section id="marshmallow-study" class="slide level2">
<h1>Marshmallow study</h1>
<p>&lt;iframe width=“640” height=“480” data-src=“http://www.youtube.com/embed/Wio6Ue2-O_4?start=0&amp;end=79” frameborder=“0” allowfullscreen&gt;</iframe></p>
</section>

```

I'm using pandoc 1.15.0.6 and wonder if something changed or it became more strict? It does work if I use &amp; instead. I'm just curious and wanted to point this out for others.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/55FAC1F6.4000803%40reagle.org.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using '&' for YouTube embeds in markdown?
       [not found] ` <55FAC1F6.4000803-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
@ 2015-09-17 18:44   ` John MACFARLANE
       [not found]     ` <20150917184454.GC21127-4kKid1p5UN4xFjuZnxJpBp3lxR28IOakuDuwTybUTCk@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: John MACFARLANE @ 2015-09-17 18:44 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

I believe this is due to commit
99fe8594d94573b8ba8ec1d1e47b57444de4e4cb
which closed #227.  This change causes the raw HTML parser
to fail if the HTML5 lexer generates any warnings.
The intent is to ensure that things like
<www.boe.es/buscar/act.php?id=BOE-A-1996-8930#a66>
don't get parsed as HTML tags.  A side-effect is that
your iframe tag doesn't get parsed as an HTML tag either.
Maybe there is a better solution?

+++ Joseph Reagle [Sep 17 15 09:36 ]:
>For my class slides, I've long made use of embedded YouTube videos, sometimes with start and end parameters.
>
>
>```
>## Marshmallow study
>
><iframe width="640" height="480" src="//www.youtube.com/embed/Wio6Ue2-O_4?start=0&end=79" frameborder="0" allowfullscreen></iframe>
>
>```
>
>The bald '&' may be invalid HTML (and markdown) but I always got away with it. However, it appears pandoc is now rendering this as:
>
>```
>
><section id="marshmallow-study" class="slide level2">
><h1>Marshmallow study</h1>
><p>&lt;iframe width=“640” height=“480” data-src=“http://www.youtube.com/embed/Wio6Ue2-O_4?start=0&amp;end=79” frameborder=“0” allowfullscreen&gt;</iframe></p>
></section>
>
>```
>
>I'm using pandoc 1.15.0.6 and wonder if something changed or it became more strict? It does work if I use &amp; instead. I'm just curious and wanted to point this out for others.
>
>-- 
>You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
>To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
>To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/55FAC1F6.4000803%40reagle.org.
>For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/20150917184454.GC21127%40D25Q40BGFY13.Berkeley.EDU.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using '&' for YouTube embeds in markdown?
       [not found]     ` <20150917184454.GC21127-4kKid1p5UN4xFjuZnxJpBp3lxR28IOakuDuwTybUTCk@public.gmane.org>
@ 2015-09-17 19:22       ` Joseph Reagle
       [not found]         ` <55FB12DA.8060606-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: Joseph Reagle @ 2015-09-17 19:22 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 09/17/2015 02:44 PM, John MACFARLANE wrote:
> I believe this is due to commit
> 99fe8594d94573b8ba8ec1d1e47b57444de4e4cb
> which closed #227.  This change causes the raw HTML parser
> to fail if the HTML5 lexer generates any warnings.
> The intent is to ensure that things like
> <www.boe.es/buscar/act.php?id=BOE-A-1996-8930#a66>
> don't get parsed as HTML tags.  A side-effect is that
> your iframe tag doesn't get parsed as an HTML tag either.
> Maybe there is a better solution?

Not sure... I blame YouTube actually, if they are giving us elements to embed in HTML, then it should be well-formed HTML.

Would it be possible/acceptable for pandoc to find such errors and fix them in the output?


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using '&' for YouTube embeds in markdown?
       [not found]         ` <55FB12DA.8060606-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
@ 2015-09-17 19:39           ` Daniel Staal
  2015-09-17 21:40             ` John MACFARLANE
  2015-09-17 21:44           ` John MACFARLANE
  1 sibling, 1 reply; 10+ messages in thread
From: Daniel Staal @ 2015-09-17 19:39 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

--As of September 17, 2015 3:22:02 PM -0400, Joseph Reagle is alleged to 
have said:

> On 09/17/2015 02:44 PM, John MACFARLANE wrote:
>> I believe this is due to commit
>> 99fe8594d94573b8ba8ec1d1e47b57444de4e4cb
>> which closed #227.  This change causes the raw HTML parser
>> to fail if the HTML5 lexer generates any warnings.
>> The intent is to ensure that things like
>> <www.boe.es/buscar/act.php?id=BOE-A-1996-8930#a66>
>> don't get parsed as HTML tags.  A side-effect is that
>> your iframe tag doesn't get parsed as an HTML tag either.
>> Maybe there is a better solution?
>
> Not sure... I blame YouTube actually, if they are giving us elements to
> embed in HTML, then it should be well-formed HTML.
>
> Would it be possible/acceptable for pandoc to find such errors and fix
> them in the output?

--As for the rest, it is mine.

That was well-formed HTML...  Although admittedly some of it's not valid 
HTML5.  But even then it's only one attribute.  Try this:

<iframe width="640" height="480" 
src="//www.youtube.com/embed/Wio6Ue2-O_4?start=0&end=79" style="border:0;" 
allowfullscreen></iframe>

`allowfullscreen` is bleeding-edge, so if you still have trouble you can 
try removing that too.

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using '&' for YouTube embeds in markdown?
  2015-09-17 19:39           ` Daniel Staal
@ 2015-09-17 21:40             ` John MACFARLANE
       [not found]               ` <20150917214014.GD30437-4kKid1p5UN4xFjuZnxJpBp3lxR28IOakuDuwTybUTCk@public.gmane.org>
  0 siblings, 1 reply; 10+ messages in thread
From: John MACFARLANE @ 2015-09-17 21:40 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Daniel Staal [Sep 17 15 15:39 ]:
>--As of September 17, 2015 3:22:02 PM -0400, Joseph Reagle is alleged 
>to have said:
>
>>On 09/17/2015 02:44 PM, John MACFARLANE wrote:
>>>I believe this is due to commit
>>>99fe8594d94573b8ba8ec1d1e47b57444de4e4cb
>>>which closed #227.  This change causes the raw HTML parser
>>>to fail if the HTML5 lexer generates any warnings.
>>>The intent is to ensure that things like
>>><www.boe.es/buscar/act.php?id=BOE-A-1996-8930#a66>
>>>don't get parsed as HTML tags.  A side-effect is that
>>>your iframe tag doesn't get parsed as an HTML tag either.
>>>Maybe there is a better solution?
>>
>>Not sure... I blame YouTube actually, if they are giving us elements to
>>embed in HTML, then it should be well-formed HTML.
>>
>>Would it be possible/acceptable for pandoc to find such errors and fix
>>them in the output?
>
>--As for the rest, it is mine.
>
>That was well-formed HTML...  Although admittedly some of it's not 
>valid HTML5.  But even then it's only one attribute.  Try this:

The problem is not the attribute, but the unescaped
ampersand.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using '&' for YouTube embeds in markdown?
       [not found]         ` <55FB12DA.8060606-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
  2015-09-17 19:39           ` Daniel Staal
@ 2015-09-17 21:44           ` John MACFARLANE
       [not found]             ` <20150917214450.GE30437-4kKid1p5UN4xFjuZnxJpBp3lxR28IOakuDuwTybUTCk@public.gmane.org>
  1 sibling, 1 reply; 10+ messages in thread
From: John MACFARLANE @ 2015-09-17 21:44 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Joseph Reagle [Sep 17 15 15:22 ]:
>On 09/17/2015 02:44 PM, John MACFARLANE wrote:
>> I believe this is due to commit
>> 99fe8594d94573b8ba8ec1d1e47b57444de4e4cb
>> which closed #227.  This change causes the raw HTML parser
>> to fail if the HTML5 lexer generates any warnings.
>> The intent is to ensure that things like
>> <www.boe.es/buscar/act.php?id=BOE-A-1996-8930#a66>
>> don't get parsed as HTML tags.  A side-effect is that
>> your iframe tag doesn't get parsed as an HTML tag either.
>> Maybe there is a better solution?
>
>Not sure... I blame YouTube actually, if they are giving us elements to embed in HTML, then it should be well-formed HTML.

PS. I just tried their Embed feature, and it did seem to properly escape the & in the query.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using '&' for YouTube embeds in markdown?
       [not found]               ` <20150917214014.GD30437-4kKid1p5UN4xFjuZnxJpBp3lxR28IOakuDuwTybUTCk@public.gmane.org>
@ 2015-09-17 21:59                 ` Daniel Staal
  2015-09-17 22:19                   ` Joseph Reagle
  2015-09-18  1:54                   ` John MACFARLANE
  0 siblings, 2 replies; 10+ messages in thread
From: Daniel Staal @ 2015-09-17 21:59 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

--As of September 17, 2015 2:40:14 PM -0700, John MACFARLANE is alleged to 
have said:

> +++ Daniel Staal [Sep 17 15 15:39 ]:
>> --As of September 17, 2015 3:22:02 PM -0400, Joseph Reagle is alleged
>> to have said:
>>
>>> On 09/17/2015 02:44 PM, John MACFARLANE wrote:
>>>> I believe this is due to commit
>>>> 99fe8594d94573b8ba8ec1d1e47b57444de4e4cb
>>>> which closed #227.  This change causes the raw HTML parser
>>>> to fail if the HTML5 lexer generates any warnings.
>>>> The intent is to ensure that things like
>>>> <www.boe.es/buscar/act.php?id=BOE-A-1996-8930#a66>
>>>> don't get parsed as HTML tags.  A side-effect is that
>>>> your iframe tag doesn't get parsed as an HTML tag either.
>>>> Maybe there is a better solution?
>>>
>>> Not sure... I blame YouTube actually, if they are giving us elements to
>>> embed in HTML, then it should be well-formed HTML.
>>>
>>> Would it be possible/acceptable for pandoc to find such errors and fix
>>> them in the output?
>>
>> --As for the rest, it is mine.
>>
>> That was well-formed HTML...  Although admittedly some of it's not
>> valid HTML5.  But even then it's only one attribute.  Try this:
>
> The problem is not the attribute, but the unescaped
> ampersand.

--As for the rest, it is mine.

Ampersands are specifically mentioned in RFC's 3986 and 2396 as delimiter 
characters in URLs, and the URL is quoted specifically so that the standard 
HTML escaping rules don't apply.  (In fact, encoding them should change the 
URL.  If it works with them encoded it's because YouTube is adjusting for 
broken code.)  It should *not* be escaped in this context.

I thought the problem being brought up was that Pandoc was escaping it!

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using '&' for YouTube embeds in markdown?
       [not found]             ` <20150917214450.GE30437-4kKid1p5UN4xFjuZnxJpBp3lxR28IOakuDuwTybUTCk@public.gmane.org>
@ 2015-09-17 22:01               ` Joseph Reagle
  0 siblings, 0 replies; 10+ messages in thread
From: Joseph Reagle @ 2015-09-17 22:01 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 09/17/2015 05:44 PM, John MACFARLANE wrote:
> PS. I just tried their Embed feature, and it did seem to properly
> escape the & in the query.

Apparently, my encountering this is a result of older embeds I had from the past. There was a time when their given embed URLs didn't have 'https:' at the start of the source attribute and start/end on embed was different. They don't seem to even offering start/end for embeds anymore in the GUI -- I'll have to add them myself. Accordingly, I'll just make sure to add the parameters properly encoded myself and to update my old embed elements as I encounter them.

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/55FB3851.30103%40reagle.org.
For more options, visit https://groups.google.com/d/optout.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using '&' for YouTube embeds in markdown?
  2015-09-17 21:59                 ` Daniel Staal
@ 2015-09-17 22:19                   ` Joseph Reagle
  2015-09-18  1:54                   ` John MACFARLANE
  1 sibling, 0 replies; 10+ messages in thread
From: Joseph Reagle @ 2015-09-17 22:19 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

On 09/17/2015 05:59 PM, Daniel Staal wrote:
> Ampersands are specifically mentioned in RFC's 3986 and 2396 as
> delimiter characters in URLs, and the URL is quoted specifically so
> that the standard HTML escaping rules don't apply.  (In fact,
> encoding them should change the URL.  If it works with them encoded
> it's because YouTube is adjusting for broken code.)  It should *not*
> be escaped in this context.

Daniel, I'm not following you, this is now moot and digressive, but '&' does need to be escaped for text/html, even if they appear in quoted attributes. See this example [1].

[1]: https://validator.w3.org/nu/?showsource=yes&doc=http%3A%2F%2Freagle.org%2Fjoseph%2Ftmp%2Ftest.html


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Using '&' for YouTube embeds in markdown?
  2015-09-17 21:59                 ` Daniel Staal
  2015-09-17 22:19                   ` Joseph Reagle
@ 2015-09-18  1:54                   ` John MACFARLANE
  1 sibling, 0 replies; 10+ messages in thread
From: John MACFARLANE @ 2015-09-18  1:54 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

+++ Daniel Staal [Sep 17 15 17:59 ]:
>
>Ampersands are specifically mentioned in RFC's 3986 and 2396 as 
>delimiter characters in URLs, and the URL is quoted specifically so 
>that the standard HTML escaping rules don't apply.  (In fact, encoding 
>them should change the URL.  If it works with them encoded it's 
>because YouTube is adjusting for broken code.)  It should *not* be 
>escaped in this context.

That's right - ampersands are allowed and have a special
role in URLs.  But when URLs are represented in HTML, the
ampersands must be escaped (because of rules of HTML, not
URLs).  See e.g.
http://www.htmlhelp.com/tools/validator/problems.html
under Ampersands in URLs.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-09-18  1:54 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-17 13:36 Using '&' for YouTube embeds in markdown? Joseph Reagle
     [not found] ` <55FAC1F6.4000803-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
2015-09-17 18:44   ` John MACFARLANE
     [not found]     ` <20150917184454.GC21127-4kKid1p5UN4xFjuZnxJpBp3lxR28IOakuDuwTybUTCk@public.gmane.org>
2015-09-17 19:22       ` Joseph Reagle
     [not found]         ` <55FB12DA.8060606-T1oY19WcHSwdnm+yROfE0A@public.gmane.org>
2015-09-17 19:39           ` Daniel Staal
2015-09-17 21:40             ` John MACFARLANE
     [not found]               ` <20150917214014.GD30437-4kKid1p5UN4xFjuZnxJpBp3lxR28IOakuDuwTybUTCk@public.gmane.org>
2015-09-17 21:59                 ` Daniel Staal
2015-09-17 22:19                   ` Joseph Reagle
2015-09-18  1:54                   ` John MACFARLANE
2015-09-17 21:44           ` John MACFARLANE
     [not found]             ` <20150917214450.GE30437-4kKid1p5UN4xFjuZnxJpBp3lxR28IOakuDuwTybUTCk@public.gmane.org>
2015-09-17 22:01               ` Joseph Reagle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).