public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
* Links with attributes lost in conversion from HTML to markdown
@ 2021-11-23 10:20 christophe dervieux
       [not found] ` <a953f7f6-ef94-4ffe-bb39-7cb1c283eae5n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: christophe dervieux @ 2021-11-23 10:20 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 855 bytes --]



I tried something like this but was suprised by the conversion

❯ pandoc -t markdown -f html
<a  href="https://CRAN.R-project.org/banner.shtml#submitting"  target="_blank">https://CRAN.R-project.org/banner.shtml#submitting</a>.
^Z
<https://CRAN.R-project.org/banner.shtml#submitting>.

The attributes is lost in the conversion. 

Is there any option / extensions that controls this ? Is this expected ? 

Thank you.
​

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/a953f7f6-ef94-4ffe-bb39-7cb1c283eae5n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 3080 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Links with attributes lost in conversion from HTML to markdown
       [not found] ` <a953f7f6-ef94-4ffe-bb39-7cb1c283eae5n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2021-11-23 10:53   ` Albert Krewinkel
       [not found]     ` <87ee77ywg8.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: Albert Krewinkel @ 2021-11-23 10:53 UTC (permalink / raw)
  To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw

This was fixed recently. Can you try with pandoc 2.16.2?

christophe dervieux <christophe.dervieux-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:

> I tried something like this but was suprised by the conversion
>
> ❯ pandoc -t markdown -f html
> <a  href="https://CRAN.R-project.org/banner.shtml#submitting"  target="_blank">https://CRAN.R-project.org/banner.shtml#submitting</a>.
> ^Z
> <https://CRAN.R-project.org/banner.shtml#submitting>.
>
> The attributes is lost in the conversion. 
>
> Is there any option / extensions that controls this ? Is this expected ? 
>
> Thank you.
> ​


-- 
Albert Krewinkel
GPG: 8eed e3e2 e8c5 6f18 81fe  e836 388d c0b2 1f63 1124

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87ee77ywg8.fsf%40zeitkraut.de.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Links with attributes lost in conversion from HTML to markdown
       [not found]     ` <87ee77ywg8.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
@ 2021-11-24 11:42       ` christophe dervieux
       [not found]         ` <365cd676-c6d3-4c2b-b423-82e330863debn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: christophe dervieux @ 2021-11-24 11:42 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 1284 bytes --]

I got this behavior with 2.16.2 and nightly version too. 

Le mardi 23 novembre 2021 à 11:53:53 UTC+1, Albert Krewinkel a écrit :

> This was fixed recently. Can you try with pandoc 2.16.2?
>
> christophe dervieux <christoph...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes:
>
> > I tried something like this but was suprised by the conversion
> >
> > ❯ pandoc -t markdown -f html
> > <a href="https://CRAN.R-project.org/banner.shtml#submitting" 
> target="_blank">https://CRAN.R-project.org/banner.shtml#submitting</a>.
> > ^Z
> > <https://CRAN.R-project.org/banner.shtml#submitting>.
> >
> > The attributes is lost in the conversion. 
> >
> > Is there any option / extensions that controls this ? Is this expected ? 
> >
> > Thank you.
> > ​
>
>
> -- 
> Albert Krewinkel
> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/365cd676-c6d3-4c2b-b423-82e330863debn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 2850 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Links with attributes lost in conversion from HTML to markdown
       [not found]         ` <365cd676-c6d3-4c2b-b423-82e330863debn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2021-11-24 11:53           ` christophe dervieux
       [not found]             ` <d0f5a273-54f8-4244-8043-ed74f6a8bc81n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
  0 siblings, 1 reply; 5+ messages in thread
From: christophe dervieux @ 2021-11-24 11:53 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2497 bytes --]



I believe the supposed fix was https://github.com/jgm/pandoc/issues/7692

The examples in there works but they are Markdown reader to Markdown Writer

 echo "[https://example.com](https://example.com){.clz}" | pandoc --to markdown
[https://example.com](https://example.com){.clz}

This works with HTML too 

echo "<a href='https://example.com' class='clz'>https://example.com<a>" | pandoc --to markdown --from html
[https://example.com](https://example.com){.clz}

The issue here is specific to some attributes on the HTML even if they are 
valide ones (like target for a tags)

echo "<a href='https://example.com' target='_blank'>https://example.com<a>" | pandoc --to markdown --from html
<https://example.com>

which does not happen from mardkown to markdown

echo "[https://example.com](https://example.com){target='_blank'}" | pandoc --to markdown
[https://example.com](https://example.com){target="_blank"}

So I guess this is something in the reader regarding dropping HTML 
attributes 

Does that make sense ? 

I’ll open an issue
​
Le mercredi 24 novembre 2021 à 12:42:02 UTC+1, christophe dervieux a écrit :

> I got this behavior with 2.16.2 and nightly version too. 
>
> Le mardi 23 novembre 2021 à 11:53:53 UTC+1, Albert Krewinkel a écrit :
>
>> This was fixed recently. Can you try with pandoc 2.16.2? 
>>
>> christophe dervieux <christoph...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: 
>>
>> > I tried something like this but was suprised by the conversion 
>> > 
>> > ❯ pandoc -t markdown -f html 
>> > <a href="https://CRAN.R-project.org/banner.shtml#submitting" 
>> target="_blank">https://CRAN.R-project.org/banner.shtml#submitting</a>. 
>> > ^Z 
>> > <https://CRAN.R-project.org/banner.shtml#submitting>. 
>> > 
>> > The attributes is lost in the conversion. 
>> > 
>> > Is there any option / extensions that controls this ? Is this expected 
>> ? 
>> > 
>> > Thank you. 
>> > ​ 
>>
>>
>> -- 
>> Albert Krewinkel 
>> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 
>>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/d0f5a273-54f8-4244-8043-ed74f6a8bc81n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 10629 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Links with attributes lost in conversion from HTML to markdown
       [not found]             ` <d0f5a273-54f8-4244-8043-ed74f6a8bc81n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2021-11-24 12:02               ` christophe dervieux
  0 siblings, 0 replies; 5+ messages in thread
From: christophe dervieux @ 2021-11-24 12:02 UTC (permalink / raw)
  To: pandoc-discuss


[-- Attachment #1.1: Type: text/plain, Size: 2765 bytes --]

 Probably this issue in fact: https://github.com/jgm/pandoc/issues/6970 

so this is not fixed yet

Le mercredi 24 novembre 2021 à 12:53:02 UTC+1, christophe dervieux a écrit :

> I believe the supposed fix was https://github.com/jgm/pandoc/issues/7692
>
> The examples in there works but they are Markdown reader to Markdown Writer
>
>  echo "[https://example.com](https://example.com){.clz}" | pandoc --to markdown
> [https://example.com](https://example.com){.clz}
>
> This works with HTML too 
>
> echo "<a href='https://example.com' class='clz'>https://example.com<a>" | pandoc --to markdown --from html
> [https://example.com](https://example.com){.clz}
>
> The issue here is specific to some attributes on the HTML even if they are 
> valide ones (like target for a tags)
>
> echo "<a href='https://example.com' target='_blank'>https://example.com<a>" | pandoc --to markdown --from html
> <https://example.com>
>
> which does not happen from mardkown to markdown
>
> echo "[https://example.com](https://example.com){target='_blank'}" | pandoc --to markdown
> [https://example.com](https://example.com){target="_blank"}
>
> So I guess this is something in the reader regarding dropping HTML 
> attributes 
>
> Does that make sense ? 
>
> I’ll open an issue
> ​
> Le mercredi 24 novembre 2021 à 12:42:02 UTC+1, christophe dervieux a 
> écrit :
>
>> I got this behavior with 2.16.2 and nightly version too. 
>>
>> Le mardi 23 novembre 2021 à 11:53:53 UTC+1, Albert Krewinkel a écrit :
>>
>>> This was fixed recently. Can you try with pandoc 2.16.2? 
>>>
>>> christophe dervieux <christoph...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: 
>>>
>>> > I tried something like this but was suprised by the conversion 
>>> > 
>>> > ❯ pandoc -t markdown -f html 
>>> > <a href="https://CRAN.R-project.org/banner.shtml#submitting" 
>>> target="_blank">https://CRAN.R-project.org/banner.shtml#submitting</a>. 
>>> > ^Z 
>>> > <https://CRAN.R-project.org/banner.shtml#submitting>. 
>>> > 
>>> > The attributes is lost in the conversion. 
>>> > 
>>> > Is there any option / extensions that controls this ? Is this expected 
>>> ? 
>>> > 
>>> > Thank you. 
>>> > ​ 
>>>
>>>
>>> -- 
>>> Albert Krewinkel 
>>> GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/aef23063-7ac7-4563-a697-3abc379b1bddn%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 14056 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-11-24 12:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-23 10:20 Links with attributes lost in conversion from HTML to markdown christophe dervieux
     [not found] ` <a953f7f6-ef94-4ffe-bb39-7cb1c283eae5n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2021-11-23 10:53   ` Albert Krewinkel
     [not found]     ` <87ee77ywg8.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>
2021-11-24 11:42       ` christophe dervieux
     [not found]         ` <365cd676-c6d3-4c2b-b423-82e330863debn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2021-11-24 11:53           ` christophe dervieux
     [not found]             ` <d0f5a273-54f8-4244-8043-ed74f6a8bc81n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2021-11-24 12:02               ` christophe dervieux

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).