* Ignore link attributes and always match a hyperlink or image @ 2023-10-19 5:35 Kevin Keegan [not found] ` <1fa1b803-eced-48d5-b96d-153068eacd2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 3+ messages in thread From: Kevin Keegan @ 2023-10-19 5:35 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 1580 bytes --] I am trying to convert some naif HTML snippets to markdown, everything works great expect for this strange behaviour that I am curious to know if I am missing something in pandoc or I need to fix it myself. Having this HTML snippet: ``` <p>Lorem <a href="#">ipsum</a> dolor <a href="#" class="a">sit</a> amet.</p> ``` Using `link_attributes` extension, it returns: ``` $ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#" class="a">sit</a> amet.</p>' | pandoc --from html --to markdown_strict+link_attributes Lorem [ipsum](#) dolor [sit](#){.a} amet. ``` By omitting it, it returns: ``` $ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#" class="a">sit</a> amet.</p>' | pandoc --from html --to markdown_strict Lorem [ipsum](#) dolor <a href="#" class="a">sit</a> amet. ``` I was wondering if there is a way by omitting the `link_attributes` extension to replace anyway the hyperlink with extra attributes, ignoring the latter. The desired result would be: ``` $ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#" class="a">sit</a> amet.</p>' | pandoc --from html --to markdown_strict Lorem [ipsum](#) dolor [sit](#) amet. ``` Thank you. -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/1fa1b803-eced-48d5-b96d-153068eacd2bn%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 2334 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <1fa1b803-eced-48d5-b96d-153068eacd2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Ignore link attributes and always match a hyperlink or image [not found] ` <1fa1b803-eced-48d5-b96d-153068eacd2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2023-10-19 6:01 ` John MacFarlane [not found] ` <3BE27726-13AE-4F51-8BB9-E729A21A62B8-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 0 siblings, 1 reply; 3+ messages in thread From: John MacFarlane @ 2023-10-19 6:01 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw You can try disabling raw_html: -t markdown_strict-raw_html > On Oct 18, 2023, at 10:35 PM, Kevin Keegan <poowaq-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > I am trying to convert some naif HTML snippets to markdown, everything works great expect for this strange behaviour that I am curious to know if I am missing something in pandoc or I need to fix it myself. > > Having this HTML snippet: > ``` > <p>Lorem <a href="#">ipsum</a> dolor <a href="#" class="a">sit</a> amet.</p> > ``` > > Using `link_attributes` extension, it returns: > ``` > $ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#" class="a">sit</a> amet.</p>' | pandoc --from html --to markdown_strict+link_attributes > Lorem [ipsum](#) dolor [sit](#){.a} amet. > ``` > > By omitting it, it returns: > ``` > $ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#" class="a">sit</a> amet.</p>' | pandoc --from html --to markdown_strict > Lorem [ipsum](#) dolor <a href="#" class="a">sit</a> amet. > ``` > > I was wondering if there is a way by omitting the `link_attributes` extension to replace anyway the hyperlink with extra attributes, ignoring the latter. The desired result would be: > ``` > $ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#" class="a">sit</a> amet.</p>' | pandoc --from html --to markdown_strict > Lorem [ipsum](#) dolor [sit](#) amet. > ``` > > Thank you. > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/1fa1b803-eced-48d5-b96d-153068eacd2bn%40googlegroups.com. ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <3BE27726-13AE-4F51-8BB9-E729A21A62B8-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>]
* Re: Ignore link attributes and always match a hyperlink or image [not found] ` <3BE27726-13AE-4F51-8BB9-E729A21A62B8-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> @ 2023-10-19 6:30 ` Kevin Keegan 0 siblings, 0 replies; 3+ messages in thread From: Kevin Keegan @ 2023-10-19 6:30 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 2518 bytes --] Thanks, I didn't expect that from reading the `raw_html` documentation. On Thursday, October 19, 2023 at 8:02:08 AM UTC+2 John MacFarlane wrote: > You can try disabling raw_html: -t markdown_strict-raw_html > > > On Oct 18, 2023, at 10:35 PM, Kevin Keegan <poo...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > > > > I am trying to convert some naif HTML snippets to markdown, everything > works great expect for this strange behaviour that I am curious to know if > I am missing something in pandoc or I need to fix it myself. > > > > Having this HTML snippet: > > ``` > > <p>Lorem <a href="#">ipsum</a> dolor <a href="#" class="a">sit</a> > amet.</p> > > ``` > > > > Using `link_attributes` extension, it returns: > > ``` > > $ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#" > class="a">sit</a> amet.</p>' | pandoc --from html --to > markdown_strict+link_attributes > > Lorem [ipsum](#) dolor [sit](#){.a} amet. > > ``` > > > > By omitting it, it returns: > > ``` > > $ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#" > class="a">sit</a> amet.</p>' | pandoc --from html --to markdown_strict > > Lorem [ipsum](#) dolor <a href="#" class="a">sit</a> amet. > > ``` > > > > I was wondering if there is a way by omitting the `link_attributes` > extension to replace anyway the hyperlink with extra attributes, ignoring > the latter. The desired result would be: > > ``` > > $ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#" > class="a">sit</a> amet.</p>' | pandoc --from html --to markdown_strict > > Lorem [ipsum](#) dolor [sit](#) amet. > > ``` > > > > Thank you. > > > > -- > > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/1fa1b803-eced-48d5-b96d-153068eacd2bn%40googlegroups.com > . > > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/41091039-be55-4692-bed4-e87aef240f14n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 3925 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-10-19 6:30 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2023-10-19 5:35 Ignore link attributes and always match a hyperlink or image Kevin Keegan [not found] ` <1fa1b803-eced-48d5-b96d-153068eacd2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2023-10-19 6:01 ` John MacFarlane [not found] ` <3BE27726-13AE-4F51-8BB9-E729A21A62B8-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> 2023-10-19 6:30 ` Kevin Keegan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).