* Ignore link attributes and always match a hyperlink or image
@ 2023-10-19 5:35 Kevin Keegan
[not found] ` <1fa1b803-eced-48d5-b96d-153068eacd2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
0 siblings, 1 reply; 3+ messages in thread
From: Kevin Keegan @ 2023-10-19 5:35 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 1580 bytes --]
I am trying to convert some naif HTML snippets to markdown, everything
works great expect for this strange behaviour that I am curious to know if
I am missing something in pandoc or I need to fix it myself.
Having this HTML snippet:
```
<p>Lorem <a href="#">ipsum</a> dolor <a href="#" class="a">sit</a> amet.</p>
```
Using `link_attributes` extension, it returns:
```
$ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#"
class="a">sit</a> amet.</p>' | pandoc --from html --to
markdown_strict+link_attributes
Lorem [ipsum](#) dolor [sit](#){.a} amet.
```
By omitting it, it returns:
```
$ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#"
class="a">sit</a> amet.</p>' | pandoc --from html --to markdown_strict
Lorem [ipsum](#) dolor <a href="#" class="a">sit</a> amet.
```
I was wondering if there is a way by omitting the `link_attributes`
extension to replace anyway the hyperlink with extra attributes, ignoring
the latter. The desired result would be:
```
$ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#"
class="a">sit</a> amet.</p>' | pandoc --from html --to markdown_strict
Lorem [ipsum](#) dolor [sit](#) amet.
```
Thank you.
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/1fa1b803-eced-48d5-b96d-153068eacd2bn%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 2334 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Ignore link attributes and always match a hyperlink or image
[not found] ` <1fa1b803-eced-48d5-b96d-153068eacd2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
@ 2023-10-19 6:01 ` John MacFarlane
[not found] ` <3BE27726-13AE-4F51-8BB9-E729A21A62B8-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
0 siblings, 1 reply; 3+ messages in thread
From: John MacFarlane @ 2023-10-19 6:01 UTC (permalink / raw)
To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw
You can try disabling raw_html: -t markdown_strict-raw_html
> On Oct 18, 2023, at 10:35 PM, Kevin Keegan <poowaq-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
>
> I am trying to convert some naif HTML snippets to markdown, everything works great expect for this strange behaviour that I am curious to know if I am missing something in pandoc or I need to fix it myself.
>
> Having this HTML snippet:
> ```
> <p>Lorem <a href="#">ipsum</a> dolor <a href="#" class="a">sit</a> amet.</p>
> ```
>
> Using `link_attributes` extension, it returns:
> ```
> $ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#" class="a">sit</a> amet.</p>' | pandoc --from html --to markdown_strict+link_attributes
> Lorem [ipsum](#) dolor [sit](#){.a} amet.
> ```
>
> By omitting it, it returns:
> ```
> $ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#" class="a">sit</a> amet.</p>' | pandoc --from html --to markdown_strict
> Lorem [ipsum](#) dolor <a href="#" class="a">sit</a> amet.
> ```
>
> I was wondering if there is a way by omitting the `link_attributes` extension to replace anyway the hyperlink with extra attributes, ignoring the latter. The desired result would be:
> ```
> $ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#" class="a">sit</a> amet.</p>' | pandoc --from html --to markdown_strict
> Lorem [ipsum](#) dolor [sit](#) amet.
> ```
>
> Thank you.
>
> --
> You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/1fa1b803-eced-48d5-b96d-153068eacd2bn%40googlegroups.com.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Ignore link attributes and always match a hyperlink or image
[not found] ` <3BE27726-13AE-4F51-8BB9-E729A21A62B8-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
@ 2023-10-19 6:30 ` Kevin Keegan
0 siblings, 0 replies; 3+ messages in thread
From: Kevin Keegan @ 2023-10-19 6:30 UTC (permalink / raw)
To: pandoc-discuss
[-- Attachment #1.1: Type: text/plain, Size: 2518 bytes --]
Thanks, I didn't expect that from reading the `raw_html` documentation.
On Thursday, October 19, 2023 at 8:02:08 AM UTC+2 John MacFarlane wrote:
> You can try disabling raw_html: -t markdown_strict-raw_html
>
> > On Oct 18, 2023, at 10:35 PM, Kevin Keegan <poo...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote:
> >
> > I am trying to convert some naif HTML snippets to markdown, everything
> works great expect for this strange behaviour that I am curious to know if
> I am missing something in pandoc or I need to fix it myself.
> >
> > Having this HTML snippet:
> > ```
> > <p>Lorem <a href="#">ipsum</a> dolor <a href="#" class="a">sit</a>
> amet.</p>
> > ```
> >
> > Using `link_attributes` extension, it returns:
> > ```
> > $ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#"
> class="a">sit</a> amet.</p>' | pandoc --from html --to
> markdown_strict+link_attributes
> > Lorem [ipsum](#) dolor [sit](#){.a} amet.
> > ```
> >
> > By omitting it, it returns:
> > ```
> > $ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#"
> class="a">sit</a> amet.</p>' | pandoc --from html --to markdown_strict
> > Lorem [ipsum](#) dolor <a href="#" class="a">sit</a> amet.
> > ```
> >
> > I was wondering if there is a way by omitting the `link_attributes`
> extension to replace anyway the hyperlink with extra attributes, ignoring
> the latter. The desired result would be:
> > ```
> > $ printf '<p>Lorem <a href="#">ipsum</a> dolor <a href="#"
> class="a">sit</a> amet.</p>' | pandoc --from html --to markdown_strict
> > Lorem [ipsum](#) dolor [sit](#) amet.
> > ```
> >
> > Thank you.
> >
> > --
> > You received this message because you are subscribed to the Google
> Groups "pandoc-discuss" group.
> > To unsubscribe from this group and stop receiving emails from it, send
> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
> > To view this discussion on the web visit
> https://groups.google.com/d/msgid/pandoc-discuss/1fa1b803-eced-48d5-b96d-153068eacd2bn%40googlegroups.com
> .
>
>
--
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/41091039-be55-4692-bed4-e87aef240f14n%40googlegroups.com.
[-- Attachment #1.2: Type: text/html, Size: 3925 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-10-19 6:30 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-19 5:35 Ignore link attributes and always match a hyperlink or image Kevin Keegan
[not found] ` <1fa1b803-eced-48d5-b96d-153068eacd2bn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2023-10-19 6:01 ` John MacFarlane
[not found] ` <3BE27726-13AE-4F51-8BB9-E729A21A62B8-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2023-10-19 6:30 ` Kevin Keegan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).