* Is there a way to change the way Pandoc parses HTML inside of markdown documents? @ 2021-08-16 21:43 pompez [not found] ` <aae29ca7-60ca-4349-af03-939f0ac503efn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: pompez @ 2021-08-16 21:43 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 2262 bytes --] I'm starting out with Lua filters and apologize for this possibly already answered question. You can also read this question on StackOverflow <https://stackoverflow.com/questions/68809527/is-there-a-way-to-change-the-way-pandoc-parses-html-inside-of-markdown-documents> . I'm using Pandoc to convert markdown to HTML. My markdown files also contain some raw HTML. In the examples, I'll be using `<mark>` and `<u>`. Let's say I want to change every `<mark>` to a `<u>` tag. We parse the input as HTML and look at the AST. ``` $ echo '<u>foo</u> & <mark>bar</mark>' | pandoc --from=html --to native [Plain [Underline [Str "foo"],Space,Str "&",Space,Span ("", ["mark"],[]) [Str "bar"]]] ``` On this structure, we can use a simple filter which replaces `Span` elements representing the `<mark>` tag and replaces with `Underline` elements. ``` function Span(elem) if elem.classes[1]:gmatch('mark') then return pandoc.Underline(elem.content) end end ``` ``` [Plain [Underline [Str "foo"],Space,Str "&",Space,Underline [Str "bar"]]] ``` This is good. But if we parse the same input as markdown, we get a much less convenient structure. ``` $ echo '<u>foo</u> & <mark>bar</mark>' | pandoc --from=markdown+raw_html --to native [Para [RawInline (Format "html") "<u>",Str "foo",RawInline (Format "html") "</u>",Space,Str "&",Space,RawInline (Format "html") "<mark>",Str "bar",RawInline (Format "html") "</mark>"]] ``` And if we had some additional criteria by which to replace `<mark>` with `<u>` (the content for example), we would have to identify the opening and closing `RawInline` elements. I'm wondering if there is any good solutions to this problem? Is there a way to parse HTML in markdown just as HTML would be parsed otherwise? Or is there way to solve this in a Lua filter without writing some parsing code? -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/aae29ca7-60ca-4349-af03-939f0ac503efn%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 2902 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <aae29ca7-60ca-4349-af03-939f0ac503efn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Is there a way to change the way Pandoc parses HTML inside of markdown documents? [not found] ` <aae29ca7-60ca-4349-af03-939f0ac503efn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2021-08-16 22:08 ` John MacFarlane [not found] ` <yh480k1r6tt53d.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> 0 siblings, 1 reply; 6+ messages in thread From: John MacFarlane @ 2021-08-16 22:08 UTC (permalink / raw) To: pompez, pandoc-discuss I'm afraid you'll have to write some parsing code... pompez <martinsifrar11-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > I'm starting out with Lua filters and apologize for this possibly already > answered question. You can also read this question on StackOverflow > <https://stackoverflow.com/questions/68809527/is-there-a-way-to-change-the-way-pandoc-parses-html-inside-of-markdown-documents> > . > > I'm using Pandoc to convert markdown to HTML. My markdown files also > contain some raw HTML. In the examples, I'll be using `<mark>` and `<u>`. > > Let's say I want to change every `<mark>` to a `<u>` tag. We parse the > input as HTML and look at the AST. > > ``` > $ echo '<u>foo</u> & <mark>bar</mark>' | pandoc --from=html --to native > [Plain [Underline [Str "foo"],Space,Str "&",Space,Span ("", ["mark"],[]) > [Str "bar"]]] > ``` > > On this structure, we can use a simple filter which replaces `Span` > elements representing the `<mark>` tag and replaces with `Underline` > elements. > > ``` > function Span(elem) > if elem.classes[1]:gmatch('mark') then > return pandoc.Underline(elem.content) > end > end > ``` > > ``` > [Plain [Underline [Str "foo"],Space,Str "&",Space,Underline [Str "bar"]]] > ``` > > This is good. But if we parse the same input as markdown, we get a much > less convenient structure. > > ``` > $ echo '<u>foo</u> & <mark>bar</mark>' | pandoc --from=markdown+raw_html > --to native > [Para [RawInline (Format "html") "<u>",Str "foo",RawInline (Format "html") > "</u>",Space,Str "&",Space,RawInline (Format "html") "<mark>",Str > "bar",RawInline (Format "html") "</mark>"]] > ``` > > And if we had some additional criteria by which to replace `<mark>` with > `<u>` (the content for example), we would have to identify the opening and > closing `RawInline` elements. > > I'm wondering if there is any good solutions to this problem? Is there a > way to parse HTML in markdown just as HTML would be parsed otherwise? Or is > there way to solve this in a Lua filter without writing some parsing code? > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/aae29ca7-60ca-4349-af03-939f0ac503efn%40googlegroups.com. ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <yh480k1r6tt53d.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>]
* Re: Is there a way to change the way Pandoc parses HTML inside of markdown documents? [not found] ` <yh480k1r6tt53d.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> @ 2021-08-16 22:55 ` pompez 2021-08-17 10:37 ` William Lupton 1 sibling, 0 replies; 6+ messages in thread From: pompez @ 2021-08-16 22:55 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 3152 bytes --] That's okay. Just wanted to know beforehand. Thanks. On Tuesday, August 17, 2021 at 12:09:15 AM UTC+2 John MacFarlane wrote: > > I'm afraid you'll have to write some parsing code... > > pompez <martins...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > > > I'm starting out with Lua filters and apologize for this possibly > already > > answered question. You can also read this question on StackOverflow > > < > https://stackoverflow.com/questions/68809527/is-there-a-way-to-change-the-way-pandoc-parses-html-inside-of-markdown-documents > > > > . > > > > I'm using Pandoc to convert markdown to HTML. My markdown files also > > contain some raw HTML. In the examples, I'll be using `<mark>` and `<u>`. > > > > Let's say I want to change every `<mark>` to a `<u>` tag. We parse the > > input as HTML and look at the AST. > > > > ``` > > $ echo '<u>foo</u> & <mark>bar</mark>' | pandoc --from=html --to native > > [Plain [Underline [Str "foo"],Space,Str "&",Space,Span ("", ["mark"],[]) > > [Str "bar"]]] > > ``` > > > > On this structure, we can use a simple filter which replaces `Span` > > elements representing the `<mark>` tag and replaces with `Underline` > > elements. > > > > ``` > > function Span(elem) > > if elem.classes[1]:gmatch('mark') then > > return pandoc.Underline(elem.content) > > end > > end > > ``` > > > > ``` > > [Plain [Underline [Str "foo"],Space,Str "&",Space,Underline [Str "bar"]]] > > ``` > > > > This is good. But if we parse the same input as markdown, we get a much > > less convenient structure. > > > > ``` > > $ echo '<u>foo</u> & <mark>bar</mark>' | pandoc --from=markdown+raw_html > > --to native > > [Para [RawInline (Format "html") "<u>",Str "foo",RawInline (Format > "html") > > "</u>",Space,Str "&",Space,RawInline (Format "html") "<mark>",Str > > "bar",RawInline (Format "html") "</mark>"]] > > ``` > > > > And if we had some additional criteria by which to replace `<mark>` with > > `<u>` (the content for example), we would have to identify the opening > and > > closing `RawInline` elements. > > > > I'm wondering if there is any good solutions to this problem? Is there a > > way to parse HTML in markdown just as HTML would be parsed otherwise? Or > is > > there way to solve this in a Lua filter without writing some parsing > code? > > > > -- > > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/aae29ca7-60ca-4349-af03-939f0ac503efn%40googlegroups.com > . > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/24f0fa08-cbd8-478c-9db0-d99ed2901148n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 5266 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Is there a way to change the way Pandoc parses HTML inside of markdown documents? [not found] ` <yh480k1r6tt53d.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> 2021-08-16 22:55 ` pompez @ 2021-08-17 10:37 ` William Lupton [not found] ` <CAEe_xxj-kp22oToH4o5J54s16W4WzMkiaEicOy+TuqDZf5LP3g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 6+ messages in thread From: William Lupton @ 2021-08-17 10:37 UTC (permalink / raw) To: pandoc-discuss; +Cc: pompez [-- Attachment #1: Type: text/plain, Size: 3647 bytes --] Could pandoc.read(markup, "html") <https://pandoc.org/lua-filters.html#pandoc.read> help? On Mon, 16 Aug 2021 at 23:09, John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> wrote: > > I'm afraid you'll have to write some parsing code... > > pompez <martinsifrar11-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > > > I'm starting out with Lua filters and apologize for this possibly > already > > answered question. You can also read this question on StackOverflow > > < > https://stackoverflow.com/questions/68809527/is-there-a-way-to-change-the-way-pandoc-parses-html-inside-of-markdown-documents > > > > . > > > > I'm using Pandoc to convert markdown to HTML. My markdown files also > > contain some raw HTML. In the examples, I'll be using `<mark>` and `<u>`. > > > > Let's say I want to change every `<mark>` to a `<u>` tag. We parse the > > input as HTML and look at the AST. > > > > ``` > > $ echo '<u>foo</u> & <mark>bar</mark>' | pandoc --from=html --to native > > [Plain [Underline [Str "foo"],Space,Str "&",Space,Span ("", ["mark"],[]) > > [Str "bar"]]] > > ``` > > > > On this structure, we can use a simple filter which replaces `Span` > > elements representing the `<mark>` tag and replaces with `Underline` > > elements. > > > > ``` > > function Span(elem) > > if elem.classes[1]:gmatch('mark') then > > return pandoc.Underline(elem.content) > > end > > end > > ``` > > > > ``` > > [Plain [Underline [Str "foo"],Space,Str "&",Space,Underline [Str "bar"]]] > > ``` > > > > This is good. But if we parse the same input as markdown, we get a much > > less convenient structure. > > > > ``` > > $ echo '<u>foo</u> & <mark>bar</mark>' | pandoc --from=markdown+raw_html > > --to native > > [Para [RawInline (Format "html") "<u>",Str "foo",RawInline (Format > "html") > > "</u>",Space,Str "&",Space,RawInline (Format "html") "<mark>",Str > > "bar",RawInline (Format "html") "</mark>"]] > > ``` > > > > And if we had some additional criteria by which to replace `<mark>` with > > `<u>` (the content for example), we would have to identify the opening > and > > closing `RawInline` elements. > > > > I'm wondering if there is any good solutions to this problem? Is there a > > way to parse HTML in markdown just as HTML would be parsed otherwise? Or > is > > there way to solve this in a Lua filter without writing some parsing > code? > > > > -- > > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/aae29ca7-60ca-4349-af03-939f0ac503efn%40googlegroups.com > . > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/yh480k1r6tt53d.fsf%40johnmacfarlane.net > . > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAEe_xxj-kp22oToH4o5J54s16W4WzMkiaEicOy%2BTuqDZf5LP3g%40mail.gmail.com. [-- Attachment #2: Type: text/html, Size: 5812 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
[parent not found: <CAEe_xxj-kp22oToH4o5J54s16W4WzMkiaEicOy+TuqDZf5LP3g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: Is there a way to change the way Pandoc parses HTML inside of markdown documents? [not found] ` <CAEe_xxj-kp22oToH4o5J54s16W4WzMkiaEicOy+TuqDZf5LP3g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2021-08-17 11:24 ` Bastien DUMONT 2021-08-24 8:44 ` pompez 1 sibling, 0 replies; 6+ messages in thread From: Bastien DUMONT @ 2021-08-17 11:24 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw > On this structure, we can use a simple filter which replaces `Span` > elements representing the `<mark>` tag and replaces with `Underline` > elements. > > ``` > function Span(elem) > if elem.classes[1]:gmatch('mark') then > return pandoc.Underline(elem.content) > end > end To apply the same code on a Markdown input file, you can use inline spans like this : `[foo]{.underline} & [bar]{.mark}`. Le Tuesday 17 August 2021 à 11:37:21AM, William Lupton a écrit : > Could [1]pandoc.read(markup, "html") help? > > On Mon, 16 Aug 2021 at 23:09, John MacFarlane <[2]jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> wrote: > > > I'm afraid you'll have to write some parsing code... > > pompez <[3]martinsifrar11-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > > > I'm starting out with Lua filters and apologize for this possibly already > > answered question. You can also read this question on StackOverflow > > <[4]https://stackoverflow.com/questions/68809527/ > is-there-a-way-to-change-the-way-pandoc-parses-html-inside-of-markdown-documents > > > > . > > > > I'm using Pandoc to convert markdown to HTML. My markdown files also > > contain some raw HTML. In the examples, I'll be using `<mark>` and `<u>`. > > > > Let's say I want to change every `<mark>` to a `<u>` tag. We parse the > > input as HTML and look at the AST. > > > > ``` > > $ echo '<u>foo</u> & <mark>bar</mark>' | pandoc --from=html --to native > > [Plain [Underline [Str "foo"],Space,Str "&",Space,Span ("", ["mark"],[]) > > [Str "bar"]]] > > ``` > > > > On this structure, we can use a simple filter which replaces `Span` > > elements representing the `<mark>` tag and replaces with `Underline` > > elements. > > > > ``` > > function Span(elem) > > if elem.classes[1]:gmatch('mark') then > > return pandoc.Underline(elem.content) > > end > > end > > ``` > > > > ``` > > [Plain [Underline [Str "foo"],Space,Str "&",Space,Underline [Str "bar"]]] > > ``` > > > > This is good. But if we parse the same input as markdown, we get a much > > less convenient structure. > > > > ``` > > $ echo '<u>foo</u> & <mark>bar</mark>' | pandoc --from=markdown+raw_html > > --to native > > [Para [RawInline (Format "html") "<u>",Str "foo",RawInline (Format > "html") > > "</u>",Space,Str "&",Space,RawInline (Format "html") "<mark>",Str > > "bar",RawInline (Format "html") "</mark>"]] > > ``` > > > > And if we had some additional criteria by which to replace `<mark>` with > > `<u>` (the content for example), we would have to identify the opening > and > > closing `RawInline` elements. > > > > I'm wondering if there is any good solutions to this problem? Is there a > > way to parse HTML in markdown just as HTML would be parsed otherwise? Or > is > > there way to solve this in a Lua filter without writing some parsing > code? > > > > -- > > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send an > email to [5]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit [6]https://groups.google.com/d/ > msgid/pandoc-discuss/ > aae29ca7-60ca-4349-af03-939f0ac503efn%40googlegroups.com. > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [7]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit [8]https://groups.google.com/d/ > msgid/pandoc-discuss/yh480k1r6tt53d.fsf%40johnmacfarlane.net. > > -- > You received this message because you are subscribed to the Google Groups > "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email > to [9]pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit [10]https://groups.google.com/d/msgid/ > pandoc-discuss/ > CAEe_xxj-kp22oToH4o5J54s16W4WzMkiaEicOy%2BTuqDZf5LP3g%40mail.gmail.com. > > References: > > [1] https://pandoc.org/lua-filters.html#pandoc.read > [2] mailto:jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org > [3] mailto:martinsifrar11-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org > [4] https://stackoverflow.com/questions/68809527/is-there-a-way-to-change-the-way-pandoc-parses-html-inside-of-markdown-documents > [5] mailto:pandoc-discuss%2Bunsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > [6] https://groups.google.com/d/msgid/pandoc-discuss/aae29ca7-60ca-4349-af03-939f0ac503efn%40googlegroups.com > [7] mailto:pandoc-discuss%2Bunsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > [8] https://groups.google.com/d/msgid/pandoc-discuss/yh480k1r6tt53d.fsf%40johnmacfarlane.net > [9] mailto:pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org > [10] https://groups.google.com/d/msgid/pandoc-discuss/CAEe_xxj-kp22oToH4o5J54s16W4WzMkiaEicOy%2BTuqDZf5LP3g%40mail.gmail.com?utm_medium=email&utm_source=footer -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/YRuccFhI3anHPRPc%40localhost. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Is there a way to change the way Pandoc parses HTML inside of markdown documents? [not found] ` <CAEe_xxj-kp22oToH4o5J54s16W4WzMkiaEicOy+TuqDZf5LP3g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2021-08-17 11:24 ` Bastien DUMONT @ 2021-08-24 8:44 ` pompez 1 sibling, 0 replies; 6+ messages in thread From: pompez @ 2021-08-24 8:44 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 3928 bytes --] Sorry for the late reply. In my case, I'd still like to recognize the contents inside the block. On Tuesday, August 17, 2021 at 12:37:37 PM UTC+2 William Lupton wrote: > Could pandoc.read(markup, "html") > <https://pandoc.org/lua-filters.html#pandoc.read> help? > > On Mon, 16 Aug 2021 at 23:09, John MacFarlane <j...-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> wrote: > >> >> I'm afraid you'll have to write some parsing code... >> >> pompez <martins...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: >> >> > I'm starting out with Lua filters and apologize for this possibly >> already >> > answered question. You can also read this question on StackOverflow >> > < >> https://stackoverflow.com/questions/68809527/is-there-a-way-to-change-the-way-pandoc-parses-html-inside-of-markdown-documents >> > >> > . >> > >> > I'm using Pandoc to convert markdown to HTML. My markdown files also >> > contain some raw HTML. In the examples, I'll be using `<mark>` and >> `<u>`. >> > >> > Let's say I want to change every `<mark>` to a `<u>` tag. We parse the >> > input as HTML and look at the AST. >> > >> > ``` >> > $ echo '<u>foo</u> & <mark>bar</mark>' | pandoc --from=html --to native >> > [Plain [Underline [Str "foo"],Space,Str "&",Space,Span ("", >> ["mark"],[]) >> > [Str "bar"]]] >> > ``` >> > >> > On this structure, we can use a simple filter which replaces `Span` >> > elements representing the `<mark>` tag and replaces with `Underline` >> > elements. >> > >> > ``` >> > function Span(elem) >> > if elem.classes[1]:gmatch('mark') then >> > return pandoc.Underline(elem.content) >> > end >> > end >> > ``` >> > >> > ``` >> > [Plain [Underline [Str "foo"],Space,Str "&",Space,Underline [Str >> "bar"]]] >> > ``` >> > >> > This is good. But if we parse the same input as markdown, we get a much >> > less convenient structure. >> > >> > ``` >> > $ echo '<u>foo</u> & <mark>bar</mark>' | pandoc >> --from=markdown+raw_html >> > --to native >> > [Para [RawInline (Format "html") "<u>",Str "foo",RawInline (Format >> "html") >> > "</u>",Space,Str "&",Space,RawInline (Format "html") "<mark>",Str >> > "bar",RawInline (Format "html") "</mark>"]] >> > ``` >> > >> > And if we had some additional criteria by which to replace `<mark>` >> with >> > `<u>` (the content for example), we would have to identify the opening >> and >> > closing `RawInline` elements. >> > >> > I'm wondering if there is any good solutions to this problem? Is there >> a >> > way to parse HTML in markdown just as HTML would be parsed otherwise? >> Or is >> > there way to solve this in a Lua filter without writing some parsing >> code? >> > >> > -- >> > You received this message because you are subscribed to the Google >> Groups "pandoc-discuss" group. >> > To unsubscribe from this group and stop receiving emails from it, send >> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/pandoc-discuss/aae29ca7-60ca-4349-af03-939f0ac503efn%40googlegroups.com >> . >> >> -- >> You received this message because you are subscribed to the Google Groups >> "pandoc-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/pandoc-discuss/yh480k1r6tt53d.fsf%40johnmacfarlane.net >> . >> > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/411e9a84-5981-4bd8-b905-914a66d1dc3fn%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 7225 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-08-24 8:44 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-08-16 21:43 Is there a way to change the way Pandoc parses HTML inside of markdown documents? pompez [not found] ` <aae29ca7-60ca-4349-af03-939f0ac503efn-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2021-08-16 22:08 ` John MacFarlane [not found] ` <yh480k1r6tt53d.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> 2021-08-16 22:55 ` pompez 2021-08-17 10:37 ` William Lupton [not found] ` <CAEe_xxj-kp22oToH4o5J54s16W4WzMkiaEicOy+TuqDZf5LP3g-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2021-08-17 11:24 ` Bastien DUMONT 2021-08-24 8:44 ` pompez
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).