* Converting everything that’s inside a specific div (including other div) while excluding everything else @ 2020-09-29 20:45 Butch [not found] ` <ee79a1ca-efb1-463c-ace9-5398c8e623e3n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 8+ messages in thread From: Butch @ 2020-09-29 20:45 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 1703 bytes --] Hello, I am trying to convert specific parts of an HTML file to Markdown. I want to convert everything that’s inside a specific div (including other div) while excluding everything else. Is that possible? Here is an example. I want to take this: <div class="show"> <p>This is the outer text.</p> <div class="inner"> <p>This is the inner text.</p> </div> </div> <div class="hide"> <p>This is the hidden text.</p> </div> And convert it so I have this: ::: {.show} This is the outer text. ::: {.inner} This is the inner text. ::: ::: I.e., I want to convert everything that’s inside <div class="show"> (including other div) and to exclude everything else in the document. If I use a filter like this: function Div(el) if el.classes[1] == "show" then return el else return {} end end The resulting Markdown will be: ::: {.show} This is the outer text. ::: Which is kind of expected. So what can I do to include in the conversion not only <div class="show">, but also all the other div inside it? The actual HTML files I want to convert are very large, so I can’t list all the classes I want to include (or exclude from) in the conversion. Thanks in advance. -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ee79a1ca-efb1-463c-ace9-5398c8e623e3n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 3682 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <ee79a1ca-efb1-463c-ace9-5398c8e623e3n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Converting everything that’s inside a specific div (including other div) while excluding everything else [not found] ` <ee79a1ca-efb1-463c-ace9-5398c8e623e3n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2020-09-29 22:41 ` John MacFarlane [not found] ` <m2zh58xl1w.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org> 0 siblings, 1 reply; 8+ messages in thread From: John MacFarlane @ 2020-09-29 22:41 UTC (permalink / raw) To: Butch, pandoc-discuss This is the sort of thing that is currently a bit tricky with our filter architecture. One idea might be to do several passes (i.e., several filters, which you can include in the same lua file; see the docs). In the first pass, you'd set a special attribute keep="false" on all Divs. In the second pass, you'd match Divs with the 'show' class, and use walk_block to set keep="true" on all Divs inside it. You'd also set keep="true" on it. IN the third pass, you'd match Divs and remove them if keep="false". I think something like this could work. Butch <idiosyncraticwriter-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > Hello, > > I am trying to convert specific parts of an HTML file to Markdown. I want > to convert everything that’s inside a specific div (including other div) > while excluding everything else. Is that possible? > > Here is an example. I want to take this: > > <div class="show"> > <p>This is the outer text.</p> > <div class="inner"> > <p>This is the inner text.</p> > </div> > </div> > <div class="hide"> > <p>This is the hidden text.</p> > </div> > > And convert it so I have this: > > ::: {.show} > This is the outer text. > > ::: {.inner} > This is the inner text. > ::: > ::: > > I.e., I want to convert everything that’s inside <div class="show"> > (including other div) and to exclude everything else in the document. > > If I use a filter like this: > > function Div(el) > if el.classes[1] == "show" then > return el > else > return {} > end > end > > The resulting Markdown will be: > > ::: {.show} > This is the outer text. > ::: > > Which is kind of expected. So what can I do to include in the conversion > not only <div class="show">, but also all the other div inside it? > > The actual HTML files I want to convert are very large, so I can’t list all > the classes I want to include (or exclude from) in the conversion. > > Thanks in advance. > > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/ee79a1ca-efb1-463c-ace9-5398c8e623e3n%40googlegroups.com. -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/m2zh58xl1w.fsf%40MacBook-Pro.hsd1.ca.comcast.net. ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <m2zh58xl1w.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>]
* Re: Converting everything that’s inside a specific div (including other div) while excluding everything else [not found] ` <m2zh58xl1w.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org> @ 2020-09-30 5:11 ` Butch [not found] ` <d6b951e8-141e-4497-85cb-f5ecc8b992a4n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 8+ messages in thread From: Butch @ 2020-09-30 5:11 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 3208 bytes --] Thanks for the response. Yeah, I’ll either do what you suggested or use some external filters before passing the files through Pandoc. On Tuesday, 29 September 2020 at 19:42:05 UTC-3 John MacFarlane wrote: > > This is the sort of thing that is currently a bit tricky > with our filter architecture. > > One idea might be to do several passes (i.e., several filters, which > you can include in the same lua file; see the docs). > > In the first pass, you'd set a special attribute keep="false" on > all Divs. > > In the second pass, you'd match Divs with the 'show' class, > and use walk_block to set keep="true" on all Divs inside it. > You'd also set keep="true" on it. > > IN the third pass, you'd match Divs and remove them if > keep="false". > > I think something like this could work. > > Butch <idiosyncr...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > > > Hello, > > > > I am trying to convert specific parts of an HTML file to Markdown. I > want > > to convert everything that’s inside a specific div (including other div) > > while excluding everything else. Is that possible? > > > > Here is an example. I want to take this: > > > > <div class="show"> > > <p>This is the outer text.</p> > > <div class="inner"> > > <p>This is the inner text.</p> > > </div> > > </div> > > <div class="hide"> > > <p>This is the hidden text.</p> > > </div> > > > > And convert it so I have this: > > > > ::: {.show} > > This is the outer text. > > > > ::: {.inner} > > This is the inner text. > > ::: > > ::: > > > > I.e., I want to convert everything that’s inside <div class="show"> > > (including other div) and to exclude everything else in the document. > > > > If I use a filter like this: > > > > function Div(el) > > if el.classes[1] == "show" then > > return el > > else > > return {} > > end > > end > > > > The resulting Markdown will be: > > > > ::: {.show} > > This is the outer text. > > ::: > > > > Which is kind of expected. So what can I do to include in the conversion > > not only <div class="show">, but also all the other div inside it? > > > > The actual HTML files I want to convert are very large, so I can’t list > all > > the classes I want to include (or exclude from) in the conversion. > > > > Thanks in advance. > > > > > > -- > > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/ee79a1ca-efb1-463c-ace9-5398c8e623e3n%40googlegroups.com > . > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/d6b951e8-141e-4497-85cb-f5ecc8b992a4n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 4763 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <d6b951e8-141e-4497-85cb-f5ecc8b992a4n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Converting everything that’s inside a specific div (including other div) while excluding everything else [not found] ` <d6b951e8-141e-4497-85cb-f5ecc8b992a4n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2020-09-30 8:49 ` Albert Krewinkel [not found] ` <87lfgrk5sb.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> 0 siblings, 1 reply; 8+ messages in thread From: Albert Krewinkel @ 2020-09-30 8:49 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw Maybe there's another way: 1. Collect all divs in the order that pandoc sees them. 2. In second traversal, check whether we want to keep the div. If so, use the div that we stored before, as it will still contain all children. Here's the code: local divs = pandoc.List() local div_index = 0 function collect (d) divs:insert(d) end function filter (div) div_index = div_index + 1 if div.classes[1] == 'show' then return divs[div_index] else return {} end end return { {Div = collect}, {Div = filter} } Butch writes: > Thanks for the response. Yeah, I’ll either do what you suggested or use > some external filters before passing the files through Pandoc. > > > On Tuesday, 29 September 2020 at 19:42:05 UTC-3 John MacFarlane wrote: > >> >> This is the sort of thing that is currently a bit tricky >> with our filter architecture. >> >> One idea might be to do several passes (i.e., several filters, which >> you can include in the same lua file; see the docs). >> >> In the first pass, you'd set a special attribute keep="false" on >> all Divs. >> >> In the second pass, you'd match Divs with the 'show' class, >> and use walk_block to set keep="true" on all Divs inside it. >> You'd also set keep="true" on it. >> >> IN the third pass, you'd match Divs and remove them if >> keep="false". >> >> I think something like this could work. >> >> Butch <idiosyncr...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: >> >> > Hello, >> > >> > I am trying to convert specific parts of an HTML file to Markdown. I >> want >> > to convert everything that’s inside a specific div (including other div) >> > while excluding everything else. Is that possible? >> > >> > Here is an example. I want to take this: >> > >> > <div class="show"> >> > <p>This is the outer text.</p> >> > <div class="inner"> >> > <p>This is the inner text.</p> >> > </div> >> > </div> >> > <div class="hide"> >> > <p>This is the hidden text.</p> >> > </div> >> > >> > And convert it so I have this: >> > >> > ::: {.show} >> > This is the outer text. >> > >> > ::: {.inner} >> > This is the inner text. >> > ::: >> > ::: >> > >> > I.e., I want to convert everything that’s inside <div class="show"> >> > (including other div) and to exclude everything else in the document. >> > >> > If I use a filter like this: >> > >> > function Div(el) >> > if el.classes[1] == "show" then >> > return el >> > else >> > return {} >> > end >> > end >> > >> > The resulting Markdown will be: >> > >> > ::: {.show} >> > This is the outer text. >> > ::: >> > >> > Which is kind of expected. So what can I do to include in the conversion >> > not only <div class="show">, but also all the other div inside it? >> > >> > The actual HTML files I want to convert are very large, so I can’t list >> all >> > the classes I want to include (or exclude from) in the conversion. >> > >> > Thanks in advance. >> > >> > >> > -- >> > You received this message because you are subscribed to the Google >> Groups "pandoc-discuss" group. >> > To unsubscribe from this group and stop receiving emails from it, send >> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/pandoc-discuss/ee79a1ca-efb1-463c-ace9-5398c8e623e3n%40googlegroups.com >> . >> -- Albert Krewinkel GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87lfgrk5sb.fsf%40zeitkraut.de. ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <87lfgrk5sb.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>]
* Re: Converting everything that’s inside a specific div (including other div) while excluding everything else [not found] ` <87lfgrk5sb.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> @ 2020-09-30 19:26 ` John MacFarlane [not found] ` <m2eemjxe0f.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org> 0 siblings, 1 reply; 8+ messages in thread From: John MacFarlane @ 2020-09-30 19:26 UTC (permalink / raw) To: Albert Krewinkel, pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw Good idea -- I agree, that's probably a better approach! Albert Krewinkel <albert+pandoc-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> writes: > Maybe there's another way: > > 1. Collect all divs in the order that pandoc sees them. > 2. In second traversal, check whether we want to keep the div. If > so, use the div that we stored before, as it will still contain > all children. > > Here's the code: > > local divs = pandoc.List() > local div_index = 0 > > function collect (d) > divs:insert(d) > end > > function filter (div) > div_index = div_index + 1 > if div.classes[1] == 'show' then > return divs[div_index] > else > return {} > end > end > > return { > {Div = collect}, > {Div = filter} > } > > Butch writes: > >> Thanks for the response. Yeah, I’ll either do what you suggested or use >> some external filters before passing the files through Pandoc. >> >> >> On Tuesday, 29 September 2020 at 19:42:05 UTC-3 John MacFarlane wrote: >> >>> >>> This is the sort of thing that is currently a bit tricky >>> with our filter architecture. >>> >>> One idea might be to do several passes (i.e., several filters, which >>> you can include in the same lua file; see the docs). >>> >>> In the first pass, you'd set a special attribute keep="false" on >>> all Divs. >>> >>> In the second pass, you'd match Divs with the 'show' class, >>> and use walk_block to set keep="true" on all Divs inside it. >>> You'd also set keep="true" on it. >>> >>> IN the third pass, you'd match Divs and remove them if >>> keep="false". >>> >>> I think something like this could work. >>> >>> Butch <idiosyncr...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: >>> >>> > Hello, >>> > >>> > I am trying to convert specific parts of an HTML file to Markdown. I >>> want >>> > to convert everything that’s inside a specific div (including other div) >>> > while excluding everything else. Is that possible? >>> > >>> > Here is an example. I want to take this: >>> > >>> > <div class="show"> >>> > <p>This is the outer text.</p> >>> > <div class="inner"> >>> > <p>This is the inner text.</p> >>> > </div> >>> > </div> >>> > <div class="hide"> >>> > <p>This is the hidden text.</p> >>> > </div> >>> > >>> > And convert it so I have this: >>> > >>> > ::: {.show} >>> > This is the outer text. >>> > >>> > ::: {.inner} >>> > This is the inner text. >>> > ::: >>> > ::: >>> > >>> > I.e., I want to convert everything that’s inside <div class="show"> >>> > (including other div) and to exclude everything else in the document. >>> > >>> > If I use a filter like this: >>> > >>> > function Div(el) >>> > if el.classes[1] == "show" then >>> > return el >>> > else >>> > return {} >>> > end >>> > end >>> > >>> > The resulting Markdown will be: >>> > >>> > ::: {.show} >>> > This is the outer text. >>> > ::: >>> > >>> > Which is kind of expected. So what can I do to include in the conversion >>> > not only <div class="show">, but also all the other div inside it? >>> > >>> > The actual HTML files I want to convert are very large, so I can’t list >>> all >>> > the classes I want to include (or exclude from) in the conversion. >>> > >>> > Thanks in advance. >>> > >>> > >>> > -- >>> > You received this message because you are subscribed to the Google >>> Groups "pandoc-discuss" group. >>> > To unsubscribe from this group and stop receiving emails from it, send >>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >>> > To view this discussion on the web visit >>> https://groups.google.com/d/msgid/pandoc-discuss/ee79a1ca-efb1-463c-ace9-5398c8e623e3n%40googlegroups.com >>> . >>> > > > -- > Albert Krewinkel > GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/87lfgrk5sb.fsf%40zeitkraut.de. -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/m2eemjxe0f.fsf%40MacBook-Pro.hsd1.ca.comcast.net. ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <m2eemjxe0f.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org>]
* Re: Converting everything that’s inside a specific div (including other div) while excluding everything else [not found] ` <m2eemjxe0f.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org> @ 2020-10-01 5:24 ` Butch [not found] ` <d8bdce6a-7632-4107-a700-0b228c9c3f74n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 8+ messages in thread From: Butch @ 2020-10-01 5:24 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 5511 bytes --] Thanks for your help, Albert. This filter seems to work only if <div class="show"> is not inside another block. Would it be possible to change it to work with all the divs in the document? In my example, how could I keep only <div class="inner">? In addition to that, I'd need to exclude all other elements outside the divs I want to keep, even if they are not inside a div. For example, a <p>Paragraph</p> that is not inside any div should be excluded. Would that be feasible? On Wednesday, 30 September 2020 at 16:26:27 UTC-3 John MacFarlane wrote: > > Good idea -- I agree, that's probably a better approach! > > Albert Krewinkel <albert...-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> writes: > > > Maybe there's another way: > > > > 1. Collect all divs in the order that pandoc sees them. > > 2. In second traversal, check whether we want to keep the div. If > > so, use the div that we stored before, as it will still contain > > all children. > > > > Here's the code: > > > > local divs = pandoc.List() > > local div_index = 0 > > > > function collect (d) > > divs:insert(d) > > end > > > > function filter (div) > > div_index = div_index + 1 > > if div.classes[1] == 'show' then > > return divs[div_index] > > else > > return {} > > end > > end > > > > return { > > {Div = collect}, > > {Div = filter} > > } > > > > Butch writes: > > > >> Thanks for the response. Yeah, I’ll either do what you suggested or use > >> some external filters before passing the files through Pandoc. > >> > >> > >> On Tuesday, 29 September 2020 at 19:42:05 UTC-3 John MacFarlane wrote: > >> > >>> > >>> This is the sort of thing that is currently a bit tricky > >>> with our filter architecture. > >>> > >>> One idea might be to do several passes (i.e., several filters, which > >>> you can include in the same lua file; see the docs). > >>> > >>> In the first pass, you'd set a special attribute keep="false" on > >>> all Divs. > >>> > >>> In the second pass, you'd match Divs with the 'show' class, > >>> and use walk_block to set keep="true" on all Divs inside it. > >>> You'd also set keep="true" on it. > >>> > >>> IN the third pass, you'd match Divs and remove them if > >>> keep="false". > >>> > >>> I think something like this could work. > >>> > >>> Butch <idiosyncr...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > >>> > >>> > Hello, > >>> > > >>> > I am trying to convert specific parts of an HTML file to Markdown. I > >>> want > >>> > to convert everything that’s inside a specific div (including other > div) > >>> > while excluding everything else. Is that possible? > >>> > > >>> > Here is an example. I want to take this: > >>> > > >>> > <div class="show"> > >>> > <p>This is the outer text.</p> > >>> > <div class="inner"> > >>> > <p>This is the inner text.</p> > >>> > </div> > >>> > </div> > >>> > <div class="hide"> > >>> > <p>This is the hidden text.</p> > >>> > </div> > >>> > > >>> > And convert it so I have this: > >>> > > >>> > ::: {.show} > >>> > This is the outer text. > >>> > > >>> > ::: {.inner} > >>> > This is the inner text. > >>> > ::: > >>> > ::: > >>> > > >>> > I.e., I want to convert everything that’s inside <div class="show"> > >>> > (including other div) and to exclude everything else in the document. > >>> > > >>> > If I use a filter like this: > >>> > > >>> > function Div(el) > >>> > if el.classes[1] == "show" then > >>> > return el > >>> > else > >>> > return {} > >>> > end > >>> > end > >>> > > >>> > The resulting Markdown will be: > >>> > > >>> > ::: {.show} > >>> > This is the outer text. > >>> > ::: > >>> > > >>> > Which is kind of expected. So what can I do to include in the > conversion > >>> > not only <div class="show">, but also all the other div inside it? > >>> > > >>> > The actual HTML files I want to convert are very large, so I can’t > list > >>> all > >>> > the classes I want to include (or exclude from) in the conversion. > >>> > > >>> > Thanks in advance. > >>> > > >>> > > >>> > -- > >>> > You received this message because you are subscribed to the Google > >>> Groups "pandoc-discuss" group. > >>> > To unsubscribe from this group and stop receiving emails from it, > send > >>> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > >>> > To view this discussion on the web visit > >>> > https://groups.google.com/d/msgid/pandoc-discuss/ee79a1ca-efb1-463c-ace9-5398c8e623e3n%40googlegroups.com > >>> . > >>> > > > > > > -- > > Albert Krewinkel > > GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 > > > > -- > > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/87lfgrk5sb.fsf%40zeitkraut.de > . > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/d8bdce6a-7632-4107-a700-0b228c9c3f74n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 8816 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <d8bdce6a-7632-4107-a700-0b228c9c3f74n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Converting everything that’s inside a specific div (including other div) while excluding everything else [not found] ` <d8bdce6a-7632-4107-a700-0b228c9c3f74n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2020-10-02 17:43 ` Albert Krewinkel [not found] ` <87d020jzgd.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> 0 siblings, 1 reply; 8+ messages in thread From: Albert Krewinkel @ 2020-10-02 17:43 UTC (permalink / raw) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw Butch writes: > Thanks for your help, Albert. > > This filter seems to work only if <div class="show"> is not inside another > block. Would it be possible to change it to work with all the divs in the > document? In my example, how could I keep only <div class="inner">? Taken by itself, that task is really difficult to achive, but... > In addition to that, I'd need to exclude all other elements outside the > divs I want to keep, even if they are not inside a div. For example, a > <p>Paragraph</p> that is not inside any div should be excluded. Would that > be feasible? with this additional requirement it becomes easy again: we can collect all Div which we'd like to keep, then we replace the document with the list of collected divs: local keep = pandoc.List() function Div (div) if div.classes[1] == 'show' then return keep:insert(div) end end function Pandoc (doc) doc.blocks = keep return doc end The limitation here is that we assume that "show" divs will not be nested; nested "show" divs would be included twice. Cheers -- Albert Krewinkel GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <87d020jzgd.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org>]
* Re: Converting everything that’s inside a specific div (including other div) while excluding everything else [not found] ` <87d020jzgd.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> @ 2020-10-03 5:29 ` Butch 0 siblings, 0 replies; 8+ messages in thread From: Butch @ 2020-10-03 5:29 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 1736 bytes --] Thanks a lot, Albert! That worked great for my purposes. On Friday, 2 October 2020 at 14:43:40 UTC-3 Albert Krewinkel wrote: > Butch writes: > > > Thanks for your help, Albert. > > > > This filter seems to work only if <div class="show"> is not inside > another > > block. Would it be possible to change it to work with all the divs in the > > document? In my example, how could I keep only <div class="inner">? > > Taken by itself, that task is really difficult to achive, but... > > > In addition to that, I'd need to exclude all other elements outside the > > divs I want to keep, even if they are not inside a div. For example, a > > <p>Paragraph</p> that is not inside any div should be excluded. Would > that > > be feasible? > > with this additional requirement it becomes easy again: we can collect > all Div which we'd like to keep, then we replace the document with the > list of collected divs: > > local keep = pandoc.List() > > function Div (div) > if div.classes[1] == 'show' then > return keep:insert(div) > end > end > > function Pandoc (doc) > doc.blocks = keep > return doc > end > > The limitation here is that we assume that "show" divs will not be > nested; nested "show" divs would be included twice. > > Cheers > > -- > Albert Krewinkel > GPG: 8eed e3e2 e8c5 6f18 81fe e836 388d c0b2 1f63 1124 > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/3b061ebc-bd2c-4633-9e8a-4e70e1f40a30n%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 2545 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-10-03 5:29 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-09-29 20:45 Converting everything that’s inside a specific div (including other div) while excluding everything else Butch [not found] ` <ee79a1ca-efb1-463c-ace9-5398c8e623e3n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2020-09-29 22:41 ` John MacFarlane [not found] ` <m2zh58xl1w.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org> 2020-09-30 5:11 ` Butch [not found] ` <d6b951e8-141e-4497-85cb-f5ecc8b992a4n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2020-09-30 8:49 ` Albert Krewinkel [not found] ` <87lfgrk5sb.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> 2020-09-30 19:26 ` John MacFarlane [not found] ` <m2eemjxe0f.fsf-jF64zX8BO08an7k8zZ43ob9bIa4KchGshsV+eolpW18@public.gmane.org> 2020-10-01 5:24 ` Butch [not found] ` <d8bdce6a-7632-4107-a700-0b228c9c3f74n-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2020-10-02 17:43 ` Albert Krewinkel [not found] ` <87d020jzgd.fsf-9EawChwDxG8hFhg+JK9F0w@public.gmane.org> 2020-10-03 5:29 ` Butch
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).