* Making a filter to convert ruby characters from HTML to ConTeXt @ 2019-08-08 16:26 Patrick Kenny [not found] ` <2e8d8fde-b107-41d3-ad59-bc249f8f0ae8-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 0 siblings, 1 reply; 3+ messages in thread From: Patrick Kenny @ 2019-08-08 16:26 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 838 bytes --] I'm trying to write a filter (in panflute) to convert ruby characters <https://en.wikipedia.org/wiki/Ruby_character> from HTML to ConTeXt. Input HTML: <p>This is an example: <ruby>例<rt>レイ</rt></ruby></p> After conversion (what I want my output to look like): This is an example: \ruby{例}{レイ} What's a good way to approach this kind of conversion? I don't know how to target the ruby tags. -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/2e8d8fde-b107-41d3-ad59-bc249f8f0ae8%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 2771 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <2e8d8fde-b107-41d3-ad59-bc249f8f0ae8-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>]
* Re: Making a filter to convert ruby characters from HTML to ConTeXt [not found] ` <2e8d8fde-b107-41d3-ad59-bc249f8f0ae8-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> @ 2019-08-08 17:49 ` John MacFarlane [not found] ` <yh480kh86rtum3.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> 0 siblings, 1 reply; 3+ messages in thread From: John MacFarlane @ 2019-08-08 17:49 UTC (permalink / raw) To: Patrick Kenny, pandoc-discuss This will show you how pandoc parses this content: % pandoc -t native <ruby>aa<rt>bb</rt></ruby> ^D [Para [RawInline (Format "html") "<ruby>",Str "aa",RawInline (Format "html") "<rt>",Str "bb",RawInline (Format "html") "</rt>",RawInline (Format "html") "</ruby>"]] So now you know what kind of structure you'll have to intercept and deal with in your filter. Does that help? Patrick Kenny <ptmkenny-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> writes: > I'm trying to write a filter (in panflute) to convert ruby characters > <https://en.wikipedia.org/wiki/Ruby_character> from HTML to ConTeXt. > > Input HTML: > > <p>This is an example: <ruby>例<rt>レイ</rt></ruby></p> > > After conversion (what I want my output to look like): > > This is an example: \ruby{例}{レイ} > > What's a good way to approach this kind of conversion? I don't know how to > target the ruby tags. > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/2e8d8fde-b107-41d3-ad59-bc249f8f0ae8%40googlegroups.com. -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/yh480kh86rtum3.fsf%40johnmacfarlane.net. ^ permalink raw reply [flat|nested] 3+ messages in thread
[parent not found: <yh480kh86rtum3.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>]
* Re: Making a filter to convert ruby characters from HTML to ConTeXt [not found] ` <yh480kh86rtum3.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> @ 2019-08-09 16:52 ` Patrick Kenny 0 siblings, 0 replies; 3+ messages in thread From: Patrick Kenny @ 2019-08-09 16:52 UTC (permalink / raw) To: pandoc-discuss [-- Attachment #1.1: Type: text/plain, Size: 2954 bytes --] Thank you, that does help. I managed to walk through the document and identify the parts I want to change (confirmed with debugging), but they don't get changed properly (in the output document, the conversion doesn't occur). How can I take the HTML and change it to TeX? def ruby_convert(elem, doc): if isinstance(elem, pf.RawInline): if elem.text == '<ruby>': pf.debug(elem.text) elem = pf.RawInline('\\ruby', 'tex') elif elem.text == '<rt>': pf.debug(elem.text) elem = pf.RawInline('}{', 'tex') elif elem.text == '</rt>': pf.debug(elem.text) elem = pf.RawInline('}', 'tex') elif elem.text == '</ruby>': pf.debug(elem.text) # We can delete this because we already processed the end tag in </rt> return [] def action(elem, doc): if isinstance(elem, pf.Para) and (doc.format == 'context'): return elem.walk(ruby_convert) On Friday, August 9, 2019 at 2:50:10 AM UTC+9, John MacFarlane wrote: > > > This will show you how pandoc parses this content: > > % pandoc -t native > <ruby>aa<rt>bb</rt></ruby> > ^D > [Para [RawInline (Format "html") "<ruby>",Str "aa",RawInline > (Format "html") "<rt>",Str "bb",RawInline (Format "html") > "</rt>",RawInline (Format "html") "</ruby>"]] > > So now you know what kind of structure you'll have to > intercept and deal with in your filter. Does that help? > > > Patrick Kenny <ptmk...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> writes: > > > I'm trying to write a filter (in panflute) to convert ruby characters > > <https://en.wikipedia.org/wiki/Ruby_character> from HTML to ConTeXt. > > > > Input HTML: > > > > <p>This is an example: <ruby>例<rt>レイ</rt></ruby></p> > > > > After conversion (what I want my output to look like): > > > > This is an example: \ruby{例}{レイ} > > > > What's a good way to approach this kind of conversion? I don't know how > to > > target the ruby tags. > > > > -- > > You received this message because you are subscribed to the Google > Groups "pandoc-discuss" group. > > To unsubscribe from this group and stop receiving emails from it, send > an email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. > > To view this discussion on the web visit > https://groups.google.com/d/msgid/pandoc-discuss/2e8d8fde-b107-41d3-ad59-bc249f8f0ae8%40googlegroups.com. > > -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/4121b894-b876-48cd-b5d1-1d110f5c98bc%40googlegroups.com. [-- Attachment #1.2: Type: text/html, Size: 14320 bytes --] ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2019-08-09 16:52 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-08-08 16:26 Making a filter to convert ruby characters from HTML to ConTeXt Patrick Kenny [not found] ` <2e8d8fde-b107-41d3-ad59-bc249f8f0ae8-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org> 2019-08-08 17:49 ` John MacFarlane [not found] ` <yh480kh86rtum3.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org> 2019-08-09 16:52 ` Patrick Kenny
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).