public inbox archive for pandoc-discuss@googlegroups.com
 help / color / mirror / Atom feed
From: Patrick Kenny <ptmkenny-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: pandoc-discuss <pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
Subject: Re: Making a filter to convert ruby characters from HTML to ConTeXt
Date: Fri, 9 Aug 2019 09:52:42 -0700 (PDT)	[thread overview]
Message-ID: <4121b894-b876-48cd-b5d1-1d110f5c98bc@googlegroups.com> (raw)
In-Reply-To: <yh480kh86rtum3.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>


[-- Attachment #1.1: Type: text/plain, Size: 2954 bytes --]

Thank you, that does help.

I managed to walk through the document and identify the parts I want to 
change (confirmed with debugging), but they don't get changed properly (in 
the output document, the conversion doesn't occur).

How can I take the HTML and change it to TeX?

def ruby_convert(elem, doc):
    if isinstance(elem, pf.RawInline):
        if elem.text == '<ruby>':
            pf.debug(elem.text)
            elem = pf.RawInline('\\ruby', 'tex')
        elif elem.text == '<rt>':
            pf.debug(elem.text)
            elem = pf.RawInline('}{', 'tex')
        elif elem.text == '</rt>':
            pf.debug(elem.text)
            elem = pf.RawInline('}', 'tex')
        elif elem.text == '</ruby>':
            pf.debug(elem.text)
            # We can delete this because we already processed the end tag 
in </rt> 
            return []

def action(elem, doc):
    if isinstance(elem, pf.Para) and (doc.format == 'context'):
        return elem.walk(ruby_convert)


On Friday, August 9, 2019 at 2:50:10 AM UTC+9, John MacFarlane wrote:
>
>
> This will show you how pandoc parses this content: 
>
> % pandoc -t native 
> <ruby>aa<rt>bb</rt></ruby> 
> ^D 
> [Para [RawInline (Format "html") "<ruby>",Str "aa",RawInline 
> (Format "html") "<rt>",Str "bb",RawInline (Format "html") 
> "</rt>",RawInline (Format "html") "</ruby>"]] 
>
> So now you know what kind of structure you'll have to 
> intercept and deal with in your filter. Does that help? 
>
>
> Patrick Kenny <ptmk...-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org <javascript:>> writes: 
>
> > I'm trying to write a filter (in panflute) to convert ruby characters 
> > <https://en.wikipedia.org/wiki/Ruby_character> from HTML to ConTeXt. 
> > 
> > Input HTML: 
> > 
> > <p>This is an example: <ruby>例<rt>レイ</rt></ruby></p> 
> > 
> > After conversion (what I want my output to look like): 
> > 
> > This is an example: \ruby{例}{レイ} 
> > 
> > What's a good way to approach this kind of conversion?  I don't know how 
> to 
> > target the ruby tags. 
> > 
> > -- 
> > You received this message because you are subscribed to the Google 
> Groups "pandoc-discuss" group. 
> > To unsubscribe from this group and stop receiving emails from it, send 
> an email to pandoc-...-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org <javascript:>. 
> > To view this discussion on the web visit 
> https://groups.google.com/d/msgid/pandoc-discuss/2e8d8fde-b107-41d3-ad59-bc249f8f0ae8%40googlegroups.com. 
>
>

-- 
You received this message because you are subscribed to the Google Groups "pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/4121b894-b876-48cd-b5d1-1d110f5c98bc%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 14320 bytes --]

      parent reply	other threads:[~2019-08-09 16:52 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-08 16:26 Patrick Kenny
     [not found] ` <2e8d8fde-b107-41d3-ad59-bc249f8f0ae8-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org>
2019-08-08 17:49   ` John MacFarlane
     [not found]     ` <yh480kh86rtum3.fsf-pgq/RBwaQ+zq8tPRBa0AtqxOck334EZe@public.gmane.org>
2019-08-09 16:52       ` Patrick Kenny [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4121b894-b876-48cd-b5d1-1d110f5c98bc@googlegroups.com \
    --to=ptmkenny-re5jqeeqqe8avxtiumwx3w@public.gmane.org \
    --cc=pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).