From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/29693 Path: news.gmane.io!.POSTED.blaine.gmane.org!not-for-mail From: John MacFarlane Newsgroups: gmane.text.pandoc Subject: Re: Skipping commands in LaTeX document Date: Mon, 06 Dec 2021 09:57:32 -0800 Message-ID: References: <0462fc42-ae24-4c52-b267-1126ed5834edn@googlegroups.com> <84e207d9-eaed-4b24-8b6b-62ea07bb2b5bn@googlegroups.com> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: ciao.gmane.io; posting-host="blaine.gmane.org:116.202.254.214"; logging-data="5916"; mail-complaints-to="usenet@ciao.gmane.io" To: Greg S , pandoc-discuss Original-X-From: pandoc-discuss+bncBCJZJHG45QDBBGM6XGGQMGQELYLCDDQ-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Mon Dec 06 18:57:48 2021 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane-mx.org Original-Received: from mail-pf1-f186.google.com ([209.85.210.186]) by ciao.gmane.io with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from ) id 1muIFH-0001Lg-K5 for gtp-pandoc-discuss@m.gmane-mx.org; Mon, 06 Dec 2021 18:57:47 +0100 Original-Received: by mail-pf1-f186.google.com with SMTP id y68-20020a626447000000b004a2b2d0c39esf6959664pfb.14 for ; Mon, 06 Dec 2021 09:57:47 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1638813466; cv=pass; d=google.com; s=arc-20160816; b=SedsZMFWAxLqBgHMwP6d7BbPfnrWpGPchsCXQdpPCh9W4HVu/BJ8K5jKDmPBJTgewg R/ccNaA7CxJiBT3t88hSn4gdszk4fBqdvt1PLnWVry9bGA1bLcNfouUEs/oExe8YXUqd CU/DM0jAd8n1mPHLhRxucx04100Wr/lmO1Us/lxwcoSA8s4pX90Ba6yRxpFbjydi0uzp N75gM4nZjW4uryGHk/zZRRWxQnnBTZiiylRE39BoZGcfxT0KKrIc/MG7GkNv0NNMIaXI XSaO50MCNnfFBdVIhSe61f/aYi+kESdCOJbTNXDrRus1gUwEPA4DbCPAlaOYgcm4osQe WGlw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-archive:list-help:list-post :list-id:mailing-list:precedence:reply-to:mime-version:message-id :date:references:in-reply-to:subject:to:from:sender:dkim-signature; bh=ZnT6+5B6uLykvm2xo/Ly6ItwRrcLsSpcj1HYwDBmv6k=; b=uxK8AvhReCzMx7P2Nz4bx/uqBotYTc0Y+gKAXt4HATq+5nBWCE5CFewUnDwysWfaaG IJXEyitY2ejyCgwJMATa2z9ZrUQYBh9LkZJL+wUCUsfsP6C9cmOrd/BTOjLFBca78Ki4 VXo4veFQlfXdFGLeszceJBMG/c9iac+TVh4O26wbUI6nY/eGwvCNbVdJKg+iaHkWJPLv olc9c2922poPnKN/vXuxQ6c+BcbxXgYZi6uqSnPrO0NX7xUwGXgy7yGklMbo1ttnty8I gi2koR1ZOh87lPFi3qyUe9XQnIJfZ1N8dp9uHU32OYmCM8h5dOxiKrbpF/D1gnyeZ+4Q 3BfQ== ARC-Authentication-Results: i=2; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20210112.gappssmtp.com header.s=20210112 header.b=77WZ3IMB; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::135 as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20210112; h=sender:from:to:subject:in-reply-to:references:date:message-id :mime-version:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:list-subscribe:list-unsubscribe; bh=ZnT6+5B6uLykvm2xo/Ly6ItwRrcLsSpcj1HYwDBmv6k=; b=q1LcwQdeSonPv2ahMRByXOCbOcRqqECqEbaYM+1N33Bf2wuEva/eNIQm2eyRNLc0rS TVsCD1tIpFCR6kkM9DPoq6jCcbbH/+aZyDJ8KZjW2c8YYPadsdf64ZhXF5FPeDhjho+n 15SEx1qM3ebO0Ri9d84j+NF3EC6JPgvE0lN3phrsLtAadNxrwVXicj/KJxnWxpB5Oqc6 6RCOPK2mJYAKmlr+vCzggammuGGyKzmbsD8m7KewIW3LC7IN/9SuAhm6Yw0/YVqCECIi BH0PXPYSZ3nRAOm93YhsoxdzpSRdF9kfOR8lkx0JpleRUK98t9udj4SEOyIRWil6PX46 Mh0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=sender:x-gm-message-state:from:to:subject:in-reply-to:references :date:message-id:mime-version:x-original-sender :x-original-authentication-results:reply-to:precedence:mailing-list :list-id:x-spam-checked-in-group:list-post:list-help:list-archive :list-subscribe:list-unsubscribe; bh=ZnT6+5B6uLykvm2xo/Ly6ItwRrcLsSpcj1HYwDBmv6k=; b=i24ZWqNxaZGd27i/Z8ul5JqA/ZlXNvRVYIaLKYwcwBPf4YBCVDpuXYTV0dWSbprBio kPc5vPBzUv97TAoff/4CRSpFrhbczRT0M8Qo1lMSOHVAXoBisnFGLepSx4rMtiJXy5tb 5kBt3vsUMhm8BnD5/DFMZkFS2STskq39TEUzEmsBcsKnCAGaBO1y0X6X7CN+jRcivu97 0Apsu7XoilLDpcXu7aLQ8YAnpztzPq0nNBjUqSEC/9drRzsK/S401TQqhKtNBa9UW6PA euFmi+1afnapVQtILAPraMp/I3smGEsL7H6DZLQY80xjdqek5C47GRXUuqBZpr+F9W00 18sA== Original-Sender: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org X-Gm-Message-State: AOAM5316shCdROz/1BTEs6HXc9sxcps/C8gfo3N5mtoMMk6Pnk1ng9lD FxIzaYeXEkKuBvDQ3AcMuPc= X-Google-Smtp-Source: ABdhPJyAF2r2aqv2OkBOLpbzNHQUuUIVeyZqKeS6XBq2OAIebMc+vmUmcj6OPfFY67oqIUXhVTXSAg== X-Received: by 2002:a65:58c3:: with SMTP id e3mr11353934pgu.118.1638813466445; Mon, 06 Dec 2021 09:57:46 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 2002:a05:6a00:14d4:: with SMTP id w20ls6830479pfu.2.gmail; Mon, 06 Dec 2021 09:57:44 -0800 (PST) X-Received: by 2002:a63:6848:: with SMTP id d69mr19353907pgc.496.1638813464500; Mon, 06 Dec 2021 09:57:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1638813464; cv=none; d=google.com; s=arc-20160816; b=zoAAZnXVeNwKmCkTPdREpPZL6RF0dEsEH8tdUfddH0vrSJ7vY3dRwFQzU6GLMIgFFK JpYUaFF04gO982E/c9u6oHS1urkuUUHw5Lesuwo9u3IzDqf+pkKjXIAQ0wiPW+fu8u6T 58N3puIA6sKNs/7fI58/zDRtr8pcK4N75U0d1+oGQ+ooy0EQ8sOeVvPj9sCeH/eFXH4e ZETJFKnk19nAbw2wnwK+fwEwao1FbH4nchnx/ZfxWVNz3/HycAmFxXJ9gIZSK4vHJNuQ juN9kkumFg7nY4LK0DJdvaf4Ukv8SbEUxbrhH3XMwgnPhVD/XKaESNfcJWaiG/riPDlF r0nQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:message-id:date:references:in-reply-to:subject:to:from :dkim-signature; bh=3Zq0oiNByeq+vJ3wholOvLpqme7Gp3fsJSEFR7GQaMQ=; b=SSOI3FYaws2UGJJdt8gzkgGVct62RzNbnGYkq46UzNrZcJ4HCerHht6mKYu702aazT PkP8F/psJQCD64bUSIT1mvg5CyLkqQnOhWMK9E+hCd5m9pYGyI8pDV6DkTfSHETU6nLa RDm5SuG7I9PdqRv2qq3ryBYg8+0x6/rKQtIVK3wp3usiBVnrTThsR5Dw7rtgZaUDj4dc W06aAXG6htZ983L2EhwRt+fukXIkWIiF7S7oXwfH+KO13ZRvGhgLPXAcUgN5lQJdcBrN WHvO9yZ/OdTfwb8pb+GMmfNSlCzDYEedh4EKQN1FQBQXDzF9EAlDdT1/oKWJsrC1fkxv GPOw== ARC-Authentication-Results: i=1; gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20210112.gappssmtp.com header.s=20210112 header.b=77WZ3IMB; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::135 as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Original-Received: from mail-il1-x135.google.com (mail-il1-x135.google.com. [2607:f8b0:4864:20::135]) by gmr-mx.google.com with ESMTPS id l17si429775plb.1.2021.12.06.09.57.44 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 06 Dec 2021 09:57:44 -0800 (PST) Received-SPF: pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::135 as permitted sender) client-ip=2607:f8b0:4864:20::135; Original-Received: by mail-il1-x135.google.com with SMTP id i9so11099481ilu.1 for ; Mon, 06 Dec 2021 09:57:44 -0800 (PST) X-Received: by 2002:a05:6e02:b45:: with SMTP id f5mr34515280ilu.275.1638813463614; Mon, 06 Dec 2021 09:57:43 -0800 (PST) Original-Received: from johnmacfarlane.net (li55-134.members.linode.com. [74.82.3.134]) by smtp.gmail.com with ESMTPSA id i26sm8749917ila.12.2021.12.06.09.57.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Dec 2021 09:57:43 -0800 (PST) Original-Received: by johnmacfarlane.net (Postfix, from userid 1000) id 39C1CA1D6; Mon, 6 Dec 2021 12:57:32 -0500 (EST) In-Reply-To: X-Original-Sender: jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; dkim=pass header.i=@berkeley-edu.20210112.gappssmtp.com header.s=20210112 header.b=77WZ3IMB; spf=pass (google.com: domain of jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org designates 2607:f8b0:4864:20::135 as permitted sender) smtp.mailfrom=jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.io gmane.text.pandoc:29693 Archived-At: Is that because the whole tabular is being parsed as a raw TeX chunk? Or is it treated as a table, but the contents of cells are parsed differently than outside the table? Greg S writes: > Okay I've written a filter: > > ``` > #!/usr/bin/python > import logging > import re > from pandocfilters import toJSONFilter, Emph, Para, RawInline > > ipa_regex = re.compile("\\\IPA{(.*)}") > > def handle(key, value, format, meta): > logging.warning(f"KEY {key} VALUE {value} format {format} META {meta}") > if key == "RawInline": > if m := ipa_regex.match(value[1]): > return RawInline('html', f"{m.group(1)}") > > if __name__ == "__main__": > toJSONFilter(handle) > ``` > > and with the `-f latex+raw_tex` option passed to pandoc it looks like this > is correctly capturing the text in the IPA macro. > > However, I noticed that the filter completely skips over text in the \IPA > macro if that macro occurs within a latex table defined with > \begin{tabular}. I'm using the > makecell latex package and wrapping the cells with the \makecell command > (i.e. `\makecell { \IPA{ some text } }`, but I tried removing the \makecell > and the IPA macro still gets skipped in this context. > > > On Sunday, December 5, 2021 at 12:12:44 PM UTC-8 John MacFarlane wrote: > >> >> I should have mentioned before that you'll need to enable >> the `raw_tex` extension as shown above, to allow inclusion >> of RawBlock or RawInline. >> >> % pandoc -t native -f latex+raw_tex >> \IPA{hi} there >> ^D >> [ Para >> [ RawInline (Format "latex") "\\IPA{hi}" >> , Space >> , Str "there" >> ] >> ] >> >> >> Greg S writes: >> >> > How can I write a filter that matches RawInline elements if the filter >> > applies after the unknown latex macros have been applied in the parsing >> > stage? I'm not seeing the text within the \IPA macro at all in the >> logging >> > from the test filter I wrote - is there something I need to do to make >> that >> > filter apply earlier? >> > >> > On Sunday, December 5, 2021 at 10:56:51 AM UTC-8 John MacFarlane wrote: >> > >> >> >> >> You can't insert the macro with a filter, because the filter >> >> is applied after parsing, and the macro would be resolved in >> >> the parsing phase. >> >> >> >> However, you could have a filter that matches RawInline >> >> elements that are "\IPA" commands, extracts their textual >> >> content, and returns a Str element. >> >> >> >> Greg S writes: >> >> >> >> > Is there a way I can tell pandoc to insert a new Latex macro before >> >> > processing that doesn't exist in the document? Using >> >> > \renewcommand{\IPA}[1]{#1} makes the text appear in the output of the >> >> latex >> >> > -> html conversion, but it breaks the formatting I care about in the >> pdf >> >> > version so I don't want to have that line permanently in the latex >> source >> >> > file. >> >> > >> >> > I think I'd ultimately like to use a filter to intercept the raw >> latex >> >> from >> >> > \IPA{...} and do something specific with it in HTML (probably put it >> >> within >> >> > a tag). I also have some other latex macros from >> >> > specific packages that pandoc doesn't seem to understand, that I'd >> like >> >> to >> >> > handle in a custom way. I tried creating a simple logging Python >> filter >> >> > just to understand how they work. >> >> > >> >> > ``` >> >> > #!/usr/bin/python >> >> > import logging >> >> > from pandocfilters import toJSONFilter, Emph, Para >> >> > >> >> > def handle(key, value, format, meta): >> >> > logging.warn(f"KEY {key} VALUE {value} format {format} META {meta}") >> >> > >> >> > if __name__ == "__main__": >> >> > toJSONFilter(handle) >> >> > ``` >> >> > And then running `pandoc --pdf-engine=xelatex --verbose test.tex -o >> >> > test.html --filter filter.py`. >> >> > >> >> > But it seems like latex macros that pandoc doesn't understand are >> getting >> >> > skipped before the filter is applied, so the `handle` function never >> gets >> >> > called with the text contents of my \IPA macro. >> >> > >> >> > On Saturday, December 4, 2021 at 9:37:16 AM UTC-8 John MacFarlane >> wrote: >> >> > >> >> >> >> >> >> Pandoc doesn't understand everything, especially outside of >> >> >> core LaTeX. In particular, it doesn't understand >> >> >> >> >> >> \DeclareTextFontCommand >> >> >> >> >> >> from fontspec, so the \IPA macro isn't understood. >> >> >> >> >> >> You can work around this by adding your own macro >> >> >> definition before you convert with pandoc: >> >> >> >> >> >> \renewcommand{\IPA}[1]{#1} >> >> >> >> >> >> and then the contents of \IPA will just be passed >> >> >> through. >> >> >> >> >> >> I suppose you could alternatively redefine >> >> >> >> >> >> \renewcommand{\DeclareTextFontCommand}[2]{\newcommand{#1}[1]{##1}} >> >> >> >> >> >> before your fontspec stuff (untested and may not work). >> >> >> >> >> >> Another option is to use a filter and intercept the raw >> >> >> LaTeX inline produced from \IPA{some text}, changing it >> >> >> into textual content, but I think the first approach above >> >> >> is the simplest. >> >> >> >> >> >> >> >> >> >> >> >> Greg S writes: >> >> >> >> >> >> > I have a minimal test latex file `test.tex`: >> >> >> > >> >> >> > >> >> >> > \documentclass{article} >> >> >> > >> >> >> > \usepackage{fontspec} >> >> >> > >> >> >> > \newfontfamily\IPAFont{Doulos SIL} >> >> >> > \DeclareTextFontCommand{\IPA}{\IPAFont} >> >> >> > >> >> >> > \begin{document} >> >> >> > >> >> >> > \section{Test} >> >> >> > Hello \IPA{some IPA} >> >> >> > >> >> >> > \end{document} >> >> >> > >> >> >> > >> >> >> > This builds fine with xelatex and produces a pdf I expect. When i >> try >> >> to >> >> >> > convert this to an html document with `pandoc --pdf-engine=xelatex >> >> >> > --verbose test.tex -o test.html`, I see the warnings: >> >> >> > >> >> >> > [INFO] Could not load include file fontspec.sty at test.tex line 3 >> >> >> column 22 >> >> >> > [INFO] Skipped '\newfontfamily' at test.tex line 5 column 15 >> >> >> > [INFO] Skipped '\IPAFont{Doulos SIL}' at test.tex line 5 column 35 >> >> >> > [INFO] Skipped '\DeclareTextFontCommand{\IPA}{\IPAFont}' at >> test.tex >> >> >> line 6 >> >> >> > column 40 >> >> >> > [INFO] Skipped '\IPA{some IPA}' at test.tex line 11 column 21 >> >> >> > >> >> >> > And the text within the custom \IPA command is skipped. How can I >> make >> >> >> > pandoc not skip these? >> >> >> > >> >> >> > >> >> >> > -- >> >> >> > You received this message because you are subscribed to the Google >> >> >> Groups "pandoc-discuss" group. >> >> >> > To unsubscribe from this group and stop receiving emails from it, >> send >> >> >> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> >> >> > To view this discussion on the web visit >> >> >> >> >> >> https://groups.google.com/d/msgid/pandoc-discuss/0462fc42-ae24-4c52-b267-1126ed5834edn%40googlegroups.com >> >> >> . >> >> >> >> >> > >> >> > -- >> >> > You received this message because you are subscribed to the Google >> >> Groups "pandoc-discuss" group. >> >> > To unsubscribe from this group and stop receiving emails from it, >> send >> >> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> >> > To view this discussion on the web visit >> >> >> https://groups.google.com/d/msgid/pandoc-discuss/bac7947b-259e-4774-b993-33f69fffc05fn%40googlegroups.com >> >> . >> >> >> > >> > -- >> > You received this message because you are subscribed to the Google >> Groups "pandoc-discuss" group. >> > To unsubscribe from this group and stop receiving emails from it, send >> an email to pandoc-discus...-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/pandoc-discuss/84e207d9-eaed-4b24-8b6b-62ea07bb2b5bn%40googlegroups.com >> . >> > > -- > You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. > To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/c648fb98-d892-4f1e-b3aa-0da071d8de4bn%40googlegroups.com.