From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.text.pandoc/11813 Path: news.gmane.org!not-for-mail From: Elliott Slaughter Newsgroups: gmane.text.pandoc Subject: Re: Towards (better) Python filters for Pandoc with fluent queries Date: Wed, 21 Jan 2015 21:25:56 -0800 Message-ID: References: <20150102165038.GA25833@localhost.hsd1.ca.comcast.net> Reply-To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/alternative; boundary=90e6ba1f00901a82ae050d36e899 X-Trace: ger.gmane.org 1421904361 10114 80.91.229.3 (22 Jan 2015 05:26:01 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 22 Jan 2015 05:26:01 +0000 (UTC) To: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-X-From: pandoc-discuss+bncBDRLZB6H3ABBBZMTQKTAKGQEMGRVTFY-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Thu Jan 22 06:26:01 2015 Return-path: Envelope-to: gtp-pandoc-discuss@m.gmane.org Original-Received: from mail-pd0-f188.google.com ([209.85.192.188]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1YEAHI-0000YY-Cf for gtp-pandoc-discuss@m.gmane.org; Thu, 22 Jan 2015 06:26:00 +0100 Original-Received: by mail-pd0-f188.google.com with SMTP id fp1sf4646846pdb.5 for ; Wed, 21 Jan 2015 21:25:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlegroups.com; s=20120806; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-original-sender:x-original-authentication-results :reply-to:precedence:mailing-list:list-id:list-post:list-help :list-archive:sender:list-subscribe:list-unsubscribe; bh=kCwrqgidedTeFymR4rDA9DGI100K4DZD8zzgiZ9aLCg=; b=yK1OqYqCo5Ha4OnrJfJZIz6bOfIhNplWQcvSg4VDb4vSl/SXmLC91rnXhLuqjmAkO8 n4Ic43bXwYK2aLt1gkhLQhQnRIuv2s5cg40GmA4f/VxcwDe/bugh74OShvpJYN94Yr5N Hy51yQ+V7Buq7rjak3oeT38LlKSX4kyLjzX5xVCLG6aRadbZDa5y/+upMwo8ZHztrOjg uGx2SC4OCZT3EIvsuBKdO48yOo6v4NeUqwU3R4OVmU4jF0rylWn2LfT7RSSDszgd/Y8a LC0qVpCrD1tqdbgWCSyprYc183fKUmwBBUX8kWHVi6IRs5s9iI3xNhQXgc6nvE3UsAyt NkLQ== X-Received: by 10.50.66.227 with SMTP id i3mr23302igt.14.1421904359376; Wed, 21 Jan 2015 21:25:59 -0800 (PST) X-BeenThere: pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org Original-Received: by 10.50.138.102 with SMTP id qp6ls2977317igb.7.canary; Wed, 21 Jan 2015 21:25:57 -0800 (PST) X-Received: by 10.42.64.15 with SMTP id e15mr687992ici.24.1421904357185; Wed, 21 Jan 2015 21:25:57 -0800 (PST) Original-Received: from mail-ie0-x22b.google.com (mail-ie0-x22b.google.com. [2607:f8b0:4001:c03::22b]) by gmr-mx.google.com with ESMTPS id g3si126205igr.2.2015.01.21.21.25.57 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 21 Jan 2015 21:25:57 -0800 (PST) Received-SPF: pass (google.com: domain of elliottslaughter-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4001:c03::22b as permitted sender) client-ip=2607:f8b0:4001:c03::22b; Original-Received: by mail-ie0-x22b.google.com with SMTP id tr6so16042240ieb.2 for ; Wed, 21 Jan 2015 21:25:57 -0800 (PST) X-Received: by 10.42.153.132 with SMTP id m4mr753874icw.49.1421904356998; Wed, 21 Jan 2015 21:25:56 -0800 (PST) Original-Received: by 10.64.171.111 with HTTP; Wed, 21 Jan 2015 21:25:56 -0800 (PST) In-Reply-To: <20150102165038.GA25833-bi+AKbBUZKbivNSvqvJHCtPlBySK3R6THiGdP5j34PU@public.gmane.org> X-Original-Sender: elliottslaughter-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org X-Original-Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of elliottslaughter-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org designates 2607:f8b0:4001:c03::22b as permitted sender) smtp.mail=elliottslaughter-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org; dkim=pass header.i=@gmail.com; dmarc=pass (p=NONE dis=NONE) header.from=gmail.com Precedence: list Mailing-list: list pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org; contact pandoc-discuss+owners-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org List-ID: X-Google-Group-Id: 1007024079513 List-Post: , List-Help: , List-Archive: , List-Unsubscribe: , Xref: news.gmane.org gmane.text.pandoc:11813 Archived-At: --90e6ba1f00901a82ae050d36e899 Content-Type: text/plain; charset=UTF-8 John, How do you want to proceed? I'm reasonably happy with where the API is now, but am also aware that it hasn't necessarily received much testing outside of my own use (and test suite). I would be happy to submit this for inclusion in pandocfilters, but it might make sense to publish it as a separate library first to let the API air out more and so that people can get their hands dirty and kick the tires first. I'm always a bit nervous committing to an API that hasn't seen much real use yet, even if people have glanced at it and said it looks reasonable. Thoughts? On Fri, Jan 2, 2015 at 8:50 AM, John MacFarlane wrote: > Nice. When it stabilizes it would be good to add a note about it > in the documentation for pandocfilters. (Or maybe even integrate > it into pandocfilters.) > > +++ Elliott Slaughter [Jan 01 15 22:58 ]: > >> I like being able to script Pandoc via filters in Python, but one of the >> major drawbacks of the approach as it currently stands is that Python has >> no pattern matching to speak of. As a result, code that needs to run >> queries of the structure of Pandoc documents quickly turns into a >> nightmare, especially if that code needs to check nested structures. >> >> Consider the following partial function in Haskell, which matches against >> a >> BlockQuote containing a Para where the first word is "Chapter" in small >> caps: >> >> filter :: Block -> Block >> filter (BlockQuote [Para (SmallCaps [Str "Chapter"]):_]) = ... >> >> Without pattern matching, the equivalent code in Python is painful to >> write, opaque, and quite brittle. Unfortunately, without support for >> pattern matching, there is no possibility of a direct analogue in Python. >> Instead, I propose a fluent interface >> as a way to provide a >> >> query language of sorts for Python. So for example, the same query might >> look like: >> >> m = Matcher(block). >> BlockQuote(length = 1)[0]. >> Para(length = -1)[0]. >> SmallCaps(length = 1)[0]. >> Str(content = 'Chapter') >> if m.matches(): >> ... >> >> The code is not quite as dense because I've split it out for legibility, >> but can be condensed better to fit on a single line if desired. It is at >> any rate a massive improvement over hand-written queries over the JSON >> structure of the document. >> >> A proof of concept library is available today, and has been demonstrated >> with the query above as well as other queries I have needed in my own >> projects. Current coverage of the Pandoc API is at around 50%. The code is >> made available under an MIT license: >> >> https://bitbucket.org/elliottslaughter/pandocpatterns >> >> I would greatly appreciate any thoughts or feedback on the concept, >> design, >> or implementation. Please feel free to take the code out for a test drive >> and kick the tires. If there is interest, I would be willing to invest the >> effort to improve the library and make it more robust and useful. >> >> Thank you for your time. >> >> >> -- >> Elliott Slaughter >> >> "Don't worry about what anybody else is going to do. The best way to >> predict the future is to invent it." - Alan Kay >> >> -- >> You received this message because you are subscribed to the Google Groups >> "pandoc-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org >> To view this discussion on the web visit https://groups.google.com/d/ >> msgid/pandoc-discuss/CAJ9X%3Dkb9W0_Jd4ufPcRiZSSZ% >> 2B5Bpftg4hZ82zCuBLb-moadnSQ%40mail.gmail.com. >> For more options, visit https://groups.google.com/d/optout. >> > > -- > You received this message because you are subscribed to a topic in the > Google Groups "pandoc-discuss" group. > To unsubscribe from this topic, visit https://groups.google.com/d/ > topic/pandoc-discuss/NsEGkTN4fnk/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org > To view this discussion on the web visit https://groups.google.com/d/ > msgid/pandoc-discuss/20150102165038.GA25833%40localhost.hsd1.ca.comcast. > net. > > For more options, visit https://groups.google.com/d/optout. > -- Elliott Slaughter "Don't worry about what anybody else is going to do. The best way to predict the future is to invent it." - Alan Kay -- You received this message because you are subscribed to the Google Groups "pandoc-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to pandoc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFF+G/Ez6ZCGd0@public.gmane.org To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-discuss/CAJ9X%3Dka-YGfH9JfK1otkiw98doWAHQNUYUPzxPTvSRmhJFuxVg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout. --90e6ba1f00901a82ae050d36e899 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
John,

How do you want to proce= ed?

I'm reasonably happy with where the API is now, but am= also aware that it hasn't necessarily received much testing outside of= my own use (and test suite). I would be happy to submit this for inclusion= in pandocfilters, but it might make sense to publish it as a separate libr= ary first to let the API air out more and so that people can get their hand= s dirty and kick the tires first. I'm always a bit nervous committing t= o an API that hasn't seen much real use yet, even if people have glance= d at it and said it looks reasonable.

Thoughts?
=

On Fri, Jan 2, 20= 15 at 8:50 AM, John MacFarlane <jgm-TVLZxgkOlNX2fBVCVOL8/A@public.gmane.org> wrote:
<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex">Nice.=C2=A0 When it stabilizes it would be go= od to add a note about it
in the documentation for pandocfilters.=C2=A0 (Or maybe even integrate
it into pandocfilters.)

+++ Elliott Slaughter [Jan 01 15 22:58 ]:
I like being able to script Pandoc via filters in Python, but one of the major drawbacks of the approach as it currently stands is that Python has no pattern matching to speak of. As a result, code that needs to run
queries of the structure of Pandoc documents quickly turns into a
nightmare, especially if that code needs to check nested structures.

Consider the following partial function in Haskell, which matches against a=
BlockQuote containing a Para where the first word is "Chapter" in= small
caps:

=C2=A0 =C2=A0filter :: Block -> Block
=C2=A0 =C2=A0filter (BlockQuote [Para (SmallCaps [Str "Chapter"])= :_]) =3D ...

Without pattern matching, the equivalent code in Python is painful to
write, opaque, and quite brittle. Unfortunately, without support for
pattern matching, there is no possibility of a direct analogue in Python. Instead, I propose a fluent interface
<https://en.wikipedia.org/wiki/Fluent_interface> as a wa= y to provide a

query language of sorts for Python. So for example, the same query might look like:

=C2=A0 =C2=A0m =3D Matcher(block).
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0BlockQuote(length =3D 1)[0].
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Para(length =3D -1)[0].
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0SmallCaps(length =3D 1)[0].
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Str(content =3D 'Chapter')=
=C2=A0 =C2=A0if m.matches():
=C2=A0 =C2=A0 =C2=A0 =C2=A0...

The code is not quite as dense because I've split it out for legibility= ,
but can be condensed better to fit on a single line if desired. It is at any rate a massive improvement over hand-written queries over the JSON
structure of the document.

A proof of concept library is available today, and has been demonstrated with the query above as well as other queries I have needed in my own
projects. Current coverage of the Pandoc API is at around 50%. The code is<= br> made available under an MIT license:

https://bitbucket.org/elliottslaughter/pandocpattern= s

I would greatly appreciate any thoughts or feedback on the concept, design,=
or implementation. Please feel free to take the code out for a test drive and kick the tires. If there is interest, I would be willing to invest the<= br> effort to improve the library and make it more robust and useful.

Thank you for your time.


--
Elliott Slaughter

"Don't worry about what anybody else is going to do. The best way = to
predict the future is to invent it." - Alan Kay

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pandoc-discuss+unsubscribe@googlegroups.com.
To post to this group, send email to pandoc-discuss@googlegroups.com.<= br> To view this discussion on the web visit https://groups.google.com/d/<= /u>msgid/pandoc-discuss/CAJ9X%3Dkb9W0_Jd4ufPcRiZSSZ%2B5Bpftg4= hZ82zCuBLb-moadnSQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Goog= le Groups "pandoc-discuss" group.
To unsubscribe from this topic, visit https://g= roups.google.com/d/topic/pandoc-discuss/NsEGkTN4fnk/unsubscri= be.
To unsubscribe from this group and all its topics, send an email to pandoc-discuss+unsubscribe@googlegroups.com.
To post to this group, send email to pandoc-discuss@googlegroups.com.<= br>
To view this discussion on the web visit https://groups.google.com/d/msgid/pandoc-di= scuss/20150102165038.GA25833%40localhost.hsd1.ca.comcast.<= /u>net.

For more options, visit https://groups.google.com/d/optout.



--
Elliott Slaughter

"Don't worry about w= hat anybody else is going to do. The best way to predict the future is to i= nvent it." - Alan Kay

--
You received this message because you are subscribed to the Google Groups &= quot;pandoc-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to pand= oc-discuss+unsubscribe-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To post to this group, send email to pandoc-discuss-/JYPxA39Uh5TLH3MbocFFw@public.gmane.org.
To view this discussion on the web visit https://group= s.google.com/d/msgid/pandoc-discuss/CAJ9X%3Dka-YGfH9JfK1otkiw98doWAHQNUYUPz= xPTvSRmhJFuxVg%40mail.gmail.com.
For more options, visit http= s://groups.google.com/d/optout.
--90e6ba1f00901a82ae050d36e899--