Re: [Caml-list] [ANN] Planck: a small monadic parser combinator library for OCaml

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

From: Gabriel Scherer <gabriel.scherer@gmail.com>
To: Jun Furuse <jun.furuse@gmail.com>
Cc: caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] [ANN] Planck: a small monadic parser combinator library for OCaml
Date: Sun, 29 May 2011 11:14:32 +0200	[thread overview]
Message-ID: <BANLkTimCRGWDeGiEwDYBiQ+Rf5VFEGNWvQ@mail.gmail.com> (raw)
In-Reply-To: <BANLkTi=p5TMM=s4qtEB_9ES=pzvMBnRJQQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4202 bytes --]

>
> The combinators in Planck are implemented simply as functional monadic
> combinators over streams (lazy lists). Unfortunately, it is very slow with
> the current OCaml compiler (3.12.0) due to its huge closure constructions.
> [...] I hope more aggressive in-lining optimizations in the compiler might
> speed up the performance of Planck greatly.
>

Have you considered writing a defunctionalized version of the parsing
library ? There are two different kinds of closure construction in your
code:
- the "bind" interface requires to be passed a function, which is generally
a closure built on the fly
- the parsing monad itself is a function type (it is a state+error monad, so
written in state-passing style)

There is not much you can do for the first closure source, if you want to
keep a monadic interface, but the second cause is inessential. You may
defunctionalize the state monad by reifying it into an algebraic datatype.
The monad computation would return a big data structure (instead of a big
function), and you would then write an interpretation function passing the
state around, without any closure construction.

Moreover, the heavy use of "bind" (which needs a function/closure as
parameter) in your parsing code could be avoided. You may use more parsing
combinators (like <|>) and less binding operators. For example your current
style is to write (v1 <-- p1; v2 <-- p2; return (p1, p2)), you may as well
write something like ((v1,v2) <-- p1 <*> p2; return (v1,v2)), with one less
bind call (the idea is to "currify" successive bindings with product-binding
combinators). I think you should promote the use of combinators over raw
"binds".

Your blog article concludes that the OCaml compiler does not optimize enough
function and closures, making the use of monads impractical. I think this is
only true for the closure-using implementations, which are not the only way
to write such monads. The functional programming litterature is full of
tricks to turn functions into datastructure, and this is almost always a
win. In particular you may be able to implement domain-specific
"optimizations" that are out of reach for general compilers.

I'm sorry I have nothing more concrete here; I think the transformations to
defunctionalize the core should not be too complicated, but unfortunately I
have failed to build planck in a reasonable time so I didn't try it myself.

On Sat, May 28, 2011 at 4:11 AM, Jun Furuse <jun.furuse@gmail.com> wrote:

> Hi,
>
> I've released a small monadic parser combinator library for OCaml, Planck
> version 1.0.0, available at:
>
>     https://bitbucket.org/camlspotter/planck/get/v1.0.0.tar.gz
>
> It is firstly just for my fun to learn what is Parsec/parser combinators,
> but it is now elaborated to something useful:
>
>     - input positions by lines and columns
>     - char specialized stream for better performance
>     - operator precedence/associativity resolver
>     - memoization module for efficient backtracks
>
> For example I could implement OCaml syntax lexer and parser using Planck.
>
> REQUIREMENTS: unfortunately Planck depends on many things:
>
>   - ocaml 3.12.0 or higher
>   - findlib
>   - omake
>   - type-conv 2.3.0 and sexplib 5.2.1 (available from
> http://ocaml.janestreet.com/?q=node/13)
>   - spotlib (my small utility functions, available at
> http://bitbucket.org/camlspotter/spotlib/ )
>
>   The followings are required to compiler ocaml syntax parser example:
>   - pa_monad_custom ( http://bitbucket.org/camlspotter/pa_monad_custom/
>   - ocaml 3.12.0 source tree and lablgtk-2.14.2 source code tree for
> testing
>
> The combinators in Planck are implemented simply as functional monadic
> combinators over streams (lazy lists). Unfortunately, it is very slow with
> the current OCaml compiler (3.12.0) due to its huge closure constructions:
> it is about x100 slower than the traditional ocamllex+ocamlyacc. I hope more
> aggressive in-lining optimizations in the compiler might speed up the
> performance of Planck greatly. You can read some of my rough considerations
> in this topic at:
>
>
> http://camlspotter.blogspot.com/2011/05/planck-small-parser-combinator-library.html
>
> Enjoy,
>
> Jun
>
>

[-- Attachment #2: Type: text/html, Size: 5247 bytes --]

next prev parent reply	other threads:[~2011-05-29  9:15 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-28  2:11 Jun Furuse
2011-05-29  9:14 ` Gabriel Scherer [this message]
2011-06-06  6:42   ` Jun Furuse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BANLkTimCRGWDeGiEwDYBiQ+Rf5VFEGNWvQ@mail.gmail.com \
    --to=gabriel.scherer@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=jun.furuse@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).