caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] [ANN] Planck: a small monadic parser combinator library for OCaml
@ 2011-05-28  2:11 Jun Furuse
  2011-05-29  9:14 ` Gabriel Scherer
  0 siblings, 1 reply; 3+ messages in thread
From: Jun Furuse @ 2011-05-28  2:11 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 1649 bytes --]

Hi,

I've released a small monadic parser combinator library for OCaml, Planck
version 1.0.0, available at:

    https://bitbucket.org/camlspotter/planck/get/v1.0.0.tar.gz

It is firstly just for my fun to learn what is Parsec/parser combinators,
but it is now elaborated to something useful:

    - input positions by lines and columns
    - char specialized stream for better performance
    - operator precedence/associativity resolver
    - memoization module for efficient backtracks

For example I could implement OCaml syntax lexer and parser using Planck.

REQUIREMENTS: unfortunately Planck depends on many things:

  - ocaml 3.12.0 or higher
  - findlib
  - omake
  - type-conv 2.3.0 and sexplib 5.2.1 (available from
http://ocaml.janestreet.com/?q=node/13)
  - spotlib (my small utility functions, available at
http://bitbucket.org/camlspotter/spotlib/ )

  The followings are required to compiler ocaml syntax parser example:
  - pa_monad_custom ( http://bitbucket.org/camlspotter/pa_monad_custom/
  - ocaml 3.12.0 source tree and lablgtk-2.14.2 source code tree for testing

The combinators in Planck are implemented simply as functional monadic
combinators over streams (lazy lists). Unfortunately, it is very slow with
the current OCaml compiler (3.12.0) due to its huge closure constructions:
it is about x100 slower than the traditional ocamllex+ocamlyacc. I hope more
aggressive in-lining optimizations in the compiler might speed up the
performance of Planck greatly. You can read some of my rough considerations
in this topic at:


http://camlspotter.blogspot.com/2011/05/planck-small-parser-combinator-library.html

Enjoy,

Jun

[-- Attachment #2: Type: text/html, Size: 2120 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] [ANN] Planck: a small monadic parser combinator library for OCaml
  2011-05-28  2:11 [Caml-list] [ANN] Planck: a small monadic parser combinator library for OCaml Jun Furuse
@ 2011-05-29  9:14 ` Gabriel Scherer
  2011-06-06  6:42   ` Jun Furuse
  0 siblings, 1 reply; 3+ messages in thread
From: Gabriel Scherer @ 2011-05-29  9:14 UTC (permalink / raw)
  To: Jun Furuse; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 4202 bytes --]

>
> The combinators in Planck are implemented simply as functional monadic
> combinators over streams (lazy lists). Unfortunately, it is very slow with
> the current OCaml compiler (3.12.0) due to its huge closure constructions.
> [...] I hope more aggressive in-lining optimizations in the compiler might
> speed up the performance of Planck greatly.
>

Have you considered writing a defunctionalized version of the parsing
library ? There are two different kinds of closure construction in your
code:
- the "bind" interface requires to be passed a function, which is generally
a closure built on the fly
- the parsing monad itself is a function type (it is a state+error monad, so
written in state-passing style)

There is not much you can do for the first closure source, if you want to
keep a monadic interface, but the second cause is inessential. You may
defunctionalize the state monad by reifying it into an algebraic datatype.
The monad computation would return a big data structure (instead of a big
function), and you would then write an interpretation function passing the
state around, without any closure construction.

Moreover, the heavy use of "bind" (which needs a function/closure as
parameter) in your parsing code could be avoided. You may use more parsing
combinators (like <|>) and less binding operators. For example your current
style is to write (v1 <-- p1; v2 <-- p2; return (p1, p2)), you may as well
write something like ((v1,v2) <-- p1 <*> p2; return (v1,v2)), with one less
bind call (the idea is to "currify" successive bindings with product-binding
combinators). I think you should promote the use of combinators over raw
"binds".

Your blog article concludes that the OCaml compiler does not optimize enough
function and closures, making the use of monads impractical. I think this is
only true for the closure-using implementations, which are not the only way
to write such monads. The functional programming litterature is full of
tricks to turn functions into datastructure, and this is almost always a
win. In particular you may be able to implement domain-specific
"optimizations" that are out of reach for general compilers.

I'm sorry I have nothing more concrete here; I think the transformations to
defunctionalize the core should not be too complicated, but unfortunately I
have failed to build planck in a reasonable time so I didn't try it myself.

On Sat, May 28, 2011 at 4:11 AM, Jun Furuse <jun.furuse@gmail.com> wrote:

> Hi,
>
> I've released a small monadic parser combinator library for OCaml, Planck
> version 1.0.0, available at:
>
>     https://bitbucket.org/camlspotter/planck/get/v1.0.0.tar.gz
>
> It is firstly just for my fun to learn what is Parsec/parser combinators,
> but it is now elaborated to something useful:
>
>     - input positions by lines and columns
>     - char specialized stream for better performance
>     - operator precedence/associativity resolver
>     - memoization module for efficient backtracks
>
> For example I could implement OCaml syntax lexer and parser using Planck.
>
> REQUIREMENTS: unfortunately Planck depends on many things:
>
>   - ocaml 3.12.0 or higher
>   - findlib
>   - omake
>   - type-conv 2.3.0 and sexplib 5.2.1 (available from
> http://ocaml.janestreet.com/?q=node/13)
>   - spotlib (my small utility functions, available at
> http://bitbucket.org/camlspotter/spotlib/ )
>
>   The followings are required to compiler ocaml syntax parser example:
>   - pa_monad_custom ( http://bitbucket.org/camlspotter/pa_monad_custom/
>   - ocaml 3.12.0 source tree and lablgtk-2.14.2 source code tree for
> testing
>
> The combinators in Planck are implemented simply as functional monadic
> combinators over streams (lazy lists). Unfortunately, it is very slow with
> the current OCaml compiler (3.12.0) due to its huge closure constructions:
> it is about x100 slower than the traditional ocamllex+ocamlyacc. I hope more
> aggressive in-lining optimizations in the compiler might speed up the
> performance of Planck greatly. You can read some of my rough considerations
> in this topic at:
>
>
> http://camlspotter.blogspot.com/2011/05/planck-small-parser-combinator-library.html
>
> Enjoy,
>
> Jun
>
>

[-- Attachment #2: Type: text/html, Size: 5247 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] [ANN] Planck: a small monadic parser combinator library for OCaml
  2011-05-29  9:14 ` Gabriel Scherer
@ 2011-06-06  6:42   ` Jun Furuse
  0 siblings, 0 replies; 3+ messages in thread
From: Jun Furuse @ 2011-06-06  6:42 UTC (permalink / raw)
  To: Gabriel Scherer; +Cc: caml-list

Sorry for late reply.

> Have you considered writing a defunctionalized version of the parsing
> library ? There are two different kinds of closure construction in your
> code:
> - the "bind" interface requires to be passed a function, which is generally
> a closure built on the fly
> - the parsing monad itself is a function type (it is a state+error monad, so
> written in state-passing style)
>
> There is not much you can do for the first closure source, if you want to
> keep a monadic interface, but the second cause is inessential. You may
> defunctionalize the state monad by reifying it into an algebraic datatype.
> The monad computation would return a big data structure (instead of a big
> function), and you would then write an interpretation function passing the
> state around, without any closure construction.

Yeah, I had just started some test implementation of this kind, but
removed it from the repo before the release.

I had fixed the 1st of April as the dead line of the personal weekend
project, and it was already in mid May, so I released it. Now I am
working on something different (the port of ocamlspot to 3.13 binannot
branch), and have no personal time work on Planck, but I am always
welcome to hear responses. Thanks!

> Moreover, the heavy use of "bind" (which needs a function/closure as
> parameter) in your parsing code could be avoided. You may use more parsing
> combinators (like <|>) and less binding operators. For example your current
> style is to write (v1 <-- p1; v2 <-- p2; return (p1, p2)), you may as well
> write something like ((v1,v2) <-- p1 <*> p2; return (v1,v2)), with one less
> bind call (the idea is to "currify" successive bindings with product-binding
> combinators). I think you should promote the use of combinators over raw
> "binds".

The preprocessor hack described in my blog post is almost the same
effort, and it improved the performance a lot.

The idea of using <*> is nice for two, but not beautiful enough if I
have to write ((((((v1,v2), v3), v4), v5), v6) <-- p1 <*> p2 <*> p3
<*> ... <*> p6. Any good trick to write (v1, v2, v3, v4, v5, ..., vn)
<-- p1 <*> p2 <*> .. <*> pn ? I am afraid I have to use another CamlP4
trick here...

Jun

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-06-06  6:43 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-28  2:11 [Caml-list] [ANN] Planck: a small monadic parser combinator library for OCaml Jun Furuse
2011-05-29  9:14 ` Gabriel Scherer
2011-06-06  6:42   ` Jun Furuse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).