From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id p4T9F0vq021935 for ; Sun, 29 May 2011 11:15:00 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: At4BAIUN4k3RVdy2mGdsb2JhbAA/AQMSmDKGMYdVCBQBAQEBAQgJDQcUJYhxokGMHoI3gzQ5iGIBAQMGhhgEgjeOGIhmgkA7gzs X-IronPort-AV: E=Sophos;i="4.65,288,1304287200"; d="scan'208";a="100069455" Received: from mail-vx0-f182.google.com ([209.85.220.182]) by mail4-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 29 May 2011 11:14:54 +0200 Received: by vxc34 with SMTP id 34so4090524vxc.27 for ; Sun, 29 May 2011 02:14:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type; bh=ppZMX7Wtq5zNFDxL8+2twLm2988Km1OgxOW++MLJWbU=; b=tYJjI9w6pOcLi/NdOmjI3fr5uN211+RKDkD3kaTyDsv9m7H5OsqfQ8t/4hgj9jfs/i 1kNFleCoPMWsAfi8lKE3g6UpO5oso29O1e6BxkqrdVp4XmR8GFPTKcpUqcIvu9vlKNsH mo80ZsK4DglcHeHGZyLXJkXp+94DwMj7sk46s= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; b=YkvKxyXWMI50N2KwM5Zj1XzYKkXhNKu3FVCbgwdC7x5mNoAgUjHpa+3APHQp5yk3pu a+FxvrMm7rpppRZ+P8pYT3VPxZl6Y/3vP/v4GzIy0gBbDdtPQ7PwnDp5EmoVxmclon23 09mhuiupPxbFq5dkunYLOEwBRN82AtQA4ikvQ= Received: by 10.52.112.106 with SMTP id ip10mr1720515vdb.127.1306660493063; Sun, 29 May 2011 02:14:53 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.158.225 with HTTP; Sun, 29 May 2011 02:14:32 -0700 (PDT) In-Reply-To: References: From: Gabriel Scherer Date: Sun, 29 May 2011 11:14:32 +0200 Message-ID: To: Jun Furuse Cc: caml-list Content-Type: multipart/alternative; boundary=bcaec548a1cf87f7a104a4669aac Subject: Re: [Caml-list] [ANN] Planck: a small monadic parser combinator library for OCaml --bcaec548a1cf87f7a104a4669aac Content-Type: text/plain; charset=ISO-8859-1 > > The combinators in Planck are implemented simply as functional monadic > combinators over streams (lazy lists). Unfortunately, it is very slow with > the current OCaml compiler (3.12.0) due to its huge closure constructions. > [...] I hope more aggressive in-lining optimizations in the compiler might > speed up the performance of Planck greatly. > Have you considered writing a defunctionalized version of the parsing library ? There are two different kinds of closure construction in your code: - the "bind" interface requires to be passed a function, which is generally a closure built on the fly - the parsing monad itself is a function type (it is a state+error monad, so written in state-passing style) There is not much you can do for the first closure source, if you want to keep a monadic interface, but the second cause is inessential. You may defunctionalize the state monad by reifying it into an algebraic datatype. The monad computation would return a big data structure (instead of a big function), and you would then write an interpretation function passing the state around, without any closure construction. Moreover, the heavy use of "bind" (which needs a function/closure as parameter) in your parsing code could be avoided. You may use more parsing combinators (like <|>) and less binding operators. For example your current style is to write (v1 <-- p1; v2 <-- p2; return (p1, p2)), you may as well write something like ((v1,v2) <-- p1 <*> p2; return (v1,v2)), with one less bind call (the idea is to "currify" successive bindings with product-binding combinators). I think you should promote the use of combinators over raw "binds". Your blog article concludes that the OCaml compiler does not optimize enough function and closures, making the use of monads impractical. I think this is only true for the closure-using implementations, which are not the only way to write such monads. The functional programming litterature is full of tricks to turn functions into datastructure, and this is almost always a win. In particular you may be able to implement domain-specific "optimizations" that are out of reach for general compilers. I'm sorry I have nothing more concrete here; I think the transformations to defunctionalize the core should not be too complicated, but unfortunately I have failed to build planck in a reasonable time so I didn't try it myself. On Sat, May 28, 2011 at 4:11 AM, Jun Furuse wrote: > Hi, > > I've released a small monadic parser combinator library for OCaml, Planck > version 1.0.0, available at: > > https://bitbucket.org/camlspotter/planck/get/v1.0.0.tar.gz > > It is firstly just for my fun to learn what is Parsec/parser combinators, > but it is now elaborated to something useful: > > - input positions by lines and columns > - char specialized stream for better performance > - operator precedence/associativity resolver > - memoization module for efficient backtracks > > For example I could implement OCaml syntax lexer and parser using Planck. > > REQUIREMENTS: unfortunately Planck depends on many things: > > - ocaml 3.12.0 or higher > - findlib > - omake > - type-conv 2.3.0 and sexplib 5.2.1 (available from > http://ocaml.janestreet.com/?q=node/13) > - spotlib (my small utility functions, available at > http://bitbucket.org/camlspotter/spotlib/ ) > > The followings are required to compiler ocaml syntax parser example: > - pa_monad_custom ( http://bitbucket.org/camlspotter/pa_monad_custom/ > - ocaml 3.12.0 source tree and lablgtk-2.14.2 source code tree for > testing > > The combinators in Planck are implemented simply as functional monadic > combinators over streams (lazy lists). Unfortunately, it is very slow with > the current OCaml compiler (3.12.0) due to its huge closure constructions: > it is about x100 slower than the traditional ocamllex+ocamlyacc. I hope more > aggressive in-lining optimizations in the compiler might speed up the > performance of Planck greatly. You can read some of my rough considerations > in this topic at: > > > http://camlspotter.blogspot.com/2011/05/planck-small-parser-combinator-library.html > > Enjoy, > > Jun > > --bcaec548a1cf87f7a104a4669aac Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
The combinators i= n Planck are implemented simply as functional monadic=20 combinators over streams (lazy lists). Unfortunately, it is very slow=20 with the current OCaml compiler (3.12.0) due to its huge closure=20 constructions. [...] I hope more aggressive in-lining optimizations in the = compiler might speed up the performance of Planck greatly.
=

Have you considered writing a defunctionalized version of the pars= ing library ? There are two different kinds of closure construction in your= code:
- the "bind" interface requires to be passed a function, which is= generally a closure built on the fly
- the parsing monad itself is a fu= nction type (it is a state+error monad, so written in state-passing style)<= br>
There is not much you can do for the first closure source, if you want = to keep a monadic interface, but the second cause is inessential. You may d= efunctionalize the state monad by reifying it into an algebraic datatype. T= he monad computation would return a big data structure (instead of a big fu= nction), and you would then write an interpretation function passing the st= ate around, without any closure construction.

Moreover, the heavy use of "bind" (which needs a function/clo= sure as parameter) in your parsing code could be avoided. You may use more = parsing combinators (like <|>) and less binding operators. For exampl= e your current style is to write (v1 <-- p1; v2 <-- p2; return (p1, p= 2)), you may as well write something like ((v1,v2) <-- p1 <*> p2; = return (v1,v2)), with one less bind call (the idea is to "currify"= ; successive bindings with product-binding combinators). I think you should= promote the use of combinators over raw "binds".

Your blog article concludes that the OCaml compiler does not opti= mize enough function and closures, making the use of monads impractical. I = think this is only true for the closure-using implementations, which are no= t the only way to write such monads. The functional programming litterature= is full of tricks to turn functions into datastructure, and this is almost= always a win. In particular you may be able to implement domain-specific &= quot;optimizations" that are out of reach for general compilers.

I'm sorry I have nothing more concrete here; I think the transforma= tions to defunctionalize the core should not be too complicated, but unfort= unately I have failed to build planck in a reasonable time so I didn't = try it myself.

On Sat, May 28, 2011 at 4:11 AM, Jun Furuse = <jun.furuse@gm= ail.com> wrote:
Hi,

I've released a small monadic parser combinator library for = OCaml, Planck version 1.0.0, available at:

=A0=A0=A0 h= ttps://bitbucket.org/camlspotter/planck/get/v1.0.0.tar.gz

It is firstly just for my fun to learn what is Parsec/parser combinator= s, but it is now elaborated to something useful:

=A0=A0=A0 - input p= ositions by lines and columns
=A0=A0=A0 - char specialized stream for be= tter performance
=A0=A0=A0 - operator precedence/associativity resolver
=A0=A0=A0 - memoi= zation module for efficient backtracks

For example I could implement= OCaml syntax lexer and parser using Planck.

REQUIREMENTS: unfortuna= tely Planck depends on many things:

=A0 - ocaml 3.12.0 or higher
=A0 - findlib
=A0 - omake
=A0 - t= ype-conv 2.3.0 and sexplib 5.2.1 (available from http://ocaml.janestreet.com/?q= =3Dnode/13)
=A0 - spotlib (my small utility functions, available at http://bitbucket.org/c= amlspotter/spotlib/ )

=A0 The followings are required to compiler ocaml syntax parser example= :
=A0 - pa_monad_custom ( http://bitbucket.org/camlspotter/pa_monad= _custom/
=A0 - ocaml 3.12.0 source tree and lablgtk-2.14.2 source code tree for test= ing

The combinators in Planck are implemented simply as functional monadic = combinators over streams (lazy lists). Unfortunately, it is very slow with = the current OCaml compiler (3.12.0) due to its huge closure constructions: = it is about x100 slower than the traditional ocamllex+ocamlyacc. I hope mor= e aggressive in-lining optimizations in the compiler might speed up the per= formance of Planck greatly. You can read some of my rough considerations in= this topic at:

=A0=A0=A0 http://camlspotter.blo= gspot.com/2011/05/planck-small-parser-combinator-library.html

En= joy,

Jun


--bcaec548a1cf87f7a104a4669aac--