caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: "Daniel Bünzli" <daniel.buenzli@erratique.ch>
To: Nicolas Ojeda Bar <no263@dpmms.cam.ac.uk>
Cc: Gerd Stolpmann <info@gerd-stolpmann.de>,
	Francois.Pottier@inria.fr,  caml-list@inria.fr
Subject: Re: [Caml-list] New release of Menhir (20141215)
Date: Thu, 18 Dec 2014 16:20:18 +0100	[thread overview]
Message-ID: <C5EC132E1EA94F8E9D38BE580D9990B8@erratique.ch> (raw)
In-Reply-To: <CAPunWhD=DgnPRXJo60ppx_sGbGeVbzYSXDCqffk2dMKJ8K=Vdw@mail.gmail.com>

Le jeudi, 18 décembre 2014 à 15:19, Nicolas Ojeda Bar a écrit :
> When programming monadically you are reifying the continuation at
> every `bind` point so you get the incremental bit for free (I think
> this goes by the fancy name of `iteratees` nowadays). On the other
> hand it suffers from poor time/space performance common to this type
> of parser. Also, while combinator parsers can handle arbitrary
> backtracking (and you end up paying for this), IMAP itself requires
> very little, so it would seem that the full flexibility of combinators
> are not needed.

For a long time I played with combinator parsers but I never got to the point of being satisfied with the result. I also played a little bit with iteratees but couldn't get the performance I wanted with the full model in all its compositional glory.   

Nowadays I simply write my streaming lexers/parsers manually in CPS and you can drive them in non-blocking mode with a single, fixed size, buffer. See here [1] for an interface and an implementation on a toy example.  

If you get the lowest continuation decoding bits correctly (e.g. don't bind on each byte) you can actually get good time and space performance. Once you have handled that (mainly the painful bits about read overlapping two input buffers) you get very nice parsing flexibility, e.g. for things like text encoding discovery where you need to patch the continuation with the appropriate character decoder. If you care for your users you can also get very good error reporting and error recovery capabilities by applying knowledge specific to the decoded protocol. That's the way Uutf [2], Jsonm [3] (on top of Uutf) and Dicomm [4] are programmed.  

Best,

Daniel

[1] https://github.com/dbuenzli/nbcodec/blob/master/RATIONALE.md
[2] http://erratique.ch/software/uutf
[3] http://erratique.ch/software/jsonm
[4] http://erratique.ch/software/dicomm
  



  reply	other threads:[~2014-12-18 15:20 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-17 20:14 Francois Pottier
2014-12-18 12:45 ` Gerd Stolpmann
2014-12-18 14:19   ` Nicolas Ojeda Bar
2014-12-18 15:20     ` Daniel Bünzli [this message]
2014-12-18 15:34       ` Simon Cruanes
2014-12-18 16:02         ` Nicolas Ojeda Bar
2014-12-18 15:25     ` Gerd Stolpmann
2014-12-18 17:25       ` Francois Pottier
2014-12-22 11:13     ` oleg
2014-12-22 18:40 ` Dario Teixeira
2014-12-24 23:30   ` Francois Pottier
2014-12-26 11:13     ` Dario Teixeira
2014-12-26 11:31       ` Frédéric Bour
2014-12-26 12:16         ` Dario Teixeira

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C5EC132E1EA94F8E9D38BE580D9990B8@erratique.ch \
    --to=daniel.buenzli@erratique.ch \
    --cc=Francois.Pottier@inria.fr \
    --cc=caml-list@inria.fr \
    --cc=info@gerd-stolpmann.de \
    --cc=no263@dpmms.cam.ac.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).