From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id 27B207FAE1 for ; Thu, 18 Dec 2014 16:20:21 +0100 (CET) Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of daniel.buenzli@erratique.ch) identity=pra; client-ip=74.55.86.74; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="daniel.buenzli@erratique.ch"; x-sender="daniel.buenzli@erratique.ch"; x-conformance=sidf_compatible Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of daniel.buenzli@erratique.ch) identity=mailfrom; client-ip=74.55.86.74; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="daniel.buenzli@erratique.ch"; x-sender="daniel.buenzli@erratique.ch"; x-conformance=sidf_compatible Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@smtp.webfaction.com) identity=helo; client-ip=74.55.86.74; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="daniel.buenzli@erratique.ch"; x-sender="postmaster@smtp.webfaction.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AhwDAATwklRKN1ZKm2dsb2JhbABaDoNKWAGBYoEjwxmFcgKBMgEBAQEBEQEBAQEBBgsLCRQuhA0BBSNWEAsaAiYCAkcQBhuIJAQJvkiWGgEBAQEGAQEBAQEdgSGLI4J7MweCaC6BEwWRSIY/MIQyDyGDMYQ6gziCMIEhPm6CQwEBAQ X-IPAS-Result: AhwDAATwklRKN1ZKm2dsb2JhbABaDoNKWAGBYoEjwxmFcgKBMgEBAQEBEQEBAQEBBgsLCRQuhA0BBSNWEAsaAiYCAkcQBhuIJAQJvkiWGgEBAQEGAQEBAQEdgSGLI4J7MweCaC6BEwWRSIY/MIQyDyGDMYQ6gziCMIEhPm6CQwEBAQ X-IronPort-AV: E=Sophos;i="5.07,601,1413237600"; d="scan'208";a="94027949" Received: from mail6.webfaction.com (HELO smtp.webfaction.com) ([74.55.86.74]) by mail3-smtp-sop.national.inria.fr with ESMTP; 18 Dec 2014 16:20:20 +0100 Received: from [172.20.10.2] (87-236.197-178.cust.bluewin.ch [178.197.236.87]) by smtp.webfaction.com (Postfix) with ESMTP id 245BF2078A91; Thu, 18 Dec 2014 15:20:15 +0000 (UTC) Date: Thu, 18 Dec 2014 16:20:18 +0100 From: =?utf-8?Q?Daniel_B=C3=BCnzli?= To: Nicolas Ojeda Bar Cc: Gerd Stolpmann , Francois.Pottier@inria.fr, caml-list@inria.fr Message-ID: In-Reply-To: References: <20141217201448.GA27253@yquem.inria.fr> <1418906703.5445.5.camel@e130.lan.sumadev.de> X-Mailer: sparrow 1.6.4 (build 1178) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Subject: Re: [Caml-list] New release of Menhir (20141215) Le jeudi, 18 d=C3=A9cembre 2014 =C3=A0 15:19, Nicolas Ojeda Bar a =C3=A9cri= t : > When programming monadically you are reifying the continuation at > every `bind` point so you get the incremental bit for free (I think > this goes by the fancy name of `iteratees` nowadays). On the other > hand it suffers from poor time/space performance common to this type > of parser. Also, while combinator parsers can handle arbitrary > backtracking (and you end up paying for this), IMAP itself requires > very little, so it would seem that the full flexibility of combinators > are not needed. For a long time I played with combinator parsers but I never got to the poi= nt of being satisfied with the result. I also played a little bit with iter= atees but couldn't get the performance I wanted with the full model in all = its compositional glory.=20=20=20 Nowadays I simply write my streaming lexers/parsers manually in CPS and you= can drive them in non-blocking mode with a single, fixed size, buffer. See= here [1] for an interface and an implementation on a toy example.=20=20 If you get the lowest continuation decoding bits correctly (e.g. don't bind= on each byte) you can actually get good time and space performance. Once y= ou have handled that (mainly the painful bits about read overlapping two in= put buffers) you get very nice parsing flexibility, e.g. for things like te= xt encoding discovery where you need to patch the continuation with the app= ropriate character decoder. If you care for your users you can also get ver= y good error reporting and error recovery capabilities by applying knowledg= e specific to the decoded protocol. That's the way Uutf [2], Jsonm [3] (on = top of Uutf) and Dicomm [4] are programmed.=20=20 Best, Daniel [1] https://github.com/dbuenzli/nbcodec/blob/master/RATIONALE.md [2] http://erratique.ch/software/uutf [3] http://erratique.ch/software/jsonm [4] http://erratique.ch/software/dicomm =20=20