From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by c5ff346549e7 (Postfix) with ESMTPS id 0C7065D5 for ; Thu, 25 Jul 2019 11:42:01 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.64,306,1559512800"; d="scan'208,217";a="393276992" Received: from sympa.inria.fr ([193.51.193.213]) by mail2-relais-roc.national.inria.fr with ESMTP; 25 Jul 2019 13:42:01 +0200 Received: by sympa.inria.fr (Postfix, from userid 20132) id 28F8A826CE; Thu, 25 Jul 2019 13:42:01 +0200 (CEST) Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id E3893826D8 for ; Thu, 25 Jul 2019 13:41:58 +0200 (CEST) Authentication-Results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=gabriel.scherer@gmail.com; spf=Pass smtp.mailfrom=gabriel.scherer@gmail.com; spf=None smtp.helo=postmaster@mail-qt1-f176.google.com IronPort-PHdr: =?us-ascii?q?9a23=3Ac2TRJhXanRqeXTeqp34adBacIU/V8LGtZVwlr6E/?= =?us-ascii?q?grcLSJyIuqrYbBCFt8tkgFKBZ4jH8fUM07OQ7/m6HzxaqsrZ+Fk5M7V0Hycfjs?= =?us-ascii?q?sXmwFySOWkMmbcaMDQUiohAc5ZX0Vk9XzoeWJcGcL5ekGA6ibqtW1aFRrwLxd6?= =?us-ascii?q?KfroEYDOkcu3y/qy+5rOaAlUmTaxe7x/IAiooQnLtcQan4RuJ6ktxhDUvnZGZu?= =?us-ascii?q?NayH9yK1mOhRj8/MCw/JBi8yRUpf0s8tNLXLv5caolU7FWFSwqPG8p6sLlsxnD?= =?us-ascii?q?VhaP6WAHUmoKiBpIAhPK4w/8U5zsryb1rOt92C2dPc3rUbA5XCmp4ql3RBP0ji?= =?us-ascii?q?oMKiU0+3/LhMNukK1boQqhpx1hzI7SfIGVL+d1cqfEcd8HWWZNQsNdWipcCY2+?= =?us-ascii?q?coQPFfIMMulWr4b/p1UAoxiwCxSyCuzz0TJHnGP60Lcg3ug9DQ3L3gotFM8Ovn?= =?us-ascii?q?TOq9X1Mb8fX+Gvw6bT1zXDbu1Z2TPg44bVbh8hoe+DXahufsrL1EIiEAzFgU+L?= =?us-ascii?q?poz/PjOayOANv3KA7+V8VeKglXQnpB9rojW0yccsj5PGhoMRylze6Sp5x4M1KM?= =?us-ascii?q?S+RUVmb9CkF55QuDubN4twWs4iTGBouDo6yr0bopG3ZjQFyJMixxPZdveJcJCI?= =?us-ascii?q?7wr9WOqNJTp0nnFodbKlixqv8EWs1vfwWtS23VtLqCdOj8PCuWoX1xPJ78iKUv?= =?us-ascii?q?t98Vml2TaIzw3T7/tLIUEwlabCMp4h3qM8moMdsUjeHCL7mV/6jKCRdkUj9eio?= =?us-ascii?q?7/robq/6qZ+bMo94kgD+MqIwlcyjGek0LBQCUmyB9em/1LDv51D1TbRWgvEsj6?= =?us-ascii?q?XUspHXKdwepqGjAg9V1ogj6wy4DzejyNkYkmMII0lfeBKGkYfpP0vCIOvkAve/?= =?us-ascii?q?nVusiilkx+rdM73uB5XCNHnDkLP7cblh7E5czRI/zcpD6JJMFrEBPPXzV1ftu9?= =?us-ascii?q?PCCx85NxW4w+LmCNVmyoMTQnmPA6+cMKPKq1CE/OMvI++WZI8UojnxMfYl5+S9?= =?us-ascii?q?xUM+zH0adqStzJ0gU/GiGegud0eeanfok9FHCmoQuRYWUefjzlOYB219fXG3Co?= =?us-ascii?q?017Cs6BYbuNozDS5qgmvTV0y6xBJxbYiZdAVCBC3ryX4qBUvYILimVJ5kywXQ/?= =?us-ascii?q?SbG9Rtp5hlmVvwjgxu8id7KMo3FKhdfYzNFwotbru1Qq7zUtVpaS1miMSyd/mW?= =?us-ascii?q?ZaH2ZrjpA6mlR0zxK46YY9g/FcEoYNtfZAUwN/LIKFiuIjVI60VQXGcdOEDl2h?= =?us-ascii?q?R4f+WGBjfpcK29YLJn1FNZCnhxHH0TCtBuZMxbOODZ0wtKnb2iqoKg=3D=3D?= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0BFAgAPlTldhrCgVdFmHgEGBwaBZ4MEU?= =?us-ascii?q?TMqhB2BHYJejRqCD4pZkCIJAQMBDCMMAQGEQAKCXBsHAQQ0EwEDAQEEAQECAQM?= =?us-ascii?q?DARMBAQEICwsIKYUlDII6KQGCZgEBAQECASMEGQEbHQEDAQsGAwILDSoCAiIBE?= =?us-ascii?q?QEFARwGE4MiAYFpAQMODw+RZpAOPIshfxYFAReCewWEQwoZJw1fA4E8AgEGEoE?= =?us-ascii?q?ii2CBVz+BEYJkLj6CSBkCAoIMgl6CWASqBG0HAoIcXQSFeIhsBYRGG5gMjFuIK?= =?us-ascii?q?ZAkDyGBRkuBLjMaI1AxgjuCQgwOCRSDOopVQDCOWwEB?= X-IPAS-Result: =?us-ascii?q?A0BFAgAPlTldhrCgVdFmHgEGBwaBZ4MEUTMqhB2BHYJejRq?= =?us-ascii?q?CD4pZkCIJAQMBDCMMAQGEQAKCXBsHAQQ0EwEDAQEEAQECAQMDARMBAQEICwsIK?= =?us-ascii?q?YUlDII6KQGCZgEBAQECASMEGQEbHQEDAQsGAwILDSoCAiIBEQEFARwGE4MiAYF?= =?us-ascii?q?pAQMODw+RZpAOPIshfxYFAReCewWEQwoZJw1fA4E8AgEGEoEii2CBVz+BEYJkL?= =?us-ascii?q?j6CSBkCAoIMgl6CWASqBG0HAoIcXQSFeIhsBYRGG5gMjFuIKZAkDyGBRkuBLjM?= =?us-ascii?q?aI1AxgjuCQgwOCRSDOopVQDCOWwEB?= X-IronPort-AV: E=Sophos;i="5.64,306,1559512800"; d="scan'208,217";a="393276949" X-MGA-submission: =?us-ascii?q?MDHvtSRzM8uWBG5Fw7tCLrKQur8QTXsLh93y9s?= =?us-ascii?q?1P7/pql5x19zNsqa7rinWQ15DAKWYwC7DApapINJzelViLQjw+A7ensE?= =?us-ascii?q?Xc75C4xO+xPKGoQSOv0jTYfDnCPVzxSNRsbEzF+mOYsmk0Cz3jriOgxe?= =?us-ascii?q?74qqkynBlc7XoUNwbPMuA7Yg=3D=3D?= Received: from mail-qt1-f176.google.com ([209.85.160.176]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/AES128-GCM-SHA256; 25 Jul 2019 13:41:51 +0200 Received: by mail-qt1-f176.google.com with SMTP id d17so48617743qtj.8 for ; Thu, 25 Jul 2019 04:41:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ibRpsSlYIh/f6b1CIPZ7QphwiYkpZ9uutsSZvxTLfhM=; b=aqdUdoYNm2uxuB42X6gbaiMlIvdZsfn5Sk5nM712d3FJpxIilv8XtjQJ1ORanl6V4r DlZWMfc9R/f4SMWcNtM2XzvdZDB/EC5LQzNCoyt0MQcMBCB20sW9/1fcvzQSH75BSec4 ZpPe58aaBOdq9CLIfrogZbcxnn+DV5wbPeasmLucLJaBzlbfDit75DB/ieYJCSHh73sy ewcPMwBZjUTvdFol1Xu6WUwnXhnRr6bR54IXAoF+N/Osu4BSZhweuoFaCuo9s+m1hyXU mJK+ReuhmZFaFreMikWfNMgkfTEUU9ziQI8auo93tv9ke7d/7hOogzOD+3SFs+athL3J FLUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ibRpsSlYIh/f6b1CIPZ7QphwiYkpZ9uutsSZvxTLfhM=; b=WtPiyLj4Y6sf6w9WqF3llQnupjgYNBs+whA/KnfiraMrayw0gIyow/zEF/OTaJbhan dxWFnQHj5JsiFEBGiQx4vR0ehRJ79SstbvCaiIEJY68eafzAtWrI/6LSfGth9OmHdUYq axnhVylQenLtSLUhdp833JvCL7KMDKhLHiG5kfmumQoATpXzSKQZOfqZAHTbAhSXsLnH FoUqs6NUMYHDJ7iXxssT9Pk9iZqsjeIMGpid2AtIXsvqND4XgPppXv2RBXwYhsxzPLmn 3T5DK7qOniKJn1Ki5K24kq5dLy46XT4Lor86qbypdZFM3hUPHdlWSksPQxmGowHMvS7N ZrZQ== X-Gm-Message-State: APjAAAVOxqeGJcR2SJKCSJeWUx36RqTzIx3pvna1YRDpZizjx+r0ufec PeXcvKlEs3v/nVhQy6U1iq5YszGUifrr0Qc9Xl0= X-Google-Smtp-Source: APXvYqxiacWnm6TLdVmJk2tEIGtqiJT3/NAObd3MHzlmWLfeES+hrOKPRhpTGzSENxpkc76lI4nnBzdNc1qExjwE2Fw= X-Received: by 2002:ac8:384c:: with SMTP id r12mr61351337qtb.153.1564054910191; Thu, 25 Jul 2019 04:41:50 -0700 (PDT) MIME-Version: 1.0 References: <6300BCD6-A188-46E1-BA13-D9ACBE80FD49@uca.fr> In-Reply-To: <6300BCD6-A188-46E1-BA13-D9ACBE80FD49@uca.fr> From: Gabriel Scherer Date: Thu, 25 Jul 2019 13:42:49 +0200 Message-ID: To: =?UTF-8?Q?Jocelyn_S=C3=A9rot?= Cc: caml users Content-Type: multipart/alternative; boundary="00000000000054ce43058e7fe91e" Subject: Re: [Caml-list] Camlp4-free implementation of stream parsers (was camlp4 & OCaml 4.08) Reply-To: Gabriel Scherer X-Loop: caml-list@inria.fr X-Sequence: 17706 Errors-to: caml-list-owner@inria.fr Precedence: list Precedence: bulk Sender: caml-list-request@inria.fr X-no-archive: yes List-Id: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --00000000000054ce43058e7fe91e Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi, The parser from https://github.com/jserot/lascar/blob/master/src/lib/fsm_expr.ml seems fairly trivial, have you considered just rewriting it to use the functions of the Stream primitives directly? For example, roughly (I have not tried to type-check or test the code), let rec aux =3D parser | [< 'Int n when n<0; t=3Daux >] -> [< 'Kwd "-"; 'Int (-n); t >] | [< 'h; t=3Daux >] -> [< 'h; t >] | [< >] -> [< >] in would become let aux s =3D let next =3D ref [] in Stream.from @@ fun _ -> match !next with | tok::toks -> next :=3D toks; Some tok | [] -> match Stream.next with | exception Stream.Failure -> None | Int n when n < 0 -> next :=3D [Int (-n)]; Some (Kwd "-") | tok -> Some tok and let rec p_exp0 =3D parser | [< 'Int n >] -> EConst n | [< 'Ident i >] -> EVar i | [< 'Kwd "("; e=3Dp_exp ; 'Kwd ")" >] -> e becomes let rec p_exp0 s =3D match Stream.next s with | Int n -> EConst n | Ident i -> EVar i | Kwd "(" -> let e =3D p_exp s in match Stream.peek s with | Some (Kwd ")") -> Stream.junk s; e | _ -> raise Stream.Failure This is not exactly exciting code to write, but it's not a lot of work either for such a simple grammar. On Thu, Jul 25, 2019 at 12:28 PM Jocelyn S=C3=A9rot = wrote: > HI Daniil, > > Thanks for the example. It clearly shows how to embed a Menhir-specified > parser into an existing program. > > I still think, however that using Menhir for parsing arithmetic > expressions is a bit overkill. > > I=E2=80=99m having a look at Angstrom (and all the other parser combinato= r libs > cited on the corresp. page). > It seems simpler. > > Jocelyn > > Le 24 juil. 2019 =C3=A0 17:31, Daniil Baturin a =C3= =A9crit : > > > Hi Jocelyn, > > > > I've completed the first version of my project, so now I can start > > looking into this again! > > > > There's a third option: parser combinators like angstrom. > > My experience with Menhir is very positive though. After initial > > struggle, I came to like its new incremental API and declarative error > > reporting. > > > > Here's my parser for an extended BNF: > > Menhir grammar: > > https://github.com/dmbaturin/bnfgen/blob/master/src/bnf_parser.mly > > Parser driver that feeds it tokens: > > https://github.com/dmbaturin/bnfgen/blob/master/src/parse_bnf.ml > > Error messages: > > https://github.com/dmbaturin/bnfgen/blob/master/src/bnf_parser.messages > > Error message module build: > > https://github.com/dmbaturin/bnfgen/blob/master/src/dune#L6-L8 > > > > On 7/24/19 10:10 PM, Jocelyn S=C3=A9rot wrote: > >> Hi Daniil (and everyone interested by the subject), > >> > >> Did you have a closer look at this ? > >> > >> I=E2=80=99m still hesitating between these three approaches for replac= ing the > implementation of the small arithm expression parser used in Lascar [1] : > >> > >> i. rewrite it using the basic fns provided by the Stream library (pro: > no additionnal dependency, cons: not so trivial..) > >> > >> ii. replace camlp4 by camlp5 (pro: straightforward, cons: long term > maintainability of camlp5 (?)) > >> > >> iii. rewrite it using ocamlex/menhir and embed it in the main code > (pro: =C2=AB standard =C2=BB soon; cons: a bit heavy) > >> > >> Jocelyn > >> > >> [1] https://github.com/jserot/lascar/blob/master/src/lib/fsm_expr.ml, > lines 70=E2=80=93112 > >> > >> Le 2 juil. 2019 =C3=A0 11:25, Daniil Baturin a = =C3=A9crit : > >> > >>> Hi Jocelyn, > >>> Camlp5 is still sort of maintained, but I don't think it's going to b= e > >>> developed beyond compatibility updates. > >>> For syntax extensions, everyone is switching to PPX. > >>> > >>> From a quick look, it seems like the only bit of camlp4 you use is > >>> stream expressions. > >>> This is one of the things PPX can't do (on purpose, since it doesn't > >>> allow _arbitrary_ extensions), > >>> but I don't think just using streams directly is going to make code > much > >>> longer. > >>> > >>> Or I missed some other camlp4 bits? > >>> > >>> I'm ready to work on a patch if you are open to it. > >>> > >>> On 7/2/19 1:44 PM, Jocelyn S=C3=A9rot wrote: > >>>> Le 29 juin 2019 =C3=A0 17:15, Daniil Baturin a = =C3=A9crit > : > >>>> > >>>>> Perhaps we should make some coordinated effort to help them. > >>>>> I've just sent a pull request to the ocamldot maintainer that enabl= es > >>>>> the graphviz files parsing and printing modules > >>>>> to build and work with 4.08. The GTK parts have their own issues. > >>>>> Next I'm going to look into LASCAR/RFSM (packages that interest me > first ;). > >>>>> > >>>> Hi Daniil, > >>>> > >>>> I=E2=80=99ve been been thinking of removing the dependency of Lascar= and RFSM > on camlp4 for a while. > >>>> Is switching to CamlP5 a good alternative ? > >>>> > >>>> Jocelyn > >>>> > >>>> > >>> > > > > > > > --00000000000054ce43058e7fe91e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi,

The parser from https= ://github.com/jserot/lascar/blob/master/src/lib/fsm_expr.ml seems fairl= y trivial, have you considered just rewriting it to use the functions of th= e Stream primitives directly?

For example, roughly= (I have not tried to type-check or test the code),

=C2=A0 =C2=A0 let rec aux =3D parser
= =C2=A0 =C2=A0 | [< 'Int n when n<0; t=3Daux >] -> [< = 9;Kwd "-"; 'Int (-n); t >]
=C2=A0 =C2=A0 | [< 'h= ; t=3Daux >] -> [< 'h; t >]
=C2=A0 =C2=A0 | [< >] = -> [< >] in

would become

=C2=A0 =C2=A0 let aux s= =3D
=C2=A0 =C2=A0 =C2=A0 let next =3D ref [] in
=C2=A0 =C2=A0 =C2=A0= Stream.from @@ fun _ ->
=C2=A0 =C2=A0 =C2=A0 =C2=A0 match !next with=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 | tok::toks -> next :=3D toks; Some tok<= br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 | [] -> match Stream.next with
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | exception Stream.Failure= -> None
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | In= t n when n < 0 ->
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 next :=3D [Int (-n)];
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 Some (Kwd "-")
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | tok -> Some tok
<= span class=3D"gmail-pl-k">=C2=A0
and

=C2=A0 =C2=A0 let rec p_exp0 =C2=A0=3D parser
= =C2=A0 =C2=A0 | [< 'Int n >] -> EConst n
=C2=A0 =C2=A0 | [&= lt; 'Ident i >] -> EVar i
=C2=A0 =C2=A0 | [< 'Kwd "= ;("; e=3Dp_exp ; 'Kwd ")" >] -> e =C2=A0 =C2=A0 = =C2=A0 =C2=A0

becomes

=C2=A0 = =C2=A0 let rec p_exp0 s =3D match Stream.next s with
=C2=A0 =C2=A0 | Int= n -> EConst n
=C2=A0 =C2=A0 | Ident i -> EVar i
=C2=A0 =C2=A0 = | Kwd "(" ->
=C2=A0 =C2=A0 =C2=A0 let e =3D p_exp s in
= =C2=A0 =C2=A0 =C2=A0 match Stream.peek s with
=C2=A0 =C2=A0 =C2=A0 | Som= e (Kwd ")") -> Stream.junk s; e
=C2=A0 =C2=A0 =C2=A0 | _ -&= gt; raise Stream.Failure

This is not exactly excit= ing code to write, but it's not a lot of work either for such a simple = grammar.

On Thu, Jul 25, 2019 at 12:28 PM Jocelyn S=C3=A9rot <<= a href=3D"mailto:jocelyn.serot@uca.fr">jocelyn.serot@uca.fr> wrote:<= br>
HI Daniil,

Thanks for the example. It clearly shows how to embed a Menhir-specified pa= rser into an existing program.

I still think, however that using Menhir for parsing arithmetic expressions= is a bit overkill.

I=E2=80=99m having a look at Angstrom (and all the other parser combinator = libs cited on the corresp. page).
It seems simpler.

Jocelyn

Le 24 juil. 2019 =C3=A0 17:31, Daniil Baturin <daniil@baturin.org> a =C3=A9crit :
> Hi Jocelyn,
>
> I've completed the first version of my project, so now I can start=
> looking into this again!
>
> There's a third option: parser combinators like angstrom.
> My experience with Menhir is very positive though. After initial
> struggle, I came to like its new incremental API and declarative error=
> reporting.
>
> Here's my parser for an extended BNF:
> Menhir grammar:
> https://github.com/dmbaturin/= bnfgen/blob/master/src/bnf_parser.mly
> Parser driver that feeds it tokens:
> https://github.com/dmbaturin/bn= fgen/blob/master/src/parse_bnf.ml
> Error messages:
> https://github.com/dmbat= urin/bnfgen/blob/master/src/bnf_parser.messages
> Error message module build:
> https://github.com/dmbaturin/bnfg= en/blob/master/src/dune#L6-L8
>
> On 7/24/19 10:10 PM, Jocelyn S=C3=A9rot wrote:
>> Hi Daniil (and everyone interested by the subject),
>>
>> Did you have a closer look at this ?
>>
>> I=E2=80=99m still hesitating between these three approaches for re= placing the implementation of the small arithm expression parser used in La= scar [1] :
>>
>> i. rewrite it using the basic fns provided by the Stream library (= pro: no additionnal dependency, cons: not so trivial..)
>>
>> ii. replace camlp4 by camlp5 (pro: straightforward, cons: long ter= m maintainability of camlp5 (?))
>>
>> iii. rewrite it using ocamlex/menhir and embed it in the main code= (pro: =C2=AB standard =C2=BB soon; cons: a bit heavy)
>>
>> Jocelyn
>>
>> [1] https://github.com/jser= ot/lascar/blob/master/src/lib/fsm_expr.ml, lines 70=E2=80=93112
>>
>> Le 2 juil. 2019 =C3=A0 11:25, Daniil Baturin <daniil@baturin.org> a =C3=A9c= rit :
>>
>>> Hi Jocelyn,
>>> Camlp5 is still sort of maintained, but I don't think it&#= 39;s going to be
>>> developed beyond compatibility updates.
>>> For syntax extensions, everyone is switching to PPX.
>>>
>>> From a quick look, it seems like the only bit of camlp4 you us= e is
>>> stream expressions.
>>> This is one of the things PPX can't do (on purpose, since = it doesn't
>>> allow _arbitrary_ extensions),
>>> but I don't think just using streams directly is going to = make code much
>>> longer.
>>>
>>> Or I missed some other camlp4 bits?
>>>
>>> I'm ready to work on a patch if you are open to it.
>>>
>>> On 7/2/19 1:44 PM, Jocelyn S=C3=A9rot wrote:
>>>> Le 29 juin 2019 =C3=A0 17:15, Daniil Baturin <daniil@baturin.org>= a =C3=A9crit :
>>>>
>>>>> Perhaps we should make some coordinated effort to help= them.
>>>>> I've just sent a pull request to the ocamldot main= tainer that enables
>>>>> the graphviz files parsing and printing modules
>>>>> to build and work with 4.08. The GTK parts have their = own issues.
>>>>> Next I'm going to look into LASCAR/RFSM (packages = that interest me first ;).
>>>>>
>>>> Hi Daniil,
>>>>
>>>> I=E2=80=99ve been been thinking of removing the dependency= of Lascar and RFSM on camlp4 for a while.
>>>> Is switching to CamlP5 a good alternative ?
>>>>
>>>> Jocelyn
>>>>
>>>>
>>>
>
>


--00000000000054ce43058e7fe91e--