From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id 57D2A7EE79 for ; Sun, 15 May 2016 12:32:45 +0200 (CEST) IronPort-PHdr: 9a23:Ibh1lR8hcBW7VP9uRHKM819IXTAuvvDOBiVQ1KB80u4cTK2v8tzYMVDF4r011RmSDdSdsa4P0bGempujcFJDyK7JiGoFfp1IWk1NouQttCtkPvS4D1bmJuXhdS0wEZcKflZk+3amLRodQ56mNBXsq3G/pQQfBg/4fVIsYL+lS8iI04/tjKibwN76XUZhvHKFe7R8LRG7/036l/I9ps9cEJs30QbDuXBSeu5blitCLFOXmAvgtI/rpMYwuwwZgf8q9tZBXKPmZOx4COUAVHV1BVso/9XmvgXvSg6G531UEjlH00kAPw+Qwgt7UhbrsyCynO1gwmHOM9f7Qb0uWD/k5aB2UjfsgSQOPTc/tmfalpojorhcpUeHphd4x4fPKKaXOfZ3NonUZ5tOQ2tKWcJYTGpGAI6wZs0FBvApOetIrof84VAJqE3tVkGXGOrzx2oQ1TfN1qog3rFkTFjL Authentication-Results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=nicolas.ojeda.bar@lexifi.com; spf=None smtp.mailfrom=nicolas.ojeda.bar@lexifi.com; spf=None smtp.helo=postmaster@mx20.yaziba.net Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of nicolas.ojeda.bar@lexifi.com) identity=pra; client-ip=85.233.204.164; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="nicolas.ojeda.bar@lexifi.com"; x-sender="nicolas.ojeda.bar@lexifi.com"; x-conformance=sidf_compatible Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of nicolas.ojeda.bar@lexifi.com) identity=mailfrom; client-ip=85.233.204.164; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="nicolas.ojeda.bar@lexifi.com"; x-sender="nicolas.ojeda.bar@lexifi.com"; x-conformance=sidf_compatible Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mx20.yaziba.net) identity=helo; client-ip=85.233.204.164; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="nicolas.ojeda.bar@lexifi.com"; x-sender="postmaster@mx20.yaziba.net"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0DWAgC0TzhXkKTM6VVehAx+BoJ2pXuLd4Z7IoVvAoEPBzwQAQEBAQEBAQERAQEBAQkLCQkhL4ItghUBAQEDAQwGEQRHCwULCQILDQ0dAgIiEgEFAQoSBhMIChCHcwMPCAQKjg+PQoExPjGLO4waA4QxAQEBAQEFAQEBAQEiinKHP4JZAQSGOwyMaYR3hX6CeIUogjeMYo4DEh6BDAIPKIJCgVdsiAYBAQE X-IPAS-Result: A0DWAgC0TzhXkKTM6VVehAx+BoJ2pXuLd4Z7IoVvAoEPBzwQAQEBAQEBAQERAQEBAQkLCQkhL4ItghUBAQEDAQwGEQRHCwULCQILDQ0dAgIiEgEFAQoSBhMIChCHcwMPCAQKjg+PQoExPjGLO4waA4QxAQEBAQEFAQEBAQEiinKHP4JZAQSGOwyMaYR3hX6CeIUogjeMYo4DEh6BDAIPKIJCgVdsiAYBAQE X-IronPort-AV: E=Sophos;i="5.24,622,1454972400"; d="scan'208,217";a="177864659" Received: from mx20.yaziba.net ([85.233.204.164]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/ADH-AES256-SHA; 15 May 2016 12:32:44 +0200 Received: from mta20.int.yaziba.net (mta20.int.yaziba.net [10.4.20.31]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx20.yaziba.net (mx10.yaziba.net) with ESMTPS id 942CB1A74F2 for ; Sun, 15 May 2016 12:32:43 +0200 (CEST) Received: from mta20.int.yaziba.net (localhost [127.0.0.1]) by mta20.int.yaziba.net (Postfix) with ESMTPS id 8B40BCA657 for ; Sun, 15 May 2016 12:32:43 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by mta20.int.yaziba.net (Postfix) with ESMTP id 7A622CA795 for ; Sun, 15 May 2016 12:32:43 +0200 (CEST) X-Virus-Scanned: amavisd-new at mta20.int.yaziba.net Received: from mta20.int.yaziba.net ([127.0.0.1]) by localhost (mta20.int.yaziba.net [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 65GHestaZfUZ for ; Sun, 15 May 2016 12:32:43 +0200 (CEST) Received: from mail-vk0-f53.google.com (mail-vk0-f53.google.com [209.85.213.53]) by mta20.int.yaziba.net (Postfix) with ESMTPSA id 0961FCA657 for ; Sun, 15 May 2016 12:32:43 +0200 (CEST) Received: by mail-vk0-f53.google.com with SMTP id c189so31045969vkb.1 for ; Sun, 15 May 2016 03:32:42 -0700 (PDT) X-Gm-Message-State: AOPr4FUimgElHoTymp2pJnI37ypGIzg8YgJ7lIOqi2o9kibIbgkfivfkEZtXXHfmBAlRsrs/UfDg8pnbzmQMpA== X-Received: by 10.31.92.129 with SMTP id q123mr12181129vkb.92.1463308361830; Sun, 15 May 2016 03:32:41 -0700 (PDT) MIME-Version: 1.0 Received: by 10.159.41.103 with HTTP; Sun, 15 May 2016 03:32:22 -0700 (PDT) In-Reply-To: <8f56349e-0e8d-4e3f-ca8b-3ff0733f88d7@lakaban.net> References: <8f56349e-0e8d-4e3f-ca8b-3ff0733f88d7@lakaban.net> From: Nicolas Ojeda Bar Date: Sun, 15 May 2016 12:32:22 +0200 X-Gmail-Original-Message-ID: Message-ID: To: =?UTF-8?B?RnLDqWTDqXJpYyBCb3Vy?= Cc: caml-list@inria.fr Content-Type: multipart/alternative; boundary=001a114e566a1aa14a0532df07c8 X-DRWEB-SCAN: ok X-VRSPAM-SCORE: -51 X-VRSPAM-STATE: legit X-VRSPAM-CAUSE: gggruggvucftvghtrhhoucdtuddrfeekledrvdeggdeftdcutefuodetggdotefrucfrrhhofhhilhgvmecuggftfghnshhusghstghrihgsvgdpjgetkgfkueetnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenogfuuhhsphgvtghtffhomhgrihhnucdlgeelmdenucfjughrpegjfhfhfffkuffvtgesrgdtreertddtjeenucfhrhhomheppfhitgholhgrshcuqfhjvggurgcuuegrrhcuoehnihgtohhlrghsrdhojhgvuggrrdgsrghrsehlvgigihhfihdrtghomheqnecuffhomhgrihhnpehinhhrihgrrdhfrhdphigrhhhoohdrtghomhenucfkphepvddtledrkeehrddvudefrdehfeenucfrrghrrghmpehmohguvgepshhmthhpohhuth X-VRSPAM-EXTCAUSE: mhhouggvpehsmhhtphhouhht Subject: Re: [Caml-list] ocamllex and polymorphic recursion --001a114e566a1aa14a0532df07c8 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks Fr=C3=A9d=C3=A9ric! Works nicely. - Nicolas On Sat, May 14, 2016 at 11:29 PM, Fr=C3=A9d=C3=A9ric Bour wrote: > On 05/14/2016 09:50 PM, Nicolas Ojeda Bar wrote: > > Hi list, >> >> I am trying to write a parser using ocamllex for a language for which the >> usual character -> token -> ast does not make sense and instead one needs >> to produce the ast directly from the character level. >> >> Even though this is not its intended use case, it actually works quite >> well! Except that, while I can define higher-order rules, such as >> >> rule sp item =3D parse >> | ' ' { item lexbuf } >> >> I cannot actually use such a rule for 'item's which return different >> types. As far as I can see this is because the set of all rules is >> translated into a set of recursively defined functions and using 'sp' wi= th >> 'items' of different types would require the function corresponding to '= sp' >> to have some explicit polymorphism annotation. >> >> Is there a way to twist ocamllex a little more to make this work ? Or if >> not, would it make sense to expose a way for the user to provide these >> annotations herself ? >> > > Here is how I would do it: > > 1) Polymorphic recursion: > > - define a record type with one field for each rule, making types explici= t: > > type rules =3D { sp : 'a. (Lexing.lexbuf -> 'a) -> Lexing.lexbuf -> 'a; > ... } > > - make each rule take an argument of this type, and refer to rules by > projecting from the record: > > rule foo rules =3D parse > | 'int' { rules.sp rules.int_item lexbuf } > | 'string' { rules.sp rules.string_item lexbuf } > > - tie the knot outside of the lexer: > > let rec rules =3D { sp =3D (fun lexbuf -> sp rules lexbuf); ... } > > When invoking the lexer, pass the rules argument too. > > 2) Variation with open recursion: > > If you assume each rule take the "rules" argument, you can use type > recursion and avoid some annoyances when tying the knot. > You will have to pass the "rules" argument manually each time you call a > rule. > > type rules =3D { sp : 'a. (rules -> Lexing.lexbuf -> 'a) -> rules -> > Lexing.lexbuf -> 'a; ... } > > rule foo rules =3D parse > | 'int' { rules.sp rules.int_item rules lexbuf } > | 'string' { rules.sp rules.string_item rules lexbuf } > > 3) Perhaps there is no recursion in "sp" definition. > You can simply define 'sp' rule in a different lexer (and then call > "FirstLexer.sp"). > > I am not 100% sure this 3) will work, but I think there is no shared state > in the lexbuf after a rule succeed. Calling a different lexer should be > fine. > > Thanks! >> >> - Nicolas >> > > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > --001a114e566a1aa14a0532df07c8 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks Fr=C3=A9d=C3=A9ric! Works nicely.

- Nicolas


On Sat, May 14, 2016 at 11:29 PM, Fr=C3=A9d=C3=A9r= ic Bour <frederic.bour@lakaban.net> wrote:
On 05/14/2016 09:50 PM, Nicolas O= jeda Bar wrote:

Hi list,

I am trying to write a parser using ocamllex for a language for which the u= sual character -> token -> ast does not make sense and instead one ne= eds to produce the ast directly from the character level.

Even though this is not its intended use case, it actually works quite well= ! Except that, while I can define higher-order rules, such as

=C2=A0 rule sp item =3D parse
=C2=A0 | ' ' { item lexbuf }

I cannot actually use such a rule for 'item's which return differen= t types.=C2=A0 As far as I can see this is because the set of all rules is = translated into a set of recursively defined functions and using 'sp= 9; with 'items' of different types would require the function corre= sponding to 'sp' to have some explicit polymorphism annotation.

Is there a way to twist ocamllex a little more to make this work ?=C2=A0 Or= if not, would it make sense to expose a way for the user to provide these = annotations herself ?

Here is how I would do it:

1) Polymorphic recursion:

- define a record type with one field for each rule, making types explicit:=

=C2=A0 type rules =3D { sp : 'a. (Lexing.lexbuf -> 'a) -> Lex= ing.lexbuf -> 'a; ...=C2=A0 }

- make each rule take an argument of this type, and refer to rules by proje= cting from the record:

=C2=A0 rule foo rules =3D parse
=C2=A0 =C2=A0 | 'int' { rules.sp rules.int_item lexbuf }
=C2=A0 =C2=A0 | 'string' { rules.sp rules.string_item lexbuf }

- tie the knot outside of the lexer:

let rec rules =3D { sp =3D (fun lexbuf -> sp rules lexbuf); ... }

When invoking the lexer, pass the rules argument too.

2) Variation with open recursion:

If you assume each rule take the "rules" argument, you can use ty= pe recursion and avoid some annoyances when tying the knot.
You will have to pass the "rules" argument manually each time you= call a rule.

=C2=A0 type rules =3D { sp : 'a. (rules -> Lexing.lexbuf -> '= a) -> rules -> Lexing.lexbuf -> 'a; ...=C2=A0 }

=C2=A0 rule foo rules =3D parse
=C2=A0 =C2=A0 | 'int' { rules.sp rules.int_item rules lexbuf }
=C2=A0 =C2=A0 | 'string' { rules.sp rules.string_item rules lexbuf = }

3) Perhaps there is no recursion in "sp" definition.
You can simply define 'sp' rule in a different lexer (and then call= "FirstLexer.sp").

I am not 100% sure this 3) will work, but I think there is no shared state = in the lexbuf after a rule succeed. Calling a different lexer should be fin= e.

Thanks!

- Nicolas


--
Caml-list mailing list.=C2=A0 Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocam= l_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

--001a114e566a1aa14a0532df07c8--