From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id pA37CYAf023253 for ; Thu, 3 Nov 2011 08:12:34 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApcCACA+sk7RVde2kGdsb2JhbABEmgyOTIEXCCIBAQEBCQkNGwQhgXIBAQEEDAYCExkBGxILAQMMBgULDQ0hIQEBEQEFAQoSBhMSEIdol0kKi1SCYIVKPYhwAgUKgz2FVwSIBowQijuCfj2Dfg X-IronPort-AV: E=Sophos;i="4.69,448,1315173600"; d="scan'208";a="116501919" Received: from mail-ey0-f182.google.com ([209.85.215.182]) by mail4-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 03 Nov 2011 08:12:30 +0100 Received: by eyd10 with SMTP id 10so1415873eyd.27 for ; Thu, 03 Nov 2011 00:12:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=yj6if0dq0MmjkwWdTpj44tZrmV+H2EabHXe5u7ZTUOo=; b=woFFEajmo8EyMXHgzj2gkrtIPoRA+/z6I/I7amNFz/lAFB3ZqMbBJixIKeZZ/VrHkY oJyygydhUEAN3q0c2HFD/TIEFwJWl6fdDlDkU4AGSX00kGX6dYDbPId8+3zhikoJ275H 9TdX7hemrvH30NeryhamPY3e2ShZUSRTM2TVA= MIME-Version: 1.0 Received: by 10.14.11.87 with SMTP id 63mr730641eew.9.1320304349620; Thu, 03 Nov 2011 00:12:29 -0700 (PDT) Received: by 10.14.95.197 with HTTP; Thu, 3 Nov 2011 00:12:29 -0700 (PDT) In-Reply-To: References: Date: Thu, 3 Nov 2011 16:12:29 +0900 Message-ID: From: Jun Furuse To: Gabriel Scherer Cc: caml-list Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by walapai.inria.fr id pA37CYAf023253 Subject: Re: [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4? Gabriel, Thanks for the info. But what I want cannot be achieved by the lex filter. I want to have pcre regexp literals in the same syntax as Perl i.e. /hello\sworld\\n/. Currently what we do in OCaml is Pcre.regexp "hello\\sworld\\\\n", where the backslash char must be escaped in a OCaml string literal. This is lousy for scripting in OCaml. To have the same or similar syntax as Perl, the lexer must be really modified. Currently I am using a modified CamlP4 where I can replace its lexer function, but it is an adhoc way, and I am seeking any healthier way without such a modification. Jun On Thu, Nov 3, 2011 at 7:52 AM, Gabriel Scherer wrote: >> I have tried to override whole the syntax as follows, but it seems >> that it changes nothing...: > > Camlp4 is designed around mutable state. That you Make module produces > a new grammar doesn't make it the current grammar used by the > preprocessor. What need to happen is that the current state of the > lexer/parser is *mutated* by your Make module (whose evaluation is > then delayed and controlled by Camlp4 itself thanks to registration > with Register). This is what EXTEND does at the grammar level. > > If it suits your need, you can define your modification as a filter > that will post-process the output of the original lexer. The Token > module expose a define_filter function to imperatively update the set > of such active stream transformers. > > This idea was suggested to me by Jérémie Dimino, and works well. It is > used for example in pa_comprehension to define "[?" and "?]" as new > OCaml keywords (asking for "["; "?" at the camlp4 grammar level would > allow spaces in between): >  https://github.com/ocaml-batteries-team/batteries-included/blob/master/src/syntax/pa_comprehension/pa_comprehension.ml > > Below is the relevant code: > >  module Make (Syntax : Sig.Camlp4Syntax) = struct >    open Sig; >    include Syntax; > > >    (* "[?" and "?]" are not recognized as delimiters by the Camlp4 >      lexer; This token parser will spot "["; "?" and "?"; "]" token >      and insert "[?" and "?]" instead. >     Thanks to Jérémie Dimino for the idea. *) >    value rec delim_filter older_filter stream = >      let rec filter = parser >      [ [: `(KEYWORD "[", loc); rest :] -> >          match rest with parser >          [ [: `(KEYWORD "?", _) :] -> [: `(KEYWORD "[?", loc); filter rest :] >          | [: :] -> [: `(KEYWORD "[", loc); filter rest :] ] >      | [: `(KEYWORD "?", loc); rest :] -> >          match rest with parser >          [ [: `(KEYWORD "]", loc) :] -> [: `(KEYWORD "?]", loc); filter rest :] >          | [: :] -> [: `(KEYWORD "?", loc); filter rest :] ] >      | [: `other; rest :] -> [: `other; filter rest :] ] in >      older_filter (filter stream); > >    value _ = Token.Filter.define_filter (Gram.get_filter ()) delim_filter; > >    (* REST OF THE CAMLP4 EXTENSION ... *) >  end; > > > On Wed, Nov 2, 2011 at 9:34 PM, Jun Furuse wrote: >> Hi, >> >> Is it possible for Camlp4 to implement an OCaml syntax extension (i.e. >> pa_*) which modifies the lexer of OCaml syntax? >> >> I have tried to override whole the syntax as follows, but it seems >> that it changes nothing...: >> >> ----------------------------------------------------------- >> open Camlp4 >> >> module Id : Sig.Id = struct >>  let name = "pa_extlex" >>  let version = "1.0" >> end >> >> module XLexer = Xlexer.Make(PreCast.Token)        (* XLexer >> reimplements OCaml lexer with some extra rules *) >> module XGram = PreCast.MakeGram(XLexer) >> >> module Make (Syntax : Sig.Camlp4Syntax) = struct >>  let _ = prerr_endline "Creating OCaml syntax with lexer extension" >>  module M1 = OCamlInitSyntax.Make(PreCast.Ast)(XGram)(PreCast.Quotation) >>  module M2 = Camlp4OCamlRevisedParser.Make(M1) >>  module M3 = Camlp4OCamlParser.Make(M2) >>  include M3 >> end >> >> let module M = Register.OCamlSyntaxExtension(Id)(Make) in () >> ----------------------------------- >> >> Jun >> >> -- >> Caml-list mailing list.  Subscription management and archives: >> https://sympa-roc.inria.fr/wws/info/caml-list >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >> Bug reports: http://caml.inria.fr/bin/caml-bugs >> >> >