caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4?
@ 2011-11-02 20:34 Jun Furuse
  2011-11-02 22:52 ` Gabriel Scherer
  0 siblings, 1 reply; 6+ messages in thread
From: Jun Furuse @ 2011-11-02 20:34 UTC (permalink / raw)
  To: caml-list

Hi,

Is it possible for Camlp4 to implement an OCaml syntax extension (i.e.
pa_*) which modifies the lexer of OCaml syntax?

I have tried to override whole the syntax as follows, but it seems
that it changes nothing...:

-----------------------------------------------------------
open Camlp4

module Id : Sig.Id = struct
  let name = "pa_extlex"
  let version = "1.0"
end

module XLexer = Xlexer.Make(PreCast.Token)        (* XLexer
reimplements OCaml lexer with some extra rules *)
module XGram = PreCast.MakeGram(XLexer)

module Make (Syntax : Sig.Camlp4Syntax) = struct
  let _ = prerr_endline "Creating OCaml syntax with lexer extension"
  module M1 = OCamlInitSyntax.Make(PreCast.Ast)(XGram)(PreCast.Quotation)
  module M2 = Camlp4OCamlRevisedParser.Make(M1)
  module M3 = Camlp4OCamlParser.Make(M2)
  include M3
end

let module M = Register.OCamlSyntaxExtension(Id)(Make) in ()
-----------------------------------

Jun

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4?
  2011-11-02 20:34 [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4? Jun Furuse
@ 2011-11-02 22:52 ` Gabriel Scherer
  2011-11-03  7:12   ` Jun Furuse
  0 siblings, 1 reply; 6+ messages in thread
From: Gabriel Scherer @ 2011-11-02 22:52 UTC (permalink / raw)
  To: Jun Furuse; +Cc: caml-list

> I have tried to override whole the syntax as follows, but it seems
> that it changes nothing...:

Camlp4 is designed around mutable state. That you Make module produces
a new grammar doesn't make it the current grammar used by the
preprocessor. What need to happen is that the current state of the
lexer/parser is *mutated* by your Make module (whose evaluation is
then delayed and controlled by Camlp4 itself thanks to registration
with Register). This is what EXTEND does at the grammar level.

If it suits your need, you can define your modification as a filter
that will post-process the output of the original lexer. The Token
module expose a define_filter function to imperatively update the set
of such active stream transformers.

This idea was suggested to me by Jérémie Dimino, and works well. It is
used for example in pa_comprehension to define "[?" and "?]" as new
OCaml keywords (asking for "["; "?" at the camlp4 grammar level would
allow spaces in between):
  https://github.com/ocaml-batteries-team/batteries-included/blob/master/src/syntax/pa_comprehension/pa_comprehension.ml

Below is the relevant code:

  module Make (Syntax : Sig.Camlp4Syntax) = struct
    open Sig;
    include Syntax;


    (* "[?" and "?]" are not recognized as delimiters by the Camlp4
      lexer; This token parser will spot "["; "?" and "?"; "]" token
      and insert "[?" and "?]" instead.
     Thanks to Jérémie Dimino for the idea. *)
    value rec delim_filter older_filter stream =
      let rec filter = parser
      [ [: `(KEYWORD "[", loc); rest :] ->
          match rest with parser
          [ [: `(KEYWORD "?", _) :] -> [: `(KEYWORD "[?", loc); filter rest :]
          | [: :] -> [: `(KEYWORD "[", loc); filter rest :] ]
      | [: `(KEYWORD "?", loc); rest :] ->
          match rest with parser
          [ [: `(KEYWORD "]", loc) :] -> [: `(KEYWORD "?]", loc); filter rest :]
          | [: :] -> [: `(KEYWORD "?", loc); filter rest :] ]
      | [: `other; rest :] -> [: `other; filter rest :] ] in
      older_filter (filter stream);

    value _ = Token.Filter.define_filter (Gram.get_filter ()) delim_filter;

    (* REST OF THE CAMLP4 EXTENSION ... *)
  end;


On Wed, Nov 2, 2011 at 9:34 PM, Jun Furuse <jun.furuse@gmail.com> wrote:
> Hi,
>
> Is it possible for Camlp4 to implement an OCaml syntax extension (i.e.
> pa_*) which modifies the lexer of OCaml syntax?
>
> I have tried to override whole the syntax as follows, but it seems
> that it changes nothing...:
>
> -----------------------------------------------------------
> open Camlp4
>
> module Id : Sig.Id = struct
>  let name = "pa_extlex"
>  let version = "1.0"
> end
>
> module XLexer = Xlexer.Make(PreCast.Token)        (* XLexer
> reimplements OCaml lexer with some extra rules *)
> module XGram = PreCast.MakeGram(XLexer)
>
> module Make (Syntax : Sig.Camlp4Syntax) = struct
>  let _ = prerr_endline "Creating OCaml syntax with lexer extension"
>  module M1 = OCamlInitSyntax.Make(PreCast.Ast)(XGram)(PreCast.Quotation)
>  module M2 = Camlp4OCamlRevisedParser.Make(M1)
>  module M3 = Camlp4OCamlParser.Make(M2)
>  include M3
> end
>
> let module M = Register.OCamlSyntaxExtension(Id)(Make) in ()
> -----------------------------------
>
> Jun
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa-roc.inria.fr/wws/info/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4?
  2011-11-02 22:52 ` Gabriel Scherer
@ 2011-11-03  7:12   ` Jun Furuse
  2011-11-03  9:16     ` Jérémie Dimino
  0 siblings, 1 reply; 6+ messages in thread
From: Jun Furuse @ 2011-11-03  7:12 UTC (permalink / raw)
  To: Gabriel Scherer; +Cc: caml-list

Gabriel,

Thanks for the info. But what I want cannot be achieved by the lex filter.

I want to have pcre regexp literals in the same syntax as Perl i.e.
/hello\sworld\\n/. Currently what we do in OCaml is Pcre.regexp
"hello\\sworld\\\\n", where the backslash char must be escaped in a
OCaml string literal. This is lousy for scripting in OCaml.

To have the same or similar syntax as Perl, the lexer must be really
modified. Currently I am using a modified CamlP4 where I can replace
its lexer function, but it is an adhoc way, and I am seeking any
healthier way without such a modification.

Jun

On Thu, Nov 3, 2011 at 7:52 AM, Gabriel Scherer
<gabriel.scherer@gmail.com> wrote:
>> I have tried to override whole the syntax as follows, but it seems
>> that it changes nothing...:
>
> Camlp4 is designed around mutable state. That you Make module produces
> a new grammar doesn't make it the current grammar used by the
> preprocessor. What need to happen is that the current state of the
> lexer/parser is *mutated* by your Make module (whose evaluation is
> then delayed and controlled by Camlp4 itself thanks to registration
> with Register). This is what EXTEND does at the grammar level.
>
> If it suits your need, you can define your modification as a filter
> that will post-process the output of the original lexer. The Token
> module expose a define_filter function to imperatively update the set
> of such active stream transformers.
>
> This idea was suggested to me by Jérémie Dimino, and works well. It is
> used for example in pa_comprehension to define "[?" and "?]" as new
> OCaml keywords (asking for "["; "?" at the camlp4 grammar level would
> allow spaces in between):
>  https://github.com/ocaml-batteries-team/batteries-included/blob/master/src/syntax/pa_comprehension/pa_comprehension.ml
>
> Below is the relevant code:
>
>  module Make (Syntax : Sig.Camlp4Syntax) = struct
>    open Sig;
>    include Syntax;
>
>
>    (* "[?" and "?]" are not recognized as delimiters by the Camlp4
>      lexer; This token parser will spot "["; "?" and "?"; "]" token
>      and insert "[?" and "?]" instead.
>     Thanks to Jérémie Dimino for the idea. *)
>    value rec delim_filter older_filter stream =
>      let rec filter = parser
>      [ [: `(KEYWORD "[", loc); rest :] ->
>          match rest with parser
>          [ [: `(KEYWORD "?", _) :] -> [: `(KEYWORD "[?", loc); filter rest :]
>          | [: :] -> [: `(KEYWORD "[", loc); filter rest :] ]
>      | [: `(KEYWORD "?", loc); rest :] ->
>          match rest with parser
>          [ [: `(KEYWORD "]", loc) :] -> [: `(KEYWORD "?]", loc); filter rest :]
>          | [: :] -> [: `(KEYWORD "?", loc); filter rest :] ]
>      | [: `other; rest :] -> [: `other; filter rest :] ] in
>      older_filter (filter stream);
>
>    value _ = Token.Filter.define_filter (Gram.get_filter ()) delim_filter;
>
>    (* REST OF THE CAMLP4 EXTENSION ... *)
>  end;
>
>
> On Wed, Nov 2, 2011 at 9:34 PM, Jun Furuse <jun.furuse@gmail.com> wrote:
>> Hi,
>>
>> Is it possible for Camlp4 to implement an OCaml syntax extension (i.e.
>> pa_*) which modifies the lexer of OCaml syntax?
>>
>> I have tried to override whole the syntax as follows, but it seems
>> that it changes nothing...:
>>
>> -----------------------------------------------------------
>> open Camlp4
>>
>> module Id : Sig.Id = struct
>>  let name = "pa_extlex"
>>  let version = "1.0"
>> end
>>
>> module XLexer = Xlexer.Make(PreCast.Token)        (* XLexer
>> reimplements OCaml lexer with some extra rules *)
>> module XGram = PreCast.MakeGram(XLexer)
>>
>> module Make (Syntax : Sig.Camlp4Syntax) = struct
>>  let _ = prerr_endline "Creating OCaml syntax with lexer extension"
>>  module M1 = OCamlInitSyntax.Make(PreCast.Ast)(XGram)(PreCast.Quotation)
>>  module M2 = Camlp4OCamlRevisedParser.Make(M1)
>>  module M3 = Camlp4OCamlParser.Make(M2)
>>  include M3
>> end
>>
>> let module M = Register.OCamlSyntaxExtension(Id)(Make) in ()
>> -----------------------------------
>>
>> Jun
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa-roc.inria.fr/wws/info/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>>
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4?
  2011-11-03  7:12   ` Jun Furuse
@ 2011-11-03  9:16     ` Jérémie Dimino
  2011-11-05 21:19       ` Nicolas Pouillard
  0 siblings, 1 reply; 6+ messages in thread
From: Jérémie Dimino @ 2011-11-03  9:16 UTC (permalink / raw)
  To: Jun Furuse; +Cc: caml-list

Hi,

On Thu, Nov 03, 2011 at 04:12:29PM +0900, Jun Furuse wrote:
> I want to have pcre regexp literals in the same syntax as Perl i.e.
> /hello\sworld\\n/. Currently what we do in OCaml is Pcre.regexp
> "hello\\sworld\\\\n", where the backslash char must be escaped in a
> OCaml string literal. This is lousy for scripting in OCaml.

Have you look at camlp4 quotations ? Basically you can define a new
quotation named "foo" and in you code you can write:

  <:foo<...>>

The ... can be any string, except that it cannot contains >>.

Also you may be interested in the Mikmatch syntax extension:

  http://martin.jambon.free.fr/micmatch.html

Cheers,

-- 
Jérémie

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4?
  2011-11-03  9:16     ` Jérémie Dimino
@ 2011-11-05 21:19       ` Nicolas Pouillard
  2011-11-06  0:58         ` Jun Furuse
  0 siblings, 1 reply; 6+ messages in thread
From: Nicolas Pouillard @ 2011-11-05 21:19 UTC (permalink / raw)
  To: Jun Furuse, caml-list

On Thu, Nov 3, 2011 at 10:16 AM, Jérémie Dimino <jeremie@dimino.org> wrote:
> Hi,
>
> On Thu, Nov 03, 2011 at 04:12:29PM +0900, Jun Furuse wrote:
>> I want to have pcre regexp literals in the same syntax as Perl i.e.
>> /hello\sworld\\n/. Currently what we do in OCaml is Pcre.regexp
>> "hello\\sworld\\\\n", where the backslash char must be escaped in a
>> OCaml string literal. This is lousy for scripting in OCaml.

As said earlier Camlp4's lexer is not extensible. One can change
the meaning of the token stream using the token filters but this
won't work in your case. The third option is to use quotations this
is really the adapted feature for this task. Of course the syntax
won't be as concise as /bla/...

Regarding OCaml lexing you may be interested in camllexer [1] which
is not intended to be extensible but is very small and selfcontained.
If you really want to hack your own lexical syntax I suggest you to fork
camllexer and change it for your purpose.

[1]: https://github.com/np/camllexer

Best regards,

-- 
Nicolas Pouillard
http://nicolaspouillard.fr


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4?
  2011-11-05 21:19       ` Nicolas Pouillard
@ 2011-11-06  0:58         ` Jun Furuse
  0 siblings, 0 replies; 6+ messages in thread
From: Jun Furuse @ 2011-11-06  0:58 UTC (permalink / raw)
  To: Nicolas Pouillard; +Cc: caml-list

Hi,

Unfortunately the conclusion seems to be currently there is no way to
change the lexer by pa_*.cmo modules.

Then, I stick to my patched p4 approach for now. With it I can use
$/regexp\n/i and $`find . -iname hoo` syntax, but for whom using the
vanilla p4, they can still use <:m<regexp\n/i>> and <:qx<find . -iname
hoo>> :

  https://bitbucket.org/camlspotter/orakuda/src/50d736f39428/test

Thanks,
Jun

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-11-06  0:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-02 20:34 [Caml-list] Is it possible to extend OCaml lexer rules via Camlp4? Jun Furuse
2011-11-02 22:52 ` Gabriel Scherer
2011-11-03  7:12   ` Jun Furuse
2011-11-03  9:16     ` Jérémie Dimino
2011-11-05 21:19       ` Nicolas Pouillard
2011-11-06  0:58         ` Jun Furuse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).