caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* ocamlyacc/lex reentrancy
@ 1999-12-24 21:31 skaller
  1999-12-30 18:30 ` Francois Rouaix
  0 siblings, 1 reply; 4+ messages in thread
From: skaller @ 1999-12-24 21:31 UTC (permalink / raw)
  To: caml-list

At present, ocamllex cannot be used reentrantly by the client,
since there is nowhere to put auxilliary data.

The obvious place to attach this data is the lexbuf;
(say by using get and set functions):
this would be transparent (allow existing lexers to work
without source code modification). Also, no changes to the
lexer would be required (only the lexbuf).

At present, ocamlyacc parsers accept a lexer and a lexbuf.
In principle, this is the wrong interface: the lexer and
parser should be decoupled; the parser should only
require a function which acts as a token source.

However, there is also no place for auxilliary data 
to be stored while parsing. If the lexbuf is extended to
provide access to client data, the current interface would
support supplying that data to client code processing
non-terminals on reduction.

This suggests that each start symbol should generate two
parsers: one using the old interface, and a new interface
passing a function:

	'a -> token

The old interface can call the new interface, after fetching
the client data from the lexbuf.

These changes together would seem to allow all existing
ocamllex/ocamlyacc sources to continue to work unmodified,
while allowing new codes for both ocamllex and ocamlyacc
to be written which are reentrant, and also allow alternate
sources of tokens to be used with ocamlyacc.

One minor problem: the type 'a could not be infered for the lexbuf,
if client code never used it? Is there a good way to modify
ocamlyacc/lex, so existing code works, but which also
supports supplying auxilliary data, so that both
the lexer and parser can be reentered?

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ocamlyacc/lex reentrancy
  1999-12-24 21:31 ocamlyacc/lex reentrancy skaller
@ 1999-12-30 18:30 ` Francois Rouaix
  1999-12-30 19:09   ` skaller
  0 siblings, 1 reply; 4+ messages in thread
From: Francois Rouaix @ 1999-12-30 18:30 UTC (permalink / raw)
  To: skaller; +Cc: caml-list

I don't use ocamlyacc,  but I do have lots of ocamllex lexers that need
to be reentrant. I don't remember if I've  posted that trick to the list
already, but this is what you can do:

In your opening section of the lexer:

{
type t = <something to store the data>

let create_data () = ....    (* unit -> t *)

}

and then for all rules, use

rule somerule = parse
 | somepattern { (fun lexdata -> action) }


And, when invoking a lexer function or entry point,  you need to pass the
additionnal argument, a in
  somerule lexbuf lexdata

--f


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ocamlyacc/lex reentrancy
  1999-12-30 18:30 ` Francois Rouaix
@ 1999-12-30 19:09   ` skaller
  2000-01-03  9:46     ` Markus Mottl
  0 siblings, 1 reply; 4+ messages in thread
From: skaller @ 1999-12-30 19:09 UTC (permalink / raw)
  To: frouaix; +Cc: caml-list

Francois Rouaix wrote:
> 
> I don't use ocamlyacc,  but I do have lots of ocamllex lexers that need
> to be reentrant. I don't remember if I've  posted that trick to the list
> already, but this is what you can do:
> 
> In your opening section of the lexer:
> 
> {
> type t = <something to store the data>
> 
> let create_data () = ....    (* unit -> t *)
> 
> }
> 
> and then for all rules, use
> 
> rule somerule = parse
>  | somepattern { (fun lexdata -> action) }
> 
> And, when invoking a lexer function or entry point,  you need to pass the
> additionnal argument, a in
>   somerule lexbuf lexdata

	Ahhhh! Thank you. I see how this works.

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: ocamlyacc/lex reentrancy
  1999-12-30 19:09   ` skaller
@ 2000-01-03  9:46     ` Markus Mottl
  0 siblings, 0 replies; 4+ messages in thread
From: Markus Mottl @ 2000-01-03  9:46 UTC (permalink / raw)
  To: skaller; +Cc: OCAML

> > And, when invoking a lexer function or entry point,  you need to pass the
> > additionnal argument, a in
> >   somerule lexbuf lexdata

There is another way to get a similar behaviour with some syntactic sugar.
Use the following implementation of a "flow" (hm, just invented this name),
which is actually just a queue with some additional operators:

---------------------------------------------------------------------------
type 'a flow = Nil | Cons of 'a * 'a flow ref

type 'a t = { mutable head: 'a flow; mutable tail: 'a flow }

let create () = { head = Nil; tail = Nil }

let add fl x = match fl.tail with
  | Nil -> let c = Cons (x, ref Nil) in fl.head <- c; fl.tail <- c
  | Cons (_, last_ref) -> let c = Cons (x, ref Nil) in
                          last_ref := c; fl.tail <- c

let append fl1 fl2 = match fl1.tail with
  | Nil -> fl1.head <- fl2.head; fl1.tail <- fl2.tail
  | Cons (_, last_ref) -> last_ref := fl2.head; fl1.tail <- fl2.tail

let rec iter_aux f = function Nil -> () | Cons (x, t) -> f x; iter_aux f !t
let iter f q = iter_aux f q.head

let (%) fl el = add fl el; fl
let (!%) el = create () % el
let (%@) fl1 fl2 = append fl1 fl2; fl1
---------------------------------------------------------------------------

I use this in ocamlyacc files as follows:

---------------------------------------------------------------------------
%{ open Semantics.Impl
   open Flow
%}

... some tokens ...

%start main
%type <Semantics.Impl.cmd Flow.t> main

%%

actions_or_paths
  : action_or_path                          { $1 }
  | actions_or_paths useless action_or_path { $1 %@ $3 }

action_or_path : path { $1 } | action { !% $1 }

action
  : POP_PATH         { pop_dir }
  | EMPTY_PATH_STACK { clear_dir_stack }
  | AT ID            { dir_stack_at $2 }

path : op_rel_path { $1 % push_dir } | cl_rel_path { $1 % top_dir }

op_rel_path
  : cl_rel_path SLASH       { $1 }
  | op_el                   { !% (cd $1) }
  | cl_rel_path SLASH op_el { $1 % (cd $3) }

cl_rel_path
  : cl_el                    { !% (cd $1) }
  | set_el                   { !% (dir_set $1) }
  | cl_rel_path SLASH cl_el  { $1 % (cd $3) }
  | cl_rel_path SLASH set_el { $1 % (dir_set $3) }

---------------------------------------------------------------------------

This should lead to very compact and readable specifications. The result of
parsing is a "flow" (queue) of commands (functions) that transform state.
See below for an interface of Semantics.Impl:

---------------------------------------------------------------------------
...
  type state
  type cmd = state -> state
  
  val fresh : state
  val cd : Dir.Spec.el -> cmd
  val cd_root : cmd
  val cd_up : cmd
  val apply_flow : cmd Flow.t -> cmd
...
---------------------------------------------------------------------------

"apply_flow parse_result fresh" will apply all the state transformations of
the "flow" to a "fresh" state. Here the implementation of this function:

  let apply_stream strm s =
    let s' = ref s in Flow.iter (fun f -> s' := f !s') strm; !s'

Maybe someone else will also find it useful.

Regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2000-01-03 16:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-12-24 21:31 ocamlyacc/lex reentrancy skaller
1999-12-30 18:30 ` Francois Rouaix
1999-12-30 19:09   ` skaller
2000-01-03  9:46     ` Markus Mottl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).