From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: weis
Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id LAA20913 for caml-redistribution; Thu, 30 Dec 1999 11:40:03 +0100 (MET)
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id WAA27462 for <caml-list@pauillac.inria.fr>; Fri, 24 Dec 1999 22:29:31 +0100 (MET)
Received: from ruby (krusty117.zip.com.au [61.8.16.117])
	by nez-perce.inria.fr (8.8.7/8.8.7) with ESMTP id WAA23563
	for <caml-list@inria.fr>; Fri, 24 Dec 1999 22:29:23 +0100 (MET)
Received: from maxtal.com.au (IDENT:root@localhost [127.0.0.1])
	by ruby (8.9.3/8.9.3) with ESMTP id IAA02633
	for <caml-list@inria.fr>; Sat, 25 Dec 1999 08:31:13 +1100
Sender: weis
Message-ID: <3863E61F.E27029EF@maxtal.com.au>
Date: Sat, 25 Dec 1999 08:31:11 +1100
From: skaller <skaller@maxtal.com.au>
Organization: Maxtal
X-Mailer: Mozilla 4.51 [en] (X11; I; Linux 2.2.12 i586)
X-Accept-Language: en
MIME-Version: 1.0
To: caml-list@inria.fr
Subject: ocamlyacc/lex reentrancy
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

At present, ocamllex cannot be used reentrantly by the client,
since there is nowhere to put auxilliary data.

The obvious place to attach this data is the lexbuf;
(say by using get and set functions):
this would be transparent (allow existing lexers to work
without source code modification). Also, no changes to the
lexer would be required (only the lexbuf).

At present, ocamlyacc parsers accept a lexer and a lexbuf.
In principle, this is the wrong interface: the lexer and
parser should be decoupled; the parser should only
require a function which acts as a token source.

However, there is also no place for auxilliary data 
to be stored while parsing. If the lexbuf is extended to
provide access to client data, the current interface would
support supplying that data to client code processing
non-terminals on reduction.

This suggests that each start symbol should generate two
parsers: one using the old interface, and a new interface
passing a function:

	'a -> token

The old interface can call the new interface, after fetching
the client data from the lexbuf.

These changes together would seem to allow all existing
ocamllex/ocamlyacc sources to continue to work unmodified,
while allowing new codes for both ocamllex and ocamlyacc
to be written which are reentrant, and also allow alternate
sources of tokens to be used with ocamlyacc.

One minor problem: the type 'a could not be infered for the lexbuf,
if client code never used it? Is there a good way to modify
ocamlyacc/lex, so existing code works, but which also
supports supplying auxilliary data, so that both
the lexer and parser can be reentered?

-- 
John Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia
homepage: http://www.maxtal.com.au/~skaller
voice: 61-2-9660-0850