From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: * X-Spam-Status: No, score=1.0 required=5.0 tests=AWL,SPF_NEUTRAL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38]) by yquem.inria.fr (Postfix) with ESMTP id 97CE0BC69 for ; Sun, 19 Aug 2007 12:13:30 +0200 (CEST) Received: from ausone.inria.fr (peray.inria.fr [128.93.8.98]) by discorde.inria.fr (8.13.6/8.13.6) with SMTP id l7JADSHe011837; Sun, 19 Aug 2007 12:13:29 +0200 Received: by ausone.inria.fr (sSMTP sendmail emulation); Sun, _d Aug 2007 12:13:05 +0200 From: "Nicolas Pouillard" Cc: "O'Caml Mailing List" Subject: Re: [Caml-list] lexer library To: Jon Harrop References: <1187462392.23296.5.camel@rosella.wigram> <200708182334.44751.jon@ffconsultancy.com> In-Reply-To: <200708182334.44751.jon@ffconsultancy.com> Date: Sun, 19 Aug 2007 12:13:05 +0200 Message-Id: <1187518168-sup-4112@ert.local> User-Agent: Sup/0.1 Content-Disposition: inline Content-Type: text/plain; charset=utf-8 X-j-chkmail-Score: MSGID : 46C817C8.001 on discorde : j-chkmail score : X : 0/20 1 0.000 -> 1 X-Miltered: at discorde with ID 46C817C8.001 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; lexer:01 lexer:01 parser:01 combinators:01 camlp:01 wiki:01 camlp:01 ocamllex:01 token:01 token:01 wrote:01 ideally:01 dynamically:01 dynamically:01 extensible:01 Excerpts from Jon Harrop's message of Sun Aug 19 00:34:44 +0200 2007: > On Saturday 18 August 2007 19:39:52 skaller wrote: > > I am interested in building a dynamically extensible lexer.. can > > anyone recommend any libraries or tools that would help? > > > > I could use Dypgen (extensible GLR parser) for this, but it is > > probably too slow. > > > > A system using combinators (rather than strings) would be preferred, > > and Unicode support would be nice too. Ideally the lexer automaton > > could be saved with Marshal, in which case the time to build the > > automaton isn't important. > > The new camlp4 looks like it supports this but I'm damned if I can figure out > how you use it. There is a beautiful tutorial on the Wiki that fizzles out > just when it gets exciting. > Not really, with the new camlp4 you can change the lexer and the token type (thanks to the system module). One can also dynamically add lexer filters that transforms the token stream. However the provided default lexer is not extensible (it's an ocamllex one). -- Nicolas Pouillard aka Ertai