From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id 89DEEBC48 for ; Sat, 2 Apr 2005 09:39:07 +0200 (CEST) Received: from kurims.kurims.kyoto-u.ac.jp (kurims.kurims.kyoto-u.ac.jp [130.54.16.1]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j327d5GQ032654 for ; Sat, 2 Apr 2005 09:39:06 +0200 Received: from localhost (suiren [130.54.16.25]) by kurims.kurims.kyoto-u.ac.jp (8.13.1/8.13.1) with ESMTP id j327d1hb017223; Sat, 2 Apr 2005 16:39:01 +0900 (JST) Date: Sat, 02 Apr 2005 16:38:51 +0900 (JST) Message-Id: <20050402.163851.88994843.garrigue@math.nagoya-u.ac.jp> To: effbiae@ivorykite.com Cc: caml-list@yquem.inria.fr Subject: Re: [Caml-list] some comments on ocaml{lex,yacc} from a novice's POV From: Jacques Garrigue In-Reply-To: <50130.202.164.198.46.1112418604.squirrel@www.ivorykite.com> References: <49464.202.164.198.46.1112355123.squirrel@www.ivorykite.com> <424DA923.7020106@tfb.com> <50130.202.164.198.46.1112418604.squirrel@www.ivorykite.com> X-Mailer: Mew version 4.2 on Emacs 21.3 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Miltered: at concorde with ID 424E4C19.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 ocaml:01 ocaml:01 experimented:01 parsing:01 uchicago:01 tokens:01 grammar:01 tokens:01 corresponds:01 parsers:01 parsers:01 parser:01 combinators:01 ocamllex:01 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.2 X-Spam-Level: From: "Jack Andrews" > this is a little long. i'm new to ocaml, but like most, have been > educated in FLs and experimented with and applied functional languages and > techniques. python has been the first language i turn to for a few years > now. > > i need to parse text as a sequence of records (with odd variations). i > have used ply (python lex-yacc) most recently for parsing and believe it > to be one of the more elegant mechanisms i've seen. > http://systems.cs.uchicago.edu/ply/ply.html > > elegant because there are no lex and yacc input files, but rather the > tokens and grammar rules are defined in python code -- succinctly! eg: > > # calclex.py > import lex > tokens = ( 'NUMBER', 'PLUS', 'MINUS', 'TIMES', 'DIVIDE', 'LPAREN', 'RPAREN',) > t_PLUS = r'\+' # in python, the r prefix to a string literal > t_MINUS = r'-' # means as-is. r'\' in python is "\\" in c > [snip] Interestingly, your example corresponds exactly to the one in the ocaml tutorial, where it is solved using stream parsers. Stream parsers are a bit more involved than just writing yacc rules, but they give you more control on how to combine rules (you can write parser combinators.) And they are completely integrated in the language using camlp4. Alternatively, you can also use ocamllex/ocamlyacc (a little bit more overhead, but really easy once you're set up), or use the Scanf module (if you don't need a real parser.) By the way, Caml 3.1 had complete yacc integration (15 years ago). I suppose this was considered too monolithic, but it was very nice to use in practice. Jacques Garrigue