From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: weis Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id OAA01661 for caml-redistribution; Fri, 8 Nov 1996 14:51:00 +0100 (MET) Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id OAA28856 for ; Fri, 8 Nov 1996 14:21:57 +0100 (MET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.7.6/8.7.1) with ESMTP id OAA29872; Fri, 8 Nov 1996 14:21:55 +0100 (MET) Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id OAA28852; Fri, 8 Nov 1996 14:21:55 +0100 (MET) From: Pierre Weis Message-Id: <199611081321.OAA28852@pauillac.inria.fr> Subject: Re: another question on lexer function In-Reply-To: <199611071007.LAA25922@aenegada.inria.fr> from Olivier Pons at "Nov 7, 96 11:07:04 am" To: opons@aenegada.inria.fr (Olivier Pons) Date: Fri, 8 Nov 1996 14:21:55 +0100 (MET) Cc: caml-list@inria.fr MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: weis > is it forbidden to use the ";;" token in a make_lexer function ? You cannot use an arbitrary sequence of characters as a declared keyword: the sequences you write must be recognized as an identifier by the ``next_token'' function inside make_lexer, (or be a single non-alphanumeric character). The next_token function recognizes two kinds of identifiers: regular ones (roughly speaking sequences of alphanumeric characters), and special ones or symbols (sequences of non alphanumeric characters, such as ++). Once an identifier is recognized, it is compared with the list of declared keywords: if found in the list, a Kwd token is emitted, otherwise an Ident token is returned. Since the sequence ;; is recognized as two single characters by next_token (same treatment as for parens, brackets or commas). But the rule for single non-alphanumeric characters is that they must have been declared as keywords, or it is a lexical error. Since ; has not been declared as a keyword of the lexer, an error occurs. Now, if you want to deal with a token ``;;'', you may: 1) Declare ";" as a keyword, then interpret to successive `;' tokens as a ";;" in your grammar rule. 2) Adapt the genlex module to your specific needs. I strongly recommend this solution, in particular if you want to understand stream parsing, or if you will use the lexer in an intensive way. Best regards, Pierre Weis INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://pauillac.inria.fr/~weis