From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id 7907BBBC1 for ; Fri, 7 Mar 2008 16:39:48 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApMAAOrv0EfBL1AZjGdsb2JhbACQeAEBAQgEBgkGGpsT X-IronPort-AV: E=Sophos;i="4.25,463,1199660400"; d="scan'208";a="23491007" Received: from discorde.inria.fr ([192.93.2.38]) by mail4-smtp-sop.national.inria.fr with ESMTP; 07 Mar 2008 16:39:48 +0100 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id m27FdjZ2000395 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Fri, 7 Mar 2008 16:39:48 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApMAAOrv0EfBL1AZjGdsb2JhbACQeAEBAQgEBgkGGpsT X-IronPort-AV: E=Sophos;i="4.25,463,1199660400"; d="scan'208";a="23491006" Received: from gw.exalead.com (HELO exalead.com) ([193.47.80.25]) by mail4-smtp-sop.national.inria.fr with ESMTP; 07 Mar 2008 16:39:47 +0100 Received: from [192.168.204.148] (madpc064.exalead.com [192.168.204.148]) (authenticated bits=0) by exalead.com (8.14.0/8.14.0) with ESMTP id m27FdjeP002100; Fri, 7 Mar 2008 16:39:46 +0100 Message-ID: <47D161C1.2050401@exalead.com> Date: Fri, 07 Mar 2008 16:39:45 +0100 From: Berke Durak User-Agent: Thunderbird 1.5.0.10 (X11/20070221) MIME-Version: 1.0 To: Till Varoquaux Cc: OCaml Subject: Re: [Caml-list] Ocamllex newline counting... References: <9d3ec8300803070157y6bb173eew865e2f232cb4c789@mail.gmail.com> In-Reply-To: <9d3ec8300803070157y6bb173eew865e2f232cb4c789@mail.gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Miltered: at discorde with ID 47D161C1.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; berke:01 durak:01 berke:01 durak:01 ocamllex:01 ocamllex:01 lexbuf:01 lexbuf:01 lexer:01 newlines:01 newlines:01 lexing:01 lexing:01 lexer:01 compiler:01 Till Varoquaux a écrit : > The title is pretty self explanetory: Ocamllex is able to keep track > of positions automatically but it needs help with new lines ( you need > to register new lines with a function like: > > let newline lexbuf = > let pos = lexbuf.lex_curr_p in > lexbuf.lex_curr_p <- > { pos with pos_lnum = pos.pos_lnum + 1; pos_bol = pos.pos_cnum } > ). > This tends to pollute the code and require you to add additional rules > and underlying machinery. I can see one easy workaround: pipe the > function you build your lexer from trough an additional function that > registers newlines. This seems a bit costly at run time but should be > just fine in most cases. > > Is there any fundamental reason I am missing why newlines are not > handled natively in the generated automaton? Well, as you said, wrapping using Lexing.from_function to count newlines or whatever you want is "just fine" in most cases; especially given that you'll often want to build a byte-offset-to-line-number table while parsing. Putting specialized logic in lexing doesn't seem very elegant... My wild guess would be that the lexer_position record probably got those extra lnum fields because they were needed in the compiler and they didn't want to change the interface to insert a type parameter for the positions. Am I right? -- Berke DURAK