From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jjhellst@gmail.com>
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr
X-Spam-Level: *
X-Spam-Status: No, score=1.4 required=5.0 tests=SPF_NEUTRAL 
	autolearn=disabled version=3.1.3
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38])
	by yquem.inria.fr (Postfix) with ESMTP id EB5F4BC0A
	for <caml-list@yquem.inria.fr>; Thu,  5 Apr 2007 22:58:01 +0200 (CEST)
Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.171])
	by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l35Kw17i028199
	for <caml-list@inria.fr>; Thu, 5 Apr 2007 22:58:01 +0200
Received: by ug-out-1314.google.com with SMTP id k3so1125943ugf
        for <caml-list@inria.fr>; Thu, 05 Apr 2007 13:58:01 -0700 (PDT)
DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed;
        d=gmail.com; s=beta;
        h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references;
        b=k1Wca+4JAJQ1zj5T8JHYGFaZHcPKtiroWG4rPL03umkf8/8tDv+xJWvWIVO/ixETIEK/g7i2GYGAhKPLefszAooa9Wp0KSjfPt8eDOkpXu0MBtyDjD/fsN97TWtd/0bsi89kmI/k2SY5RIFDmuAlf5QWOhfM4Xu1+YyUek8/bR4=
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=beta;
        h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references;
        b=RhfHgLGk02xrnlXVCiAy/4uJD7WGGCZXZGS81CQY6qDewLgVoqDTd42Zy6Kk9OAt6wq3xKt4iHdBbDO97hCKAoY6VAV/ay6OMwU1DReqdUUMKloT9/SOjc5ZzdPj/t3pYawpc66na9AdTfKQ/S5CCF+/779Qa8yww0qtAFRrg2g=
Received: by 10.114.122.2 with SMTP id u2mr899253wac.1175806680448;
        Thu, 05 Apr 2007 13:58:00 -0700 (PDT)
Received: by 10.114.72.17 with HTTP; Thu, 5 Apr 2007 13:58:00 -0700 (PDT)
Message-ID: <700d600f0704051358g294d540ey96a546f7f57a08e7@mail.gmail.com>
Date: Thu, 5 Apr 2007 23:58:00 +0300
From: "Janne Hellsten" <jjhellst@gmail.com>
To: caml-list@inria.fr
Subject: Re: [Caml-list] Matching start of input in lexer created with ocamllex
In-Reply-To: <1175802901.5274.14.camel@rosella.wigram>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
References: <700d600f0704050737r3ea45a16gb318ac7acf8e3178@mail.gmail.com>
	 <1175802901.5274.14.camel@rosella.wigram>
X-j-chkmail-Score: MSGID : 461562D9.000 on discorde : j-chkmail score : X : 0/20 1 0.000 -> 1
X-Miltered: at discorde with ID 461562D9.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)!
X-Spam: no; 0.00; lexer:01 ocamllex:01 tokens:01 recursive:01 lexer:01 lexbuf:01 regexp:01 caml-list:01 tail:01 simplify:01 ident:02 ident:02 newline:02 newline:02 match:02 

> let table = ["for", FOR; "while", WHILE]
> ..
> | space-not-newline + { WHITE }
> | newline { NEWLINE }
> | ident as id { try assoc id table with Not_found -> IDENT id }
>
> An alternative to the WHITE and NEWLINE tokens is a tail
> recursive call to the lexer:
>
> | space + { initial lexbuf }
>
> which just skips over the spaces.

I do have the latter construct in my lexer.  How does the above help
me match the beginning of line in the rule

  | '!' [' ' '\t']* "for" { FOR (current_loc ()) }

I'd like it to look like this:

  | "^!for" { FOR }

where ^ would denote the start of input.  To simplify my question, we
can assume there are no new line chars in my input.  This regexp:

'!' [' ' '\t']* "for"

could actually be simplified to

| "!for"

and I would still need to match the "beginning of input" somehow.

I must've missed your point.

Janne