caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: james woodyatt <jhw@wetware.com>
To: Ocaml Trade <caml-list@inria.fr>
Subject: Re: [Caml-list] ocamllex problem
Date: Thu, 4 Aug 2005 23:15:48 -0700	[thread overview]
Message-ID: <A885EC6D-313A-4938-9590-F31D773B8D5D@wetware.com> (raw)
In-Reply-To: <1123203791.6720.63.camel@localhost.localdomain>

On 04 Aug 2005, at 18:03, skaller wrote:
>
> Alain Frisch pointed me at some nasty papers on this, one with a  
> regexp -> NFA conversion and the other with a NFA-> DFA conversion,  
> but I couldn't figure out how to do the direct regexp->DFA  
> conversion, I'd sure like to find an algorithm for that..

In my OCaml NAE Core Foundation, there is a something you may find  
interesting.  See the [Cf_lex] module and its subordinate [Cf_dfa].   
Since it isn't trying to be a multi-stage programming tool like  
[ocamllex], it produces a parser monad that executes a Lazy-DFA,  
instead of a fully space-time optimized DFA.  At some point, I may  
implement a [study] function that fully evaluates the Lazy-DFA and  
optimizes it, but I don't yet see a compelling need for that.

One thing: the pattern [':'((letter|' ')* as s)] is interesting.   
You're definitely right that something non-trivial is happening  
inside the DFA.  My [Cf_dfa] module does not keep a stack of  
backtracking sequences because I did something else to resolve the  
problem.  Look at the ( $@ ) operators, which allow you to use a  
parser monad on the recognized input sequence to obtain the result of  
a lexical rule.  Using this, you can implement something like the  
feature you're interested in by defining a nested hierarchy of parsers.

I know.  This is probably not what you're looking for.  To get what  
you're looking for, I'd have to extend [Cf_dfa] to handle marker  
nodes in the NFA.  I thought that would be more appropriate for  
[ocamllex] and similar tools, so I didn't do it.  Nice to see  
[ocamllex] did.


-- 
j h woodyatt <jhw@wetware.com>
that's my village calling... no doubt, they want their idiot back.



  parent reply	other threads:[~2005-08-05  6:15 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-08-04 23:12 Jonathan Roewen
2005-08-04 23:53 ` Jonathan Roewen
2005-08-05  1:03   ` skaller
2005-08-05  5:11     ` Alain Frisch
2005-08-05  6:15     ` james woodyatt [this message]
2005-08-05  8:35       ` skaller
2005-08-05  9:15       ` Berke Durak
2005-08-05 11:05         ` skaller
2005-08-05 12:21           ` Jonathan Bryant
2005-08-05 12:39             ` David MENTRE

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=A885EC6D-313A-4938-9590-F31D773B8D5D@wetware.com \
    --to=jhw@wetware.com \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).