From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id B35F2BC8C for ; Tue, 11 Jan 2005 04:54:41 +0100 (CET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j0B3sfwP008738 for ; Tue, 11 Jan 2005 04:54:41 +0100 Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id EAA19051 for ; Tue, 11 Jan 2005 04:54:40 +0100 (MET) Received: from smtp1.adl2.internode.on.net (smtp1.adl2.internode.on.net [203.16.214.181]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j0B3scPU015153; Tue, 11 Jan 2005 04:54:39 +0100 Received: from [192.168.1.200] (ppp201-179.lns1.syd3.internode.on.net [203.122.201.179]) by smtp1.adl2.internode.on.net (8.12.9/8.12.9) with ESMTP id j0B3sUeu098654; Tue, 11 Jan 2005 14:24:31 +1030 (CST) Subject: Re: [Caml-list] Thread safe Str From: skaller Reply-To: skaller@users.sourceforge.net To: Martin Jambon Cc: Xavier Leroy , Christophe TROESTLER , "O'Caml Mailing List" In-Reply-To: References: Content-Type: text/plain Message-Id: <1105415669.3534.55.camel@pelican.wigram> Mime-Version: 1.0 X-Mailer: Ximian Evolution 1.2.2 (1.2.2-4) Date: 11 Jan 2005 14:54:30 +1100 Content-Transfer-Encoding: 7bit X-Miltered: at nez-perce with ID 41E34E01.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at concorde with ID 41E34DFE.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 sourceforge:01 wrote:01 compile-time:01 syntax:01 ocaml:01 syntax:01 nfa:01 recursion:01 parser:01 parsers:01 parser:01 regexp:01 ocaml:01 ocamllex:01 X-Spam-Checker-Version: SpamAssassin 3.0.0 (2004-09-13) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.0 X-Spam-Level: On Tue, 2005-01-11 at 07:25, Martin Jambon wrote: > However, my concerns are more about how to integrate regular expressions > in the language so that they can be checked at compile-time just like the > rest of the program. My Camlp4 syntax extension > (http://martin.jambon.free.fr/micmatch.html) works just fine for this > purpose and I am not asking for any change in the core OCaml syntax ;-) But Str is a hack. Like all NFA based solutions, it's unreliable because it is unsound: possible infinite recursion, indeterminate capture results, and exponential performance. In addition, regular expressions have poor scalability and fail to provide simple i18n support. I think Felix does this the right way: core language support for regular definitions, linear classification and tokensisation, and no captures. If you want captures use the proper tool, namely a parser, which is also supported directly in the language (this is more problematic due to the number of parsers around, Felix currently provides the GLR parser Elkhound). So I would love to see integration of regexp support in Ocaml but I'm very much against Str. If some technology is to be integrated, please use the right technology and integrate Ocamllex. See the ulex package for a model. The problem with micmatch is precisely that it does use Str. BTW: it probably doesn't work either, due to the bug in Str I mentioned in an earlier post. -- John Skaller, mailto:skaller@users.sf.net voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net