From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id WAA08940; Wed, 13 Feb 2002 22:55:08 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id WAA09561 for caml-list@pauillac.inria.fr; Wed, 13 Feb 2002 22:55:07 +0100 (MET) Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id PAA01032 for ; Wed, 13 Feb 2002 15:23:09 +0100 (MET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id g1DEN7X00554; Wed, 13 Feb 2002 15:23:07 +0100 (MET) Received: (from vouillon@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id PAA01483; Wed, 13 Feb 2002 15:23:06 +0100 (MET) Date: Wed, 13 Feb 2002 15:23:06 +0100 From: Jerome Vouillon To: Francois Rouaix Cc: caml-list@inria.fr Subject: Re: [Caml-list] Regexp libraries Message-ID: <20020213152306.B32322@pauillac.inria.fr> References: <000001c1b211$5d469720$ca01a8c0@homebox> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0i In-Reply-To: <000001c1b211$5d469720$ca01a8c0@homebox>; from frouaix@yahoo.com on Sun, Feb 10, 2002 at 01:00:18AM -0800 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Sun, Feb 10, 2002 at 01:00:18AM -0800, Francois Rouaix wrote: > I'm cleaning up some old code of mine that uses regexps quite heavily... > Since those days where we only had Str, it seems that I now have the > choice between at least Pcre and Libre. Would anyone care to give their > opinion on those ? I'd like to get rid of Str because of the threads > issues (and bugs with large input). I've used Pcre a bit, but would like > to go to an all-Ocaml solution because of Win2k porting requirements. > However, the Libre documentation is inexistant (the .mli files contains > barely more than the types). Here is a list of disavantages and advantages of RE compared to Pcre. Disavantages - The pattern compilation is more costly - Not fully thread safe (you cannot simultaneously use the same pattern in different threads) - Slower than Pcre when compiled to bytecode - Lot of missing features : back-references, look-ahead and look-behind assertions, ... Avantages - Choice between different matching semantics (first match, longuest match, shortest match) - Regular expression can be combined using operators such as union or concatenation - Much faster than Pcre once start-up time is amortized - All regular expression are executed at the same speed once start-up time is amortized, so you don't have to do any performance tweaking. > Would somebody on the list, that has knowledge of the library (Jerome > ?), be willing to enhance the Libre docs a tiny bit ? I should really take the time to write a documentation. But I don't have the time at the moment... Patrick Doane has contributed a pretty large test suite (located in tests/test_*.ml). This test suite may help you understand the library. -- Jerome ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr