From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id TAA17591; Tue, 25 Sep 2001 19:19:47 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id TAA17538 for caml-list@pauillac.inria.fr; Tue, 25 Sep 2001 19:19:47 +0200 (MET DST) Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id TAA17400 for ; Tue, 25 Sep 2001 19:12:32 +0200 (MET DST) Received: from saul.cis.upenn.edu (SAUL.CIS.UPENN.EDU [158.130.12.4]) by nez-perce.inria.fr (8.11.1/8.10.0) with ESMTP id f8PHCUT10537 for ; Tue, 25 Sep 2001 19:12:31 +0200 (MET DST) Received: (from vouillon@localhost) by saul.cis.upenn.edu (8.10.1/8.10.1) id f8PHCTA24593; Tue, 25 Sep 2001 13:12:29 -0400 (EDT) Date: Tue, 25 Sep 2001 13:12:29 -0400 From: Jerome Vouillon To: caml-list@inria.fr Subject: [Caml-list] RE: a regular expression library Message-ID: <20010925131229.A22868@saul.cis.upenn.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0i Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk Hello, I've started to write a regular expression library. It supports several styles of regular expressions: - Perl-style regular expressions; - Posix extended regular expressions; - Emacs-style regular expressions; - Shell-style file globbing It is also possible to build regular expressions by combining simpler regular expressions. The library is still under developpement, but already quite usable. The most notable missing features are back-references and look-ahead/look-behind assertions. I would greatly appreciate your comments about the library (and, in particular, about its API). Contributions and bug reports are also welcome. The library can be downloaded from http://sourceforge.net/projects/libre/ The library seems to be pretty fast when compiled to native code. Here are some timing results (Pentium III 500Mhz): * Scanning a 1Mb string containing only 'a's, except for the last character which is a 'b', searching for the pattern "aa?b" (repeated 100 times). - RE: 2.6s - PCRE: 68s * Regular expression example from http://www.bagley.org/~doug/shootout/ - RE: 0.43s - PCRE: 3.68s (The library is much slower when compiled to bytecode though, as it is entirely written in O'Caml. I plan to rewrite the critical sections of the code in C.) -- Jerome ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr