From: Markus Mottl <mottl@miss.wu-wien.ac.at>
To: OCAML <caml-list@inria.fr>
Subject: features of PCRE-OCaml
Date: Wed, 6 Dec 2000 01:51:39 +0100 [thread overview]
Message-ID: <20001206015139.D31140@miss.wu-wien.ac.at> (raw)
Hello,
it seems that many people hadn't yet learnt about PCRE-OCaml (the
OCaml-interface to the PCRE-library) and have asked for more information
on the advantages as compared to the Str-library (or to Perl).
Here is a list of features as taken from the README:
* The PCRE-library by Philip Hazel has been under development for
quite some time now and is fairly advanced and stable. It implements
just about all of the convenient functionality of regular expressions
as one can find them in PERL. The higher-level functions written
in OCaml (split, replace), too, are compatible to the corresponding
PERL-functions (to the extent that OCaml allows). Most people find
the syntax of PERL-style regular expressions more straightforward
than the Emacs-style one used in the "Str"-module.
* In contrast to PERL, the library creates DFAs (deterministic finite
automata) instead of NFAs (nondeterministic finite automata). DFAs
generally allow much faster pattern matching, because they never
need to backtrack. Especially patterns with many alternations can
see a great speedup.
* It is reentrant - and thus thread safe. This is not the case with
the "Str"-module of OCaml, which builds on the GNU "regex"-library.
Using reentrant libraries also means more convenience for
programmers. They do not have to reason about states in which the
library might be in.
* The high-level functions for replacement and substitution, they are
all implemented in OCaml, are much faster than the ones of the
"Str"-module. In fact, when compiled to native code, they even seem
to be significantly faster than those of PERL (PERL is written in C).
Somebody reported to me that he had tested OCaml with PCRE-OCaml
against PERL and Python with several 100MB data that had to be
matched/manipulated. Trusting his claims, the overall speed of the
OCaml-version (native code) was 15 times faster than Perl and 45
times faster than Python, which is probably also due to the high
quality of the OCaml-compiler.
* You can rely on the data returned being unique. In other terms:
if the result of a function is a string, you can safely use
destructive updates on it without having to fear side effects.
* The interface to the library makes use of labels and default
arguments to give you a high degree of programming comfort.
I hope this answers most questions!
Best regards,
Markus Mottl
--
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl
next reply other threads:[~2000-12-07 8:04 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2000-12-06 0:51 Markus Mottl [this message]
2000-12-07 16:01 ` John Max Skaller
2000-12-07 16:32 ` Markus Mottl
2000-12-07 17:08 ` John Max Skaller
2000-12-08 0:03 ` Markus Mottl
2000-12-08 17:52 ` John Max Skaller
2000-12-08 9:19 ` Alain Frisch
2000-12-08 18:11 ` John Max Skaller
2000-12-08 19:48 ` Alain Frisch
2000-12-09 17:07 ` John Max Skaller
2000-12-14 17:35 ` unicode support Nickolay Semyonov
2000-12-07 20:17 ` features of PCRE-OCaml Miles Egan
2000-12-08 12:30 ` Gerd Stolpmann
2000-12-08 15:05 ` Markus Mottl
2000-12-08 15:40 ` Gerd Stolpmann
2000-12-09 3:03 ` Markus Mottl
2000-12-09 13:12 ` Gerd Stolpmann
2000-12-10 0:32 ` Markus Mottl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20001206015139.D31140@miss.wu-wien.ac.at \
--to=mottl@miss.wu-wien.ac.at \
--cc=caml-list@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).