From: Isaac <idunham@lavabit.com>
To: musl@lists.openwall.com
Subject: Re: regex libs (was Re: [musl] embedded newbies site.)
Date: Tue, 16 Jul 2013 22:38:11 -0700 [thread overview]
Message-ID: <20130717053811.GA1719@newbook> (raw)
In-Reply-To: <20130716153220.GP29800@brightrain.aerifal.cx>
On Tue, Jul 16, 2013 at 11:32:20AM -0400, Rich Felker wrote:
> On Tue, Jul 16, 2013 at 11:10:34AM -0400, LM wrote:
> > On Tue, Jul 16, 2013 at 10:00 AM, Rich Felker <dalias@aerifal.cx> wrote:
> >
> > > The whole concept of regular expressions is that they're regular,
> > > meaning they're matchable in O(n) time with O(1) space. PCRE (the
> > > implementation) uses backtracking for everything, giving it
> > > exponentially-bad performance (JIT cannot fix this), and PCRE (the
> > > language) has a lot of features that are fundamentally not regular and
> > > thus can't be implemented efficiently. Also, the behavior of some of
> > > the features (e.g. greedy vs non-greedy matching) were not designed
> > > intentionally but just arose out of the backtracking implementation,
> > > and thus don't make a lot of sense unless you think from the
> > > standpoint of such an implementation.
> > >
> >
> > Went back and rechecked the documentation (
> > http://www.pcre.org/readme.txt). You're both right, PCRE is offering
> > the Perl regular expressions
> > implementation even when one uses the pcreposix interface. Would have been
> > nice if they offered actual regular expressions handling if you only want
> > to use the POSIX compatible part of the interface.
The pcreposix stuff is only for those who want perl regexes with an existing
POSIX codebase.
> > So what are some good regex library solutions? I'm also wondering if there
>
> POSIX systems provide the regcomp and regexec api as part of libc. On
> musl, glibc, uclibc, and (afaik) all the bsd libcs, this gives you a
> good implementation. However musl's is the only one I know of with no
> conformance bugs, but the conformance bugs in the others are isolated
> to obscure subexpression match issues.
>
> > are some good cross-platform portable library solutions (or if PCRE is the
> > best pattern matching solution from a portability standpoint even if it's
> > not strictly regex compatible).
>
> TRE, on which musl's is based, is the closest to being good, in my
> opinion. It has optimal big-O but fairly suboptimal constant. However
> it's unmaintained and has at least two bugs that aren't fixed
> upstream, one of which may be exploitable if an untrusted party can
> supply arbitrary regular expressions, and the other which just causes
> some POSIX BRE expressions (the legacy style used by sed and non-e
> grep) to be misinterpreted.
Have these been mentioned to NetBSD? I ask because they use TRE for their libc
(the relicense happened at their request).
Thanks,
Isaac Dunham
next prev parent reply other threads:[~2013-07-17 5:38 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-16 15:10 LM
2013-07-16 15:32 ` Rich Felker
2013-07-17 5:38 ` Isaac [this message]
2013-07-16 15:41 ` Justin Cormack
2013-07-16 16:55 ` Szabolcs Nagy
2013-07-16 17:13 ` Strake
2013-07-16 17:14 ` Kurt H Maier
2013-07-16 17:38 ` Szabolcs Nagy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130717053811.GA1719@newbook \
--to=idunham@lavabit.com \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).