mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@aerifal.cx>
To: musl@lists.openwall.com
Subject: Re: REG_STARTEND (regex)
Date: Tue, 15 Jan 2013 08:42:44 -0500	[thread overview]
Message-ID: <20130115134244.GW20323@brightrain.aerifal.cx> (raw)
In-Reply-To: <CAPLrYER6=Guv4EUxs9rtQov1x=aYm4jVbiYSDvoqi05zBp_2hg@mail.gmail.com>

On Tue, Jan 15, 2013 at 11:34:59AM +0100, Daniel Cegiełka wrote:
> Hi,
> Is there a chance that musl will support REG_STARTEND? It is used
> quite often in *BSD.
> 
> http://www.sourceware.org/ml/libc-alpha/2004-03/msg00038.html

Probably not, at least not in the immediate future. The original TRE
code actually worked with strings as a base+length rather than
null-terminated internally, which meant a lot of things were a lot
more expensive they should be; if I remember correctly, even searches
for text guaranteed to be found near the beginning of the string
required strlen for the whole string, i.e. the whole operation was
needlessly O(n). In one of the cleanup rounds, I changed it to use
null termination, which simplified a lot of the tests; many checks
collapsed away since \0 was automatically not in the set being checked
against and thus no second check was requried.

If/when we overhaul regex again, I'll certainly consider this request
and see if the design can be made such that it's not expensive. But I
don't see any easy way to do it right now short of making a temp copy
of the string. That _would_ be possible; \0 could be replaced with
\xff, and \xff replaced with \fe, and special logic added to allow
\xff (which is otherwise an invalid byte and never matchable) while
still rejecting \xfe and other invalid bytes. This would require no
changes to the internals, but it would have the property of requiring
an O(n) malloc/memcpy, which is certainly not very appealing.

Rich


  reply	other threads:[~2013-01-15 13:42 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-15 10:34 Daniel Cegiełka
2013-01-15 13:42 ` Rich Felker [this message]
2013-01-15 15:16   ` Daniel Cegiełka
2013-01-15 15:37     ` John Spencer
2013-01-15 15:50       ` Daniel Cegiełka
2013-01-15 16:13         ` Rob Landley
2013-01-15 18:38         ` John Spencer
2013-01-16 15:41           ` Rob Landley
2013-01-15 16:11     ` Rob Landley
2013-01-15 18:45     ` Rich Felker
2013-01-15 18:55       ` Daniel Cegiełka
2013-01-16 15:42       ` Rob Landley
2013-01-16 16:57         ` Rich Felker
2014-06-11 14:24     ` Justin Cormack
2014-06-12  1:00       ` bfdamkoehler
2014-06-12  1:40         ` Rich Felker
2014-06-13  1:15           ` bfdamkoehler
2014-06-13  3:00             ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130115134244.GW20323@brightrain.aerifal.cx \
    --to=dalias@aerifal.cx \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).