From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7234 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: buffer overflow in regcomp and a way to find more of those Date: Fri, 20 Mar 2015 21:30:16 -0400 Message-ID: <20150321013016.GS23507@brightrain.aerifal.cx> References: <20150320235227.GE16260@port70.net> <20150321002616.GF16260@port70.net> <20150321004637.GQ23507@brightrain.aerifal.cx> <20150321010043.GR23507@brightrain.aerifal.cx> <20150321012341.GG16260@port70.net> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1426901452 9594 80.91.229.3 (21 Mar 2015 01:30:52 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 21 Mar 2015 01:30:52 +0000 (UTC) To: Konstantin Serebryany , musl@lists.openwall.com Original-X-From: musl-return-7247-gllmg-musl=m.gmane.org@lists.openwall.com Sat Mar 21 02:30:47 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1YZ8FH-0002Ig-Ph for gllmg-musl@m.gmane.org; Sat, 21 Mar 2015 02:30:35 +0100 Original-Received: (qmail 24558 invoked by uid 550); 21 Mar 2015 01:30:29 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 24539 invoked from network); 21 Mar 2015 01:30:29 -0000 Content-Disposition: inline In-Reply-To: <20150321012341.GG16260@port70.net> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:7234 Archived-At: On Sat, Mar 21, 2015 at 02:23:41AM +0100, Szabolcs Nagy wrote: > * Konstantin Serebryany [2015-03-20 18:10:18 -0700]: > > After your fix the fuzzer did not find anything else so far, but it > > suffers from slow performance on some cases. > > Not sure if this qualifies for a bug, but the following example takes > > ~2 seconds to run (runs instantly with glibc): > > i think the problem is stacked repetitions > tre doesnt handle them in a sane way > and uses huge amount of ram Sadly there doesn't seem to be any sane way to handle them... > for * it would be easy to solve, but > the general case is theoretically impossible to > solve: x{255}{255} will be a 255*255 state machine > > this is the only part in the musl regex > engine that's allowed to have super linear > space/time complexity > > (you might want to add some logic to avoid > such stacked repetitions to speed up the search) > > (btw the standard does not allow these, but if > the pattern is parenthesized around every repetition > then that's ok: (x*)* is a valid pattern, x** is not, > so there is not much point rejecting these patterns > the problem does not go away since grouping is allowed) > > > int main() { > > regex_t preg; > > const char *s = ".****\\Z$<\\0)_"; Isn't the \0 an invalid backreference? Could it be getting processed in a way that's causing the slowdown, but simply rejected by glibc? Rich