From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/10577 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general,gmane.comp.version-control.git Subject: Re: Re: Regression: git no longer works with musl libc's regex impl Date: Wed, 5 Oct 2016 09:15:59 -0400 Message-ID: <20161005131559.GG19318@brightrain.aerifal.cx> References: <20161004150848.GA7949@brightrain.aerifal.cx> <20161004152722.ex2nox43oj5ak4yi@sigill.intra.peff.net> <20161004154045.GT19318@brightrain.aerifal.cx> <20161004173926.GA19318@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1475673403 1745 195.159.176.226 (5 Oct 2016 13:16:43 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 5 Oct 2016 13:16:43 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Jeff King , git@vger.kernel.org, musl@lists.openwall.com To: Johannes Schindelin Original-X-From: musl-return-10589-gllmg-musl=m.gmane.org@lists.openwall.com Wed Oct 05 15:16:39 2016 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1brm3d-00071z-Jt for gllmg-musl@m.gmane.org; Wed, 05 Oct 2016 15:16:25 +0200 Original-Received: (qmail 22043 invoked by uid 550); 5 Oct 2016 13:16:25 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 22018 invoked from network); 5 Oct 2016 13:16:24 -0000 Content-Disposition: inline In-Reply-To: Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:10577 gmane.comp.version-control.git:306193 Archived-At: On Wed, Oct 05, 2016 at 01:17:49PM +0200, Johannes Schindelin wrote: > Hi Rich, > > On Tue, 4 Oct 2016, Rich Felker wrote: > > > On Tue, Oct 04, 2016 at 06:08:33PM +0200, Johannes Schindelin wrote: > > > > > And lastly, the best alternative would be to teach musl about > > > REG_STARTEND, as it is rather useful a feature. > > > > Maybe, but it seems fundamentally costly to support -- it's extra > > state in the inner loops that imposes costly spill/reload on archs > > with too few registers (x86). > > It is true that it could cause that. > > I had a brief look at the source code (you use backtracking... Where did you get that idea? Backtracking is the most utterly incompetent way to implement regex -- it throws away the whole property that makes regex useful, being regular. Unfortunately, POSIX BRE is not regular, as it contains backreferences, so any implementation of regcomp/regexec requires at least a minimal backtracking code path for BREs that contain backreferences. > hopefully > nobody uses musl to parse regular expressions from untrusted, or On the contrary, musl's is the only system reccomp/regexec I'm aware of that actually attempts to be safe with untrusted input -- when using REG_EXTENDED (ERE). Other implementations provide backreferences in ERE as an extension, making ERE unsafe just like BRE. musl intentionally disallows them as a feature. At least until recently, glibc also crashed on malloc failures in regcomp, making it unsafe on untrusted input for that reason too. Rich