From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/10923 Path: news.gmane.org!.POSTED!not-for-mail From: Jeff King Newsgroups: gmane.comp.version-control.git,gmane.linux.lib.musl.general Subject: Re: [musl] Re: Test failures when Git is built with libpcre and grep is built without it Date: Wed, 11 Jan 2017 05:04:01 -0500 Message-ID: <20170111100400.vhd5ytarqpujigbn@sigill.intra.peff.net> References: <58688C9F.4000605@adelielinux.org> <20170102065351.7ymrm77asjbghgdg@sigill.intra.peff.net> <58736B2A.40003@adelielinux.org> <871swcjsd3.fsf@linux-m68k.org> <20170109213303.4rupe5cqwejfp6af@sigill.intra.peff.net> <5874B942.7070402@adelielinux.org> <20170110113959.GL17692@port70.net> NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 X-Trace: blaine.gmane.org 1484129066 16616 195.159.176.226 (11 Jan 2017 10:04:26 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Wed, 11 Jan 2017 10:04:26 +0000 (UTC) Cc: musl@lists.openwall.com, Andreas Schwab , "A. Wilcox" To: git@vger.kernel.org Original-X-From: git-owner@vger.kernel.org Wed Jan 11 11:04:17 2017 Return-path: Envelope-to: gcvg-git-2@m.gmane.org Original-Received: from vger.kernel.org ([209.132.180.67]) by blaine.gmane.org with esmtp (Exim 4.84_2) (envelope-from ) id 1cRFlG-0002jU-4k for gcvg-git-2@m.gmane.org; Wed, 11 Jan 2017 11:04:06 +0100 Original-Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753280AbdAKKEH (ORCPT ); Wed, 11 Jan 2017 05:04:07 -0500 Original-Received: from cloud.peff.net ([104.130.231.41]:37904 "EHLO cloud.peff.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751498AbdAKKEG (ORCPT ); Wed, 11 Jan 2017 05:04:06 -0500 Original-Received: (qmail 20504 invoked by uid 109); 11 Jan 2017 10:04:04 -0000 Original-Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.84) with SMTP; Wed, 11 Jan 2017 10:04:04 +0000 Original-Received: (qmail 20531 invoked by uid 111); 11 Jan 2017 10:04:57 -0000 Original-Received: from sigill.intra.peff.net (HELO sigill.intra.peff.net) (10.0.0.7) by peff.net (qpsmtpd/0.84) with SMTP; Wed, 11 Jan 2017 05:04:57 -0500 Original-Received: by sigill.intra.peff.net (sSMTP sendmail emulation); Wed, 11 Jan 2017 05:04:01 -0500 Content-Disposition: inline In-Reply-To: <20170110113959.GL17692@port70.net> Original-Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Xref: news.gmane.org gmane.comp.version-control.git:312129 gmane.linux.lib.musl.general:10923 Archived-At: On Tue, Jan 10, 2017 at 12:40:00PM +0100, Szabolcs Nagy wrote: > > > I'm not sure if musl is wrong for failing to complain about a > > > bogus regex. Generally making something that would break into > > > something that works is an OK way to extend the standard. So our > > > test is at fault for assuming that the regex will fail. I guess > > \x is undefined in posix and musl is based on tre which > supports \x{hexdigits} in ere. Thanks for confirming; I figured it was something like that. > > > we'd need to find some more exotic syntax that pcre supports, but > > > that ERE doesn't. Maybe "(?:)" or something. > > i think you would have to use something that's invalid > in posix ere, ? after empty expression is undefined, > not an error so "(?:)" is a valid ere extension. Reading through POSIX[1], hardly anything is explicitly labeled as "invalid". Most things are just "undefined", which leaves rooms for implementations to do what they like. That's a good thing for a standard to do, but a bad thing when you are trying to find behavior that differs reliably between PCRE and ERE. :) In most cases, PCRE constructs could be viable extensions to ERE. > since most syntax is either defined or undefined in ere > instead of being invalid, distinguishing pcre using > syntax is not easy. > > there are semantic differences in subexpression matching: > leftmost match has higher priority in pcre, longest match > has higher priority in ere. > > $ echo ab | grep -o -E '(a|ab)' > ab > $ echo ab | grep -o -P '(a|ab)' > a > > unfortunately grep -o is not portable. In this case we're testing whether Git has internally fed the regex to pcre or to regcomp(), not a system grep. So we'd need something like "-o" for "git grep", which I don't think exists. Another difference I found is that "[\d]" matches a literal "\" or "d" in ERE, but behaves like "[0-9]" in PCRE. I'll work up a patch based on that. Thanks for your answer. I'll drop the musl list from the cc when I follow-up, as this is most definitely not a musl problem, but a git one. -Peff