From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13211 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Regex: behaviour of ? after () atom Date: Fri, 7 Sep 2018 11:33:02 -0400 Message-ID: <20180907153302.GM1878@brightrain.aerifal.cx> References: <20180907133805.FZif_%steffen@sdaoden.eu> <20180907151821.GL1878@brightrain.aerifal.cx> <20180907152517.QGi3S%steffen@sdaoden.eu> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1536334273 15980 195.159.176.226 (7 Sep 2018 15:31:13 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 7 Sep 2018 15:31:13 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Steffen Nurpmeso To: musl@lists.openwall.com Original-X-From: musl-return-13227-gllmg-musl=m.gmane.org@lists.openwall.com Fri Sep 07 17:31:08 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1fyIix-00041Y-EJ for gllmg-musl@m.gmane.org; Fri, 07 Sep 2018 17:31:07 +0200 Original-Received: (qmail 16304 invoked by uid 550); 7 Sep 2018 15:33:14 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 16283 invoked from network); 7 Sep 2018 15:33:14 -0000 Content-Disposition: inline In-Reply-To: <20180907152517.QGi3S%steffen@sdaoden.eu> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:13211 Archived-At: On Fri, Sep 07, 2018 at 05:25:17PM +0200, Steffen Nurpmeso wrote: > Rich Felker wrote in <20180907151821.GL1878@brightrain.aerifal.cx>: > |On Fri, Sep 07, 2018 at 03:38:05PM +0200, Steffen Nurpmeso wrote: > |> Hello. > |> > |> In perl this is > |> > |> $x="print 1 2"; > |> if($x =~ /^(:[[:space:]]+)?([^[:space:]]+)(.*)$/){ > |> print "<$0> -> <$1> <$2> <$3>\n" > |>} > |> > |> and the result is > |> > |> -> <> < 1 2> > |> > |> Now the same on AlpineLinux edge and musl-1.1.19-r10 with the MUA > |> i maintain, which uses the normal regex stuff and calls it via > |> > |> echo eins=$3 > |> vput vexpr i regex "${3}" \ > |> '^(:[[:space:]]+)?([^[:space:]]+)(.*)$' \ > |> '<\$0> -> <\$1> <\$2> <\$3>' > |> echo i=$i > |> > |> which in C code does > |> > |> if((reflrv = regcomp(&re, argv[2], reflrv))){ > |> ... > |> goto jestr; > |>} > |> fprintf(stderr, "GOING for <%s> -> <%s> %u\n", > |> argv[1],argv[2],n_NELEM(rema)); > |> reflrv = regexec(&re, argv[1], n_NELEM(rema), rema, 0); > |> > |> and overall prints > |> > |> eins=print 1 2 > |> GOING for -> <^(:[[:space:]]+)?([^[:space:]]+)(.*)$> 17 > |> i= -> <> <> <> > |> > |> It works correctly if i remove the ()? atom, so i thought i should > |> report that. > | > |What is the value of the flags argument you passed to regcomp? > | > > REG_EXTENDED, optional REG_ICASE: > > reflrv = REG_EXTENDED; > if(f & a_ICASE) > reflrv |= REG_ICASE; > if((reflrv = regcomp(&re, argv[2], reflrv))){ OK, it looks like that should work, and seemed to work here when I passed the regex to grep -E linked with musl's regex. Can you provide a minimal self-contained C program to demonstrate the issue you're having? BTW which "()?" are you talking about? The whole first parenthesized subsexpression and the ? after it? I wouldn't call that an atom, but nothing seems wrong with it. Rich