From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_PASS autolearn=ham autolearn_force=no version=3.4.2 Received: (qmail 25874 invoked from network); 19 Apr 2020 12:27:22 -0000 Received-SPF: pass (mother.openwall.net: domain of lists.openwall.com designates 195.42.179.200 as permitted sender) receiver=inbox.vuxu.org; client-ip=195.42.179.200 envelope-from= Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with UTF8ESMTPZ; 19 Apr 2020 12:27:22 -0000 Received: (qmail 1336 invoked by uid 550); 19 Apr 2020 12:27:19 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 1315 invoked from network); 19 Apr 2020 12:27:19 -0000 From: "liheng (P)" To: Szabolcs Nagy CC: "musl@lists.openwall.com" , Florian Weimer , Rich Felker , "Xiangrui (Euler)" , Lizefan Thread-Topic: [musl] regex Back reference matching result not same as glibc and tre. Thread-Index: AQHWFWwozZbn9HCrKUWWs1T9EOPT1qh+tJSA//9/JoCAAIrKMP//pcyAgAHy0eA= Date: Sun, 19 Apr 2020 12:26:58 +0000 Message-ID: <6D612B6AC5DCDA4580AF97B1068118AD2DD415@DGGEML501-MBX.china.huawei.com> References: <6D612B6AC5DCDA4580AF97B1068118AD2DC49A@DGGEML501-MBX.china.huawei.com> <874kth84v9.fsf@mid.deneb.enyo.de> <6D612B6AC5DCDA4580AF97B1068118AD2DC524@DGGEML501-MBX.china.huawei.com> <20200418111309.GD23945@port70.net> <6D612B6AC5DCDA4580AF97B1068118AD2DC549@DGGEML501-MBX.china.huawei.com> <20200418140703.GE23945@port70.net> In-Reply-To: <20200418140703.GE23945@port70.net> Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.166.215.203] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-CFilter-Loop: Reflected Subject: RE: [musl] regex Back reference matching result not same as glibc and tre. Ok, you are right, I retest to match "aba" by pat[] =3D "\\(.\\?\\).\\?\\1"= success without tags (basic regular expression mode I think). regcomp1(&rbuf, pat, 0); But my point is that why pat[] =3D "(.?).?\\1" to match "aba" in extended= regular expression mode that success in glibc and failed in musl? Are mu= sl-regex and glibc-regex different?=20 -----Original Message----- From: Szabolcs Nagy [mailto:nsz@port70.net]=20 Sent: Saturday, April 18, 2020 10:07 PM To: liheng (P) Cc: musl@lists.openwall.com; Florian Weimer ; Rich Felker= ; Xiangrui (Euler) ; Lizefan Subject: Re: [musl] regex Back reference matching result not same as glibc = and tre. * liheng (P) [2020-04-18 11:37:13 +0000]: > static const char pat[] =3D "\\(.?\\).?\\1"; str =3D "aba" >=20 > ok, I retest this pat with no tag. why? ? is not special in bre. you need "\\{0,1\\}" or "\\?" instead of "?" to match "aba" your pat would match str=3D"a?b?a?" in a standard conform implementation. >=20 > regcomp(&rbuf, pat, 0); > regexec1(&rbuf, str, N, m, 0); >=20 > glibc: > # ./test > regexec failed > test regex failed >=20 > musl: > # ./test > regexec failed > test regex failed >=20 >=20 >=20 > -----Original Message----- > From: Szabolcs Nagy [mailto:nsz@port70.net] > Sent: Saturday, April 18, 2020 7:13 PM > To: liheng (P) > Cc: Florian Weimer ; Rich Felker ;=20 > musl@lists.openwall.com; Xiangrui (Euler) ;=20 > Lizefan > Subject: Re: [musl] regex Back reference matching result not same as glib= c and tre. >=20 > * liheng (P) [2020-04-18 11:07:20 +0000]: > > static const char pat[] =3D "\\(.?\\).?\\1"; > > string: "aba"; >=20 > ? is not special in bre >=20 > it should be \{0,1\} (i think we support \? as an extension, but=20 > unescaped ? only matches literal ?). try one of >=20 > static const char pat[] =3D "\\(.\\{0,1\\}\\).\\{0,1\\}\\1"; static=20 > const char pat[] =3D "\\(.\\?\\).\\?\\1"; >=20 > >=20 > > I tested this pattern by my test case just now. > >=20 > > musl: > > # ./test > > regexec failed > > test regex failed > >=20 > > glibc: > > # ./test > > Invalid back reference > > test regex failed > >=20 > > tre: > > # ./test > > Invalid back reference > > test regex failed > >=20 > > -----Original Message----- > > From: Florian Weimer [mailto:fw@deneb.enyo.de] > > Sent: Saturday, April 18, 2020 6:29 PM > > To: liheng (P) > > Cc: Rich Felker ; musl@lists.openwall.com; Xiangrui > > (Euler) ; Lizefan > > Subject: Re: [musl] regex Back reference matching result not same as gl= ibc and tre. > >=20 > > * liheng: > >=20 > > > static const char pat[] =3D "(.?).?\\1"; > >=20 > > > This commit reminds me that if i want to use back reference i=20 > > > should not to tag REG_EXTENDED, but this test case matching still fai= led. > >=20 > >=20 > > Did you change the expression to this for the basic regular expression = test? > >=20 > > static const char pat[] =3D "\\(.?\\).?\\1";