From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=0.4 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, MAILING_LIST_MULTI,MISSING_HEADERS,RCVD_IN_MSPIKE_H2 autolearn=no autolearn_force=no version=3.4.4 Received: (qmail 22041 invoked from network); 20 Apr 2023 21:45:40 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 20 Apr 2023 21:45:40 -0000 Received: (qmail 32220 invoked by uid 550); 20 Apr 2023 21:41:20 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 9373 invoked from network); 20 Apr 2023 21:04:16 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=nabijaczleweli.xyz; s=202211; t=1682024645; bh=KAGG5ASFch+MMBeQsXiHoWdL3XtRft6W2tSeL6QafIQ=; h=Date:From:Cc:Subject:References:In-Reply-To:From; b=YDEFM5w8BAYmQ4tJeL5jw2Cn+u2Kcsw5/IrI5ktSvUEaZ/6FxlZ5MXOQO0Ehrf+c6 2BrU20mGjrxyi+0xxRAtQYz3ROc4kQUczP62o48ugx2+xW7sFP9xlWeL8yNap1gfi6 Phn67iKDWyRWDSk64ryEQUYUz5dyYxfBHXLTKckdn81DhoZZQ1vvCumD7z38hgQqdT qzQZYLs2tmsB1aqFExNWeXR9LENCtya/3ZZ/IXJ8+TKSlfUgSsppJ6Oc+ePqtJ1qkf 0IiMRVeGFdJNf7Gfy45MXa/cOFn/P8V/spUjqErvCD/CIc5A/r3Me0PfsrS5TnQt6P j7oghcbsh/zoA== Date: Thu, 20 Apr 2023 23:04:03 +0200 From: =?utf-8?B?0L3QsNCx?= Cc: musl@lists.openwall.com Message-ID: <871973dd402e10af32e48118663af9bfe3fa23b9.1682024413.git.nabijaczleweli@nabijaczleweli.xyz> References: <73caac41e70db544c53b1aa947627206d3eb625b.1682024413.git.nabijaczleweli@nabijaczleweli.xyz> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="dqtsarxrhzc2pnhr" Content-Disposition: inline In-Reply-To: <73caac41e70db544c53b1aa947627206d3eb625b.1682024413.git.nabijaczleweli@nabijaczleweli.xyz> User-Agent: NeoMutt/20230407 Subject: [musl] [PATCH 2/2] regex: increase TRE_CHAR_MAX and use it for NUL with REG_STARTEND --dqtsarxrhzc2pnhr Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable This character cannot be named normally, but can be matched with catch-alls like . and [^] This brings us to feature parity with NetBSD: $ ./a.out '^a[^w]c$' # matching "a\0c" 0 1, 4; -1, -1 $ ./a.out '^a.c$' 0 1, 4; -1, -1 $ ./a.out '.c$' 0 2, 4; -1, -1 $ ./a.out '.*' 0 1, 4; -1, -1 $ sed -i 's/cdef/adef/' a.c $ ./a.out '^\(a\).\1$' # matching "a\0a" 0 1, 4; 1, 2 --- Please keep me in CC, as I'm not subscribed. I haven't encountered an issue with this, and TRE_CHAR_MAX seems to be "domain of characters from GET_NEXT_WCHAR()", not "real characters in the current locale's encoding", so expanding the domain with a special character for NUL seems fine. src/regex/regexec.c | 2 +- src/regex/tre.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/regex/regexec.c b/src/regex/regexec.c index 2a2bded5..f09fdae1 100644 --- a/src/regex/regexec.c +++ b/src/regex/regexec.c @@ -60,7 +60,7 @@ tre_fill_pmatch(size_t nmatch, regmatch_t pmatch[], int c= flags, if(!max_len) { next_c =3D '\0'; pos_add_next =3D 1; } = \ else if ((pos_add_next =3D mbtowc(&next_c, str_byte, max_len)) <=3D 0)= { \ if (pos_add_next < 0) { ret =3D REG_NOMATCH; goto error_exit; } = \ - else { pos_add_next++; if (startend) next_c =3D -1; }; = \ + else { pos_add_next++; if (startend) next_c =3D TRE_CHAR_MAX; }; = \ } = \ str_byte +=3D pos_add_next; = \ } while (0) diff --git a/src/regex/tre.h b/src/regex/tre.h index 9aae851f..e913899a 100644 --- a/src/regex/tre.h +++ b/src/regex/tre.h @@ -50,7 +50,7 @@ typedef wchar_t tre_char_t; =20 /* Wide characters. */ typedef wint_t tre_cint_t; -#define TRE_CHAR_MAX 0x10ffff +#define TRE_CHAR_MAX (0x10ffff + 1) =20 #define tre_isalnum iswalnum #define tre_isalpha iswalpha --=20 2.30.2 --dqtsarxrhzc2pnhr Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfWlHToQCjFzAxEFjvP0LAY0mWPEFAmRBqMMACgkQvP0LAY0m WPEOqA//TC3dH/CUNxgY2uNLSmVFecEELINXpdDisiaWFQMnt6FIb5LQcSgp8qv2 IEjTpr3rgQs05sUVxO60mEOfHuR4nObin98+Spv/H/5CDx16irIJlNIkCehjY+4+ ba29n7ujo22AbvCp/aTArupYDItLg6Sb7/K1Tihu8szDzqaVQuJUkVq4y0mGH/FT kIzec4joCzYZMN6C1UjA6kTeuOQ7Xw9oq/wIsDUAg3fNidAqMa3E8x8y7nPjq5N+ ooatswvC2VlzUUEp5CIPdJR0aE7TbxMhJiScbokkLStBneMP6HY2y0yYP+xLPl17 qL17OCw85a9Uw8qgXcw2PmZ9aK+VJKaGa9UEAP6wpiT4cbUCb2i/7lHL/vssmhea 9BNK7JfB8WI0lQUh9T7+9BYiqEG1PLC01/vSwriCpzhjqiux0L6SZhxtNzScbKLT t7N4eXYeMbZyC/jC1Fi/QQhLQD37U77oiH5JM6UxReWfuvf46JbyGTpB9s0LbCFI rPyt7kut8IjqPk+QHDopqXJxYnZ9scJdO/PbelDLvAsA0NXtQTsB1dxP399bq3dw 55+vwJjw0ShOg+7l5lDO0j/tFm/QX7wy4QzNp50n+7eS5DQk+BtcNJDF3dg7gdHw Bm5uiwq6wT6heIaEJVbbxbIaprSUSvmBVQvMD17we7m7fSJTckM= =luWd -----END PGP SIGNATURE----- --dqtsarxrhzc2pnhr--