mailing list of musl libc
 help / color / mirror / code / Atom feed
From: наб <nabijaczleweli@nabijaczleweli.xyz>
Cc: musl@lists.openwall.com
Subject: [musl] [PATCH 2/2] regex: increase TRE_CHAR_MAX and use it for NUL with REG_STARTEND
Date: Thu, 20 Apr 2023 23:04:03 +0200	[thread overview]
Message-ID: <871973dd402e10af32e48118663af9bfe3fa23b9.1682024413.git.nabijaczleweli@nabijaczleweli.xyz> (raw)
In-Reply-To: <73caac41e70db544c53b1aa947627206d3eb625b.1682024413.git.nabijaczleweli@nabijaczleweli.xyz>

[-- Attachment #1: Type: text/plain, Size: 2022 bytes --]

This character cannot be named normally, but can be matched with
catch-alls like . and [^]

This brings us to feature parity with NetBSD:
	$ ./a.out '^a[^w]c$'  # matching "a\0c"
	0
	1, 4; -1, -1
	$ ./a.out '^a.c$'
	0
	1, 4; -1, -1
	$ ./a.out '.c$'
	0
	2, 4; -1, -1
	$ ./a.out '.*'
	0
	1, 4; -1, -1

        $ sed -i 's/cdef/adef/' a.c
	$ ./a.out '^\(a\).\1$'  # matching "a\0a"
	0
	1, 4; 1, 2
---
Please keep me in CC, as I'm not subscribed.

I haven't encountered an issue with this, and TRE_CHAR_MAX seems to be
"domain of characters from GET_NEXT_WCHAR()", not
"real characters in the current locale's encoding",
so expanding the domain with a special character for NUL seems fine.

 src/regex/regexec.c | 2 +-
 src/regex/tre.h     | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/regex/regexec.c b/src/regex/regexec.c
index 2a2bded5..f09fdae1 100644
--- a/src/regex/regexec.c
+++ b/src/regex/regexec.c
@@ -60,7 +60,7 @@ tre_fill_pmatch(size_t nmatch, regmatch_t pmatch[], int cflags,
     if(!max_len) { next_c = '\0'; pos_add_next = 1; }                         \
     else if ((pos_add_next = mbtowc(&next_c, str_byte, max_len)) <= 0) {      \
         if (pos_add_next < 0) { ret = REG_NOMATCH; goto error_exit; }         \
-        else { pos_add_next++; if (startend) next_c = -1; };                  \
+        else { pos_add_next++; if (startend) next_c = TRE_CHAR_MAX; };        \
     }                                                                         \
     str_byte += pos_add_next;                                                 \
   } while (0)
diff --git a/src/regex/tre.h b/src/regex/tre.h
index 9aae851f..e913899a 100644
--- a/src/regex/tre.h
+++ b/src/regex/tre.h
@@ -50,7 +50,7 @@ typedef wchar_t tre_char_t;
 
 /* Wide characters. */
 typedef wint_t tre_cint_t;
-#define TRE_CHAR_MAX 0x10ffff
+#define TRE_CHAR_MAX (0x10ffff + 1)
 
 #define tre_isalnum iswalnum
 #define tre_isalpha iswalpha
-- 
2.30.2

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2023-04-20 21:45 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-20 21:01 [musl] [PATCH 1/2] regex: add BSD-style REG_STARTEND наб
2023-04-20 21:04 ` наб [this message]
2023-04-21 15:48 ` [musl] REG_STARTEND tests наб
2023-04-28 11:39 ` [musl] [PATCH v2 1/2] regex: add BSD-style REG_STARTEND наб
2023-05-14 15:17   ` [musl] [PATCH v3 " наб
2023-05-14 15:17   ` [musl] [PATCH v3 2/2] regex: increase TRE_CHAR_MAX and use it for NUL with REG_STARTEND наб
2023-04-28 11:40 ` [musl] [PATCH v2 " наб

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871973dd402e10af32e48118663af9bfe3fa23b9.1682024413.git.nabijaczleweli@nabijaczleweli.xyz \
    --to=nabijaczleweli@nabijaczleweli.xyz \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).