zsh-workers
 help / color / mirror / code / Atom feed
From: Oliver Kiddle <opk@zsh.org>
To: Bart Schaefer <schaefer@brasslantern.com>
Cc: chris0e3@gmail.com, Zsh hackers list <zsh-workers@zsh.org>
Subject: Re: [PATCH?] Re: [BUG] `$match` is haunting my regex’s trailing, optional, capture
Date: Tue, 12 Dec 2023 00:49:50 +0100	[thread overview]
Message-ID: <34734-1702338590.864931@1x0T.Klos.9utN> (raw)
In-Reply-To: <CAH+w=7bSrq8p8-LNbn-M-Fkigo1GP3S=5+uXho5zw3bJxXBbBQ@mail.gmail.com>

Bart Schaefer wrote:
> On Fri, Dec 8, 2023 at 10:23 PM Bart Schaefer <[1]schaefer@brasslantern.com>
> wrote:
>
>     On Fri, Dec 8, 2023 at 9:14 PM <[2]chris0e3@gmail.com> wrote:
>     >
>     >   setopt rematch_pcre
>     >   [[ 'REQUIRE. OPT' =~ 'REQUIRE.(\s*OPT)?' ]] && printf '\tA. ‹%s›\n'
>     $match
>     >   [[ 'REQUIRE.'     =~ 'REQUIRE.(\s*OPT)?' ]] && printf '\tB. ‹%s›\n'

Without rematchpcre and with \s changed to just a space, this will set
match=( '' ) which is what would seem most logical to me.

> Is "unset match" OK here?  There doesn't seem to be an obvious way to
> distinguish "there are capture expressions, but none matched anything" from
> "there were no capture expressions".  Maybe Oliver has a better clue.

pcre2_get_ovector_count() will give how many capture expressions
the pattern contains. The following:
  [[ 'REQUIRE.1' =~ 'REQUIRE.(\s*O(P)T)?(1)' ]]
results in match=( '' '' 1 ). So adding empty elements at the end too is
consistent with that. pcre2_match's return status tells us the
last capture element that was set.

I didn't find anything in the documentation to confirm that later
elements of the ovector will have been initialised empty but they do
appear to be. If you get garbage instead of empty elements, that'll be
the cause.

Oliver

diff --git a/Src/Modules/pcre.c b/Src/Modules/pcre.c
index e48ae3ae5..a49d1a307 100644
--- a/Src/Modules/pcre.c
+++ b/Src/Modules/pcre.c
@@ -391,6 +391,8 @@ bin_pcre_match(char *nam, char **args, Options ops, UNUSED(int func))
 	pcre_mdata = pcre2_match_data_create_from_pattern(pcre_pattern, NULL);
 	ret = pcre2_match(pcre_pattern, (PCRE2_SPTR) plaintext, subject_len,
 		offset_start, 0, pcre_mdata, mcontext);
+	if (ret > 0)
+	    ret = pcre2_get_ovector_count(pcre_mdata);
     }
 
     if (ret==0) return_value = 0;
@@ -479,7 +481,8 @@ cond_pcre_match(char **a, int id)
 		    break;
 		}
                 else if (r>0) {
-		    zpcre_get_substrings(pcre_pat, lhstr_plain, pcre_mdata, r, svar, avar,
+		    uint32_t ovec_count = pcre2_get_ovector_count(pcre_mdata);
+		    zpcre_get_substrings(pcre_pat, lhstr_plain, pcre_mdata, ovec_count, svar, avar,
 			    ".pcre.match", 0, isset(BASHREMATCH), !isset(BASHREMATCH));
 		    return_value = 1;
 		    break;


  reply	other threads:[~2023-12-11 23:50 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-09  5:14 chris0e3
2023-12-09  6:23 ` Bart Schaefer
2023-12-09 20:54   ` [PATCH?] " Bart Schaefer
2023-12-11 23:49     ` Oliver Kiddle [this message]
2023-12-12  1:38       ` Bart Schaefer
2024-01-25 22:14         ` Oliver Kiddle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=34734-1702338590.864931@1x0T.Klos.9utN \
    --to=opk@zsh.org \
    --cc=chris0e3@gmail.com \
    --cc=schaefer@brasslantern.com \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).