zsh-workers
 help / color / mirror / code / Atom feed
From: Bart Schaefer <schaefer@brasslantern.com>
To: Zsh hackers list <zsh-workers@zsh.org>
Subject: [PATCH (not final)] (take three?) unset "array[$anything]"
Date: Wed, 2 Jun 2021 19:04:01 -0700	[thread overview]
Message-ID: <CAH+w=7a3hgsag-1WsLxn1yTUja8ooZ+K4eFQhNzHKc32o4a3yg@mail.gmail.com> (raw)
In-Reply-To: <CAH+w=7bDA-nOH-OX1M9fCGACavpf=kN9rMYqN9vmBUZ+YrxRYQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 2349 bytes --]

On Wed, Jun 2, 2021 at 8:59 AM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> I've just had a hand-slaps-forehead
> moment ... take 3 to follow in another thread.

What I realized is that for any unset of an array element, the closing
bracket must always be the last character of the argument.  There's no
reason to parse the subscript or skip over matching brackets; if a '['
is found, just make sure the last character is ']' and the subscript
must be everything in between.

% typeset -A ax
% for k in '' '\' '`' '(' '[' ')' ']'; do
for> ax[$k]=$k
for> done
% typeset -p ax
typeset -A ax=( ['']='' ['(']='(' [')']=')' ['[']='[' ['\']='\'
[']']=']' ['`']='`' )
% for k in ${(k)ax}; do
unset "ax[$k]"
done
% typeset -p ax
typeset -A ax=( )

Given this realization, it's easy to make { unset "hash[$key]" } work
"like it always should have".  The trouble comes in with (not)
breaking past workarounds.  Because the current (and theoretically
experimental, though we forgot about that for 5 years) code uses
parse_subscript(), we get a partly-tokenized (as if double-quoted,
actually) string in the cases where backslashes are used to force the
closing bracket to be found.  If those backslashes aren't needed any
more, there's no clean way to ignore them upon untokenize, to get back
to something that actually matches the intended hash key.

The attached not-ready-for-push patch has 4 variations that can be
chosen by #define, currently set up as follows:

#define unset_workers_37914 0
#define unset_hashelem_empty_only 0
#define unset_hashelem_literal 1
#define unset_hashelem_stripquote 0

The first one is just the current code.  The second one allows { unset
"hash[]" } (and gives "invalid subscript" for array[] instead of
"invalid parameter name").  The third one, which I have defined by
default, uses the subscript literally, so if you can do { hash[$k]=v }
you can also do { unset "hash[$k]" } (at least for all cases I
tested).  The fourth one requires { unset "hash[${(q)k}]" } instead,
but I think it otherwise works for all cases.  Both of those also work
for "hash[]".

Therefore I think the best option is to choose one of the latter two,
possibly depending on which one induces the least damage to any
workarounds for the current behavior that are known in the wild,
though aesthetically I'd rather use the literal version.

[-- Attachment #2: unset_subscripts.txt --]
[-- Type: text/plain, Size: 2174 bytes --]

diff --git a/Src/builtin.c b/Src/builtin.c
index a16fddcb7..ec0376367 100644
--- a/Src/builtin.c
+++ b/Src/builtin.c
@@ -1933,10 +1933,14 @@ getasg(char ***argvp, LinkList assigns)
     asg.flags = 0;
 
     /* search for `=' */
-    for (; *s && *s != '='; s++);
+    for (; *s && *s != '[' && *s != '=' /* && *s != '+' */; s++);
+    if (s > asg.name && *s == '[') {
+	char *se = parse_subscript(s + 1, 1, ']');
+	if (se && *se == ']') s = se + 1;
+    }
 
     /* found `=', so return with a value */
-    if (*s) {
+    if (*s && *s == '=') {
 	*s = '\0';
 	asg.value.scalar = s + 1;
     } else {
@@ -3724,6 +3728,11 @@ bin_unset(char *name, char **argv, Options ops, int func)
     while ((s = *argv++)) {
 	char *ss = strchr(s, '['), *subscript = 0;
 	if (ss) {
+#define unset_workers_37914 0
+#define unset_hashelem_empty_only 0
+#define unset_hashelem_literal 1
+#define unset_hashelem_stripquote 0
+#if unset_workers_37914
 	    char *sse;
 	    *ss = 0;
 	    if ((sse = parse_subscript(ss+1, 1, ']'))) {
@@ -3733,6 +3742,47 @@ bin_unset(char *name, char **argv, Options ops, int func)
 		remnulargs(subscript);
 		untokenize(subscript);
 	    }
+#elif unset_hashelem_empty_only
+	    char *sse;
+	    *ss = 0;
+	    if (ss[1] == ']' && !ss[2] ? (sse = ss+1) :
+		(sse = parse_subscript(ss+1, 1, ']'))) {
+		*sse = 0;
+		subscript = dupstring(ss+1);
+		*sse = ']';
+		remnulargs(subscript);
+		untokenize(subscript);
+	    }
+#else
+	    char *sse = ss + strlen(ss)-1;
+	    *ss = 0;
+	    if (*sse == ']') {
+# if unset_hashelem_literal
+		*sse = 0;
+		subscript = dupstring(ss+1);
+		*sse = ']';
+# elif unset_hashelem_stripquote
+		int ne = noerrs;
+		noerrs = 2;
+		*ss = 0;
+		*sse = 0;
+		subscript = dupstring(ss+1);
+		*sse = ']';
+		/*
+		 * parse_subst_string() removes one level of quoting.
+		 * If it returns nonzero, substring is unchanged, else
+		 * it has been re-tokenized in place, so clean it up.
+		 */
+		if (!parse_subst_string(subscript)) {
+		    remnulargs(subscript);
+		    untokenize(subscript);
+		}
+		noerrs = ne;
+# else
+		XXX parse error XXX
+# endif
+	    }
+#endif
 	}
 	if ((ss && !subscript) || !isident(s)) {
 	    if (ss)

  reply	other threads:[~2021-06-03  2:05 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-16 21:10 regexp-replace and ^, word boundary or look-behind operators Stephane Chazelas
2019-12-16 21:27 ` Stephane Chazelas
2019-12-17  7:38   ` Stephane Chazelas
2019-12-17 11:11     ` [PATCH] " Stephane Chazelas
2019-12-18  0:22       ` Daniel Shahaf
2019-12-18  8:31         ` Stephane Chazelas
2020-01-01 14:03         ` [PATCH v2] " Stephane Chazelas
2021-04-30  6:11           ` Stephane Chazelas
2021-04-30 23:13             ` Bart Schaefer
2021-05-05 11:45               ` [PATCH v3] regexp-replace and ^, word boundary or look-behind operators (and more) Stephane Chazelas
2021-05-31  0:58                 ` Lawrence Velázquez
2021-05-31 18:18                 ` Bart Schaefer
2021-05-31 21:37                   ` [PATCH] (?) typeset array[position=index]=value Bart Schaefer
2021-06-01  5:32                     ` Stephane Chazelas
2021-06-01 16:05                       ` Bart Schaefer
2021-06-02  2:51                         ` [PATCH] (take two?) typeset array[position=index]=value / unset hash[$stuff] Bart Schaefer
2021-06-02 10:06                           ` Stephane Chazelas
2021-06-02 14:52                             ` Bart Schaefer
2021-06-02 16:02                               ` Stephane Chazelas
2021-06-02  9:11                         ` [PATCH] (?) typeset array[position=index]=value Stephane Chazelas
2021-06-02 13:34                           ` Daniel Shahaf
2021-06-02 14:20                             ` Stephane Chazelas
2021-06-02 15:59                               ` Bart Schaefer
2021-06-03  2:04                                 ` Bart Schaefer [this message]
2021-06-03  2:42                                   ` [PATCH (not final)] (take three?) unset "array[$anything]" Bart Schaefer
2021-06-03  6:12                                     ` Bart Schaefer
2021-06-03  8:54                                       ` Peter Stephenson
2021-06-03 13:13                                         ` Stephane Chazelas
2021-06-03 14:41                                           ` Peter Stephenson
2021-06-04 19:25                                             ` Bart Schaefer
2021-06-05 18:18                                               ` Peter Stephenson
2021-06-09 23:31                                                 ` Bart Schaefer
2021-06-13 16:51                                                   ` Peter Stephenson
2021-06-13 18:04                                                     ` Bart Schaefer
2021-06-13 19:48                                                       ` Peter Stephenson
2021-06-13 21:44                                                         ` Bart Schaefer
2021-06-14  7:19                                                           ` Stephane Chazelas
2021-06-03 18:12                                           ` Bart Schaefer
2021-06-04  8:02                                             ` Stephane Chazelas
2021-06-04 18:36                                               ` Bart Schaefer
2021-06-04 20:21                                                 ` Stephane Chazelas
2021-06-05  0:20                                                   ` Bart Schaefer
2021-06-05 17:05                                                     ` Stephane Chazelas
2021-06-10  0:14                                                       ` Square brackets in command position Bart Schaefer
2021-06-03  6:05                                   ` [PATCH (not final)] (take three?) unset "array[$anything]" Stephane Chazelas
2021-06-03  6:43                                     ` Bart Schaefer
2021-06-03  7:31                                       ` Stephane Chazelas
2021-06-10  0:21                         ` [PATCH] (?) typeset array[position=index]=value Bart Schaefer
2021-06-05  4:29                     ` Mikael Magnusson
2021-06-05  5:49                       ` Bart Schaefer
2021-06-05 11:06                         ` Mikael Magnusson
2021-06-05 16:22                           ` Bart Schaefer
2021-06-18 10:53                         ` Mikael Magnusson
2024-03-08 15:30                 ` [PATCH v3] regexp-replace and ^, word boundary or look-behind operators (and more) Stephane Chazelas
2024-03-09  8:41                   ` [PATCH v5] " Stephane Chazelas
2024-03-09  9:21                     ` MBEGIN when =~ finds bytes inside characters (Was: [PATCH v5] regexp-replace and ^, word boundary or look-behind operators (and more).) Stephane Chazelas
2024-03-09 13:03                   ` [PATCH v3] regexp-replace and ^, word boundary or look-behind operators (and more) Stephane Chazelas
2024-03-10 19:52                     ` [PATCH v6] " Stephane Chazelas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAH+w=7a3hgsag-1WsLxn1yTUja8ooZ+K4eFQhNzHKc32o4a3yg@mail.gmail.com' \
    --to=schaefer@brasslantern.com \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).