zsh-workers
 help / color / mirror / code / Atom feed
* [PATCH] Support the mksh's ${|func;} substitution
@ 2019-09-06  0:52 Sebastian Gniazdowski
  2019-09-06  0:54 ` Sebastian Gniazdowski
                   ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Sebastian Gniazdowski @ 2019-09-06  0:52 UTC (permalink / raw)
  To: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 869 bytes --]

Hello
Some notes on the patch:
- the subst is being handled at top of paramsubst, because it's made
uncompatible with (...)-flags (${|(U)funct;} looks awful, more on this
in the patch),
- then the `s' variable is advanced past the semicolon, so that the
rest of the function can safely progress doing (almost) nothing
- one thing that the function still does is a fetchvalue, which I
prevent from setting vunset to 1 in case of ${|func;}

If commited, the substitution will be super useful in // substitution. E.g.:

arr=( val1 val2 abc1 abc3 )
func() { REPLY="${(C)match[1]}"; }
print -rl ${arr[@]//(#b)(*)/${|func;}}

Output:
Val1
Val2
Abc1
Abc3

PS. I did install mksh and test the substitution, it works the same.
-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

[-- Attachment #2: 0001-Support-the-mksh-s-substitution-func.patch.txt --]
[-- Type: text/plain, Size: 3021 bytes --]

From fb4744562d2f03246288da2692559d7f9c4014f2 Mon Sep 17 00:00:00 2001
From: Sebastian Gniazdowski <sgniazdowski@gmail.com>
Date: Fri, 6 Sep 2019 02:35:14 +0200
Subject: [PATCH] Support the mksh's substitution ${|func;}

---
 Src/subst.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 50 insertions(+), 1 deletion(-)

diff --git a/Src/subst.c b/Src/subst.c
index b132f251b..80a572e17 100644
--- a/Src/subst.c
+++ b/Src/subst.c
@@ -29,6 +29,7 @@
 
 #include "zsh.mdh"
 #include "subst.pro"
+#include "exec.pro"
 
 #define LF_ARRAY	1
 
@@ -1847,8 +1848,17 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
      * nested (P) flags.
      */
     int fetch_needed;
+    /* Indicates ${|func;} */
+    int rplyfunc = 0;
+    /* The name of the function to be ran by ${|...;} */
+    char *cmdarg = NULL;
+    /* The length of the input string */
+    int slen = 0;
+    /* The closing brace pointer */
+    char *outbracep;
 
     *s++ = '\0';
+    slen = strlen(s);
     /*
      * Nothing to do unless the character following the $ is
      * something we recognise.
@@ -1876,6 +1886,41 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
     if (c == Inbrace) {
 	inbrace = 1;
 	s++;
+
+        /* Short-path for the function-running substitution ${|func;}
+         * The function name is extracted and called, and the
+         * substitution assigned. There's no (...)-flags processing,
+         * i.e. no ${|(U)func;}, because it looks quite awful and
+         * also requires a change to the manual, part about the
+         * substitution order. Use ${(U)${|func;}} instead, it looks
+         * cleaner. */
+        if ( ((outbracep=strchr(s,Outbrace)) ||
+             (outbracep=strchr(s,'}'))) &&
+                (s[0] == Bar || s[0] == '|') &&
+                    outbracep[-1] == ';' )
+        {
+            rplyfunc = 1;
+            cmdarg = dupstrpfx(s+1, outbracep-s-2);
+            s=outbracep;
+
+            HashNode hn = NULL;
+            if( (hn = shfunctab->getnode(shfunctab, cmdarg)) ) {
+                /* Execute the shell function */
+                doshfunc((Shfunc) hn, NULL, 1);
+                val = getsparam("REPLY");
+                if (val)
+                    vunset = 0;
+                else {
+                    vunset = 1;
+                    val = dupstring("");
+                }
+            } else {
+                zerr("no such function: %s", cmdarg);
+                return NULL;
+            }
+            fetch_needed = 0;
+        }
+
 	/*
 	 * In ksh emulation a leading `!' is a special flag working
 	 * sort of like our (k).
@@ -2519,7 +2564,11 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
 			     scanflags)) ||
 	    (v->pm && (v->pm->node.flags & PM_UNSET)) ||
 	    (v->flags & VALFLAG_EMPTY))
-	    vunset = 1;
+        {
+            if (!rplyfunc) {
+                vunset = 1;
+            }
+        }
 
 	if (wantt) {
 	    /*
-- 
2.21.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-06  0:52 [PATCH] Support the mksh's ${|func;} substitution Sebastian Gniazdowski
@ 2019-09-06  0:54 ` Sebastian Gniazdowski
  2019-09-06 23:16 ` Sebastian Gniazdowski
  2019-09-07 15:07 ` Stephane Chazelas
  2 siblings, 0 replies; 32+ messages in thread
From: Sebastian Gniazdowski @ 2019-09-06  0:54 UTC (permalink / raw)
  To: Zsh hackers list

PS. I've forgot to write what the substitution does: ${|func;} is
being substituted with the value of $REPLY after running func().

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-06  0:52 [PATCH] Support the mksh's ${|func;} substitution Sebastian Gniazdowski
  2019-09-06  0:54 ` Sebastian Gniazdowski
@ 2019-09-06 23:16 ` Sebastian Gniazdowski
  2019-09-07 12:16   ` Daniel Shahaf
  2019-09-07 15:07 ` Stephane Chazelas
  2 siblings, 1 reply; 32+ messages in thread
From: Sebastian Gniazdowski @ 2019-09-06 23:16 UTC (permalink / raw)
  To: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 1239 bytes --]

I see no response. Is it because the substitution isn't zsh-like, ie. flag
based? I can prepare such patch, ie. assign a flag and integrate it nicely
into the zsh substitution stack.

pt., 6 wrz 2019, 02:52 użytkownik Sebastian Gniazdowski <
sgniazdowski@gmail.com> napisał:

> Hello
> Some notes on the patch:
> - the subst is being handled at top of paramsubst, because it's made
> uncompatible with (...)-flags (${|(U)funct;} looks awful, more on this
> in the patch),
> - then the `s' variable is advanced past the semicolon, so that the
> rest of the function can safely progress doing (almost) nothing
> - one thing that the function still does is a fetchvalue, which I
> prevent from setting vunset to 1 in case of ${|func;}
>
> If commited, the substitution will be super useful in // substitution.
> E.g.:
>
> arr=( val1 val2 abc1 abc3 )
> func() { REPLY="${(C)match[1]}"; }
> print -rl ${arr[@]//(#b)(*)/${|func;}}
>
> Output:
> Val1
> Val2
> Abc1
> Abc3
>
> PS. I did install mksh and test the substitution, it works the same.
> --
> Sebastian Gniazdowski
> News: https://twitter.com/ZdharmaI
> IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
> Blog: http://zdharma.org
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-06 23:16 ` Sebastian Gniazdowski
@ 2019-09-07 12:16   ` Daniel Shahaf
  0 siblings, 0 replies; 32+ messages in thread
From: Daniel Shahaf @ 2019-09-07 12:16 UTC (permalink / raw)
  To: Sebastian Gniazdowski; +Cc: Zsh hackers list

Sebastian Gniazdowski wrote on Sat, Sep 07, 2019 at 01:16:00 +0200:
> I see no response. Is it because the substitution isn't zsh-like, ie. flag
> based? I can prepare such patch, ie. assign a flag and integrate it nicely
> into the zsh substitution stack.

I have no opinion as to whether the feature in question would be a good
addition.

Assuming arguendo that it is:

You forgot to update the documentation and test suite and to link us to the
mksh documentation (and/or test suite — I'm assuming there will be no copyright
issues with borrowing their test cases).

You may wish to wait until someone weighs in on whether the substitution in
question would be an acceptable addition before addressing the points from the
previous paragraph.  I haven't reviewed the patch beyond those points, and I'm
NOT committing to reviewing further iterations.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-06  0:52 [PATCH] Support the mksh's ${|func;} substitution Sebastian Gniazdowski
  2019-09-06  0:54 ` Sebastian Gniazdowski
  2019-09-06 23:16 ` Sebastian Gniazdowski
@ 2019-09-07 15:07 ` Stephane Chazelas
  2019-09-07 18:09   ` Sebastian Gniazdowski
  2 siblings, 1 reply; 32+ messages in thread
From: Stephane Chazelas @ 2019-09-07 15:07 UTC (permalink / raw)
  To: Sebastian Gniazdowski; +Cc: Zsh hackers list

2019-09-06 02:52:39 +0200, Sebastian Gniazdowski:
> Hello
> Some notes on the patch:
> - the subst is being handled at top of paramsubst, because it's made
> uncompatible with (...)-flags (${|(U)funct;} looks awful, more on this
> in the patch),
> - then the `s' variable is advanced past the semicolon, so that the
> rest of the function can safely progress doing (almost) nothing
> - one thing that the function still does is a fetchvalue, which I
> prevent from setting vunset to 1 in case of ${|func;}
> 
> If commited, the substitution will be super useful in // substitution. E.g.:
> 
> arr=( val1 val2 abc1 abc3 )
> func() { REPLY="${(C)match[1]}"; }
> print -rl ${arr[@]//(#b)(*)/${|func;}}
[...]


Note that mksh's operator can do ${|REPLY=value}, it's not
limited to functions. The ; is also not necessary, contrary to
the ${ code; } variant from ksh93 (but which mksh implements
with temporary files instead of changing the whole I/O framework).

Note that in zsh anything is a valid function name, just like
just about anything is a valid command name. So your operator
would be incompatible with mksh's if it accepted arbitrary
function names unless you handled quoting in there:

$ ./Src/zsh -c '"REPLY=value"() { REPLY=x; echo done; }; REPLY\=value; echo ${|REPLY=value;}'
done
zsh:1: no such function: REPLY=value
$ ./Src/zsh -c '"REPLY=value"() { REPLY=x; echo done; }; REPLY\=value; echo ${|"REPLY=value";}'
done
zsh:1: no such function: "REPLY=value"

So at the moment I'd say it has a few problems in that:
- it doesn't accept all function names
- the ; is unnecessary here
- it doesn't allow arbitrary code.

With those fixed, i.e. when it's really like mksh's ${|code},
I'd agree the feature could be useful, but I suspect that would
be harder to implement as it would mean changing the parsing.

Note that beside the math functions, zsh already has something
similar with its "dynamic named directory" framework (a feature
I always found quite obscure/far fetched myself).

echo "${| REPLY=value}"

could be done in zsh with:

zsh_directory_name() {
  [[ $1 = d ]] && [[ $2 = //* ]] || return
  eval " ${2#//}"
  reply=("$REPLY" ${#2})
}


echo ${${${(D):-//REPLY=value}#\~\[}%\]}

(even more convoluted than your math function approach).

-- 
Stephane

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-07 15:07 ` Stephane Chazelas
@ 2019-09-07 18:09   ` Sebastian Gniazdowski
  2019-09-07 20:19     ` Stephane Chazelas
  0 siblings, 1 reply; 32+ messages in thread
From: Sebastian Gniazdowski @ 2019-09-07 18:09 UTC (permalink / raw)
  To: Sebastian Gniazdowski, Zsh hackers list

On Sat, 7 Sep 2019 at 17:07, Stephane Chazelas
<stephane.chazelas@gmail.com> wrote:
> Note that mksh's operator can do ${|REPLY=value}, it's not
> limited to functions.

Ok, true, it can also run binary commands.

> The ; is also not necessary

I think that this is undocumented feature, as the docs say:

"Another variant of substitution are the valsubs (value substitutions)
${|command;} which are also executed in the current environment, like
funsubs, but share their I/O with the parent; instead, they evaluate
to whatever the, initially empty, expression-local variable REPLY is
set to within the commands."

> Note that in zsh anything is a valid function name, just like
> just about anything is a valid command name. So your operator
> would be incompatible with mksh's if it accepted arbitrary
> function names unless you handled quoting in there:
>
> $ ./Src/zsh -c '"REPLY=value"() { REPLY=x; echo done; }; REPLY\=value; echo ${|REPLY=value;}'
> done
> zsh:1: no such function: REPLY=value

That's a valid and nice point, and it does yield some concerns,
however for any problems to appear the user would have to have at
least a function with "targetvariable=..." in name, i.e. a coincidence
of two places in code, the targetvar= in the ${|...;} and in the
function name, and this can be controlled and seems unlikely to occur
by accident.

> With those fixed, i.e. when it's really like mksh's ${|code},
> I'd agree the feature could be useful, but I suspect that would
> be harder to implement as it would mean changing the parsing.

The parsing would have to be changed to prevent the "=" in function names?

I think that I've chosen an initial wrong direction: to implement the
substitution "as-is", with it's form, treating it as a model. Instead,
I should have implemented the feature, not the substitution. Zsh has
its own ways to set-up complex substitutions and this is done via the
parens flags.

Would you consider such method, i.e. to not impose mksh's substitution
ways on Zsh, but instead assign a flag, like e.g.: ${(|)funct}, still
useful?

> Note that beside the math functions, zsh already has something
> similar with its "dynamic named directory" framework (a feature
> I always found quite obscure/far fetched myself).
>
> echo "${| REPLY=value}"
>
> could be done in zsh with:
>
> zsh_directory_name() {
>   [[ $1 = d ]] && [[ $2 = //* ]] || return
>   eval " ${2#//}"
>   reply=("$REPLY" ${#2})
> }
>
> echo ${${${(D):-//REPLY=value}#\~\[}%\]}
>
> (even more convoluted than your math function approach).

That's interesting, it actually allows to do:

arr=( val1 val2 abc1 abc3 )
funct() { REPLY="${(C)MATCH}"; }
zsh_directory_name() { ... }
print -rl ${arr[@]//(#m)*/${${${(D):-//funct}#\~\[}%\]}}

Output:
Val1
Val2
Abc1
Abc3

-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-07 18:09   ` Sebastian Gniazdowski
@ 2019-09-07 20:19     ` Stephane Chazelas
  2019-09-07 21:19       ` Sebastian Gniazdowski
  0 siblings, 1 reply; 32+ messages in thread
From: Stephane Chazelas @ 2019-09-07 20:19 UTC (permalink / raw)
  To: Sebastian Gniazdowski; +Cc: Zsh hackers list

2019-09-07 20:09:57 +0200, Sebastian Gniazdowski:
> On Sat, 7 Sep 2019 at 17:07, Stephane Chazelas
> <stephane.chazelas@gmail.com> wrote:
> > Note that mksh's operator can do ${|REPLY=value}, it's not
> > limited to functions.
> 
> Ok, true, it can also run binary commands.
> 
> > The ; is also not necessary
> 
> I think that this is undocumented feature, as the docs say:
> 
> "Another variant of substitution are the valsubs (value substitutions)
> ${|command;} which are also executed in the current environment, like
> funsubs, but share their I/O with the parent; instead, they evaluate
> to whatever the, initially empty, expression-local variable REPLY is
> set to within the commands."
[...]

Those operators are shaped after the ksh93 ${ cmdsubst;}
operator. ksh93 man page also mentions the ; there, but it's not
necessary either. In ksh93, ${(uname)} or ${ uname } (which for
those who wouldn't be familiar with those is the same as
$(cmdsubst) but without creating a subshell environment) also
work.

That is the { must be delimited from the following code, and }
delimited from the previous code.

The ${ cmd;} is reminiscent of { cmd;}, it makes sense to
document that form as it makes it more obvious what we're
talking about, but just like {(cmd)} also works ${(cmd)} works
as well (in ksh93, not in mksh). { cmd} doesn't work, but
${ cmd} does though (in mksh, not in ksh93 where you need
${ cmd }).

That discrepancy causes confusion:

$ ksh -c '{ echo }; }'
}
$ ksh -c 'echo ${ { echo }; }; }'
ksh: syntax error at line 1: `}' unexpected

(you can not longer use a bare "}" inside ${ ...; })

$ mksh -c 'echo ${ { echo }; echo x } y'
x y
$ ksh -c 'echo ${ { echo }; echo x } y'
ksh: syntax error at line 1: `{' unmatched
$ mksh -c '{ echo x }'
mksh: syntax error: unmatched '{'

It is quite messy.

In ksh93 ${ print foo;} is efficient because in that case
"print" doesn't write "foo\n", the "foo" makes up the result of
the expansion without any I/O being made. And it's also the case
in ${ myfunction; }. ksh93 only ever forks for executing
external commands or in pipelines. When inside a subshell, ksh93
adds the would-be-output data to the command-substitution-to-be

ksh93 was a complete rewrite (compared to ksh88). For mksh to be
able to do that, it would probably have had to be completely
rewritten as well.

Instead, in mksh, for ${ code; }, for the code not to run in a
subshell, the code's output is written to a temp file which is
read afterwards, which is less efficient as it involves I/O.

I suppose that's why the ${| cmd;} variant that uses the $REPLY
variable to transfer the data and avoids I/O was introduced
(you'd still get I/O through a pipe if you do ${|REPLY=$(print
test)}.

[...]
> > With those fixed, i.e. when it's really like mksh's ${|code},
> > I'd agree the feature could be useful, but I suspect that would
> > be harder to implement as it would mean changing the parsing.
> 
> The parsing would have to be changed to prevent the "=" in function names?
[...]

No, I meant that you'd need the parser to handle that case of a
pseudo-command group  (a {any shell code here} but with {|
instead of {)).

So you can do:

echo ${|
  whatever $(...)
  for i do
    ...
  done}

Whether it would actually be difficult or not I can't comment,
I've not looked at the parser code.

Having an operator that *only* invokes a function to do an
expansion is less useful IMO. That just sound like a very
limited form of command substitution where you could have done a
more complete form by allowing any code instead of just one
function invocation without argument.

Note that mksh calls it "function substitution" not because you
can invoke a function within it but because the code in ${ code;
} is like in a function body, where it can have a local scope,
call return, but is a bit buggy when it comes to positional
parameters:

$ mksh -c 'echo "$@"; : "${ shift}"; echo "$@"' sh 1 2
1 2
sh 2

-- 
Stephane

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-07 20:19     ` Stephane Chazelas
@ 2019-09-07 21:19       ` Sebastian Gniazdowski
  2019-09-10  2:20         ` Sebastian Gniazdowski
  0 siblings, 1 reply; 32+ messages in thread
From: Sebastian Gniazdowski @ 2019-09-07 21:19 UTC (permalink / raw)
  To: Sebastian Gniazdowski, Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 1899 bytes --]

On Sat, 7 Sep 2019 at 22:19, Stephane Chazelas
<stephane.chazelas@gmail.com> wrote:
>
> 2019-09-07 20:09:57 +0200, Sebastian Gniazdowski:
> > The parsing would have to be changed to prevent the "=" in function names?
> [...]
>
> No, I meant that you'd need the parser to handle that case of a
> pseudo-command group  (a {any shell code here} but with {|
> instead of {)).
>
> So you can do:
>
> echo ${|
>   whatever $(...)
>   for i do
>     ...
>   done}
>
> Whether it would actually be difficult or not I can't comment,
> I've not looked at the parser code.

I think that it's already supported with the single exception that }
would have to be quoted:

% print ${|
  whatever $(...)
  for i do
    ...
  done;}
zsh: no such function: \n  whatever $(...)\n  for i do\n    ...\n  done

> Having an operator that *only* invokes a function to do an
> expansion is less useful IMO. That just sound like a very
> limited form of command substitution where you could have done a
> more complete form by allowing any code instead of just one
> function invocation without argument.

Ok, I agree, the lambda-function reminiscent version is better. It
also isn't much harder to implement – instead of the doshfunc() just
bin_eval() would have to be called. I attach such patch. However, it
has some problems:

arr=( val1 val2 abc1 abc3 )
print ${arr[@]//(#b)(*)/${|REPLY\=test;}}
Output:test test test test

So the = has to be quoted. Also, not much more works. REPLY\=$MATCH
nor REPLY\=\$MATCH are working. I wonder why, as it doesn't look that
bad in general:

whatever() { echo func ran; }
echo ${|
  whatever $(...)
  for i in a b c; do
    REPLY\=1
  done;}

Output:
func ran
1

-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

[-- Attachment #2: 0001-Support-the-mksh-s-substitution-func.patch-(2).txt --]
[-- Type: text/plain, Size: 2785 bytes --]

From 6c4a2b778ca4c625bf15cf700d6ec3ce1aea1bcf Mon Sep 17 00:00:00 2001
From: Sebastian Gniazdowski <sgniazdowski@gmail.com>
Date: Fri, 6 Sep 2019 02:35:14 +0200
Subject: [PATCH] Support the mksh's substitution ${|func;}

---
 Src/subst.c | 46 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 45 insertions(+), 1 deletion(-)

diff --git a/Src/subst.c b/Src/subst.c
index b132f251b..dc2b58cc7 100644
--- a/Src/subst.c
+++ b/Src/subst.c
@@ -29,6 +29,7 @@
 
 #include "zsh.mdh"
 #include "subst.pro"
+#include "builtin.pro"
 
 #define LF_ARRAY	1
 
@@ -1847,8 +1848,17 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
      * nested (P) flags.
      */
     int fetch_needed;
+    /* Indicates ${|func;} */
+    int rplyfunc = 0;
+    /* The name of the function to be ran by ${|...;} */
+    char *rplycode[2] = {NULL, NULL};
+    /* The length of the input string */
+    int slen = 0;
+    /* The closing brace pointer */
+    char *outbracep;
 
     *s++ = '\0';
+    slen = strlen(s);
     /*
      * Nothing to do unless the character following the $ is
      * something we recognise.
@@ -1876,6 +1886,36 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
     if (c == Inbrace) {
 	inbrace = 1;
 	s++;
+
+        /* Short-path for the function-running substitution ${|func;}
+         * The function name is extracted and called, and the
+         * substitution assigned. There's no (...)-flags processing,
+         * i.e. no ${|(U)func;}, because it looks quite awful and
+         * also requires a change to the manual, part about the
+         * substitution order. Use ${(U)${|func;}} instead, it looks
+         * cleaner. */
+        if ( ((outbracep=strchr(s,Outbrace)) ||
+             (outbracep=strchr(s,'}'))) &&
+                (s[0] == Bar || s[0] == '|') &&
+                    outbracep[-1] == ';' )
+        {
+            rplyfunc = 1;
+            rplycode[0] = dupstrpfx(s+1, outbracep-s-2);
+            s=outbracep;
+
+            /* Execute the shell function */
+            bin_eval(NULL, rplycode, NULL, 0);
+
+            val = getsparam("REPLY");
+            if (val)
+                vunset = 0;
+            else {
+                vunset = 1;
+                val = dupstring("");
+            }
+            fetch_needed = 0;
+        }
+
 	/*
 	 * In ksh emulation a leading `!' is a special flag working
 	 * sort of like our (k).
@@ -2519,7 +2559,11 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
 			     scanflags)) ||
 	    (v->pm && (v->pm->node.flags & PM_UNSET)) ||
 	    (v->flags & VALFLAG_EMPTY))
-	    vunset = 1;
+        {
+            if (!rplyfunc) {
+                vunset = 1;
+            }
+        }
 
 	if (wantt) {
 	    /*
-- 
2.21.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-07 21:19       ` Sebastian Gniazdowski
@ 2019-09-10  2:20         ` Sebastian Gniazdowski
  2019-09-10  5:29           ` Bart Schaefer
  0 siblings, 1 reply; 32+ messages in thread
From: Sebastian Gniazdowski @ 2019-09-10  2:20 UTC (permalink / raw)
  To: Zsh hackers list

I thought that I ask: should I continue working on the feature? I.e.
does it have a chance of being accepted?

The feature would be: a new substitution flag (x for execute, or | to
mark similarity to mksh's ${|code;}) that would execute the provided
code and substitute the value of $REPLY. E.g.:

- var='REPLY=test'; echo ${(x)var} -> test
- echo ${(x):-REPLY=test2} -> test2
- noglob print -rl ${(x):-for val (test test3) {
REPLY=\$val
}}
-> test3

The usefulness is the ability to map code onto array elements (with
(#m) or (#b) flags) and general lambda-like use-case.


On Sat, 7 Sep 2019 at 23:19, Sebastian Gniazdowski
<sgniazdowski@gmail.com> wrote:
>
> On Sat, 7 Sep 2019 at 22:19, Stephane Chazelas
> <stephane.chazelas@gmail.com> wrote:
> >
> > 2019-09-07 20:09:57 +0200, Sebastian Gniazdowski:
> > > The parsing would have to be changed to prevent the "=" in function names?
> > [...]
> >
> > No, I meant that you'd need the parser to handle that case of a
> > pseudo-command group  (a {any shell code here} but with {|
> > instead of {)).
> >
> > So you can do:
> >
> > echo ${|
> >   whatever $(...)
> >   for i do
> >     ...
> >   done}
> >
> > Whether it would actually be difficult or not I can't comment,
> > I've not looked at the parser code.
>
> I think that it's already supported with the single exception that }
> would have to be quoted:
>
> % print ${|
>   whatever $(...)
>   for i do
>     ...
>   done;}
> zsh: no such function: \n  whatever $(...)\n  for i do\n    ...\n  done
>
> > Having an operator that *only* invokes a function to do an
> > expansion is less useful IMO. That just sound like a very
> > limited form of command substitution where you could have done a
> > more complete form by allowing any code instead of just one
> > function invocation without argument.
>
> Ok, I agree, the lambda-function reminiscent version is better. It
> also isn't much harder to implement – instead of the doshfunc() just
> bin_eval() would have to be called. I attach such patch. However, it
> has some problems:
>
> arr=( val1 val2 abc1 abc3 )
> print ${arr[@]//(#b)(*)/${|REPLY\=test;}}
> Output:test test test test
>
> So the = has to be quoted. Also, not much more works. REPLY\=$MATCH
> nor REPLY\=\$MATCH are working. I wonder why, as it doesn't look that
> bad in general:
>
> whatever() { echo func ran; }
> echo ${|
>   whatever $(...)
>   for i in a b c; do
>     REPLY\=1
>   done;}
>
> Output:
> func ran
> 1
>
> --
> Sebastian Gniazdowski
> News: https://twitter.com/ZdharmaI
> IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
> Blog: http://zdharma.org



--
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-10  2:20         ` Sebastian Gniazdowski
@ 2019-09-10  5:29           ` Bart Schaefer
  2019-09-10 18:21             ` Sebastian Gniazdowski
  0 siblings, 1 reply; 32+ messages in thread
From: Bart Schaefer @ 2019-09-10  5:29 UTC (permalink / raw)
  To: Sebastian Gniazdowski; +Cc: Zsh hackers list

On Mon, Sep 9, 2019 at 7:21 PM Sebastian Gniazdowski
<sgniazdowski@gmail.com> wrote:
>
> The feature would be: a new substitution flag (x for execute, or | to
> mark similarity to mksh's ${|code;}) that would execute the provided
> code and substitute the value of $REPLY.
[...]
> The usefulness is the ability to map code onto array elements (with
> (#m) or (#b) flags) and general lambda-like use-case.

I've been kinda-sorta following this thread amidst a bunch of other
"real life" distractions.  Is the deep meaning here the desire to have
a $(...) that doesn't fork?

I don't particularly like either mksh's or ksh93's choice of syntax
for these variations, but they do have the advantage of being real
parser tokens, so the stuff that follows them can be parsed at the
statement level rather than gobbled up by the parameter substitution
code.  That is, ideally these two examples --

> - echo ${(x):-REPLY=test2} -> test2
> - noglob print -rl ${(x):-for val (test test3) {
> REPLY=\$val
> }}
> -> test3

-- would be parsed more like $(...) is parsed (and at roughly the same
place in the parser), so that (among other things) you would not have
to quote \$val like that.

On the other hand the "var='...'; echo ${(x)var}" example seems
reasonable and would enable those other two uses as a side-effect.

I still have a nagging feeling that it should be more like the
(e^...^) globbing flag, in particular the part about returning arrays
through reply=(...) but also whether it might look like
${(x^code^)var} where "code" would receive the current value of the
substitution as $REPLY and return the new value in $reply.  Your "for"
example could still I think come out like:

${(x^eval $REPLY^):-for val (test test3) {
reply=\$val
}}

Other things that need to be thought about before this gets a go/no-go
are nested substitutions and how to fit (x) into the order-of-events
subsect(Rules) as laid out in expn.yo.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-10  5:29           ` Bart Schaefer
@ 2019-09-10 18:21             ` Sebastian Gniazdowski
  2019-09-10 19:38               ` Bart Schaefer
  0 siblings, 1 reply; 32+ messages in thread
From: Sebastian Gniazdowski @ 2019-09-10 18:21 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh hackers list

On Tue, 10 Sep 2019 at 07:29, Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Mon, Sep 9, 2019 at 7:21 PM Sebastian Gniazdowski
> <sgniazdowski@gmail.com> wrote:
> I've been kinda-sorta following this thread amidst a bunch of other
> "real life" distractions.  Is the deep meaning here the desire to have
> a $(...) that doesn't fork?

Yes. this was a founding of the idea, but it then did come to me again
when wanting to apply function to array elements.

> parser tokens, so the stuff that follows them can be parsed at the
> statement level rather than gobbled up by the parameter substitution
> code.  That is, ideally these two examples --
>
> > - echo ${(x):-REPLY=test2} -> test2
> > ...
>
> -- would be parsed more like $(...) is parsed (and at roughly the same
> place in the parser), so that (among other things) you would not have
> to quote \$val like that.

Would it be hard to accomplish? Because currently the ${...} contents
gets fairly unparsed into stringsubst or paramsubst.

> On the other hand the "var='...'; echo ${(x)var}" example seems
> reasonable and would enable those other two uses as a side-effect.

I also like that method. I think that the need for the quoting would
be natural because of the way :- works. It would even be less natural
to not need to quote.

> I still have a nagging feeling that it should be more like the
> (e^...^) globbing flag, in particular the part about returning arrays
> through reply=(...) but also whether it might look like
> ${(x^code^)var} where "code" would receive the current value of the
> substitution as $REPLY and return the new value in $reply.  Your "for"
> example could still I think come out like:
>
> ${(x^eval $REPLY^):-for val (test test3) {
> reply=\$val
> }}

I'm not following the example. Why there's reply= and not reply+=? Why
in the :- it's reply that's altered, while in (x) there's REPLY?

> Other things that need to be thought about before this gets a go/no-go
> are nested substitutions and how to fit (x) into the order-of-events
> subsect(Rules) as laid out in expn.yo.

I think that the (x) flag should be at the top of the list, first.

-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-10 18:21             ` Sebastian Gniazdowski
@ 2019-09-10 19:38               ` Bart Schaefer
  2019-09-12  0:08                 ` Sebastian Gniazdowski
  0 siblings, 1 reply; 32+ messages in thread
From: Bart Schaefer @ 2019-09-10 19:38 UTC (permalink / raw)
  To: Sebastian Gniazdowski; +Cc: Zsh hackers list

On Tue, Sep 10, 2019 at 11:21 AM Sebastian Gniazdowski
<sgniazdowski@gmail.com> wrote:
>
> On Tue, 10 Sep 2019 at 07:29, Bart Schaefer <schaefer@brasslantern.com> wrote:
> >
> > parser tokens, so the stuff that follows them can be parsed at the
> > statement level rather than gobbled up by the parameter substitution
> > code.
>
> Would it be hard to accomplish?

Probably not -- it could mostly share the code for $(...), it would
just have to have different opening and closing tokens.  Without
having given it a huge amount of thought, I suspect I might have
chosen $(|...) instead of ${|...} had I been implementing for mksh ...
but I suppose the idea is that parens imply a subshell where braces
imply the current shell.

> > I still have a nagging feeling that it should be more like the
> > (e^...^) globbing flag, in particular the part about returning arrays
> > through reply=(...) but also whether it might look like
> > ${(x^code^)var} where "code" would receive the current value of the
> > substitution as $REPLY and return the new value in $reply.  Your "for"
> > example could still I think come out like:
> >
> > ${(x^eval $REPLY^):-for val (test test3) {
> > reply=\$val
> > }}
>
> I'm not following the example. Why there's reply= and not reply+=? Why
> in the :- it's reply that's altered, while in (x) there's REPLY?

I'm writing by analogy to the (e) glob qualifier.  Consider:

% touch 'echo HELLO WORLD; reply=HERE'
% print e*(e^'eval $REPLY'^)
HELLO WORLD
HERE

So ${(x^'eval $REPLY'^)var} would be analogous to (minus the implied fork)
  $(REPLY="$var"
    eval $REPLY
    print -r -- "${reply[@]}")

If you change ${var} to ${:-string} then you get
  $(REPLY="string"
    eval $REPLY
    print -r -- "${reply[@]}")

> > Other things that need to be thought about before this gets a go/no-go
> > are nested substitutions and how to fit (x) into the order-of-events
> > subsect(Rules) as laid out in expn.yo.
>
> I think that the (x) flag should be at the top of the list, first.

That can't be right.  It's got to at least be after nested
substitution or the parsing for ${...:-...} and similar doesn't make
sense, and it's probably got to be after (P) flag handling if not also
after double-quoted joining.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-10 19:38               ` Bart Schaefer
@ 2019-09-12  0:08                 ` Sebastian Gniazdowski
  2019-09-12  1:03                   ` Bart Schaefer
  0 siblings, 1 reply; 32+ messages in thread
From: Sebastian Gniazdowski @ 2019-09-12  0:08 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh hackers list

On Tue, 10 Sep 2019 at 21:38, Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Tue, Sep 10, 2019 at 11:21 AM Sebastian Gniazdowski
> <sgniazdowski@gmail.com> wrote:
>
> > Would it be hard to accomplish?
>
> Probably not -- it could mostly share the code for $(...), it would
> just have to have different opening and closing tokens.  Without
> having given it a huge amount of thought, I suspect I might have
> chosen $(|...) instead of ${|...} had I been implementing for mksh ...
> but I suppose the idea is that parens imply a subshell where braces
> imply the current shell.

So you've implemented ${|code;} for mksh?

> > > ${(x^eval $REPLY^):-for val (test test3) {
> > > reply=\$val
> > > }}
> >
> > I'm not following the example. Why there's reply= and not reply+=? Why
> > in the :- it's reply that's altered, while in (x) there's REPLY?
>
> I'm writing by analogy to the (e) glob qualifier.  Consider:
>
> % touch 'echo HELLO WORLD; reply=HERE'
> % print e*(e^'eval $REPLY'^)
> HELLO WORLD
> HERE

Ok, but this could use ...RLD; REPLY=HERE'. The use of reply is just
to simplify, i.e. to use different vars for the (x) embedded code and
for the substitution code?

> So ${(x^'eval $REPLY'^)var} would be analogous to (minus the implied fork)
>   $(REPLY="$var"
>     eval $REPLY
>     print -r -- "${reply[@]}")
>
> If you change ${var} to ${:-string} then you get
>   $(REPLY="string"
>     eval $REPLY
>     print -r -- "${reply[@]}")

Ahso. So the ${(x^'eval $REPLY'^)var} in the end could be ${(x^'eval
func^):-} ? I.e. the code to evaluate would be provided within the (x)
delimiters, and the connection with REPLY <-> substitution value (i.e.
$var or :-string) would be additional?

> > I think that the (x) flag should be at the top of the list, first.
>
> That can't be right.  It's got to at least be after nested
> substitution or the parsing for ${...:-...} and similar doesn't make
> sense, and it's probably got to be after (P) flag handling if not also
> after double-quoted joining.

Ok, I've did hurry too much. The :- is point 7.:

       7. Modifiers
              Any modifiers, as specified by a trailing `#', `%', `/' (possi‐
              bly doubled) or by a set of modifiers of the form  `:...'  (see
              the  section  `Modifiers'  in the section `History Expansion'),
              are applied to the words of the value at this level.

It seems to be a good place for the (x) to be put, but with the point
broke down into :- and the other modifiers – they modify the value and
it makes more sense to modify the output of the evaluated code than
the code itself.

All the following points seem to operate in on-value like fashion, so
they should go after (x). The preceding ones OTOH operate in an
select-value like fashion, e.g.:
- subscripting, ${(x)arr[2]} makes sense to operate on the $arr than
on the $REPLY (evaluation result),
- (P), in ${(xP)var} the P makes more sense to operate on $var rather
than on the result of the evaluation – what would it be, to provide a
parameter name for P through an evaluation of code? it does sound
tempting, but the clear-situation that comes from P being always first
is more safe. The programmer that would want to provide a parameter
name through evaluation can do the evaluation manually before the
${(P)var}. It will be possible even in an on-array mapping situation:

arr=( "REPLY=param1" "REPLY=param2" )
: ${a[@]/(#m)*/${${param::=${(x)MATCH}:+}${(P)param}}

- the same exception as for (P) in point "2. Internal parameter flags"
seems to make sense also for (x)

-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-12  0:08                 ` Sebastian Gniazdowski
@ 2019-09-12  1:03                   ` Bart Schaefer
  2019-09-12  2:06                     ` Sebastian Gniazdowski
  0 siblings, 1 reply; 32+ messages in thread
From: Bart Schaefer @ 2019-09-12  1:03 UTC (permalink / raw)
  To: Sebastian Gniazdowski; +Cc: Zsh hackers list

On Wed, Sep 11, 2019 at 5:09 PM Sebastian Gniazdowski
<sgniazdowski@gmail.com> wrote:
>
> So you've implemented ${|code;} for mksh?

Good grief, no.  I have never even (knowingly) USED mksh.  I just said
that IF it had been me, I probably would have made a different syntax
choice.

> On Tue, 10 Sep 2019 at 21:38, Bart Schaefer <schaefer@brasslantern.com> wrote:
> >
> > I'm writing by analogy to the (e) glob qualifier.  Consider:
> >
> > % touch 'echo HELLO WORLD; reply=HERE'
> > % print e*(e^'eval $REPLY'^)
> > HELLO WORLD
> > HERE
>
> Ok, but this could use ...RLD; REPLY=HERE'. The use of reply is just
> to simplify, i.e. to use different vars for the (x) embedded code and
> for the substitution code?

Again, by analogy to the glob qualifier -- the input ($REPLY) is a
scalar, the output ($reply) is an array (which might or might not get
string-ified by double-quoting).

In an (e) glob, REPLY is set to each possible file name in turn, but
$reply can provide multiple names to the final result.  I'm assuming
you want ${(x...)array} to apply to each array element in turn, not to
the entire array at once, so again REPLY is a scalar (each array
element) but could be replaced by multiple elements in $reply.

> Ahso. So the ${(x^'eval $REPLY'^)var} in the end could be ${(x^'eval
> func^):-} ? I.e. the code to evaluate would be provided within the (x)
> delimiters, and the connection with REPLY <-> substitution value (i.e.
> $var or :-string) would be additional?

I think you grasped it, yes.

Whether the syntax ${(x+func)...} would call "func" once for each
array element (again by analogy to glob (e+func)) would be an
implementation choice.  Too bad (x) for globs and (e) for parameters
already have other meanings, so there's no way to make them use the
identical key character.

> > > I think that the (x) flag should be at the top of the list, first.
> >
> > That can't be right.
>
> Ok, I've did hurry too much. The :- is point 7.

This sounds more sensible.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-12  1:03                   ` Bart Schaefer
@ 2019-09-12  2:06                     ` Sebastian Gniazdowski
  2019-09-12  5:35                       ` Bart Schaefer
  0 siblings, 1 reply; 32+ messages in thread
From: Sebastian Gniazdowski @ 2019-09-12  2:06 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh hackers list

On Thu, 12 Sep 2019 at 03:03, Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Wed, Sep 11, 2019 at 5:09 PM Sebastian Gniazdowski
> <sgniazdowski@gmail.com> wrote:
> >
> > So you've implemented ${|code;} for mksh?
>
> Good grief, no.  I have never even (knowingly) USED mksh.  I just said
> that IF it had been me, I probably would have made a different syntax
> choice.

Ah, I've misread it then.

> > Ok, but this could use ...RLD; REPLY=HERE'. The use of reply is just
> > to simplify, i.e. to use different vars for the (x) embedded code and
> > for the substitution code?
>
> Again, by analogy to the glob qualifier -- the input ($REPLY) is a
> scalar, the output ($reply) is an array (which might or might not get
> string-ified by double-quoting).
>
> In an (e) glob, REPLY is set to each possible file name in turn, but
> $reply can provide multiple names to the final result.  I'm assuming
> you want ${(x...)array} to apply to each array element in turn, not to
> the entire array at once, so again REPLY is a scalar (each array
> element) but could be replaced by multiple elements in $reply.

That's interesting. The mapping could then look like:

array( "a" "b c" "d" )
func() { reply=( ${=REPLY} ); }
array=( ${(x^func^)array} )

and it would result in FOUR array elements, not three. This is a
concrete extension from the mapping via //(#m)*/${(x):-func}, which
cannot extend the array (nor contract it). I think that in order to
allow contraction of the result, the (x) flag could use only reply,
without REPLY as opposed to the (e), unless we do that reply being
just set invalidates REPLY and allows the empty result.

But also, the advanced (x) flag is quite complex. It might be a good
reason to lower the complexity by making the flag substitute only
reply contents.

> I think you grasped it, yes.
>
> Whether the syntax ${(x+func)...} would call "func" once for each
> array element (again by analogy to glob (e+func)) would be an
> implementation choice.  Too bad (x) for globs and (e) for parameters
> already have other meanings, so there's no way to make them use the
> identical key character.

Why only one + in the examples? (I've tried this syntax with (e), it
doesn't seem to support it).

I think that yes, it should call the func per each array element, to
allow the expanding / contracting featured mapping. Also, I wonder
what other interesting things can result from the 2-step code/data
providing to the substitution.

> > > > I think that the (x) flag should be at the top of the list, first.
> > >
> > > That can't be right.
> >
> > Ok, I've did hurry too much. The :- is point 7.
>
> This sounds more sensible.

Great. It seems that we're getting close to a final "draft".

-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-12  2:06                     ` Sebastian Gniazdowski
@ 2019-09-12  5:35                       ` Bart Schaefer
  2019-09-12  6:00                         ` Sebastian Gniazdowski
  0 siblings, 1 reply; 32+ messages in thread
From: Bart Schaefer @ 2019-09-12  5:35 UTC (permalink / raw)
  To: Sebastian Gniazdowski; +Cc: Zsh hackers list

On Wed, Sep 11, 2019 at 7:07 PM Sebastian Gniazdowski
<sgniazdowski@gmail.com> wrote:
>
> On Thu, 12 Sep 2019 at 03:03, Bart Schaefer <schaefer@brasslantern.com> wrote:
> >
> > In an (e) glob, REPLY is set to each possible file name in turn, but
> > $reply can provide multiple names to the final result.
>
> array( "a" "b c" "d" )
> func() { reply=( ${=REPLY} ); }
> array=( ${(x^func^)array} )
>
> and it would result in FOUR array elements, not three. [...]
> I think that in order to
> allow contraction of the result, the (x) flag could use only reply,
> without REPLY as opposed to the (e), unless we do that reply being
> just set invalidates REPLY and allows the empty result.

Hm, perhaps you're still confused about something.  The value of
$REPLY means nothing after the code has run -- it's strictly an input.
$reply is the only output.  So if we were to do as I'm suggesting,

array( "a" "b c" "d" )
func() {
  if (( ${#${=REPLY}} > 1 ))
  then reply=()
  else reply=( $REPLY )
  fi
}
array=( ${(x^func^)array} )

would result in two elements ("a" "d").

Oh, I've forgotten an important bit -- the return value of the code
matters as well.  The full semantics is:
1. on entry, REPLY is set to one element and reply is unset
2. If the code returns nonzero (false, failure) then the result is the
empty array
   (so the corresponding element is removed from the input set)
3. else if reply has become set, that array is used (even if empty)
   (so the corresponding element may become zero, one, or more elements)
4. else if REPLY is set, the value of REPLY is used
   (so the corresponding element changes, possibly to an empty string)
5. else the original element is unchanged

> > Whether the syntax ${(x+func)...} would call "func" once for each
> > array element (again by analogy to glob (e+func))
>
> Why only one + in the examples? (I've tried this syntax with (e), it
> doesn't seem to support it).

Oops, I've typo'd.  as a glob flag, (+func) is (e:func:), you don't
use (e+).  I don't think we can get away with using a bare leading "+"
like that in parameter flags.

Here are examples globbing in my zsh source (gmail is probably going
to line wrap some of this, sorry):

% i=0; echo *(oNe:'REPLY=$((++i))':)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
% echo *(e:'() return 1':)
zsh: no matches found: *(e:() return 1:)
% func() { if (( $#REPLY > 4 )); then REPLY=; fi }
% echo *(oNP:,:+func)
, Misc ,  ,  ,  ,  , Test , Util ,  ,  ,  ,  ,  ,  ,  ,  , Etc ,  ,  ,
 ,  ,  ,  ,  ,  ,  , NEWS ,  ,  ,  ,  ,  ,  ,  ,  , Doc ,  ,  , zsh ,
,  ,  , Src
% func() { if (( $#REPLY > 4 )); then reply=(); fi }
% echo *(+func)
Doc Etc Misc NEWS Src Test Util zsh

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-12  5:35                       ` Bart Schaefer
@ 2019-09-12  6:00                         ` Sebastian Gniazdowski
  2019-09-12  6:55                           ` Bart Schaefer
  0 siblings, 1 reply; 32+ messages in thread
From: Sebastian Gniazdowski @ 2019-09-12  6:00 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh hackers list

On Thu, 12 Sep 2019 at 07:35, Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Wed, Sep 11, 2019 at 7:07 PM Sebastian Gniazdowski
> <sgniazdowski@gmail.com> wrote:
> > (...)
> > and it would result in FOUR array elements, not three. [...]
> > I think that in order to
> > allow contraction of the result, the (x) flag could use only reply,
> > without REPLY as opposed to the (e), unless we do that reply being
> > just set invalidates REPLY and allows the empty result.
>
> Hm, perhaps you're still confused about something.  The value of
> $REPLY means nothing after the code has run -- it's strictly an input.
> $reply is the only output.

Yes I was opting for such settlement, however what you wrote seems
contradict to:

> 4. else if REPLY is set, the value of REPLY is used
>    (so the corresponding element changes, possibly to an empty string)
--
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-12  6:00                         ` Sebastian Gniazdowski
@ 2019-09-12  6:55                           ` Bart Schaefer
  2019-09-13 20:28                             ` Sebastian Gniazdowski
  0 siblings, 1 reply; 32+ messages in thread
From: Bart Schaefer @ 2019-09-12  6:55 UTC (permalink / raw)
  To: Sebastian Gniazdowski; +Cc: Zsh hackers list

On Wed, Sep 11, 2019 at 11:01 PM Sebastian Gniazdowski
<sgniazdowski@gmail.com> wrote:
>
> On Thu, 12 Sep 2019 at 07:35, Bart Schaefer <schaefer@brasslantern.com> wrote:
> >
> > $REPLY means nothing after the code has run -- it's strictly an input.
> > $reply is the only output.
>
> Yes I was opting for such settlement, however what you wrote seems
> contradict to:
>
> > 4. else if REPLY is set, the value of REPLY is used
> >    (so the corresponding element changes, possibly to an empty string)

Yes, I was imprecise/absentminded in the former paragraph ... that
first sentence should have said:  REPLY means nothing if reply is set
after the code has run.

As the examples demonstrated, the latter paragraph numbered 4 is the
correct interpretation.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-12  6:55                           ` Bart Schaefer
@ 2019-09-13 20:28                             ` Sebastian Gniazdowski
  2019-09-13 21:33                               ` Bart Schaefer
  0 siblings, 1 reply; 32+ messages in thread
From: Sebastian Gniazdowski @ 2019-09-13 20:28 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh hackers list

Ok, thanks for the clarification. I've gathered the information in a
markdown document:

https://github.com/psprint/zsh/blob/master/x-flag.md

It seems fairly complete, now only someone who knows how to evaluate
code would have to implement it or provide hints, as my tries with
using bin_eval (the eval function that's behind it is declared static)
didn't result in a working ${|...;} substitution.

On Thu, 12 Sep 2019 at 08:55, Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Wed, Sep 11, 2019 at 11:01 PM Sebastian Gniazdowski
> <sgniazdowski@gmail.com> wrote:
> >
> > On Thu, 12 Sep 2019 at 07:35, Bart Schaefer <schaefer@brasslantern.com> wrote:
> > >
> > > $REPLY means nothing after the code has run -- it's strictly an input.
> > > $reply is the only output.
> >
> > Yes I was opting for such settlement, however what you wrote seems
> > contradict to:
> >
> > > 4. else if REPLY is set, the value of REPLY is used
> > >    (so the corresponding element changes, possibly to an empty string)
>
> Yes, I was imprecise/absentminded in the former paragraph ... that
> first sentence should have said:  REPLY means nothing if reply is set
> after the code has run.
>
> As the examples demonstrated, the latter paragraph numbered 4 is the
> correct interpretation.



-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-13 20:28                             ` Sebastian Gniazdowski
@ 2019-09-13 21:33                               ` Bart Schaefer
  2019-09-13 21:36                                 ` Bart Schaefer
  0 siblings, 1 reply; 32+ messages in thread
From: Bart Schaefer @ 2019-09-13 21:33 UTC (permalink / raw)
  To: Sebastian Gniazdowski; +Cc: Zsh hackers list

On Fri, Sep 13, 2019 at 1:28 PM Sebastian Gniazdowski
<sgniazdowski@gmail.com> wrote:
>
> It seems fairly complete, now only someone who knows how to evaluate
> code would have to implement it or provide hints

Again, this is intended to be very similar to the (e) glob operator,
so have a look at glob_exec_string in Src/glob.c

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-13 21:33                               ` Bart Schaefer
@ 2019-09-13 21:36                                 ` Bart Schaefer
  2019-09-14  0:41                                   ` Sebastian Gniazdowski
  0 siblings, 1 reply; 32+ messages in thread
From: Bart Schaefer @ 2019-09-13 21:36 UTC (permalink / raw)
  To: Sebastian Gniazdowski; +Cc: Zsh hackers list

On Fri, Sep 13, 2019 at 2:33 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Fri, Sep 13, 2019 at 1:28 PM Sebastian Gniazdowski
> <sgniazdowski@gmail.com> wrote:
> >
> > It seems fairly complete, now only someone who knows how to evaluate
> > code would have to implement it or provide hints
>
> Again, this is intended to be very similar to the (e) glob operator,
> so have a look at glob_exec_string in Src/glob.c

Sorry, that's just finding the string to be exec'd.  The actual magic
in glob.c happens right after this comment:
/* Parsed OK, execute for each name */

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-13 21:36                                 ` Bart Schaefer
@ 2019-09-14  0:41                                   ` Sebastian Gniazdowski
  2019-09-14  0:44                                     ` Sebastian Gniazdowski
  0 siblings, 1 reply; 32+ messages in thread
From: Sebastian Gniazdowski @ 2019-09-14  0:41 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh hackers list

On Fri, 13 Sep 2019 at 23:36, Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> On Fri, Sep 13, 2019 at 2:33 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
> >
> > On Fri, Sep 13, 2019 at 1:28 PM Sebastian Gniazdowski
> > <sgniazdowski@gmail.com> wrote:
> > >
> > > It seems fairly complete, now only someone who knows how to evaluate
> > > code would have to implement it or provide hints
> >
> > Again, this is intended to be very similar to the (e) glob operator,
> > so have a look at glob_exec_string in Src/glob.c
>
> Sorry, that's just finding the string to be exec'd.  The actual magic
> in glob.c happens right after this comment:
> /* Parsed OK, execute for each name */



-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2019-09-14  0:41                                   ` Sebastian Gniazdowski
@ 2019-09-14  0:44                                     ` Sebastian Gniazdowski
  0 siblings, 0 replies; 32+ messages in thread
From: Sebastian Gniazdowski @ 2019-09-14  0:44 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh hackers list

> On Fri, 13 Sep 2019 at 23:36, Bart Schaefer <schaefer@brasslantern.com> wrote:
> >
> > On Fri, Sep 13, 2019 at 2:33 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
> > >
> > > On Fri, Sep 13, 2019 at 1:28 PM Sebastian Gniazdowski
> > > <sgniazdowski@gmail.com> wrote:
> > > >
> > > > It seems fairly complete, now only someone who knows how to evaluate
> > > > code would have to implement it or provide hints
> > >
> > > Again, this is intended to be very similar to the (e) glob operator,
> > > so have a look at glob_exec_string in Src/glob.c
> >
> > Sorry, that's just finding the string to be exec'd.  The actual magic
> > in glob.c happens right after this comment:
> > /* Parsed OK, execute for each name */

(Ups, accidental send before this email.)

Thanks, the code looks very accessible, I should have the feature
implemented tomorrow when I'll have some free time.

-- 
Sebastian Gniazdowski
News: https://twitter.com/ZdharmaI
IRC: https://kiwiirc.com/client/chat.freenode.net:+6697/#zplugin
Blog: http://zdharma.org

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2023-07-06  4:00           ` Bart Schaefer
@ 2023-07-18  3:14             ` Bart Schaefer
  0 siblings, 0 replies; 32+ messages in thread
From: Bart Schaefer @ 2023-07-18  3:14 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 3058 bytes --]

On Mon, Jul 3, 2023 at 6:55 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> Also in that thread Stephane points out that it would be ideal if
> (paraphrasing his example) "${| echo { }" were parsed similarly to "{
> echo { }" and that to do so would require changes to the parser

More specifically, it's going to require changes to the tokenizer.
There would need to be an analog of skipcomm() that knows how to stop
at the appropriately significant Outbrace instead of at Outpar, and
similarly a skipparens() analog that does the same for a string.  For
the nonce it gets most of the way there to require that braces inside
${|...} are either balanced or quoted.

One of the harder things to get right is the "%_" prompt replacement
for this.  I ended up with "braceparam cursh" because that's otherwise
impossible, as opposed to "braceparam cmdsubst" or just "cmdsubst" or
even "cmdsubst cursh" any of which could result from other valid
syntaxes.

As an aside ... compatibility-wise we're stuck with using braces for
this, but why not parens like $(|...) instead, by analogy with
$(<...)?  Parsing would be so much ... saner ... and the whole
fake-it-with $REPLY bit could be skipped, go straight to capturing
stdout.

On Tue, Jul 4, 2023 at 10:32 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> -- REPLY is treated as a local, i.e., it's value gets saved and
> restored around the substitution

I've implemented this, but after experimenting with it both ways, I've
done so only for REPLY, that is, other cases of $VAR as in ${|VAR|...}
remain non-localized.  Again I wonder whether instead of being
initialized as unset, REPLY should be initialized to the value from
the calling scope, so that ${+REPLY} is usable.  The drawback of
course is that the called code must then explicitly clobber $REPLY if
an empty result is desired.

> -- "local" works inside the substitution as it would inside a function
> body, but $@ refers to the calling environment.

Implemented this.  Combined with the foregoing, this means that
${|VAR|local VAR; ...} substitutes the value of $VAR from the caller,
not from the command (so, don't do that unless you mean it).

> -- "return" also behaves as if in a function body.

Also implemented this.  Following Chet's lead, though, "exit" does
exit the shell.  I could instead make "exit" work like ${notset?error}
and thus exit only if the shell is not interactive.  Thoughts?

I have not yet implemented the anonymous tempfile for ${
capture_stdout }.  However, this equivalent construction does work
(extra newlines for clarity):
  ${|
    () {
      capture_stdout > $1
      REPLY=$(<$1)
    } =(</dev/null)
  }

Both $(<...) and =(<...) are non-forking shortcuts already, so this
acts as proof of concept.

Should test cases go in D04parameter, or D08cmdsubst, or a new D10 file?

Attached patch should apply in either order with the named references
patch from workers/51945 although there will be line-number fuzz in
the docs.

[-- Attachment #2: nofork.txt --]
[-- Type: text/plain, Size: 9860 bytes --]

diff --git a/Doc/Zsh/expn.yo b/Doc/Zsh/expn.yo
index 7bc736470..44867e655 100644
--- a/Doc/Zsh/expn.yo
+++ b/Doc/Zsh/expn.yo
@@ -1875,23 +1875,51 @@ sect(Command Substitution)
 cindex(command substitution)
 cindex(substitution, command)
 A command enclosed in parentheses preceded by a dollar sign, like
-`tt($LPAR())...tt(RPAR())', or quoted with grave
-accents, like `tt(`)...tt(`)', is replaced with its standard output, with
-any trailing newlines deleted.
-If the substitution is not enclosed in double quotes, the
-output is broken into words using the tt(IFS) parameter.
+`tt($LPAR())...tt(RPAR())', or quoted with grave accents, like
+`tt(`)...tt(`)', is executed in a subshell and replaced by its
+standard output, with any trailing newlines deleted.  If the
+substitution is not enclosed in double quotes, the output is broken
+into words using the tt(IFS) parameter.
 vindex(IFS, use of)
 
 The substitution `tt($LPAR()cat) var(foo)tt(RPAR())' may be replaced
 by the faster `tt($LPAR()<)var(foo)tt(RPAR())'.  In this case var(foo)
 undergoes single word shell expansions (em(parameter expansion),
 em(command substitution) and em(arithmetic expansion)), but not
-filename generation.
+filename generation.  No subshell is created.
 
 If the option tt(GLOB_SUBST) is set, the result of any unquoted command
 substitution, including the special form just mentioned, is eligible for
 filename generation.
 
+A command with a leading pipe character, enclosed in braces prefixed by
+a dollar sign, as in `tt(${|)...tt(})', is executed in the current shell
+context, rather than in a subshell, and is replaced by the value of the
+parameter tt(REPLY) at the end of the command.  There em(must not) be
+any whitespace between the opening brace and the pipe character.  Any
+prior value of tt($REPLY) is saved and restored around this substitution,
+in the manner of a function local parameter.  Other parameters declared
+within the substitution also behave as locals, as if in a function,
+unless `tt(typeset -g)' is used.  Trailing newlines are em(not) deleted
+from the final replacement in this case, and it is subject to filename
+generation in the same way as `tt($LPAR())...tt(RPAR())' but is em(not)
+split on tt(IFS) unless the tt(SH_WORD_SPLIT) option is set.
+
+Substitutions of the form `tt(${|)var(param)tt(|)...tt(})' are similar,
+except that the substitution is replaced by the value of the parameter
+named by var(param).  No implicit save or restore applies to var(param)
+except as noted for tt(REPLY), and var(param) should em(not) be declared
+within the command.  If var(param) names an array, array expansion rules
+apply.
+
+COMMENT(To be implemented later:
+A command enclosed in braces preceded by a dollar sign, and set off from
+the braces by whitespace, like `tt(${ )...tt( })', is replaced by its
+standard output.  Like `tt(${|)...tt(})' and unlike
+`tt($LPAR())...tt(RPAR())', the command executes in the current shell
+context with function local behaviors and does not create a subshell.
+)
+
 texinode(Arithmetic Expansion)(Brace Expansion)(Command Substitution)(Expansion)
 sect(Arithmetic Expansion)
 cindex(arithmetic expansion)
diff --git a/Src/lex.c b/Src/lex.c
index 2f7937410..73d94264e 100644
--- a/Src/lex.c
+++ b/Src/lex.c
@@ -937,7 +937,7 @@ static enum lextok
 gettokstr(int c, int sub)
 {
     int bct = 0, pct = 0, brct = 0, seen_brct = 0, fdpar = 0;
-    int intpos = 1, in_brace_param = 0;
+    int intpos = 1, in_brace_param = 0, cmdsubst = 0;
     int inquote, unmatched = 0;
     enum lextok peek;
 #ifdef DEBUG
@@ -1157,8 +1157,11 @@ gettokstr(int c, int sub)
 	    if (in_brace_param) {
 		cmdpop();
 	    }
-	    if (bct-- == in_brace_param)
-		in_brace_param = 0;
+	    if (bct-- == in_brace_param) {
+		if (cmdsubst)
+		    cmdpop();
+		in_brace_param = cmdsubst = 0;
+	    }
 	    c = Outbrace;
 	    break;
 	case LX2_COMMA:
@@ -1405,10 +1408,15 @@ gettokstr(int c, int sub)
        }
        add(c);
        c = hgetc();
-	if (intpos)
+       if (intpos)
 	    intpos--;
-	if (lexstop)
+       if (lexstop)
 	    break;
+       if (!cmdsubst && in_brace_param && act == LX2_STRING &&
+	   (c == '|' || c == Bar || inblank(c))) {
+	   cmdsubst = in_brace_param;
+	   cmdpush(CS_CURSH);
+       }
     }
   brk:
     if (errflag) {
@@ -1459,7 +1467,7 @@ gettokstr(int c, int sub)
 static int
 dquote_parse(char endchar, int sub)
 {
-    int pct = 0, brct = 0, bct = 0, intick = 0, err = 0;
+    int pct = 0, brct = 0, bct = 0, intick = 0, err = 0, cmdsubst = 0;
     int c;
     int math = endchar == ')' || endchar == ']' || infor;
     int zlemath = math && zlemetacs > zlemetall + addedx - inbufct;
@@ -1529,11 +1537,21 @@ dquote_parse(char endchar, int sub)
 		c = Qstring;
 	    }
 	    break;
+	case '{':
+	    if (cmdsubst && !intick) {
+		/* In nofork substitution, tokenize as if unquoted */
+		c = Inbrace;
+		bct++;
+	    }
+	    break;
 	case '}':
 	    if (intick || !bct)
 		break;
 	    c = Outbrace;
-	    bct--;
+	    if (bct-- == cmdsubst) {
+		cmdpop();
+		cmdsubst = 0;
+	    }
 	    cmdpop();
 	    break;
 	case '`':
@@ -1588,12 +1606,24 @@ dquote_parse(char endchar, int sub)
 	if (err || lexstop)
 	    break;
 	add(c);
+	if (!cmdsubst && c == Inbrace) {
+	    /* Check for ${|...} nofork command substitution */
+	    if ((c = hgetc())) {
+		if (c == '|' || inblank(c)) {
+		    cmdsubst = bct;
+		    cmdpush(CS_CURSH);
+		}
+		hungetc(c);
+	    }
+	}
     }
     if (intick == 2)
 	ALLOWHIST
     if (intick) {
 	cmdpop();
     }
+    if (bct && bct == cmdsubst)
+	cmdpop();
     while (bct--)
 	cmdpop();
     if (lexstop)
diff --git a/Src/subst.c b/Src/subst.c
index 14947ae36..92a53d99a 100644
--- a/Src/subst.c
+++ b/Src/subst.c
@@ -1860,6 +1860,8 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
      * joining the array into a string (for compatibility with ksh/bash).
      */
     int quoted_array_with_offset = 0;
+    /* Indicates ${|...;} */
+    char *rplyvar = NULL;
 
     *s++ = '\0';
     /*
@@ -1887,8 +1889,104 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
      * flags in parentheses, but also one ksh hack.
      */
     if (c == Inbrace) {
+	/* The command string to be run by ${|...;} */
+	char *cmdarg = NULL;
+	size_t slen = 0;
 	inbrace = 1;
 	s++;
+
+        /* Short-path for the nofork command substitution ${|cmd;}
+	 * See other comments about kludges for why this is here.
+	 *
+         * The command string is extracted and executed, and the
+         * substitution assigned. There's no (...)-flags processing,
+         * i.e. no ${|(U)cmd;}, because it looks quite awful and
+         * should not be part of command substitution in any case.
+         * Use ${(U)${|cmd;}} as you would for ${(U)$(cmd;)}.
+	 */
+	if (*s == '|' || *s == Bar) {
+	    char *outbracep = s;
+	    char sav = *s;
+	    *s = Inbrace;
+	    if (skipparens(Inbrace, Outbrace, &outbracep) == 0) {
+		slen = outbracep - s - 1;
+		if ((*s = sav) != Bar) {
+		    sav = *outbracep;
+		    *outbracep = '\0';
+		    tokenize(s);
+		    *outbracep = sav;
+		}
+	    }
+	}
+	if (slen > 1) {
+	    char *outbracep = s + slen;
+	    if (*outbracep == Outbrace) {
+		if ((rplyvar = itype_end(s+1, INAMESPC, 0))) {
+		    if (*rplyvar == Inbrack &&
+			(rplyvar = parse_subscript(++rplyvar, 1, ']')))
+			++rplyvar;
+		}
+		if (rplyvar == s+1 && *rplyvar == Bar) {
+		    /* Is ${||...} a subtitution error or a syntax error?
+		    zerr("bad substitution");
+		    return NULL;
+		    */
+		    rplyvar = NULL;
+		}
+		if (rplyvar && *rplyvar == Bar) {
+		    cmdarg = dupstrpfx(rplyvar+1, outbracep-rplyvar-1);
+		    rplyvar = dupstrpfx(s+1,rplyvar-s-1);
+		} else {
+		    cmdarg = dupstrpfx(s+1, outbracep-s-1);
+		    rplyvar = "REPLY";
+		}
+		s = outbracep;
+	    }
+	}
+
+	if (rplyvar) {
+	    Param pm;
+	    /* char *rplyval = getsparam("REPLY"); */
+	    startparamscope(); /* "local" behaves as if in a function */
+	    pm = createparam("REPLY", PM_LOCAL|PM_UNSET);
+	    if (pm)	/* Shouldn't createparam() do this? */
+		pm->level = locallevel;
+	    /* if (rplyval) setsparam("REPLY", ztrdup(rplyval)); */
+	}
+
+	if (rplyvar && cmdarg && *cmdarg) {
+	    /* Execute the shell command */
+	    untokenize(cmdarg);
+	    execstring(cmdarg, 1, 0, "cmdsubst");
+	}
+
+	if (rplyvar) {
+	    if (strcmp(rplyvar, "REPLY") == 0) {
+		if ((val = ztrdup(getsparam("REPLY"))))
+		    vunset = 0;
+		else {
+		    vunset = 1;
+		    val = dupstring("");
+		}
+	    } else {
+		s = dyncat(rplyvar, s);
+		rplyvar = NULL;
+	    }
+	    endparamscope();
+	    if (exit_pending) {
+		if (mypid == getpid()) {
+		    /*
+		     * paranoia: don't check for jobs, but there
+		     * shouldn't be any if not interactive.
+		     */
+		    stopmsg = 1;
+		    zexit(exit_val, ZEXIT_NORMAL);
+		} else
+		    _exit(exit_val);
+	    }
+	    retflag = 0; /* "return" behaves as if in a function */
+	}
+
 	/*
 	 * In ksh emulation a leading `!' is a special flag working
 	 * sort of like our (k).  This is true only for arrays or
@@ -2583,14 +2681,14 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
 	 * we let fetchvalue set the main string pointer s to
 	 * the end of the bit it's fetched.
 	 */
-	if (!(v = fetchvalue(&vbuf, (subexp ? &ov : &s),
-			     (wantt ? -1 :
-			      ((unset(KSHARRAYS) || inbrace) ? 1 : -1)),
-			     scanflags)) ||
-	    (v->pm && (v->pm->node.flags & PM_UNSET)) ||
-	    (v->flags & VALFLAG_EMPTY))
+	if (!rplyvar &&
+	    (!(v = fetchvalue(&vbuf, (subexp ? &ov : &s),
+			      (wantt ? -1 :
+			       ((unset(KSHARRAYS) || inbrace) ? 1 : -1)),
+			      scanflags)) ||
+	     (v->pm && (v->pm->node.flags & PM_UNSET)) ||
+	     (v->flags & VALFLAG_EMPTY)))
 	    vunset = 1;
-
 	if (wantt) {
 	    /*
 	     * Handle the (t) flag: value now becomes the type

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2023-07-06  2:27         ` Mikael Magnusson
@ 2023-07-06  4:00           ` Bart Schaefer
  2023-07-18  3:14             ` Bart Schaefer
  0 siblings, 1 reply; 32+ messages in thread
From: Bart Schaefer @ 2023-07-06  4:00 UTC (permalink / raw)
  To: zsh-workers

On Wed, Jul 5, 2023 at 7:28 PM Mikael Magnusson <mikachu@gmail.com> wrote:
>
> % echo "${:-a{}b}"
> a{b}
> ^ People occasionally get confused by this in #zsh, it comes up in
> relation to %F{} or so usually.
> It would probably be bad if this behavior changed.

Thanks; my first tweak would in fact have changed that, but I can
account for it.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2023-07-06  1:39       ` Bart Schaefer
@ 2023-07-06  2:27         ` Mikael Magnusson
  2023-07-06  4:00           ` Bart Schaefer
  0 siblings, 1 reply; 32+ messages in thread
From: Mikael Magnusson @ 2023-07-06  2:27 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

On 7/6/23, Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Tue, Jul 4, 2023 at 10:32 PM Bart Schaefer <schaefer@brasslantern.com>
> wrote:
>>
>> The first three of those are implementable within the structure of the
>> patch from workers/51898 although it's a little messy because of the
>> multiple "return" points in paramsubst().
>
> Fiddling with this has led me to find a few other issues with either
> of Sebastian's or my earlier patch in this thread, namely, both need
> to be more clever about finding the closing brace of the ${|...}
> substitution.  This in turn leads me to ask if one of the old hands
> with the lexer can explain why, inside double quotes, "}" is always
> replaced by Outbrace but "{" is not replaced by Inbrace unless it is
> preceded by "$"?
>
> A small tweak to the lexer still passes all tests and makes
> skipparens(Inbrace, Outbrace, &str) work more consistently, but I
> don't want to have missed something.

I'm not one of said old hands, but presumably it's because { on its
own is not special inside double quotes? Possibly related,
% echo ${:-a{}b}
a{}b
% echo "${:-a{}b}"
a{b}
^ People occasionally get confused by this in #zsh, it comes up in
relation to %F{} or so usually.
It would probably be bad if this behavior changed.

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2023-07-05  5:32     ` Bart Schaefer
  2023-07-05  6:30       ` Bart Schaefer
@ 2023-07-06  1:39       ` Bart Schaefer
  2023-07-06  2:27         ` Mikael Magnusson
  1 sibling, 1 reply; 32+ messages in thread
From: Bart Schaefer @ 2023-07-06  1:39 UTC (permalink / raw)
  To: zsh-workers

On Tue, Jul 4, 2023 at 10:32 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> The first three of those are implementable within the structure of the
> patch from workers/51898 although it's a little messy because of the
> multiple "return" points in paramsubst().

Fiddling with this has led me to find a few other issues with either
of Sebastian's or my earlier patch in this thread, namely, both need
to be more clever about finding the closing brace of the ${|...}
substitution.  This in turn leads me to ask if one of the old hands
with the lexer can explain why, inside double quotes, "}" is always
replaced by Outbrace but "{" is not replaced by Inbrace unless it is
preceded by "$"?

A small tweak to the lexer still passes all tests and makes
skipparens(Inbrace, Outbrace, &str) work more consistently, but I
don't want to have missed something.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2023-07-05  5:32     ` Bart Schaefer
@ 2023-07-05  6:30       ` Bart Schaefer
  2023-07-06  1:39       ` Bart Schaefer
  1 sibling, 0 replies; 32+ messages in thread
From: Bart Schaefer @ 2023-07-05  6:30 UTC (permalink / raw)
  To: Lawrence Velázquez; +Cc: zsh-workers

On Tue, Jul 4, 2023 at 10:32 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> -- REPLY is treated as a local, i.e., it's value gets saved and
> restored around the substitution

(Quoting the initial message on the bash nofork thread)
> Bash creates 'REPLY' as an initially-unset
> local variable when COMMAND executes, and restores 'REPLY' to the value
> it had before the command substitution after COMMAND completes, as with
> any local variable.

This has the side-effect that testing the enclosing-scope value of
$REPLY from inside the substitution is impossible.  I'm not sure how I
feel about that.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2023-07-04 18:53   ` Lawrence Velázquez
@ 2023-07-05  5:32     ` Bart Schaefer
  2023-07-05  6:30       ` Bart Schaefer
  2023-07-06  1:39       ` Bart Schaefer
  0 siblings, 2 replies; 32+ messages in thread
From: Bart Schaefer @ 2023-07-05  5:32 UTC (permalink / raw)
  To: Lawrence Velázquez; +Cc: zsh-workers

On Tue, Jul 4, 2023 at 11:54 AM Lawrence Velázquez <larryv@zsh.org> wrote:
>
> For reference/comparison, a similar feature was recently added to
> bash's devel branch:
>
> https://lists.gnu.org/archive/html/bug-bash/2023-05/msg00042.html

Hmm, some interesting things from there --

-- REPLY is treated as a local, i.e., it's value gets saved and
restored around the substitution, and it's implied mksh does the same.
Zsh never does that anywhere; are there other places (in other shells)
where REPLY implicitly behaves like that?
-- "local" works inside the substitution as it would inside a function
body, but $@ refers to the calling environment.
-- "return" also behaves as if in a function body.
-- Robert Elz noticed that mksh will allow ${|foo} rather than
${|foo;} and Chet calls that a bug ... I suspect both zsh and mksh
consider allowing { foo } rather than { foo; } to be a feature, and
this "bug" is merely a reflection of that?
-- it's really not possible to implement ${(command)} in zsh because
${(flags)param} is already valid syntax, and this would break some
scripts that try to be zsh and bash at once.  Chet says he's going to
require ${ (command); } instead, though.
-- ${ command; } is implemented using an anonymous tempfile rather
than something like Perl's IO::String.  (Whew.)

The first three of those are implementable within the structure of the
patch from workers/51898 although it's a little messy because of the
multiple "return" points in paramsubst().  The anonymous tempfile for
the last one requires some extra stuff.

The trick with "local" vs. $@ actually might address some of the
points raised by Oliver in past discussion about the "private" module.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2023-07-04  1:55 ` Bart Schaefer
@ 2023-07-04 18:53   ` Lawrence Velázquez
  2023-07-05  5:32     ` Bart Schaefer
  0 siblings, 1 reply; 32+ messages in thread
From: Lawrence Velázquez @ 2023-07-04 18:53 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

On Mon, Jul 3, 2023, at 9:55 PM, Bart Schaefer wrote:
> Given the foregoing, here's a proposed alternative to Sebastian's
> patch.  Formal doc and tests pending reaction.

For reference/comparison, a similar feature was recently added to
bash's devel branch:

https://lists.gnu.org/archive/html/bug-bash/2023-05/msg00042.html

(NB: I think the implementation has changed slightly since the
original mailing list message, but the in-tree documentation should
be up-to-date.)

-- 
vq


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
  2023-05-03 14:46 Sebastian Gniazdowski
@ 2023-07-04  1:55 ` Bart Schaefer
  2023-07-04 18:53   ` Lawrence Velázquez
  0 siblings, 1 reply; 32+ messages in thread
From: Bart Schaefer @ 2023-07-04  1:55 UTC (permalink / raw)
  To: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 2980 bytes --]

On Wed, May 3, 2023 at 7:46 AM Sebastian Gniazdowski
<sgniazdowski@gmail.com> wrote:
>
> I was looking at some of my old patches and stumbled upon this one
> which implements mksh' ${|func;} substitution.
> [...] Original thread:
> https://zsh.org/mla/workers/2019/msg00768.html

I'm not going to try to recapitulate all the objections/discussions
from that thread, but here are some thoughts.

> [...] The old thread went
> over a complex (x) flag [but] ${!func;} doesn't block us from adding a
> more advanced x-flag in the future. [...]

An argument put forward in the code comments in this most recent
patch, namely that adding an (x) flag requires further revisions of
"14.3.3 Rules" in the documentation, does have some merit.  It would
be complicated to combine (x) with other flags in the same
substitution step, and even more convoluted to try to explain it.  I
think it's preferable to treat it more like a command substitution ala
$(...).

> To recall - the ${|func;} substitution runs function "func" and
> substitutes (returned string in…) $REPLY after it has completed,
> without (!) doing forks, which is a performance and functionality
> benefit over currently available equivalent: $(func;print -rn --
> $REPLY).

One of the significant bits from the previous thread was the notion
that the mksh syntax is actually ${|command} rather than just limited
to functions.  I looked at Sebastian's later patch (workers/44742)
that tried to implement this, but I think calling bin_eval() isn't
necessary.

Also in that thread Stephane points out that it would be ideal if
(paraphrasing his example) "${| echo { }" were parsed similarly to "{
echo { }" and that to do so would require changes to the parser, not
just to the substitution process.  However, we can get most of the way
there with what Sebastian has proposed and any future parser revisions
to take this out of paramsubst() would not (I think) need to
invalidate code written to this first approximation.

The other thing that bothered me about the idea is the hardcoded
reliance on $REPLY, so I took a stab at something to address that.

Given the foregoing, here's a proposed alternative to Sebastian's
patch.  Formal doc and tests pending reaction.  Informally,

${|code} passes the code through execstring() and then substitutes the
value of $REPLY.
${|VAR|code} passes code through execstring() and then substitutes the
value of $VAR.

This even works where VAR includes a namespace prefix or a subscript
expression, but VAR must otherwise look like a scalar parameter
reference.  The main limitation on code so interpolated is that it
can't contain unquoted { or } ... multi-line code is OK.

A remark about the last hunk of the patch:  I kept this from
Sebastian's original patch, it has the effect that ${|code} never
appears to be an unset parameter, even if that code does not set
REPLY.  I'm not certain that's actually necessary.

[-- Attachment #2: mksh-exec-subst-plus.txt --]
[-- Type: text/plain, Size: 2598 bytes --]

diff --git a/Src/subst.c b/Src/subst.c
index 14947ae36..c5c060d56 100644
--- a/Src/subst.c
+++ b/Src/subst.c
@@ -1860,6 +1860,8 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
      * joining the array into a string (for compatibility with ksh/bash).
      */
     int quoted_array_with_offset = 0;
+    /* Indicates ${|...;} */
+    char *rplyvar = NULL;
 
     *s++ = '\0';
     /*
@@ -1887,8 +1889,53 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
      * flags in parentheses, but also one ksh hack.
      */
     if (c == Inbrace) {
+	/* The command string to be run by ${|...;} */
+	char *cmdarg = NULL;
+	size_t slen = 0;
 	inbrace = 1;
 	s++;
+
+        /* Short-path for the command-running substitution ${|cmd;}
+         * The command string is extracted and executed, and the
+         * substitution assigned. There's no (...)-flags processing,
+         * i.e. no ${|(U)cmd;}, because it looks quite awful and
+         * also requires a change to the manual description of the
+         * substitution order. Use ${(U)${|cmd;}} instead, it looks
+         * cleaner. */
+	if (*s == Bar && ((slen = strlen(s) - 1) > 1)) {
+	    char *outbracep = s + slen;
+	    if (outbracep[0] == Outbrace /* && outbracep[-1] == ';' */) {
+		
+		if ((rplyvar = itype_end(s+1, INAMESPC, 0))) {
+		    if (*rplyvar == Inbrack &&
+			(rplyvar = parse_subscript(++rplyvar, 1, ']')))
+			++rplyvar;
+		}
+		if (rplyvar == s+1 && *rplyvar == Bar) {
+		    /* Is ${||...} a subtitution error or a syntax error?
+		    zerr("bad substitution");
+		    return NULL;
+		    */
+		    rplyvar = NULL;
+		}
+		if (rplyvar && *rplyvar == Bar) {
+		    cmdarg = dupstrpfx(rplyvar+1, outbracep-rplyvar-1);
+		    rplyvar = dupstrpfx(s+1,rplyvar-s-1);
+		} else {
+		    cmdarg = dupstrpfx(s+1, outbracep-s-1);
+		    rplyvar = "REPLY";
+		}
+		s = outbracep;
+	    }
+	}
+
+	if (rplyvar && cmdarg && *cmdarg) {
+	    /* Execute the shell command */
+	    untokenize(cmdarg);
+	    execstring(cmdarg, 1, 0, "cmdsubst");
+	    s = dyncat(rplyvar, s);
+	}
+
 	/*
 	 * In ksh emulation a leading `!' is a special flag working
 	 * sort of like our (k).  This is true only for arrays or
@@ -2588,8 +2635,11 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
 			      ((unset(KSHARRAYS) || inbrace) ? 1 : -1)),
 			     scanflags)) ||
 	    (v->pm && (v->pm->node.flags & PM_UNSET)) ||
-	    (v->flags & VALFLAG_EMPTY))
-	    vunset = 1;
+	    (v->flags & VALFLAG_EMPTY))	{
+	    if (!rplyvar) {
+		vunset = 1;
+	    }
+	}
 
 	if (wantt) {
 	    /*

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH] Support the mksh's ${|func;} substitution
@ 2023-05-03 14:46 Sebastian Gniazdowski
  2023-07-04  1:55 ` Bart Schaefer
  0 siblings, 1 reply; 32+ messages in thread
From: Sebastian Gniazdowski @ 2023-05-03 14:46 UTC (permalink / raw)
  To: Zsh hackers list

[-- Attachment #1: Type: text/plain, Size: 1057 bytes --]

Hi,
I was looking at some of my old patches and stumbled upon this one
which implements mksh' ${|func;} substitution. The old thread went
over a complex (x) flag choice as a expected candidate for such use
case, however I've now thought that something is better than nothing
and that implementing simpler ${!func;} doesn't block us from adding a
more advanced x-flag in the future. Also, the simplicity of this
substitution is a plus for it, opposed to x-flag complexity (think of
existing e flag, whose's use is fairly complex).

So I thought that I send the patch again upon cleaning conflicts, for
consideration. Or should we revive the advanced x-flag?

To recall - the ${|func;} substitution runs function "func" and
substitutes (returned string in…) $REPLY after it has completed,
without (!) doing forks, which is a performance and functionality
benefit over currently available equivalent: $(func;print -rn --
$REPLY). Original thread:
https://zsh.org/mla/workers/2019/msg00768.html

-- 
Best regards,
Sebastian Gniazdowski

[-- Attachment #2: mksh-exec-subst.patch --]
[-- Type: text/x-patch, Size: 3087 bytes --]

From 5129820336782902f0dc43d7f196bb66c4579fb9 Mon Sep 17 00:00:00 2001
From: Sebastian Gniazdowski <sgniazdowski@gmail.com>
Date: Wed, 3 May 2023 14:47:01 +0059
Subject: Support the mksh's ${|func;} substitution

---
 Src/subst.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 50 insertions(+), 1 deletion(-)

diff --git a/Src/subst.c b/Src/subst.c
index 974d6171e..abb458195 100644
--- a/Src/subst.c
+++ b/Src/subst.c
@@ -29,6 +29,7 @@
 
 #include "zsh.mdh"
 #include "subst.pro"
+#include "exec.pro"
 
 #define LF_ARRAY	1
 
@@ -1860,8 +1861,17 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
      * joining the array into a string (for compatibility with ksh/bash).
      */
     int quoted_array_with_offset = 0;
+    /* Indicates ${|func;} */
+    int rplyfunc = 0;
+    /* The name of the function to be ran by ${|...;} */
+    char *cmdarg = NULL;
+    /* The length of the input string */
+    int slen = 0;
+    /* The closing brace pointer */
+    char *outbracep;
 
     *s++ = '\0';
+    slen = strlen(s);
     /*
      * Nothing to do unless the character following the $ is
      * something we recognise.
@@ -1889,6 +1899,41 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
     if (c == Inbrace) {
 	inbrace = 1;
 	s++;
+
+        /* Short-path for the function-running substitution ${|func;}
+         * The function name is extracted and called, and the
+         * substitution assigned. There's no (...)-flags processing,
+         * i.e. no ${|(U)func;}, because it looks quite awful and
+         * also requires a change to the manual, part about the
+         * substitution order. Use ${(U)${|func;}} instead, it looks
+         * cleaner. */
+        if ( ((outbracep=strchr(s,Outbrace)) ||
+             (outbracep=strchr(s,'}'))) &&
+                (s[0] == Bar || s[0] == '|') &&
+                    outbracep[-1] == ';' )
+        {
+            rplyfunc = 1;
+            cmdarg = dupstrpfx(s+1, outbracep-s-2);
+            s=outbracep;
+
+            HashNode hn = NULL;
+            if( (hn = shfunctab->getnode(shfunctab, cmdarg)) ) {
+                /* Execute the shell function */
+                doshfunc((Shfunc) hn, NULL, 1);
+                val = getsparam("REPLY");
+                if (val)
+                    vunset = 0;
+                else {
+                    vunset = 1;
+                    val = dupstring("");
+            }
+        } else {
+                zerr("no such function: %s", cmdarg);
+                return NULL;
+        }
+            fetch_needed = 0;
+    }
+
 	/*
 	 * In ksh emulation a leading `!' is a special flag working
 	 * sort of like our (k).  This is true only for arrays or
@@ -2589,7 +2634,11 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags,
 			     scanflags)) ||
 	    (v->pm && (v->pm->node.flags & PM_UNSET)) ||
 	    (v->flags & VALFLAG_EMPTY))
-	    vunset = 1;
+        {
+            if (!rplyfunc) {
+                vunset = 1;
+        }
+    }
 
 	if (wantt) {
 	    /*
-- 
2.28.0


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2023-07-18  3:14 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-06  0:52 [PATCH] Support the mksh's ${|func;} substitution Sebastian Gniazdowski
2019-09-06  0:54 ` Sebastian Gniazdowski
2019-09-06 23:16 ` Sebastian Gniazdowski
2019-09-07 12:16   ` Daniel Shahaf
2019-09-07 15:07 ` Stephane Chazelas
2019-09-07 18:09   ` Sebastian Gniazdowski
2019-09-07 20:19     ` Stephane Chazelas
2019-09-07 21:19       ` Sebastian Gniazdowski
2019-09-10  2:20         ` Sebastian Gniazdowski
2019-09-10  5:29           ` Bart Schaefer
2019-09-10 18:21             ` Sebastian Gniazdowski
2019-09-10 19:38               ` Bart Schaefer
2019-09-12  0:08                 ` Sebastian Gniazdowski
2019-09-12  1:03                   ` Bart Schaefer
2019-09-12  2:06                     ` Sebastian Gniazdowski
2019-09-12  5:35                       ` Bart Schaefer
2019-09-12  6:00                         ` Sebastian Gniazdowski
2019-09-12  6:55                           ` Bart Schaefer
2019-09-13 20:28                             ` Sebastian Gniazdowski
2019-09-13 21:33                               ` Bart Schaefer
2019-09-13 21:36                                 ` Bart Schaefer
2019-09-14  0:41                                   ` Sebastian Gniazdowski
2019-09-14  0:44                                     ` Sebastian Gniazdowski
2023-05-03 14:46 Sebastian Gniazdowski
2023-07-04  1:55 ` Bart Schaefer
2023-07-04 18:53   ` Lawrence Velázquez
2023-07-05  5:32     ` Bart Schaefer
2023-07-05  6:30       ` Bart Schaefer
2023-07-06  1:39       ` Bart Schaefer
2023-07-06  2:27         ` Mikael Magnusson
2023-07-06  4:00           ` Bart Schaefer
2023-07-18  3:14             ` Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).