* [PATCH?] Nofork and removing newlines @ 2024-03-05 5:52 Bart Schaefer 2024-03-05 6:56 ` Stephane Chazelas 0 siblings, 1 reply; 29+ messages in thread From: Bart Schaefer @ 2024-03-05 5:52 UTC (permalink / raw) To: Zsh hackers list [-- Attachment #1: Type: text/plain, Size: 410 bytes --] On Tue, Feb 27, 2024 at 12:53 PM Bart Schaefer <schaefer@brasslantern.com> wrote: > > The intent was to have ${ ... } act more like parameter substitution. > It might be possible/reasonable to have ${ ... } strip newlines and > "${ ... }" keep them, if that feels better. The attached patch implements this, for purposes of discussion. The doc updates are much larger than the actual code change. [-- Attachment #2: nofork-nonewlines.txt --] [-- Type: text/plain, Size: 3711 bytes --] diff --git a/Doc/Zsh/expn.yo b/Doc/Zsh/expn.yo index 183ca6e03..b77942697 100644 --- a/Doc/Zsh/expn.yo +++ b/Doc/Zsh/expn.yo @@ -1950,6 +1950,9 @@ the braces by whitespace, like `tt(${ )...tt( })', is replaced by its standard output. Like `tt(${|)...tt(})' and unlike `tt($LPAR())...tt(RPAR())', the command executes in the current shell context with function local behaviors and does not create a subshell. +Word splitting does not apply unless tt(SH_WORD_SPLIT) is set, but +trailing newlines em(are) stripped unless the substitution is enclosed +in double quotes. Note that because the `tt(${|)...tt(})' and `tt(${ )...tt( })' forms must be parsed at once as both string tokens and commands, all other diff --git a/Etc/FAQ.yo b/Etc/FAQ.yo index 4a86050e6..0515d2fca 100644 --- a/Etc/FAQ.yo +++ b/Etc/FAQ.yo @@ -1092,10 +1092,11 @@ sect(Comparisons of forking and non-forking command substitution) affects the caller. mytt($(command)) removes trailing newlines from the output of mytt(command) - when substituting, whereas mytt(${ command }) and its variants do not. - The latter is consistent with mytt(${|...}) from mksh but differs from - bash and ksh, so in emulation modes, newlines are stripped from command - output (not from tt(REPLY) assignments). + when substituting, as does mytt(${ command }) when not quoted. Placing + double quotes around mytt("${ command }"), or using either mytt(${|...}) + format, retains newlines. The latter is consistent with mytt(${|...}) + from mksh, but mytt("${ command }") differs from bash and ksh, so in + emulation modes, newlines stripped even from quoted command output. When not enclosed in double quotes, the expansion of mytt($(command)) is split on tt(IFS) into an array of words. In contrast, and unlike both diff --git a/Src/subst.c b/Src/subst.c index 49f7336bb..785137357 100644 --- a/Src/subst.c +++ b/Src/subst.c @@ -2005,7 +2005,7 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags, int onoerrs = noerrs, rplylen; noerrs = 2; rplylen = zstuff(&cmdarg, rplytmp); - if (! EMULATION(EMULATE_ZSH)) { + if (! EMULATION(EMULATE_ZSH) || !qt) { /* bash and ksh strip trailing newlines here */ while (rplylen > 0 && cmdarg[rplylen-1] == '\n') rplylen--; diff --git a/Test/D10nofork.ztst b/Test/D10nofork.ztst index d6a5588df..1c6a30cb0 100644 --- a/Test/D10nofork.ztst +++ b/Test/D10nofork.ztst @@ -159,7 +159,7 @@ F:Why not use this error in the previous case as well? 1:unbalanced braces, part 4+ ?(eval):1: closing brace expected - purr ${ purr STDOUT } + purr "${ purr STDOUT }" 0:capture stdout >STDOUT > @@ -322,7 +322,7 @@ F:Fiddly here to get EOF past the test syntax 0:here-string behavior >in a here string - <<<${ purr $'stdout as a here string' } + <<<"${ purr $'stdout as a here string' }" 0:another capture stdout >stdout as a here string > @@ -331,7 +331,7 @@ F:Fiddly here to get EOF past the test syntax wrap=${ purr "capture in environment assignment" } typeset -p wrap 0:assignment context >typeset -g wrap='REPLY in environment assignment' ->typeset -g wrap=$'capture in environment assignment\n' +>typeset -g wrap='capture in environment assignment' # Repeat return and exit tests with stdout capture @@ -410,7 +410,7 @@ F:must do this before evaluating the next test block 0:ignored braces, part 1 >buried} - purr ${ purr ${REPLY:-buried}}} + purr "${ purr ${REPLY:-buried}}}" 0:ignored braces, part 2 >buried >} @@ -418,7 +418,6 @@ F:must do this before evaluating the next test block purr ${ { echo nested ;} } 0:ignored braces, part 3 >nested -> purr ${ { echo nested } } DONE 1:ignored braces, part 4 ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-05 5:52 [PATCH?] Nofork and removing newlines Bart Schaefer @ 2024-03-05 6:56 ` Stephane Chazelas 2024-03-05 22:48 ` Bart Schaefer 0 siblings, 1 reply; 29+ messages in thread From: Stephane Chazelas @ 2024-03-05 6:56 UTC (permalink / raw) To: Bart Schaefer; +Cc: Zsh hackers list 2024-03-04 21:52:02 -0800, Bart Schaefer: [...] > mytt($(command)) removes trailing newlines from the output of mytt(command) > - when substituting, whereas mytt(${ command }) and its variants do not. > - The latter is consistent with mytt(${|...}) from mksh but differs from > - bash and ksh, so in emulation modes, newlines are stripped from command > - output (not from tt(REPLY) assignments). > + when substituting, as does mytt(${ command }) when not quoted. Placing > + double quotes around mytt("${ command }"), or using either mytt(${|...}) > + format, retains newlines. The latter is consistent with mytt(${|...}) > + from mksh, but mytt("${ command }") differs from bash and ksh, so in > + emulation modes, newlines stripped even from quoted command output. ^^^ typo missing "are". To me ${ cmd; } being the non-forking version of $(...) should behave like $(...) in that regard. IMO, it's a bug in Bourne-like shells (and some others) that $(...) removes *all* trailing newline characters, but removing *one* is usually desired. As in: basename=$(basename -- "$file") should remove the newline added by basename, but not the newline characters that are found at the end of $file. In any case, I agree ${|cmd} should expand to the value of $REPLY as-is and trimming newlines there would not make sense. IIRC I already mentioned it here but maybe having a: ZSH_CMDSUBST_TRIM=<extendedglobpattern> (defaulting to $'\n##' for backward compatibility) could address the general issue with cmdsubst trimming too many newlines (for both $(...) and ${...;}). One would change it to ZSH_CMDSUBST_TRIM=$'\n' to get a saner default, or ZSH_CMDSUBST_TRIM= to not remove anything or ZSH_CMDSUBST_TRIM=$'(\r|)\n' or ZSH_CMDSUBST_TRIM='[[:space:]]##' to handle MSDOS line delimiters or remove any whitespace. > > When not enclosed in double quotes, the expansion of mytt($(command)) is > split on tt(IFS) into an array of words. unless called in non-list contexts such as in scalar variable assignment or [[ $var ]] or case $var in... See also: $ ./Src/zsh -c 'a=( "${(s[:])${ getconf PATH }}" ); typeset -p a' typeset -a a=( /bin $'/usr/bin\n' ) -- Stephane ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-05 6:56 ` Stephane Chazelas @ 2024-03-05 22:48 ` Bart Schaefer 2024-03-06 17:57 ` Stephane Chazelas 2024-03-06 19:43 ` Stephane Chazelas 0 siblings, 2 replies; 29+ messages in thread From: Bart Schaefer @ 2024-03-05 22:48 UTC (permalink / raw) To: Zsh hackers list On Mon, Mar 4, 2024 at 10:56 PM Stephane Chazelas <stephane@chazelas.org> wrote: > > To me ${ cmd; } being the non-forking version of $(...) should > behave like $(...) in that regard. That's the starting point of this discussion, yes. > IMO, it's a bug in Bourne-like shells (and some others) that > $(...) removes *all* trailing newline characters, but removing > *one* is usually desired. Ignoring the many-vs.-one issue, the pivotal word here is "usually". We can't change the behavior of $(...) but parameter expansions already behave differently with respect to SH_WORD_SPLIT so we have precedent for leeway on ${ ... }. The suggested change would provide $(...)-like behavior for the usual case and a simple way to keep the newline(s) in the less-usual cases. > IIRC I already mentioned it here but maybe having a: > > ZSH_CMDSUBST_TRIM=<extendedglobpattern> This is both IMO way too complicated and also misses the point that newline trimming or not ought to be easily switchable in the context of a single expansion, not globally. So when I started the thread about ${ ... } the consensus was that it would be OK to always keep the newlines and if you don't want them in a particular case, you can write ${${ command }%$'\n'}. Since then it's been pointed out that a lot of uses of $(...) that would be replace-able with ${ ... } will break if the newlines are not stripped, and it's a bit of a pain to have to remember that nesting all the time. So the proposal made here has two goals: 1) Make it easy to replace many uses of $(...) 2) Make it easy to choose case-by-case whether to keep newlines Thus ${ ... } strips newlines like $(...) for #1 "${ ... }" keeps them for handling #2 and if you want full SH_WORD_SPLIT behavior you can still write ${=${ ... }} which is shorter and easier than the %$'\n' thing and strips newlines too. My strong inclination is to either go with this patch or leave it as is. The code change to implement this patch is literally two tokens. Thanks for the doc proofread. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-05 22:48 ` Bart Schaefer @ 2024-03-06 17:57 ` Stephane Chazelas 2024-03-06 19:45 ` Bart Schaefer 2024-03-06 19:43 ` Stephane Chazelas 1 sibling, 1 reply; 29+ messages in thread From: Stephane Chazelas @ 2024-03-06 17:57 UTC (permalink / raw) To: Bart Schaefer; +Cc: Zsh hackers list 2024-03-05 14:48:00 -0800, Bart Schaefer: > On Mon, Mar 4, 2024 at 10:56 PM Stephane Chazelas <stephane@chazelas.org> wrote: > > > > To me ${ cmd; } being the non-forking version of $(...) should > > behave like $(...) in that regard. > > That's the starting point of this discussion, yes. > > > IMO, it's a bug in Bourne-like shells (and some others) that > > $(...) removes *all* trailing newline characters, but removing > > *one* is usually desired. > > Ignoring the many-vs.-one issue, the pivotal word here is "usually". > We can't change the behavior of $(...) but parameter expansions > already behave differently with respect to SH_WORD_SPLIT so we have > precedent for leeway on ${ ... }. The suggested change would provide > $(...)-like behavior for the usual case and a simple way to keep the > newline(s) in the less-usual cases. Sorry, I hadn't realised ${ cmd } also didn't do IFS-splitting, so it is indeed departing a lot from command substitution and assuming we don't care about keep compatibility with ksh93/mksh/bash, I agree the proposed behaviour makes sense and it's usefil to have a command substitution that doesn't trim all newlines, so as you say I can do for my previous example: basename="${${ basename -- "$file" }%$'\n'}" To properly get the basename of $file with basename. (yes, I know it's a bad example as we can also do basename=$file:t). > > IIRC I already mentioned it here but maybe having a: > > > > ZSH_CMDSUBST_TRIM=<extendedglobpattern> > > This is both IMO way too complicated and also misses the point that > newline trimming or not ought to be easily switchable in the context > of a single expansion, not globally. The idea would be to allow users to fix command substitution once and for all with ZSH_CMDSUBST_TRIM=$'\n'. So things like: basename=$(basename -- "$file") become correct regardless of the value of $file without to have to resort to ugly work arounds. set -o fixcmdsubstrnewlinetrimming would work as well be be less versatile. (I agree that in any case that's rather tangential to the question of what to do with ${ ... }) > My strong inclination is to either go with this patch or leave it as > is. The code change to implement this patch is literally two tokens. Either way or always removing all newlines or always removing one newline or removing one newline when not quoted are fine with me. -- Stephane ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-06 17:57 ` Stephane Chazelas @ 2024-03-06 19:45 ` Bart Schaefer 2024-03-06 22:22 ` Mikael Magnusson 0 siblings, 1 reply; 29+ messages in thread From: Bart Schaefer @ 2024-03-06 19:45 UTC (permalink / raw) To: Zsh hackers list On Wed, Mar 6, 2024 at 9:57 AM Stephane Chazelas <stephane@chazelas.org> wrote: > > Sorry, I hadn't realised ${ cmd } also didn't do IFS-splitting, > so it is indeed departing a lot from command substitution and > assuming we don't care about keep compatibility with > ksh93/mksh/bash, I agree the proposed behaviour makes sense If SH_WORD_SPLIT is in fact set (as when emulating) then it is applied, so that's the other-shell-compatibility path. > Either way or always removing all newlines or always removing one > newline or removing one newline when not quoted are fine with me. Thanks. Anyone else waiting to weigh in? ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-06 19:45 ` Bart Schaefer @ 2024-03-06 22:22 ` Mikael Magnusson 2024-03-06 22:42 ` Bart Schaefer ` (2 more replies) 0 siblings, 3 replies; 29+ messages in thread From: Mikael Magnusson @ 2024-03-06 22:22 UTC (permalink / raw) To: Bart Schaefer; +Cc: Zsh hackers list On 3/6/24, Bart Schaefer <schaefer@brasslantern.com> wrote: > On Wed, Mar 6, 2024 at 9:57 AM Stephane Chazelas <stephane@chazelas.org> > wrote: >> >> Sorry, I hadn't realised ${ cmd } also didn't do IFS-splitting, >> so it is indeed departing a lot from command substitution and >> assuming we don't care about keep compatibility with >> ksh93/mksh/bash, I agree the proposed behaviour makes sense > > If SH_WORD_SPLIT is in fact set (as when emulating) then it is > applied, so that's the other-shell-compatibility path. > >> Either way or always removing all newlines or always removing one >> newline or removing one newline when not quoted are fine with me. > > Thanks. Anyone else waiting to weigh in? These are just some observations with no real conclusion probably. 1) $(foo) will optimize away an extra fork if foo is an external command 2) ${ foo } will fork the same amount of times as 1) if foo is external and not at all if foo is a function. If you write a function that prints stuff, it is presumably pretty easy to just make it not print the extra newlines in the first place. If foo calls some external command that prints a newline then I suppose 1) and 2) are not super relevant arguments. "${ foo}" and ${ foo} having the same wordsplitting behavior but only differing in stripping newlines feels a bit magical and weird. I would feel surprised if it did wordsplitting without shwordsplit since it is an extension of the ${} syntax which doesn't do it. We could in theory add some new () flag, T for trim is free eg, ${(T)${ foo}} is somewhat more ergonomic than ${${ foo}%$'\n'} Is there some strong reason we could not allow ${(T) foo} btw? The space is syntactically kind of similar to other stuff that does work like ${(f)^param} and would save the extra ${}, but I didn't take a look at the code yet. -- Mikael Magnusson ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-06 22:22 ` Mikael Magnusson @ 2024-03-06 22:42 ` Bart Schaefer 2024-03-07 4:53 ` Bart Schaefer 2024-03-07 6:52 ` Lawrence Velázquez 2 siblings, 0 replies; 29+ messages in thread From: Bart Schaefer @ 2024-03-06 22:42 UTC (permalink / raw) To: Mikael Magnusson; +Cc: Zsh hackers list Have to go to an appointment so just one quick thing now: On Wed, Mar 6, 2024 at 2:22 PM Mikael Magnusson <mikachu@gmail.com> wrote: > > Is there some strong reason we could not allow ${(T) foo} btw? "{ " (curly bracket followed by space) is recognized like a syntax token. Can't break it up by sticking an arbitrary chunk of flags in parens in the middle of it. > space is syntactically kind of similar to other stuff that does work In that case "{" is the token and all the stuff following is parsed later. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-06 22:22 ` Mikael Magnusson 2024-03-06 22:42 ` Bart Schaefer @ 2024-03-07 4:53 ` Bart Schaefer 2024-03-07 7:02 ` Lawrence Velázquez 2024-03-07 7:10 ` Stephane Chazelas 2024-03-07 6:52 ` Lawrence Velázquez 2 siblings, 2 replies; 29+ messages in thread From: Bart Schaefer @ 2024-03-07 4:53 UTC (permalink / raw) To: Mikael Magnusson; +Cc: Zsh hackers list On Wed, Mar 6, 2024 at 2:22 PM Mikael Magnusson <mikachu@gmail.com> wrote: > > 1) $(foo) will optimize away an extra fork if foo is an external command > 2) ${ foo } will fork the same amount of times as 1) if foo is > external and not at all if foo is a function. You're almost quoting the FAQ entry. :-) > "${ foo}" and ${ foo} having the same wordsplitting behavior but only > differing in stripping newlines feels a bit magical and weird. One question (and sort of the point) is whether anyone would really notice. If you put it in quotes you're expecting a literal result, and if you (for example) assign it unquoted to a scalar you're expecting it to "just work" the way assigning $(foo) would. It's a bit unusual but it seems to preserve the principle of least surprise, and it uses the least amount of extra syntax. On the other hand I'm not highly invested in this. In the absence of this (no)quoting behavior, I've found I nearly always want ${=${ foo }} or ${(f)${ foo }}, each of which gives exactly the same result with or without trimming. > We could in theory add some new () flag, T for trim is free eg, > ${(T)${ foo}} is somewhat more ergonomic than ${${ foo}%$'\n'} I admittedly (still pre-patch) have used (f) for this when I know there's only one line of output. I'm just struggling to think of where else I would use a (T). Returning to this other bit ... On Wed, Mar 6, 2024 at 2:42 PM Bart Schaefer <schaefer@brasslantern.com> wrote: > > On Wed, Mar 6, 2024 at 2:22 PM Mikael Magnusson <mikachu@gmail.com> wrote: > > > > Is there some strong reason we could not allow ${(T) foo} btw? > > "{ " (curly bracket followed by space) is recognized like a syntax > token. Code-wise, a sequence starting with ${ (with or without the space) and ending with } is lexed into a single STRING token. (If it's inside double quotes, the entire double-quoted thing is a STRING token, but you can have nested quotes inside the dollar-brace inside the double quotes, etc., so this has to work recursively, and so on.) So the lexer has to decide when it sees dollar-brace how to find the closing brace. Skipping over parameter flags before deciding to switch to parsing something that looks like a function body might be possible, but doesn't really fit into the structure of the lexer. Deciding based on the very next character (space or pipe for a command, or any other for a parameter) makes it tractable. The lexical problem aside, when you get to the point of performing the substitution, even if the command interpretation were deferred until after all the flags are collected, it would still have to function much like ${(flags)"$(cmdsubst)"} would, so it's a lot easier if it's already structured as a nested substitution. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-07 4:53 ` Bart Schaefer @ 2024-03-07 7:02 ` Lawrence Velázquez 2024-03-07 8:09 ` ${<file} (Was: [PATCH?] Nofork and removing newlines) Stephane Chazelas 2024-03-08 1:29 ` [PATCH?] Nofork and removing newlines Bart Schaefer 2024-03-07 7:10 ` Stephane Chazelas 1 sibling, 2 replies; 29+ messages in thread From: Lawrence Velázquez @ 2024-03-07 7:02 UTC (permalink / raw) To: Bart Schaefer, Mikael Magnusson; +Cc: zsh-workers On Wed, Mar 6, 2024, at 11:53 PM, Bart Schaefer wrote: > On Wed, Mar 6, 2024 at 2:42 PM Bart Schaefer <schaefer@brasslantern.com> wrote: >> >> On Wed, Mar 6, 2024 at 2:22 PM Mikael Magnusson <mikachu@gmail.com> wrote: >> > >> > Is there some strong reason we could not allow ${(T) foo} btw? >> >> "{ " (curly bracket followed by space) is recognized like a syntax >> token. > > Code-wise, a sequence starting with ${ (with or without the space) and > ending with } is lexed into a single STRING token. (If it's inside > double quotes, the entire double-quoted thing is a STRING token, but > you can have nested quotes inside the dollar-brace inside the double > quotes, etc., so this has to work recursively, and so on.) So the > lexer has to decide when it sees dollar-brace how to find the closing > brace. Skipping over parameter flags before deciding to switch to > parsing something that looks like a function body might be possible, > but doesn't really fit into the structure of the lexer. Deciding > based on the very next character (space or pipe for a command, or any > other for a parameter) makes it tractable. Hm, would it be feasible to create an explicit LF-preserving form using a different character (e.g., ${&cmd})? If so, would it be undesirable for some other reason? (Sorry if you already said something ruling this out; I only had time to quickly skim today's messages.) -- vq ^ permalink raw reply [flat|nested] 29+ messages in thread
* ${<file} (Was: [PATCH?] Nofork and removing newlines) 2024-03-07 7:02 ` Lawrence Velázquez @ 2024-03-07 8:09 ` Stephane Chazelas 2024-03-08 1:29 ` [PATCH?] Nofork and removing newlines Bart Schaefer 1 sibling, 0 replies; 29+ messages in thread From: Stephane Chazelas @ 2024-03-07 8:09 UTC (permalink / raw) To: Lawrence Velázquez; +Cc: Bart Schaefer, Mikael Magnusson, zsh-workers By the way, if ${ cmd } preserves trailing newlines, it would be useful to also have ${<file} as a variant of $(<file) that preserves trailing newlines (and remove the need for a zslurp). "${ <file}" already does but that's via running $READNULLCMD so that could be optimized. ksh93 and mksh both support optimised ${ <file;} (also ${<file;} in ksh93), but they do trim trailing newline characters so AFAICT, they're no different from $(<file). See also the $(<<'EOF' multi-line text EOF) of mksh which actually skips the creation of the here-doc and is in effect a form of multi-line quoting (though also trims trailing newlines). Also works with: ${ <<'EOF' multi-line test EOF } -- Stephane ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-07 7:02 ` Lawrence Velázquez 2024-03-07 8:09 ` ${<file} (Was: [PATCH?] Nofork and removing newlines) Stephane Chazelas @ 2024-03-08 1:29 ` Bart Schaefer 2024-03-08 22:15 ` Oliver Kiddle 1 sibling, 1 reply; 29+ messages in thread From: Bart Schaefer @ 2024-03-08 1:29 UTC (permalink / raw) To: zsh-workers On Wed, Mar 6, 2024 at 11:02 PM Lawrence Velázquez <larryv@zsh.org> wrote: > > Hm, would it be feasible to create an explicit LF-preserving form > using a different character (e.g., ${&cmd})? If so, would it be > undesirable for some other reason? Other than that we've just about run out of characters? ${< should be reserved for reading a file, as already suggested elsewhere (no, I'm not going to implement that yet, though it seems to be an undocumented ksh93 feature). ${> might work, but it "looks wrong" to have a command instead of a file to the right of the pointy end. ${& looks like you're running something asynchronously, or perhaps changing a file descriptor. Every other character already has another meaning in that position, as far as I can tell. There is one other possibility: ${||command}, that is, ${|var|command} with an empty var name. That's already passed through the lexer, so it could be picked out at the necessary place in subst.c (I think, haven't actually tried yet). It looks a little odd, too, given "||" usually means "or", but it's at least sort of logical to treat "assign this output to nothing" as "return the output in place", and the other ${|...} forms do preserve trailing newlines. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-08 1:29 ` [PATCH?] Nofork and removing newlines Bart Schaefer @ 2024-03-08 22:15 ` Oliver Kiddle 2024-03-08 23:28 ` Bart Schaefer 0 siblings, 1 reply; 29+ messages in thread From: Oliver Kiddle @ 2024-03-08 22:15 UTC (permalink / raw) To: Bart Schaefer; +Cc: zsh-workers Bart Schaefer wrote: > ${< should be reserved for reading a file, as already suggested > elsewhere (no, I'm not going to implement that yet, though it seems to > be an undocumented ksh93 feature). > ${> might work, but it "looks wrong" to have a command instead of a > file to the right of the pointy end. I agree. I'd sooner expect that to be running $NULLCMD redirected to a file. Not that that would be even remotely useful. > Every other character already has another meaning in that position, as > far as I can tell. It could be nice to have ${= cmd } as a shorter alternative to ${=${ cmd }} particularly if the default is to be newline preserving. That would need to do word splitting but trailing IFS characters also get removed so it would work for some cases. > There is one other possibility: ${||command}, that is, > ${|var|command} with an empty var name. That's already passed through > the lexer, so it could be picked out at the necessary place in subst.c > (I think, haven't actually tried yet). It looks a little odd, too, > given "||" usually means "or", but it's at least sort of logical to > treat "assign this output to nothing" as "return the output in place", > and the other ${|...} forms do preserve trailing newlines. The logic does at least follow from the usage with a variable. One way to avoid the resemblance to an "or" is if ${| |command} also works. It could perhaps be combined so ${||<file} slurps a file unmodified. Why does it print command not found errors for things like ${|=|:}, ${|*|:} and ${|?|:}, I'd rather have $? than it globbing for a single character file. Oliver ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-08 22:15 ` Oliver Kiddle @ 2024-03-08 23:28 ` Bart Schaefer 2024-03-09 20:43 ` Oliver Kiddle 0 siblings, 1 reply; 29+ messages in thread From: Bart Schaefer @ 2024-03-08 23:28 UTC (permalink / raw) To: Oliver Kiddle; +Cc: zsh-workers On Fri, Mar 8, 2024 at 2:15 PM Oliver Kiddle <opk@zsh.org> wrote: > > Bart Schaefer wrote: > > Every other character already has another meaning in that position, as > > far as I can tell. > > It could be nice to have ${= cmd } as a shorter alternative to > ${=${ cmd }} Unfortunately the lexer needs to be able to do this with one-character peek-ahead. So it can't distinguish dollar-brace-equal-space from dollar-brace-equal, and the latter has to be treated as a parameter expansion. > > There is one other possibility: ${||command}, that is, > > ${|var|command} with an empty var name. > > The logic does at least follow from the usage with a variable. One way > to avoid the resemblance to an "or" is if ${| |command} also works. That might be possible. Right now the lexer sees "${|" and branches to scanning something that looks like a function body (closely approximate to how $(command) scans ahead to the closing paren without really "understanding" what it's skipping over). That happens to not care whether the next "|" is in a sensible position, just that it's something that can be skipped while looking for the closing brace. Then at the point of actual substitution, when there's a leading "|" it looks for an identifier followed by another "|". So you can't write ... ${|paste|read} ... and expect $REPLY to be set as the default by read, instead $paste will be set (probably to nothing). Anyway the upshot is it could probably also look for whitespace followed by another "|" without confusing anything. Right now it just attempts to evaluate the equivalent of { |commmand } which is a parse error. > It could perhaps be combined so ${||<file} slurps a file unmodified. That's messy because you can write <file somecommand and it means the same as somecommand <file so again it's not enough to see "||<" ... we'd actually have to special-case READNULLCMD or something. > Why does it print command not found errors for things like ${|=|:}, > ${|*|:} and ${|?|:}, I'd rather have $? than it globbing for a single > character file. See above about the requirement for it to look like ${|ident|...}. Since = * and ? are not identifiers, this is like writing { =|: } etc. and you get the same errors. All of the non-identifier special parameters are read-only so it doesn't make sense to assign to them, and the |ident| has to be assignable for the expansion to mean anything, so why allow them in that position? Unless you're just going for side-effects, but then why use the |var| form? ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-08 23:28 ` Bart Schaefer @ 2024-03-09 20:43 ` Oliver Kiddle 2024-03-10 6:11 ` Bart Schaefer 0 siblings, 1 reply; 29+ messages in thread From: Oliver Kiddle @ 2024-03-09 20:43 UTC (permalink / raw) To: Bart Schaefer; +Cc: zsh-workers Bart Schaefer wrote: > See above about the requirement for it to look like ${|ident|...}. > Since = * and ? are not identifiers, this is like writing { =|: } etc. Ok, that makes sense. Thanks > and you get the same errors. All of the non-identifier special > parameters are read-only so it doesn't make sense to assign to them, > and the |ident| has to be assignable for the expansion to mean > anything, so why allow them in that position? Unless you're just going > for side-effects, but then why use the |var| form? You may not be able to assign to it directly but I can think of uses for $? (and perhaps also $!) if supported there. That is assuming $? is the return status for the command running inside the expansion. Being an identifier, $_ does work there, not that it's especially useful. $1, $2 etc also work. Oliver ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-09 20:43 ` Oliver Kiddle @ 2024-03-10 6:11 ` Bart Schaefer 2024-03-12 17:54 ` Bart Schaefer 0 siblings, 1 reply; 29+ messages in thread From: Bart Schaefer @ 2024-03-10 6:11 UTC (permalink / raw) To: Oliver Kiddle; +Cc: zsh-workers On Sat, Mar 9, 2024 at 12:44 PM Oliver Kiddle <opk@zsh.org> wrote: > > Bart Schaefer wrote: > > ... the |ident| has to be assignable for the expansion to mean > > anything, so why allow them in that position? > > You may not be able to assign to it directly but I can think of uses > for $? (and perhaps also $!) if supported there. $? is also $status and ${|status|...} is fine. % print ${|status| return 9} 9 Also: % x=${ return 9 } % echo $? 9 (Just like with $(exit 9).) Pondering $! ... hm. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-10 6:11 ` Bart Schaefer @ 2024-03-12 17:54 ` Bart Schaefer 2024-03-12 23:19 ` Oliver Kiddle 0 siblings, 1 reply; 29+ messages in thread From: Bart Schaefer @ 2024-03-12 17:54 UTC (permalink / raw) To: Oliver Kiddle; +Cc: zsh-workers On Fri, Mar 8, 2024 at 2:15 PM Oliver Kiddle <opk@zsh.org> wrote: > > Why does it print command not found errors for things like ${|=|:}, > ${|*|:} and ${|?|:}, I'd rather have $? than it globbing for a single Bart Schaefer wrote: > See above about the requirement for it to look like ${|ident|...}. > Since = * and ? are not identifiers, this is like writing { =|: } etc.> character file. On Sat, Mar 9, 2024 at 12:44 PM Oliver Kiddle <opk@zsh.org> wrote: > > You may not be able to assign to it directly but I can think of uses > for $? (and perhaps also $!) if supported there. On Sat, Mar 9, 2024 at 10:11 PM Bart Schaefer <schaefer@brasslantern.com> wrote: > > $? is also $status and ${|status|...} is fine. > > Pondering $! ... hm. This can be done with e.g. typeset -n bang=! ... ${|bang|...} ... And that doesn't even run afoul of history expansion, though I would not expect $! to be used that much in an interactive context. However: Returning to the original context here, we were talking about how to make ${ ... } more newline-trimming-compatible with $(...) while still providing a way to specify that newlines not be trimmed, and using ${||...} for the latter came up. In thinking about ${|?|...} etc. I realized that there's no real reason a set of non-identifier characters couldn't be allowed to follow the first vertical bar. It'd have to be simpler than just tossing parameter expansion flags in there, but I could investigate whether we could do things like ${|=|...} is the same as ${=${ ... }}, ${|~|...} is ${~${ ... }}, etc. That only saves 1 character, though, and I'm not sure it's clearer. It does mean, though, that we could use something like ${|<|...} for non-trimming command substitution, instead of "empty" || meaning that. Just from a "clean look" standpoint, though, I still like the quoting approach better. Separately, it's definitely possible to make zsh-mode ${ ... } trim only one newline instead of all of them. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-12 17:54 ` Bart Schaefer @ 2024-03-12 23:19 ` Oliver Kiddle 2024-03-13 4:13 ` Bart Schaefer 0 siblings, 1 reply; 29+ messages in thread From: Oliver Kiddle @ 2024-03-12 23:19 UTC (permalink / raw) To: Bart Schaefer; +Cc: zsh-workers Bart Schaefer wrote: > On Fri, Mar 8, 2024 at 2:15 PM Oliver Kiddle <opk@zsh.org> wrote: > > > > Why does it print command not found errors for things like ${|=|:}, > > ${|*|:} and ${|?|:}, I'd rather have $? than it globbing for a single > > Bart Schaefer wrote: > > See above about the requirement for it to look like ${|ident|...}. > > Since = * and ? are not identifiers, this is like writing { =|: } etc. Considering this explanation, it is apparent that allowing |ident| is not fully compatible with mksh where ${|ls| cat -} runs ls. Not that I think that matters as such. In usage, it is probably wise to make a convention of always having a space before the command starts. And this leads on to the later question as we probably don't want to expand considerably on what is valid between the vertical bars. > This can be done with e.g. > > typeset -n bang=! > ... ${|bang|...} ... Yes that works. Is nice to see namerefs coming up in nifty solutions. I hadn't checked the code for what supporting ? / ! would involve. If trivial why not, but I well understand not wanting to do anything that involves the lexer. > However: > > Returning to the original context here, we were talking about how to > make ${ ... } more newline-trimming-compatible with $(...) while still > providing a way to specify that newlines not be trimmed, and using > ${||...} for the latter came up. > > In thinking about ${|?|...} etc. I realized that there's no real > reason a set of non-identifier characters couldn't be allowed to > follow the first vertical bar. It'd have to be simpler than just > tossing parameter expansion flags in there, but I could investigate > whether we could do things like ${|=|...} is the same as ${=${ ... }}, > ${|~|...} is ${~${ ... }}, etc. That only saves 1 character, though, > and I'm not sure it's clearer. Would that potentially also extend to something like ${|=var| ... } That might look like a default value assignment to someone used to a language where vertical bars delimit closure parameters. Coming within the vertical bars the character has a closer syntactic attachment to the variable implying a semantic attachment. If it is hard to support ${= ... } then not doing it at all is probably better. Given that the ${|var| ... } form appears to create a function-like scope, should var perhaps be auto-declared local for that scope and the local value be substituted? > It does mean, though, that we could use something like ${|<|...} for > non-trimming command substitution, instead of "empty" || meaning that. > Just from a "clean look" standpoint, though, I still like the quoting > approach better. The quoting approach is clean and logical and is probably my preferred option. I was initially bothered by the lack of consistency with $(...) (where quoting prevents word splitting) but it can be useful if the lack of fork is not the only thing which makes ${ ... } different and because of the syntactic resemblance, consistency with ${var} is perhaps more important - it does word splitting based on the shwordsplit option. > Separately, it's definitely possible to make zsh-mode ${ ... } trim > only one newline instead of all of them. Only one is probably the most useful. I would mostly associate the fact that $(...) strips multiple with the fact that it does word splitting and so drops repeated newlines (empty words) also from the middle. Admittedly "$(...)" preserves empty words in the middle but still drops those at the end. Oliver ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-12 23:19 ` Oliver Kiddle @ 2024-03-13 4:13 ` Bart Schaefer 2024-03-14 22:15 ` Oliver Kiddle 0 siblings, 1 reply; 29+ messages in thread From: Bart Schaefer @ 2024-03-13 4:13 UTC (permalink / raw) To: zsh-workers On Tue, Mar 12, 2024 at 4:19 PM Oliver Kiddle <opk@zsh.org> wrote: > > > Bart Schaefer wrote: > > > See above about the requirement for it to look like ${|ident|...}. > > > Since = * and ? are not identifiers, this is like writing { =|: } etc. > > Considering this explanation, it is apparent that allowing |ident| is > not fully compatible with mksh where ${|ls| cat -} runs ls. Hm, yes. Although I wasn't really aiming for compatibility, rather for borrowing the idea (via Sebastian's original attempt at it). I was also I confess a bit stuck on the idea that every case would look like ${|REPLY=...} when of course piping to "read" etc. are also valid ways to assign to REPLY. How often would there be a command name with no arguments in that position? > And this leads on to the later question as we probably don't want to > expand considerably on what is valid between the vertical bars. I hesitate in suggesting this, but ... is there any existing case in which "${{" is valid? If not, I think I can change ${|var|...} to be ${{var}...} without too much violence (except to the doc, bleah). > Yes that works. Is nice to see namerefs coming up in nifty solutions. I > hadn't checked the code for what supporting ? / ! would involve. Mostly it involves rejiggering valid_refname() to behave more like itype_end(), if you mean supporting e.g. ${|?|...}. > If trivial why not, but I well understand not wanting to do anything > that involves the lexer. That (and using {var} instead of |var|) would except for a single conditional test all happen in subst.c, the lexer already skips ahead. > Bart Schaefer wrote: > > [...] I could investigate > > whether we could do things like ${|=|...} is the same as ${=${ ... }}, > > ${|~|...} is ${~${ ... }}, etc. That only saves 1 character, though, > > and I'm not sure it's clearer. > > Would that potentially also extend to something like ${|=var| ... } It could, yes. > That might look like a default value assignment to someone Would ${{=var}...} look better? The doubled braces do give me pause. > Given that the ${|var| ... } form appears to create a function-like > scope, should var perhaps be auto-declared local for that scope and the > local value be substituted? I considered that but (a) the implementation is messy, as the state of the parameter scope has to be carried around subst.c a lot longer than with the single known scalar "REPLY" (b) it diverges even farther from the idea that REPLY is a semi-special thing -- note that REPLY is automatically saved and restored around ${|... REPLY=...} (c) creating it local doesn't really add much that you can't do with ${ local value; ... } and (d) part of the point was to be able to push the variable up to the caller as a side effect, so you don't have to write value=${|value| ... value=...} although I guess you do have to declare it somewhere so that's not entirely helpful. > The quoting approach is clean and logical and is probably my preferred > option. [...] consistency with ${var} is perhaps more > important - it does word splitting based on the shwordsplit option. Thanks for the vote. > > Separately, it's definitely possible to make zsh-mode ${ ... } trim > > only one newline instead of all of them. > > Only one is probably the most useful. I would mostly associate the fact > that $(...) strips multiple with the fact that it does word splitting This is the code diff to make emulation trim all, ${ ... } trim one, "${ ... }" trim none ... not re-doing the doc diff yet. diff --git a/Src/subst.c b/Src/subst.c index 49f7336bb..9d20a2d0e 100644 --- a/Src/subst.c +++ b/Src/subst.c @@ -1900,6 +1900,7 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags, /* The command string to be run by ${|...;} */ char *cmdarg = NULL; size_t slen = 0; + int trim = (!EMULATION(EMULATE_ZSH)) ? 2 : !qt; inbrace = 1; s++; @@ -2005,10 +2006,13 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags, int onoerrs = noerrs, rplylen; noerrs = 2; rplylen = zstuff(&cmdarg, rplytmp); - if (! EMULATION(EMULATE_ZSH)) { + if (trim) { /* bash and ksh strip trailing newlines here */ - while (rplylen > 0 && cmdarg[rplylen-1] == '\n') + while (rplylen > 0 && cmdarg[rplylen-1] == '\n') { rplylen--; + if (trim == 1) + break; + } cmdarg[rplylen] = 0; } noerrs = onoerrs; ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-13 4:13 ` Bart Schaefer @ 2024-03-14 22:15 ` Oliver Kiddle 2024-03-15 8:42 ` Stephane Chazelas 2024-03-27 7:05 ` Bart Schaefer 0 siblings, 2 replies; 29+ messages in thread From: Oliver Kiddle @ 2024-03-14 22:15 UTC (permalink / raw) To: Bart Schaefer; +Cc: zsh-workers Bart Schaefer wrote: > like ${|REPLY=...} when of course piping to "read" etc. are also valid > ways to assign to REPLY. How often would there be a command name with > no arguments in that position? Probably not all that often. > I hesitate in suggesting this, but ... is there any existing case in > which "${{" is valid? If not, I think I can change ${|var|...} to be > ${{var}...} without too much violence (except to the doc, bleah). Inner `$' in nested parameter expansions are fairly superfluous in general. ${|var|...} is closer to the REPLY default with ${|...} but other than that, I marginally prefer ${{var}...} Certainly if it does involve much violence, what we currently have is working. > > That might look like a default value assignment to someone > > Would ${{=var}...} look better? The doubled braces do give me pause. Not as good as ${={var}...} but probably better. > This is the code diff to make emulation trim all, ${ ... } trim one, > "${ ... }" trim none ... not re-doing the doc diff yet. Looks good to me. Oliver ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-14 22:15 ` Oliver Kiddle @ 2024-03-15 8:42 ` Stephane Chazelas 2024-03-27 1:16 ` Bart Schaefer 2024-03-27 7:05 ` Bart Schaefer 1 sibling, 1 reply; 29+ messages in thread From: Stephane Chazelas @ 2024-03-15 8:42 UTC (permalink / raw) To: Oliver Kiddle; +Cc: Bart Schaefer, zsh-workers I don't know if that could be done and it's probably too late anyway, but I thought I might throw in the idea anyway. What about, instead of adding ksh93's ${ cmd;} and mksh's ${|cmd} (in slightly diverging ways), we added just a | expansion flag whereby: ${(||)any zsh code} would expand to the output of the code without the fork and without the newline trimming. ${(|var|)any zsh code} would expand to the value of var as set by the zsh code Some advantages: - the flags can be cumulated as usual. So you can have ${(||.s[:])getconf PATH} to split the output of getconf PATH ("." to trim one newline, ".." to trim all) for example. - there's no extra rule as to how the expansion works and how it can be combined with others as it's the same syntax as other parameter expansions - as it's different syntax, it removes the potential surprises when ${ cmd;}, ${|cmd} behave differently than in ksh93/mksh/bash ============= Or (as a completely different idea), an alternative to mksh's ${|cmd} and ${|var|cmd} could be written ${REPLY<cmd} ${var<cmd}. That could be added as well as ${|cmd} if we wanted to add ${|cmd} for compatibility with mksh/bash. Or we could add neither of ${ cmd;} and ${|cmd} and have ${REPLY<cmd} as the (non-splitting, non-trimming) equivalent of ${|cmd} and ${<cmd} as the (non-splitting, non-trimming) equivalent of ${ cmd;} (though the latter would prevent adding ${ cmd; } in the future). And still allow flags there as in ${(.s[:])<getconf PATH} -- Stephane - ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-15 8:42 ` Stephane Chazelas @ 2024-03-27 1:16 ` Bart Schaefer 0 siblings, 0 replies; 29+ messages in thread From: Bart Schaefer @ 2024-03-27 1:16 UTC (permalink / raw) To: zsh-workers Delayed reply as I was traveling last week. On Fri, Mar 15, 2024 at 1:42 AM Stephane Chazelas <stephane@chazelas.org> wrote: > > What about, instead of adding ksh93's ${ cmd;} and mksh's > ${|cmd} (in slightly diverging ways), we added just a | > expansion flag As mentioned in a previous context, the problem with this approach is that the lexing/parsing of a parameter reference and the lexing/parsing of what amounts to a function body are very different. Upon encountering dollar-brace-pipe or dollar-brace-whitespace (or in forthcoming proposed change, dollar-brace-brace), we can immediately switch to expect a series of commands. This allows for one-character lookahead, which works with hungetc(). If required first to consume parameter flags or any string of multiple characters, the lexer can't backtrack without some serious gyrations. Even if the backtracking were worked out, the proposed flag now has semantics that the lexer has to understand in order to proceed after the close-paren, whereas current parameter flags are just swept up uninterpreted at lexing and left to paramsubst() to decode. On top of this the lexer has to maintain the PS2 context stack, which was one of the most difficult bits of implementing the switch to/from expecting commands vs. expecting (possibly nested) parameter substitutions. > Some advantages: > - the flags can be cumulated as usual. So you can have ${(||.s[:])getconf PATH} That would make this entirely impractical for lexing purposes. > - there's no extra rule as to how the expansion works and how it > can be combined with others as it's the same syntax as other > parameter expansions Except it's still not, because the syntax after the flags and up to the matching close brace is nothing like identifiers / subscripts / nested parameters. > - as it's different syntax, it removes the potential surprises > when ${ cmd;}, ${|cmd} behave differently than in > ksh93/mksh/bash Possibly, but since they'll work very similarly when in emulation modes, I think this is minor. > Or (as a completely different idea), an alternative to > mksh's ${|cmd} and ${|var|cmd} could be written ${REPLY<cmd} > ${var<cmd}. I suspect that wouldn't interact as well with nested substitutions (although I guess it wouldn't differ that much from ${REPLY=...} in that respect), and it has the appearance of reading from a file. I don't especially like ${|...} that way either as it looks more like writing than reading, but we're not setting the precedent there. Given druthers, I'd have done something with $(...) instead of ${...}, more like recognizing the "function" keyword so $(function { ... }) skips forking [ shorthand $(() { ... }) ] which could be done with zero changes to the lexer/parser, but that already has conflicting semantics with respect to [not] altering values in the current shell. Patch to use ${{param} cmd} instead of ${|param| cmd} to follow in a bit. ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-14 22:15 ` Oliver Kiddle 2024-03-15 8:42 ` Stephane Chazelas @ 2024-03-27 7:05 ` Bart Schaefer 1 sibling, 0 replies; 29+ messages in thread From: Bart Schaefer @ 2024-03-27 7:05 UTC (permalink / raw) To: zsh-workers [-- Attachment #1: Type: text/plain, Size: 716 bytes --] On Thu, Mar 14, 2024 at 3:15 PM Oliver Kiddle <opk@zsh.org> wrote: > > Bart Schaefer wrote: > > I hesitate in suggesting this, but ... is there any existing case in > > which "${{" is valid? If not, I think I can change ${|var|...} to be > > ${{var}...} without too much violence (except to the doc, bleah). > > [...] I marginally prefer ${{var}...} > Certainly if it does involve much violence, what we currently have is > working. It was slightly more violent than I expected, and consequently there is probably some room for optimization, but the attached has it working (minus Doc update as yet). Following workers/52635 the extra "TEST COMPLETE" test in D10 is not really needed any more. [-- Attachment #2: nofork-doublecurly.txt --] [-- Type: text/plain, Size: 6414 bytes --] diff --git a/Src/lex.c b/Src/lex.c index 31b130b07..700af2da1 100644 --- a/Src/lex.c +++ b/Src/lex.c @@ -1423,7 +1423,7 @@ gettokstr(int c, int sub) if (lexstop) break; if (!cmdsubst && in_brace_param && act == LX2_STRING && - (c == '|' || c == Bar || inblank(c))) { + (c == '|' || c == Bar || c == '{' || c == Inbrace || inblank(c))) { cmdsubst = in_brace_param; cmdpush(CS_CURSH); } else if (in_pattern == 2 && c != '/') diff --git a/Src/subst.c b/Src/subst.c index 9d20a2d0e..3764ed786 100644 --- a/Src/subst.c +++ b/Src/subst.c @@ -1898,11 +1898,10 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags, */ if (c == Inbrace) { /* The command string to be run by ${|...;} */ - char *cmdarg = NULL; + char *cmdarg = NULL, *endvar = NULL, inchar = *++s; size_t slen = 0; int trim = (!EMULATION(EMULATE_ZSH)) ? 2 : !qt; inbrace = 1; - s++; /* Short-path for the nofork command substitution ${|cmd;} * See other comments about kludges for why this is here. @@ -1913,43 +1912,74 @@ paramsubst(LinkList l, LinkNode n, char **str, int qt, int pf_flags, * should not be part of command substitution in any case. * Use ${(U)${|cmd;}} as you would for ${(U)$(cmd;)}. */ - if (*s == '|' || *s == Bar || inblank(*s)) { + if (inchar == '|' || inchar == Bar || inblank(inchar)) { char *outbracep = s; char sav = *s; *s = Inbrace; if (skipparens(Inbrace, Outbrace, &outbracep) == 0) { slen = outbracep - s - 1; if ((*s = sav) != Bar) { + /* This tokenize() is important */ sav = *outbracep; *outbracep = '\0'; tokenize(s); *outbracep = sav; } } + } else if (inchar == '{' || inchar == Inbrace) { + char *outbracep; + *s = Inbrace; + + if ((outbracep = itype_end(s+1, INAMESPC, 0))) { + if (*outbracep == Inbrack && + (outbracep = parse_subscript(++outbracep, 1, ']'))) + ++outbracep; + } + /* True for valid substitution, or we messed up in lex.c */ + if (outbracep && *outbracep == Outbrace) { + char outchar = inchar == Inbrace ? Outbrace : '}'; + endvar = outbracep++; + + /* Reached the first close brace, find the last */ + *endvar = '|'; /* Almost anything but braces/brackets */ + outbracep = s; + if (skipparens(Inbrace, outchar, &outbracep) == 0) + *endvar = Outbrace; + else { /* Never happens? */ + *endvar = outchar; + outbracep = endvar + 1; + } + slen = outbracep - s - 1; + if (inchar != Inbrace) { + char sav = *outbracep; + *outbracep = '\0'; + tokenize(s); + *outbracep = sav; + outbracep[-1] = Outbrace; + } + } else { + zerr("bad substitution"); + return NULL; + } } if (slen > 1) { char *outbracep = s + slen; if (*outbracep == Outbrace) { - if ((rplyvar = itype_end(s+1, INAMESPC, 0))) { - if (*rplyvar == Inbrack && - (rplyvar = parse_subscript(++rplyvar, 1, ']'))) - ++rplyvar; - } - if (rplyvar == s+1 && *rplyvar == Bar) { - /* Is ${||...} a subtitution error or a syntax error? + if (endvar == s+1 && !inblank(*endvar)) { + /* Is ${{}...} a substitution error or a syntax error? zerr("bad substitution"); return NULL; */ rplyvar = NULL; } - if (rplyvar && *rplyvar == Bar) { - cmdarg = dupstrpfx(rplyvar+1, outbracep-rplyvar-1); - rplyvar = dupstrpfx(s+1,rplyvar-s-1); + if (endvar && *endvar == Outbrace) { + cmdarg = dupstrpfx(endvar+1, outbracep-endvar-1); + rplyvar = dupstrpfx(s+1,endvar-s-1); } else { cmdarg = dupstrpfx(s+1, outbracep-s-1); rplyvar = "REPLY"; } - if (inblank(*s)) { + if (inblank(inchar)) { /* * Admittedly a hack. Take advantage of the enforced * locality of REPLY and the semantics of $(<file) to diff --git a/Test/D10nofork.ztst b/Test/D10nofork.ztst index fc6b84613..0616cf9e9 100644 --- a/Test/D10nofork.ztst +++ b/Test/D10nofork.ztst @@ -14,6 +14,28 @@ 0:Basic substitution and REPLY scoping >INNER OUTER + reply=(x OUTER x) + purl ${{reply}reply=(\{ INNER \})} $reply +0:Basic substitution, brace quoting, and array result +>{ +>INNER +>} +>{ +>INNER +>} + + () { + setopt localoptions ignorebraces + purl ${{reply} reply=({ INNER })} $reply + } +0:Basic substitution, ignorebraces, and array result +>{ +>INNER +>} +>{ +>INNER +>} + purr ${| REPLY=first}:${| REPLY=second}:$REPLY 0:re-scoping of REPLY in one statement >first:second:OUTER @@ -229,7 +251,7 @@ F:Why not use this error in the previous case as well? >26 unset reply - purl ${|reply| reply=(1 2 ${| REPLY=3 } 4) } + purl ${{reply} reply=(1 2 ${| REPLY=3 } 4) } typeset -p reply 0:array behavior with global assignment >1 @@ -315,7 +337,7 @@ F:status of "print" should hide return unset zz outer=GLOBAL - purr "${|zz| + purr "${{zz} local outer=LOCAL zz=NONLOCAL } $outer $?" @@ -453,6 +475,7 @@ F:must do this before evaluating the next test block 1:ignored braces, part 4 ?(eval):3: parse error near `}' + unsetopt ignorebraces # "break" blocks function calls in outer loop # Could use print, but that might get fixed repeat 3 do purr ${ @@ -467,11 +490,6 @@ F:must do this before evaluating the next test block ?1 ?2 - print -u $ZTST_fd ${ZTST_testname}: TEST COMPLETE -0:make sure we got to the end -F:some tests might silently break the test harness - %clean unfunction purr purl - unsetopt ignorebraces diff --git a/Test/V10private.ztst b/Test/V10private.ztst index ed51316f3..26004a2dc 100644 --- a/Test/V10private.ztst +++ b/Test/V10private.ztst @@ -497,7 +497,7 @@ F:Better if caught in checkclobberparam() but exec.c doesn't know scope () { private z=outer print ${(t)z} $z - print ${| REPLY=${|z| z=nofork} } + print ${| REPLY=${{z} z=nofork} } print ${(t)z} $z } 0:nofork may write to private in calling function @@ -518,9 +518,9 @@ F:Better if caught in checkclobberparam() but exec.c doesn't know scope () { private z=outer print ${(t)z} $z - print ${|z| + print ${{z} private q - z=${|q| q=nofork} + z=${{q} q=nofork} } print ${(t)z} $z } @@ -533,7 +533,7 @@ F:Better if caught in checkclobberparam() but exec.c doesn't know scope print ${| () { REPLY="{$q}" } } - print ${|q| + print ${{q} () { q=nofork } } } ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-07 4:53 ` Bart Schaefer 2024-03-07 7:02 ` Lawrence Velázquez @ 2024-03-07 7:10 ` Stephane Chazelas 2024-03-08 0:37 ` Bart Schaefer 1 sibling, 1 reply; 29+ messages in thread From: Stephane Chazelas @ 2024-03-07 7:10 UTC (permalink / raw) To: Bart Schaefer; +Cc: Mikael Magnusson, Zsh hackers list 2024-03-06 20:53:28 -0800, Bart Schaefer: [...] > > "${ foo}" and ${ foo} having the same wordsplitting behavior but only > > differing in stripping newlines feels a bit magical and weird. > > One question (and sort of the point) is whether anyone would really > notice. If you put it in quotes you're expecting a literal result, > and if you (for example) assign it unquoted to a scalar you're > expecting it to "just work" the way assigning $(foo) would. It's a > bit unusual but it seems to preserve the principle of least surprise, > and it uses the least amount of extra syntax. > > On the other hand I'm not highly invested in this. In the absence of > this (no)quoting behavior, I've found I nearly always want ${=${ foo > }} or ${(f)${ foo }}, each of which gives exactly the same result with > or without trimming. [...] For ${=${ foo }} that depends on whether $IFS contains a (non-doubled) newline or not. Without trimming: $ IFS=: $ printf '<%s>\n' ${=${ getconf PATH }} </bin> </usr/bin > $ IFS=$'\n\n' $ printf '<%s>\n' ${=${ seq 3 }} <1> <2> <3> <> For (f), see also: $ printf '<%s>\n' "${(f@)${ print -l 'a b' '' 'c d' }}" <a b> <> <c d> <> Like with IFS=$'\n\n', those are typically the cases where you do want to preserve empty lines. In both cases, trimming one (and only one) newline character would lead to a better behaviour. One exception would be in: lines=( "${(f@)${ print -l '' }}" ) Where you'd get no line instead of one empty line. Though at the moment, you get: $ lines=( "${(f@)${ print -l '' }}" ) $ typeset -p lines typeset -a lines=( '' '' ) (2 empty lines) which is not better. We'd need to have a way to treat the separator as *delimiter* instead (as POSIX requires for IFS splitting despite the S in IFS; both "delimiting" and "separating" have their use). -- Stephane ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-07 7:10 ` Stephane Chazelas @ 2024-03-08 0:37 ` Bart Schaefer 0 siblings, 0 replies; 29+ messages in thread From: Bart Schaefer @ 2024-03-08 0:37 UTC (permalink / raw) To: Zsh hackers list On Wed, Mar 6, 2024 at 11:10 PM Stephane Chazelas <stephane@chazelas.org> wrote: > > For ${=${ foo }} that depends on whether $IFS contains a > (non-doubled) newline or not. True, but I think not really relevant, because nobody is (I hope) going to globally set a strange IFS in their dotfiles and still expect any normal behavior. > For (f), see also: > > $ printf '<%s>\n' "${(f@)${ print -l 'a b' '' 'c d' }}" > <a b> > <> > <c d> > <> That's because of the historic behavior of the (s::) flag where (f) is (ps:\n:). But as was pointed out elsewhere if you're not invoking an external command you can control this from inside the substitution: % printf '<%s>\n' "${(f@)${ print -nl 'a b' '' 'c d' }}" <a b> <> <c d> % Which leans a little in the direction of never trimming rather than of choosing how many to trim. It does however reveal a drawback in the quoting proposal, in that when nesting ${ ... } inside another quoted expansion there would be no way to disable newline retention. > We'd need to have a way to treat the separator as *delimiter* That would be a useful choice for (T) or some other new flag -- as in, do NOT "trim" the separator when splitting -- but I don't see how it helps decide whether to trim trailing newline(s) from ${ cmd } in the first place, because in the delimiter case you'd want to keep them? Just for grins ... % : ${|reply| typeset -ga reply local -i i=1 MBEGIN MEND local -n MATCH='reply[i]' local pat=$'[^\n]#\n' : ${(*S)"${ print -l 'a b' '' 'c d' }"//(#m)($~pat)/$((i++))} } % typeset -p reply typeset -a reply=( $'a b\n' $'\n' $'c d\n' ) ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-06 22:22 ` Mikael Magnusson 2024-03-06 22:42 ` Bart Schaefer 2024-03-07 4:53 ` Bart Schaefer @ 2024-03-07 6:52 ` Lawrence Velázquez 2024-03-07 8:26 ` Mikael Magnusson 2 siblings, 1 reply; 29+ messages in thread From: Lawrence Velázquez @ 2024-03-07 6:52 UTC (permalink / raw) To: Mikael Magnusson, Bart Schaefer; +Cc: zsh-workers On Wed, Mar 6, 2024, at 5:22 PM, Mikael Magnusson wrote: > "${ foo}" and ${ foo} having the same wordsplitting behavior but only > differing in stripping newlines feels a bit magical and weird. I agree. Personally, I'm always surprised when quoting does anything other than suppress splitting, globbing, and special characters in patterns. For instance, I can never remember this pitfall mentioned in workers/52666, even though (I think) I understand why it happens: % print ${:-{}x} {}x % print "${:-{}x}" {x} -- vq ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-07 6:52 ` Lawrence Velázquez @ 2024-03-07 8:26 ` Mikael Magnusson 2024-03-07 19:02 ` Bart Schaefer 0 siblings, 1 reply; 29+ messages in thread From: Mikael Magnusson @ 2024-03-07 8:26 UTC (permalink / raw) To: Lawrence Velázquez; +Cc: zsh-workers On 3/7/24, Lawrence Velázquez <larryv@zsh.org> wrote: > On Wed, Mar 6, 2024, at 5:22 PM, Mikael Magnusson wrote: >> "${ foo}" and ${ foo} having the same wordsplitting behavior but only >> differing in stripping newlines feels a bit magical and weird. > > I agree. Personally, I'm always surprised when quoting does anything > other than suppress splitting, globbing, and special characters in > patterns. For instance, I can never remember this pitfall mentioned > in workers/52666, even though (I think) I understand why it happens: > > % print ${:-{}x} > {}x > % print "${:-{}x}" > {x} This is not really an effect of quoting per se, really it's just luck that the unquoted form works. You'll notice that if you try print "${:-}x}" without the quotes it will simply fail. Your example only happens to pass the parsing stage because the braces are balanced which they have no inherent reason to do in what is supposedly a string literal. Because the parser "knows" about the balanced braces in the unquoted case, it skips past the first } for closing the ${, but in the quoted form the { is not special in any way, so the first } does match the ${, and then the second } is just a literal } which is then printed after the x. The correct way to write it in both cases would be: % print ${:-\{\}x} {}x % print "${:-{\}x}" {}x (you can \escape the { inside the quotes too if you want, but it has no effect on the result). -- Mikael Magnusson ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-07 8:26 ` Mikael Magnusson @ 2024-03-07 19:02 ` Bart Schaefer 2024-04-02 6:45 ` Lawrence Velázquez 0 siblings, 1 reply; 29+ messages in thread From: Bart Schaefer @ 2024-03-07 19:02 UTC (permalink / raw) To: Mikael Magnusson; +Cc: Lawrence Velázquez, zsh-workers On Thu, Mar 7, 2024 at 12:26 AM Mikael Magnusson <mikachu@gmail.com> wrote: > > On 3/7/24, Lawrence Velázquez <larryv@zsh.org> wrote: > > > > % print ${:-{}x} > > {}x > > % print "${:-{}x}" > > {x} > > This is not really an effect of quoting per se, really it's just luck > that the unquoted form works. [...] Your example only > happens to pass the parsing stage because the braces are balanced > which they have no inherent reason to do in what is supposedly a > string literal. It passes the balanced braces because this: % print ${:-{a,b,c}x} ax bx cx And because this: % print {} {} I'm leaving this in the same discussion thread because I just noticed that ${|...} and ${ cmd } do not really respect the IGNORE_CLOSE_BRACES option. Setting that option changes handling of unbalanced braces (and I'm not yet sure if it does so in a sensible way) but does not force use of the semicolon e.g. in ${ cmd; } which theoretically it should. Is this worth trying to work in? ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-07 19:02 ` Bart Schaefer @ 2024-04-02 6:45 ` Lawrence Velázquez 0 siblings, 0 replies; 29+ messages in thread From: Lawrence Velázquez @ 2024-04-02 6:45 UTC (permalink / raw) To: Bart Schaefer; +Cc: zsh-workers On Thu, Mar 7, 2024, at 2:02 PM, Bart Schaefer wrote: > I'm leaving this in the same discussion thread because I just noticed > that ${|...} and ${ cmd } do not really respect the > IGNORE_CLOSE_BRACES option. Setting that option changes handling of > unbalanced braces (and I'm not yet sure if it does so in a sensible > way) but does not force use of the semicolon e.g. in ${ cmd; } which > theoretically it should. Is this worth trying to work in? It would be greatly preferable if IGNORE_CLOSE_BRACES were respected, so that we don't have yet another exception that has to be documented and watched out for. However, I can't opine on whether it'd be worth doing, since I don't know how hard it'd be and won't be working on it in any case. (Sorry if this has already been settled; I'm just now catching up on some older threads.) -- vq ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH?] Nofork and removing newlines 2024-03-05 22:48 ` Bart Schaefer 2024-03-06 17:57 ` Stephane Chazelas @ 2024-03-06 19:43 ` Stephane Chazelas 1 sibling, 0 replies; 29+ messages in thread From: Stephane Chazelas @ 2024-03-06 19:43 UTC (permalink / raw) To: Bart Schaefer; +Cc: Zsh hackers list 2024-03-05 14:48:00 -0800, Bart Schaefer: [...] > The suggested change would provide > $(...)-like behavior for the usual case and a simple way to keep the > newline(s) in the less-usual cases. [...] For reference, some other shells that can keep trailing newline in command subtitution: rc: whole_output = ``(){cmd and its args} fish: set whole_output (cmd and its args | string collect -aN) (-aN short for --allow-empty --no-trim-newlines). fish's command substitution ( (...) and also $(...) including inside "..." in recent versions) doesn't fork so is closer to ksh93's ${ cmd; } than ksh86's $(...) POSIX/Korn-like shells: I'm sure everyone will have their own variant, but get_whole_output() { eval " $1"'=$(shift; "$@"; ret=$?; echo .; exit "$ret") set -- "$1" "$?" '"$1"'=${'"$1"'%.} return "$2"' } get_whole_output whole_output cmd and its args (bearing in mind that there aren't many characters beside . that you can use safely there as it's important its encoding can't be found in the encoding of other characters. -- Stephane ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2024-04-02 6:46 UTC | newest] Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-03-05 5:52 [PATCH?] Nofork and removing newlines Bart Schaefer 2024-03-05 6:56 ` Stephane Chazelas 2024-03-05 22:48 ` Bart Schaefer 2024-03-06 17:57 ` Stephane Chazelas 2024-03-06 19:45 ` Bart Schaefer 2024-03-06 22:22 ` Mikael Magnusson 2024-03-06 22:42 ` Bart Schaefer 2024-03-07 4:53 ` Bart Schaefer 2024-03-07 7:02 ` Lawrence Velázquez 2024-03-07 8:09 ` ${<file} (Was: [PATCH?] Nofork and removing newlines) Stephane Chazelas 2024-03-08 1:29 ` [PATCH?] Nofork and removing newlines Bart Schaefer 2024-03-08 22:15 ` Oliver Kiddle 2024-03-08 23:28 ` Bart Schaefer 2024-03-09 20:43 ` Oliver Kiddle 2024-03-10 6:11 ` Bart Schaefer 2024-03-12 17:54 ` Bart Schaefer 2024-03-12 23:19 ` Oliver Kiddle 2024-03-13 4:13 ` Bart Schaefer 2024-03-14 22:15 ` Oliver Kiddle 2024-03-15 8:42 ` Stephane Chazelas 2024-03-27 1:16 ` Bart Schaefer 2024-03-27 7:05 ` Bart Schaefer 2024-03-07 7:10 ` Stephane Chazelas 2024-03-08 0:37 ` Bart Schaefer 2024-03-07 6:52 ` Lawrence Velázquez 2024-03-07 8:26 ` Mikael Magnusson 2024-03-07 19:02 ` Bart Schaefer 2024-04-02 6:45 ` Lawrence Velázquez 2024-03-06 19:43 ` Stephane Chazelas
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).