zsh-workers
 help / color / mirror / code / Atom feed
* [PATCH v2 0/1] Run pipeline command in subshell in sh mode
@ 2020-06-05  1:53 brian m. carlson
  2020-06-05  1:53 ` [PATCH v2] exec: run final pipeline command in a " brian m. carlson
  0 siblings, 1 reply; 14+ messages in thread
From: brian m. carlson @ 2020-06-05  1:53 UTC (permalink / raw)
  To: zsh-workers

POSIX sh implementations run each command in a pipeline in a subshell,
although zsh (and AT&T ksh) do not: instead, they run the final command
in the main shell.  This leads to very different behavior when the final
command is a shell function which modifies variables.

zsh is starting to be used in some cases as /bin/sh, such as on macOS
Catalina.  Consequently, it makes sense to emulate the POSIX behavior as
much as possible when emulating sh, since that's the least surprising
behavior.  This patch does exactly that.

With this patch, using "zsh --emulate sh" passes the Git testsuite.  I
expect that it will also be fully functional as /bin/sh on Debian,
although I have not tested.

This patch was sent before, but didn't get picked up.  In hopes of
aiding reviewers, I've resent it with a significantly expanded commit
message so that it is easier to reason about.

I'm not subscribed to the list, so please CC me if you have questions or
comments.

brian m. carlson (1):
  exec: run final pipeline command in a subshell in sh mode

 Src/exec.c           | 10 ++++++----
 Test/B07emulate.ztst | 22 ++++++++++++++++++++++
 2 files changed, 28 insertions(+), 4 deletions(-)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2] exec: run final pipeline command in a subshell in sh mode
  2020-06-05  1:53 [PATCH v2 0/1] Run pipeline command in subshell in sh mode brian m. carlson
@ 2020-06-05  1:53 ` brian m. carlson
  2020-06-05 10:21   ` Mikael Magnusson
  0 siblings, 1 reply; 14+ messages in thread
From: brian m. carlson @ 2020-06-05  1:53 UTC (permalink / raw)
  To: zsh-workers

zsh typically runs the final command in a pipeline in the main shell
instead of a subshell.  However, POSIX requires that all commands in a
pipeline run in a subshell, but permits zsh's behavior as an extension.

Since zsh may be used as /bin/sh in some cases (such as macOS Catalina),
it makes sense to have the POSIX behavior when emulating sh, so do that
by checking for being the final item of a multi-item pipeline and
creating a subshell in that case.

From the comment above execpline(), we know the following:

  last1 is a flag that this command is the last command in a shell that
  is about to exit, so we can exec instead of forking.  It gets passed
  all the way down to execcmd() which actually makes the decision.  A 0
  is always passed if the command is not the last in the pipeline. […]
  If last1 is zero but the command is at the end of a pipeline, we pass
  2 down to execcmd().

So there are three cases to consider in this code:

• last1 is 0, which means we are not at the end of a pipeline, in which
  case we should not change behavior.
• last1 is 1, which means we are effectively running in a subshell,
  because nothing that happens due to the exec is going to affect the
  actual shell, since it will have been replaced.  So there is nothing
  to do here.
• last1 is 2, which means our command is at the end of the pipeline, so
  in sh mode we should create a subshell by forking.

input is nonzero if the input to this process is a pipe that we've
opened.  At the end of a multi-stage pipeline, it will necessarily be
nonzero.

Note that several of the tests may appear bizarre, since most developers
do not place useless variable assignments directly at the end of a
pipeline.  However, as the function tests demonstrate, there are cases
where assignments may occur when a shell function is used at the end of
a command.  The remaining assignment tests simply test additional cases,
such as the use of local, that would otherwise be untested.
---
 Src/exec.c           | 10 ++++++----
 Test/B07emulate.ztst | 22 ++++++++++++++++++++++
 2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/Src/exec.c b/Src/exec.c
index 29f4fc5ca..f2650f311 100644
--- a/Src/exec.c
+++ b/Src/exec.c
@@ -2866,11 +2866,13 @@ execcmd_exec(Estate state, Execcmd_params eparams,
 	    pushnode(args, dupstring("fg"));
     }
 
-    if ((how & Z_ASYNC) || output) {
+    if ((how & Z_ASYNC) || output ||
+	(last1 == 2 && input && EMULATION(EMULATE_SH))) {
 	/*
-	 * If running in the background, or not the last command in a
-	 * pipeline, we don't need any of the rest of this function to
-	 * affect the state in the main shell, so fork immediately.
+	 * If running in the background, not the last command in a
+	 * pipeline, or the last command in a multi-stage pipeline
+	 * in sh mode, we don't need any of the rest of this function
+	 * to affect the state in the main shell, so fork immediately.
 	 *
 	 * In other cases we may need to process the command line
 	 * a bit further before we make the decision.
diff --git a/Test/B07emulate.ztst b/Test/B07emulate.ztst
index 7b1592fa9..45c39b51d 100644
--- a/Test/B07emulate.ztst
+++ b/Test/B07emulate.ztst
@@ -276,3 +276,25 @@ F:Some reserved tokens are handled in alias expansion
 0:--emulate followed by other options
 >yes
 >no
+
+  emulate sh -c '
+  foo () {
+    VAR=foo &&
+    echo $VAR | bar &&
+    echo "$VAR"
+  }
+  bar () {
+    tr f b &&
+    VAR="$(echo bar | tr r z)" &&
+    echo "$VAR"
+  }
+  foo
+  '
+  emulate sh -c 'func() { echo | local def="abc"; echo $def;}; func'
+  emulate sh -c 'abc="def"; echo | abc="ghi"; echo $abc'
+0:emulate sh uses subshell for last pipe entry
+>boo
+>baz
+>foo
+>
+>def

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] exec: run final pipeline command in a subshell in sh mode
  2020-06-05  1:53 ` [PATCH v2] exec: run final pipeline command in a " brian m. carlson
@ 2020-06-05 10:21   ` Mikael Magnusson
  2020-06-05 20:41     ` brian m. carlson
  0 siblings, 1 reply; 14+ messages in thread
From: Mikael Magnusson @ 2020-06-05 10:21 UTC (permalink / raw)
  To: brian m. carlson; +Cc: zsh-workers

On 6/5/20, brian m. carlson <sandals@crustytoothpaste.net> wrote:
> zsh typically runs the final command in a pipeline in the main shell
> instead of a subshell.  However, POSIX requires that all commands in a
> pipeline run in a subshell, but permits zsh's behavior as an extension.

What POSIX actually says is:
"each command of a multi-command pipeline is in a subshell
environment; as an extension, however, any or all commands in a
pipeline may be executed in the current environment"
Ie, it does not say "shall", so it doesn't require a subshell all, in
fact it explicitly does permit not using one as you also say. The
patch is possibly useful (seems unlikely to me), but to say it is
required by POSIX is not true. If someone depends on every command in
a pipeline being a subshell, they should fix their code, for example
by adding ( ) around it (the command(s) or the whole pipeline).

-- 
Mikael Magnusson

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] exec: run final pipeline command in a subshell in sh mode
  2020-06-05 10:21   ` Mikael Magnusson
@ 2020-06-05 20:41     ` brian m. carlson
  2020-06-06  4:33       ` [PATCH v2] exec: run final pipeline command in a subshell in sh modeZZ Daniel Shahaf
  2020-06-07 16:55       ` [PATCH v2] exec: run final pipeline command in a subshell in sh mode Bart Schaefer
  0 siblings, 2 replies; 14+ messages in thread
From: brian m. carlson @ 2020-06-05 20:41 UTC (permalink / raw)
  To: Mikael Magnusson; +Cc: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 2606 bytes --]

On 2020-06-05 at 10:21:41, Mikael Magnusson wrote:
> On 6/5/20, brian m. carlson <sandals@crustytoothpaste.net> wrote:
> > zsh typically runs the final command in a pipeline in the main shell
> > instead of a subshell.  However, POSIX requires that all commands in a
> > pipeline run in a subshell, but permits zsh's behavior as an extension.
> 
> What POSIX actually says is:
> "each command of a multi-command pipeline is in a subshell
> environment; as an extension, however, any or all commands in a
> pipeline may be executed in the current environment"
> Ie, it does not say "shall", so it doesn't require a subshell all, in
> fact it explicitly does permit not using one as you also say. The
> patch is possibly useful (seems unlikely to me), but to say it is
> required by POSIX is not true. If someone depends on every command in
> a pipeline being a subshell, they should fix their code, for example
> by adding ( ) around it (the command(s) or the whole pipeline).

POSIX makes a declarative statement about the behavior of a pipeline.
It is true that it doesn't explicitly use the word "shall" in this case,
since such a statement would explicitly prohibit the inclusion of an
extension at all and make it explicitly non-conforming.

What POSIX does say is that one “shall define an environment in which an
application can be run with the behavior specified by POSIX.1-2017.”
I'm proposing that "zsh --emulate sh" implement the POSIX behavior for
that reason.

I will tell you that as a practical matter, nobody writing code for sh
expects the last command not to be run in a subshell and consequently
lots of code is practically broken in this case with zsh as /bin/sh.
The Git Project is very fastidious about writing portable shell, as is
Debian, and I can tell you from experience that both are broken with zsh
as sh with the current behavior, even if they should not have made that
assumption.

zsh is a very popular interactive shell, and allowing it to be used as a
portable sh on systems where the system sh is less capable would be
really beneficial.  I would also like to see macOS users who decide to
use zsh as /bin/sh have a good experience with existing code that
overwhelmingly does make this assumption.

If your objection is to the wording, I'm happy to revise it to remove
the word "requires", but I do think this provides a lot of benefits for
the sh scripting case while not impacting users who are expecting
different behavior for the zsh case.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] exec: run final pipeline command in a subshell in sh modeZZ
  2020-06-05 20:41     ` brian m. carlson
@ 2020-06-06  4:33       ` Daniel Shahaf
  2020-06-06 16:28         ` brian m. carlson
  2020-06-07 16:55       ` [PATCH v2] exec: run final pipeline command in a subshell in sh mode Bart Schaefer
  1 sibling, 1 reply; 14+ messages in thread
From: Daniel Shahaf @ 2020-06-06  4:33 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Mikael Magnusson, zsh-workers

brian m. carlson wrote on Fri, 05 Jun 2020 20:41 +0000:
> On 2020-06-05 at 10:21:41, Mikael Magnusson wrote:
> > On 6/5/20, brian m. carlson <sandals@crustytoothpaste.net> wrote:  
> > > zsh typically runs the final command in a pipeline in the main shell
> > > instead of a subshell.  However, POSIX requires that all commands in a
> > > pipeline run in a subshell, but permits zsh's behavior as an extension.  
> > 
> > What POSIX actually says is:
> > "each command of a multi-command pipeline is in a subshell
> > environment; as an extension, however, any or all commands in a
> > pipeline may be executed in the current environment"

That's quoted from https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_12.

The part Brian quotes below is from https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap02.html#tag_02_01_01.

> > Ie, it does not say "shall", so it doesn't require a subshell all, in
> > fact it explicitly does permit not using one as you also say. The

This interpretation is analogous to how conforming C programs must
assume neither that «char» is signed nor that it is unsigned.

> > patch is possibly useful (seems unlikely to me), but to say it is
> > required by POSIX is not true. If someone depends on every command in
> > a pipeline being a subshell, they should fix their code, for example
> > by adding ( ) around it (the command(s) or the whole pipeline).  
> 
> POSIX makes a declarative statement about the behavior of a pipeline.
> It is true that it doesn't explicitly use the word "shall" in this case,
> since such a statement would explicitly prohibit the inclusion of an
> extension at all and make it explicitly non-conforming.
> 

The sentence preceding the one you quoted reads:
.
    Non-standard extensions, when used, may change the behavior of
    utilities, functions, or facilities defined by POSIX.1-2017.

I take this to mean non-standard extensions aren't bound by "shall"s.

As to why the passage Mikael quoted doesn't use the word "shall"… well,
presumably it doesn't use the word "shall" because it doesn't describe
"a feature or behavior that is mandatory"¹.

> What POSIX does say is that one “shall define an environment in which an
> application can be run with the behavior specified by POSIX.1-2017.”
> I'm proposing that "zsh --emulate sh" implement the POSIX behavior for
> that reason.

What Mikael's saying is that zsh's incumbent behaviour is already
POSIX-conforming, but POSIX-conforming implementations have some leeway:
have a range of possible behaviours to choose from, just like conforming
C compilers can choose what signedness to give to «char».

The passage Mikael quoted specifies that running the last command in
a pipeline in a subshell by default is permitted in certain cases,
outlined by the phrases "as an extension" and "may".

The definition of "may"¹ says it's used to describe "optional" behaviours,
and that conforming applications should tolerate both presence and
absence of that behaviour.

As to that behaviour's being an extension, the sentence you quoted, once
put back in context (link given above), requires conforming implementations
to document how to disable non-standard extensions where they change
POSIX-documented behaviour.  I don't think the passage Mikael quoted is
covered by that: it _is_ an extension, of course, but I question whether
it's a "non-standard extension"².  After all, it is specified in the
standard.

To summarize, I don't see why behaviour specified with the phrases "as
an extension" and "may" should be off by default in a POSIX-conforming
mode.  Would you elaborate on this?

(On the other hand, I'm not sure why they bothered to write the words
"as an extension" there.  They don't seem to change the meaning one way
or the other.)

> I will tell you that as a practical matter, nobody writing code for sh
> expects the last command not to be run in a subshell and consequently
> lots of code is practically broken in this case with zsh as /bin/sh.
> The Git Project is very fastidious about writing portable shell, as is
> Debian, and I can tell you from experience that both are broken with zsh
> as sh with the current behavior, even if they should not have made that
> assumption.
> 
> I would also like to see macOS users who decide to use zsh as /bin/sh
> have a good experience with existing code that overwhelmingly does
> make this assumption.

Well, perhaps there is something we can do to make their lives easier.

Continuing the analogy to C, gcc(1) has -fsigned-char/-funsigned-char
flags to help unportable programs.  However, I hesitate to propose
adding an option just for this: adding options is always easy to
suggest, but not always a good idea.

Since zsh already incorporates a parser for sh scripts, perhaps we could
write a tool that automatically adds parentheses to the last element in
every pipeline.  That's not such a crazy idea: it already exists (in a
much more general form) for C: http://coccinelle.lip6.fr/

[Between these two, an extra option is definitely the lower-hanging
fruit, of course.]

> zsh is a very popular interactive shell, and allowing it to be used as a
> portable sh on systems where the system sh is less capable would be
> really beneficial.

How would it be beneficial?

> If your objection is to the wording, I'm happy to revise it to remove
> the word "requires", but I do think this provides a lot of benefits for
> the sh scripting case while not impacting users who are expecting
> different behavior for the zsh case.

The patch would constitute a backwards-incompatible change to anyone who
uses zsh as sh today and relies on the current behaviour of pipelines.

This might have been acceptable if it were a question of changing
a non-conforming behaviour to a conforming behaviour.  However, the
current behaviour does appear to be conforming.

Furthermore, if the patch is accepted, those who rely on the incumbent
behaviour won't have an easy workaround to get it behaviour back,
something comparable to the "add parentheses around every element of
every pipeline" strategy that can be used given the incumbent
implementation to get the patch's semantics.

Cheers,

Daniel

¹ https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap01.html

² If POSIX didn't want to make this distinction, it would have written
just "extension", unqualified.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] exec: run final pipeline command in a subshell in sh modeZZ
  2020-06-06  4:33       ` [PATCH v2] exec: run final pipeline command in a subshell in sh modeZZ Daniel Shahaf
@ 2020-06-06 16:28         ` brian m. carlson
  2020-06-07 11:29           ` Daniel Shahaf
  0 siblings, 1 reply; 14+ messages in thread
From: brian m. carlson @ 2020-06-06 16:28 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Mikael Magnusson, zsh-workers

[-- Attachment #1: Type: text/plain, Size: 9241 bytes --]

On 2020-06-06 at 04:33:50, Daniel Shahaf wrote:
> brian m. carlson wrote on Fri, 05 Jun 2020 20:41 +0000:
> > On 2020-06-05 at 10:21:41, Mikael Magnusson wrote:
> > > On 6/5/20, brian m. carlson <sandals@crustytoothpaste.net> wrote:
> > > > zsh typically runs the final command in a pipeline in the main shell
> > > > instead of a subshell.  However, POSIX requires that all commands in a
> > > > pipeline run in a subshell, but permits zsh's behavior as an extension.
> > > 
> > > What POSIX actually says is:
> > > "each command of a multi-command pipeline is in a subshell
> > > environment; as an extension, however, any or all commands in a
> > > pipeline may be executed in the current environment"
> 
> That's quoted from https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_12.
> 
> The part Brian quotes below is from https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap02.html#tag_02_01_01.
> 
> > > Ie, it does not say "shall", so it doesn't require a subshell all, in
> > > fact it explicitly does permit not using one as you also say. The
> 
> This interpretation is analogous to how conforming C programs must
> assume neither that «char» is signed nor that it is unsigned.

Right.  That term in C is "implementation defined."  POSIX has that term
as well, and it is not used here.  That term means that the
implementation may pick a behavior, but must document its choice.

> The sentence preceding the one you quoted reads:
> .
>     Non-standard extensions, when used, may change the behavior of
>     utilities, functions, or facilities defined by POSIX.1-2017.
> 
> I take this to mean non-standard extensions aren't bound by "shall"s.
> 
> As to why the passage Mikael quoted doesn't use the word "shall"… well,
> presumably it doesn't use the word "shall" because it doesn't describe
> "a feature or behavior that is mandatory"¹.

Sure, but if the standard didn't want that behavior to be specified
somehow, then it wouldn't have mentioned it.  Why wouldn't POSIX have
just omitted that statement and said nothing about it?

POSIX also says[0] that "[w]hen data is transmitted over the network, it
is sent as a sequence of octets (8-bit unsigned values)" and "16 and
32-bit values can be converted using the htonl(), htons(), ntohl(), and
ntohs() functions."  I don't think we can argue that POSIX permits one
to use 8-bit signed values or 9-bit values or that the implementation
can fail to make those functions work this way just because they didn't
use "shall".  The word "shall" is omitted (and "is" used) all over the
shell definitions to describe syntax forms, and one isn't permitted to
substitute some other syntax form in place of the standard one.

> > What POSIX does say is that one “shall define an environment in which an
> > application can be run with the behavior specified by POSIX.1-2017.”
> > I'm proposing that "zsh --emulate sh" implement the POSIX behavior for
> > that reason.
> 
> What Mikael's saying is that zsh's incumbent behaviour is already
> POSIX-conforming, but POSIX-conforming implementations have some leeway:
> have a range of possible behaviours to choose from, just like conforming
> C compilers can choose what signedness to give to «char».

I don't agree.  That behavior is implementation defined, and that has a
specific meaning.  Certainly implementations can implement additional
extensions, provided they don't conflict with the behavior specified in
POSIX.

> The passage Mikael quoted specifies that running the last command in
> a pipeline in a subshell by default is permitted in certain cases,
> outlined by the phrases "as an extension" and "may".
> 
> The definition of "may"¹ says it's used to describe "optional" behaviours,
> and that conforming applications should tolerate both presence and
> absence of that behaviour.

It says that an "application should not rely on the existence of the
feature or behavior."  It doesn't say that we can't rely on the absence
of that feature in a conforming environment.

> To summarize, I don't see why behaviour specified with the phrases "as
> an extension" and "may" should be off by default in a POSIX-conforming
> mode.  Would you elaborate on this?

Because the behavior materially differs between the behavior specified
declaratively (albeit without "shall") and the extension.  If we were
talking about situations where the behavior was a choice between
producing an error (that is, just failing) and producing a useful
output, then clearly nobody would care: just don't rely on the program
failing if you give it the syntax specified in an extension.

For example, the shell is permitted to recognize additional arithmetic
expressions as an extension.  It would be permissible for the shell to
understand the legacy C-style expressions like =* (instead of *=), but
when in POSIX mode, the following would need to print -4:

  sh -c 'x=2; : $((x =- 4)); echo $x'

For behaviors where there is no conflict, such as =*, then we could
always print 8 here, even in a POSIX mode:

  sh -c 'x=2; : $((x =* 4)); echo $x'

> (On the other hand, I'm not sure why they bothered to write the words
> "as an extension" there.  They don't seem to change the meaning one way
> or the other.)

In general, we have to assume standards authors (and legislators) wrote
the text for a reason and not to be wasteful with words.  Therefore, we
should assume there is a relevant difference in meaning.

> Well, perhaps there is something we can do to make their lives easier.
> 
> Continuing the analogy to C, gcc(1) has -fsigned-char/-funsigned-char
> flags to help unportable programs.  However, I hesitate to propose
> adding an option just for this: adding options is always easy to
> suggest, but not always a good idea.
> 
> Since zsh already incorporates a parser for sh scripts, perhaps we could
> write a tool that automatically adds parentheses to the last element in
> every pipeline.  That's not such a crazy idea: it already exists (in a
> much more general form) for C: http://coccinelle.lip6.fr/

I think if your goal is for people to change their code to work around
this when zsh is sh, they will simply not do so, even if that's an
option, because it doesn't work by default.  In Git alone, there are
over 240,000 lines of shell between code and tests.  Debian must contain
tens of millions more.  It's just not going to be achievable to get all
of those lines changed to work this way.

If I were to add an option that were off by default for sh and on for
zsh, then that would meet my needs, and I'd be happy to implement that.
You seem to be unexcited about that possibility, though.

> > zsh is a very popular interactive shell, and allowing it to be used as a
> > portable sh on systems where the system sh is less capable would be
> > really beneficial.
> 
> How would it be beneficial?

It's already present on a lot of those systems and it avoids the need to
build one shell for interactive use and another for portable scripting.
zsh is also appealing as a portable sh because it has a pleasant
interactive mode, whereas many sh implementations (e.g., dash) do not.

> > If your objection is to the wording, I'm happy to revise it to remove
> > the word "requires", but I do think this provides a lot of benefits for
> > the sh scripting case while not impacting users who are expecting
> > different behavior for the zsh case.
> 
> The patch would constitute a backwards-incompatible change to anyone who
> uses zsh as sh today and relies on the current behaviour of pipelines.

The thing is, I don't believe anyone does, except for the possibility of
macOS[1].  I have tried zsh as sh on Debian and many things are broken
(including debconf).  I'm not aware of any other supported operating
systems[2] where a user using zsh as /bin/sh is permitted as an option.

I should also point out that when people write "emulate sh" that they
probably very much want to emulate the behavior of /bin/sh on their
system.  I'm not aware of any supported system in existence where the
default /bin/sh (or the default POSIX sh, when /bin/sh is not
POSIX-compatible) has the zsh behavior; they all run all pipeline stages
in a subshell.

I want to be clear that I don't want to change the behavior of the zsh
mode, where I agree a change would be undesirable and people are almost
certainly relying on the current behavior.

> This might have been acceptable if it were a question of changing
> a non-conforming behaviour to a conforming behaviour.  However, the
> current behaviour does appear to be conforming.

I'm not in agreement that a shell which provides only zsh's behavior is
conforming in this case.

[0] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html
[1] And macOS users are not relying on this behavior from zsh as sh
    because bash and dash are also valid sh options.
[2] That is, operating systems in versions which still receive security
    support from their vendor.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] exec: run final pipeline command in a subshell in sh modeZZ
  2020-06-06 16:28         ` brian m. carlson
@ 2020-06-07 11:29           ` Daniel Shahaf
  0 siblings, 0 replies; 14+ messages in thread
From: Daniel Shahaf @ 2020-06-07 11:29 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Mikael Magnusson, zsh-workers

brian m. carlson wrote on Sat, 06 Jun 2020 16:28 +0000:
> On 2020-06-06 at 04:33:50, Daniel Shahaf wrote:
> > brian m. carlson wrote on Fri, 05 Jun 2020 20:41 +0000:  
> > > On 2020-06-05 at 10:21:41, Mikael Magnusson wrote:  
> > > > On 6/5/20, brian m. carlson <sandals@crustytoothpaste.net> wrote:  
> > > > > zsh typically runs the final command in a pipeline in the main shell
> > > > > instead of a subshell.  However, POSIX requires that all commands in a
> > > > > pipeline run in a subshell, but permits zsh's behavior as an extension.  
> > > > 
> > > > What POSIX actually says is:
> > > > "each command of a multi-command pipeline is in a subshell
> > > > environment; as an extension, however, any or all commands in a
> > > > pipeline may be executed in the current environment"  
> > 
> > That's quoted from https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_12.
> > 
> > The part Brian quotes below is from https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap02.html#tag_02_01_01.
> >   
> > > > Ie, it does not say "shall", so it doesn't require a subshell all, in
> > > > fact it explicitly does permit not using one as you also say. The  
> > 
> > This interpretation is analogous to how conforming C programs must
> > assume neither that «char» is signed nor that it is unsigned.  
> 
> Right.  That term in C is "implementation defined."  POSIX has that term
> as well, and it is not used here.  That term means that the
> implementation may pick a behavior, but must document its choice.
> 
> > The sentence preceding the one you quoted reads:
> > .
> >     Non-standard extensions, when used, may change the behavior of
> >     utilities, functions, or facilities defined by POSIX.1-2017.
> > 
> > I take this to mean non-standard extensions aren't bound by "shall"s.
> > 
> > As to why the passage Mikael quoted doesn't use the word "shall"… well,
> > presumably it doesn't use the word "shall" because it doesn't describe
> > "a feature or behavior that is mandatory"¹.  
> 
> Sure, but if the standard didn't want that behavior to be specified
> somehow, then it wouldn't have mentioned it.  Why wouldn't POSIX have
> just omitted that statement and said nothing about it?

Perhaps because POSIX tries to first describe how an abstract or common
implementation behaves, and then proceeds to describe a set of
alternative behaviours known to be used by some implementations.

For example, IIRC C doesn't specify the signedness of «char» because, at
the time C was standardized, some platforms used signed chars and other
used unsigned chars, and it was desired to make both kinds of platforms
conformant.

> POSIX also says[0] that "[w]hen data is transmitted over the network, it
> is sent as a sequence of octets (8-bit unsigned values)" and "16 and
> 32-bit values can be converted using the htonl(), htons(), ntohl(), and
> ntohs() functions."  I don't think we can argue that POSIX permits one
> to use 8-bit signed values or 9-bit values or that the implementation
> can fail to make those functions work this way just because they didn't
> use "shall".  The word "shall" is omitted (and "is" used) all over the
> shell definitions to describe syntax forms, and one isn't permitted to
> substitute some other syntax form in place of the standard one.

So you're saying that wherever POSIX says "is" it is to be read as
"shall", if I understand correctly?  That's a fair argument, but I'm not
sure whether I agree.

> > > What POSIX does say is that one “shall define an environment in which an
> > > application can be run with the behavior specified by POSIX.1-2017.”
> > > I'm proposing that "zsh --emulate sh" implement the POSIX behavior for
> > > that reason.  
> > 
> > What Mikael's saying is that zsh's incumbent behaviour is already
> > POSIX-conforming, but POSIX-conforming implementations have some leeway:
> > have a range of possible behaviours to choose from, just like conforming
> > C compilers can choose what signedness to give to «char».  
> 
> I don't agree.  That behavior is implementation defined, and that has a
> specific meaning.  Certainly implementations can implement additional
> extensions, provided they don't conflict with the behavior specified in
> POSIX.
> 

Could you please clarify what exactly is implementation-defined here,
according to your reading?  What decision in this are implementors
supposed to make for themselves and document for their users?

In any case, our readings of the standards differ.  How can we figure
out what the correct interpretation is?  Is there background information
on Austin Group's bug tracker or mailing lists, for example?  Or can we
just ask them?

> > The passage Mikael quoted specifies that running the last command in
> > a pipeline in a subshell by default is permitted in certain cases,
> > outlined by the phrases "as an extension" and "may".
> > 
> > The definition of "may"¹ says it's used to describe "optional" behaviours,
> > and that conforming applications should tolerate both presence and
> > absence of that behaviour.  
> 
> It says that an "application should not rely on the existence of the
> feature or behavior."  It doesn't say that we can't rely on the absence
> of that feature in a conforming environment.

If "may" describes a feature on whose _absence_ conforming applications
may rely, then what's the difference between "may" and "shall not"?  And
between their respective opposites, "need not" and "shall"?

For example, consider this bit from [2.3.1]: "Implementations also may
provide predefined valid aliases that are in effect when the shell is
invoked."  If conforming applications can rely on the absence of
predefined aliases, that would imply that conforming implementations
must not predefine aliases.

[2.3.1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_03_01

> > To summarize, I don't see why behaviour specified with the phrases "as
> > an extension" and "may" should be off by default in a POSIX-conforming
> > mode.  Would you elaborate on this?  
> 
> Because the behavior materially differs between the behavior specified
> declaratively (albeit without "shall") and the extension.  If we were
> talking about situations where the behavior was a choice between
> producing an error (that is, just failing) and producing a useful
> output, then clearly nobody would care: just don't rely on the program
> failing if you give it the syntax specified in an extension.
> 
> For example, the shell is permitted to recognize additional arithmetic
> expressions as an extension.  It would be permissible for the shell to
> understand the legacy C-style expressions like =* (instead of *=), but
> when in POSIX mode, the following would need to print -4:
> 
>   sh -c 'x=2; : $((x =- 4)); echo $x'
> 
> For behaviors where there is no conflict, such as =*, then we could
> always print 8 here, even in a POSIX mode:
> 
>   sh -c 'x=2; : $((x =* 4)); echo $x'

Thanks.  I understand your argument; not sure yet whether I agree with it.

> > (On the other hand, I'm not sure why they bothered to write the words
> > "as an extension" there.  They don't seem to change the meaning one way
> > or the other.)  
> 
> In general, we have to assume standards authors (and legislators) wrote
> the text for a reason and not to be wasteful with words.  Therefore, we
> should assume there is a relevant difference in meaning.

Agreed.

That's exactly why I questioned whether the behaviour in question was
a "non-standard extension": I was trying to interpret the term
'non-standard' within the phrase 'non-standard extension' as
non-superfluous.

> > Well, perhaps there is something we can do to make their lives easier.
> > 
> > Continuing the analogy to C, gcc(1) has -fsigned-char/-funsigned-char
> > flags to help unportable programs.  However, I hesitate to propose
> > adding an option just for this: adding options is always easy to
> > suggest, but not always a good idea.
> > 
> > Since zsh already incorporates a parser for sh scripts, perhaps we could
> > write a tool that automatically adds parentheses to the last element in
> > every pipeline.  That's not such a crazy idea: it already exists (in a
> > much more general form) for C: http://coccinelle.lip6.fr/  
> 
> I think if your goal is for people to change their code to work around
> this when zsh is sh, they will simply not do so, even if that's an
> option, because it doesn't work by default.  In Git alone, there are
> over 240,000 lines of shell between code and tests.  Debian must contain
> tens of millions more.  It's just not going to be achievable to get all
> of those lines changed to work this way.
> 
> If I were to add an option that were off by default for sh and on for
> zsh, then that would meet my needs, and I'd be happy to implement that.
> You seem to be unexcited about that possibility, though.

I'm not sure you understood my point of view precisely.

What I was saying [in my previous message, based on my understanding at
the time, not taking into account your latest reply] was:

- The long-term solution is for people to add parentheses around their
  pipeline elements.

- That solution can be implemented mechanically.

- As a stopgap measure, we can consider enabling the patch's behaviour
  in sh mode _as an opt-in_.
  
  Notwithstanding the opt-in aspect, I'm sure we can figure out a way to
  arrange things so random third party code that runs /bin/sh will be
  served by zsh in sh emulation mode with the patch's behaviour already
  on, if that's what the sysadmin or third-party maintainer want.

> > > zsh is a very popular interactive shell, and allowing it to be used as a
> > > portable sh on systems where the system sh is less capable would be
> > > really beneficial.  
> > 
> > How would it be beneficial?  
> 
> It's already present on a lot of those systems and it avoids the need to
> build one shell for interactive use and another for portable scripting.
> zsh is also appealing as a portable sh because it has a pleasant
> interactive mode, whereas many sh implementations (e.g., dash) do not.
> 

Thanks.

> > > If your objection is to the wording, I'm happy to revise it to remove
> > > the word "requires", but I do think this provides a lot of benefits for
> > > the sh scripting case while not impacting users who are expecting
> > > different behavior for the zsh case.  
> > 
> > The patch would constitute a backwards-incompatible change to anyone who
> > uses zsh as sh today and relies on the current behaviour of pipelines.  
> 
> The thing is, I don't believe anyone does, except for the possibility of
> macOS[1].

https://en.wikipedia.org/wiki/No_true_Scotsman

> I have tried zsh as sh on Debian and many things are broken
> (including debconf).  I'm not aware of any other supported operating
> systems[2] where a user using zsh as /bin/sh is permitted as an option.

And I'm not aware of any regulars on this list who have symlinked
/bin/sh to zsh independently of their OS vendor's configuration options.

> I should also point out that when people write "emulate sh" that they
> probably very much want to emulate the behavior of /bin/sh on their
> system.

Personally, when I write «emulate sh» I would expect to get, not what
bash does as sh or what dash does as sh, but what POSIX specifies sh
should do.

> I'm not aware of any supported system in existence where the
> default /bin/sh (or the default POSIX sh, when /bin/sh is not
> POSIX-compatible) has the zsh behavior; they all run all pipeline stages
> in a subshell.
> 
> I want to be clear that I don't want to change the behavior of the zsh
> mode, where I agree a change would be undesirable and people are almost
> certainly relying on the current behavior.

Thanks for clarifying this.

> > This might have been acceptable if it were a question of changing
> > a non-conforming behaviour to a conforming behaviour.  However, the
> > current behaviour does appear to be conforming.  
> 
> I'm not in agreement that a shell which provides only zsh's behavior is
> conforming in this case.

Okay, so see above re how to resolve our differing interpretations.

Cheers,

Daniel

> [0] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html
> [1] And macOS users are not relying on this behavior from zsh as sh
>     because bash and dash are also valid sh options.
> [2] That is, operating systems in versions which still receive security
>     support from their vendor.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] exec: run final pipeline command in a subshell in sh mode
  2020-06-05 20:41     ` brian m. carlson
  2020-06-06  4:33       ` [PATCH v2] exec: run final pipeline command in a subshell in sh modeZZ Daniel Shahaf
@ 2020-06-07 16:55       ` Bart Schaefer
  2020-06-07 17:24         ` Peter Stephenson
  2020-06-11  0:24         ` brian m. carlson
  1 sibling, 2 replies; 14+ messages in thread
From: Bart Schaefer @ 2020-06-07 16:55 UTC (permalink / raw)
  To: brian m. carlson; +Cc: zsh-workers

On Fri, Jun 5, 2020 at 1:42 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> I will tell you that as a practical matter, nobody writing code for sh
> expects the last command not to be run in a subshell and consequently
> lots of code is practically broken in this case with zsh as /bin/sh.

I believe you, but would be curious to see an example.

For what it's worth, I'm not opposed to this patch.  I think it's
pretty unlikely that anyone is invoking zsh as sh and still expecting
to be able to (for example) pipe into read to set variables in the
current shell.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] exec: run final pipeline command in a subshell in sh mode
  2020-06-07 16:55       ` [PATCH v2] exec: run final pipeline command in a subshell in sh mode Bart Schaefer
@ 2020-06-07 17:24         ` Peter Stephenson
  2020-06-09  7:57           ` Daniel Shahaf
  2020-06-11  0:24         ` brian m. carlson
  1 sibling, 1 reply; 14+ messages in thread
From: Peter Stephenson @ 2020-06-07 17:24 UTC (permalink / raw)
  To: zsh-workers

On Sun, 2020-06-07 at 09:55 -0700, Bart Schaefer wrote:
> On Fri, Jun 5, 2020 at 1:42 PM brian m. carlson
> <sandals@crustytoothpaste.net> wrote:
> > 
> > I will tell you that as a practical matter, nobody writing code for sh
> > expects the last command not to be run in a subshell and consequently
> > lots of code is practically broken in this case with zsh as /bin/sh.
> 
> I believe you, but would be curious to see an example.
> 
> For what it's worth, I'm not opposed to this patch.  I think it's
> pretty unlikely that anyone is invoking zsh as sh and still expecting
> to be able to (for example) pipe into read to set variables in the
> current shell.

Yes, since I'm still here, that's my position too.

Our general position on consistency is that we'll try our best to
keep native mode compatible, while with sh compatibility we'll
try to be like other shells and not worry so much about what zsh
used to do.  But it's a little bit of an odd case here for all
the reasons I won't rehash.

pws



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] exec: run final pipeline command in a subshell in sh mode
  2020-06-07 17:24         ` Peter Stephenson
@ 2020-06-09  7:57           ` Daniel Shahaf
  2020-06-09 10:54             ` Mikael Magnusson
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Shahaf @ 2020-06-09  7:57 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: zsh-workers

Peter Stephenson wrote on Sun, 07 Jun 2020 18:24 +0100:
> On Sun, 2020-06-07 at 09:55 -0700, Bart Schaefer wrote:
> > On Fri, Jun 5, 2020 at 1:42 PM brian m. carlson
> > <sandals@crustytoothpaste.net> wrote:  
> > > 
> > > I will tell you that as a practical matter, nobody writing code for sh
> > > expects the last command not to be run in a subshell and consequently
> > > lots of code is practically broken in this case with zsh as /bin/sh.  
> > 
> > I believe you, but would be curious to see an example.
> > 
> > For what it's worth, I'm not opposed to this patch.  I think it's
> > pretty unlikely that anyone is invoking zsh as sh and still expecting
> > to be able to (for example) pipe into read to set variables in the
> > current shell.  
> 
> Yes, since I'm still here, that's my position too.
> 
> Our general position on consistency is that we'll try our best to
> keep native mode compatible, while with sh compatibility we'll
> try to be like other shells and not worry so much about what zsh
> used to do.

This being the case, I'm happy to defer to consensus.  I won't object
to an entry in the list of incompatibilities in README, though.

I can't speak for Mikael, of course.

> But it's a little bit of an odd case here for all the reasons I won't
> rehash.

Thanks for the additional context,

Daniel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] exec: run final pipeline command in a subshell in sh mode
  2020-06-09  7:57           ` Daniel Shahaf
@ 2020-06-09 10:54             ` Mikael Magnusson
  2020-06-17 18:26               ` Daniel Shahaf
  0 siblings, 1 reply; 14+ messages in thread
From: Mikael Magnusson @ 2020-06-09 10:54 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: Peter Stephenson, zsh-workers

On 6/9/20, Daniel Shahaf <d.s@daniel.shahaf.name> wrote:
> Peter Stephenson wrote on Sun, 07 Jun 2020 18:24 +0100:
>> On Sun, 2020-06-07 at 09:55 -0700, Bart Schaefer wrote:
>> > On Fri, Jun 5, 2020 at 1:42 PM brian m. carlson
>> > <sandals@crustytoothpaste.net> wrote:
>> > >
>> > > I will tell you that as a practical matter, nobody writing code for
>> > > sh
>> > > expects the last command not to be run in a subshell and consequently
>> > > lots of code is practically broken in this case with zsh as /bin/sh.
>> > >
>> >
>> > I believe you, but would be curious to see an example.
>> >
>> > For what it's worth, I'm not opposed to this patch.  I think it's
>> > pretty unlikely that anyone is invoking zsh as sh and still expecting
>> > to be able to (for example) pipe into read to set variables in the
>> > current shell.
>>
>> Yes, since I'm still here, that's my position too.
>>
>> Our general position on consistency is that we'll try our best to
>> keep native mode compatible, while with sh compatibility we'll
>> try to be like other shells and not worry so much about what zsh
>> used to do.
>
> This being the case, I'm happy to defer to consensus.  I won't object
> to an entry in the list of incompatibilities in README, though.
>
> I can't speak for Mikael, of course.

I was only objecting to the commit message in the first place :). I
think the arguments for including it are not very convincing but I
don't really have any at all for not including it.

Surely it must be very rare to run something like
foo | shellfunction
and *depend* on shellfunction not setting any global parameters? Why
would you make shellfunction set global parameters in the first place
if you depend on them not being set? And then only when being piped
to? It seems so strange to me. Anyway, those are just arguments
against the sanity of shellscript writers, not against the inclusion
of the patch.

-- 
Mikael Magnusson

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] exec: run final pipeline command in a subshell in sh mode
  2020-06-07 16:55       ` [PATCH v2] exec: run final pipeline command in a subshell in sh mode Bart Schaefer
  2020-06-07 17:24         ` Peter Stephenson
@ 2020-06-11  0:24         ` brian m. carlson
  1 sibling, 0 replies; 14+ messages in thread
From: brian m. carlson @ 2020-06-11  0:24 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 1929 bytes --]

On 2020-06-07 at 16:55:54, Bart Schaefer wrote:
> On Fri, Jun 5, 2020 at 1:42 PM brian m. carlson
> <sandals@crustytoothpaste.net> wrote:
> >
> > I will tell you that as a practical matter, nobody writing code for sh
> > expects the last command not to be run in a subshell and consequently
> > lots of code is practically broken in this case with zsh as /bin/sh.
> 
> I believe you, but would be curious to see an example.

I'll demonstrate an example from the Git testsuite (t1300), which is
necessarily incomplete, and whose behavior baffles me a bit.

	echo "[broken" | test_must_fail git config --list --file - >output 2>&1 &&
	test_i18ngrep "bad config line 1 in standard input" output

test_must_fail checks for a nonzero, non-SIGSEGV exit code, and
test_i18ngrep is a glorified grep operation which always succeeds when
LC_ALL=C doesn't work.  There is also a lot of FD redirection going on
under the hood.

This operation fails, interestingly enough, because grep complains that
the output is also the input; that is, the redirection of stdout on the
previous line applies to the grep as well.  I'm unable to reproduce this
with a simpler example, but putting it in a subshell does work.  AT&T
ksh does not have this behavior, and so this may be a legitimate bug in
zsh which my patch happens to fix.

The other case (t0410) in the Git testsuite is more straightforward; we
have this function:

pack_as_from_promisor () {
	HASH=$(git -C repo pack-objects .git/objects/pack/pack) &&
	>repo/.git/objects/pack/pack-$HASH.promisor &&
	echo $HASH
}

and that function is then called on the right end of a pipe.  The caller
was not expecting that HASH would be overwritten, and since until
recently the Git testsuite did not allow "local" and this test has not
been updated, the test gets the wrong value.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] exec: run final pipeline command in a subshell in sh mode
  2020-06-09 10:54             ` Mikael Magnusson
@ 2020-06-17 18:26               ` Daniel Shahaf
  2020-07-03 20:16                 ` brian m. carlson
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Shahaf @ 2020-06-17 18:26 UTC (permalink / raw)
  To: brian m. carlson; +Cc: zsh-workers

Brian, ping?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2] exec: run final pipeline command in a subshell in sh mode
  2020-06-17 18:26               ` Daniel Shahaf
@ 2020-07-03 20:16                 ` brian m. carlson
  0 siblings, 0 replies; 14+ messages in thread
From: brian m. carlson @ 2020-07-03 20:16 UTC (permalink / raw)
  To: Daniel Shahaf; +Cc: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 380 bytes --]

On 2020-06-17 at 18:26:36, Daniel Shahaf wrote:
> Brian, ping?

Sorry, this mail got lost in one of my mailboxes.

You mentioned in the PR that you're looking for a NEWS/CHANGES update.
Is that all that's needed here, or were folks looking for other changes?
I can write one up if that's all.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-07-03 20:17 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-05  1:53 [PATCH v2 0/1] Run pipeline command in subshell in sh mode brian m. carlson
2020-06-05  1:53 ` [PATCH v2] exec: run final pipeline command in a " brian m. carlson
2020-06-05 10:21   ` Mikael Magnusson
2020-06-05 20:41     ` brian m. carlson
2020-06-06  4:33       ` [PATCH v2] exec: run final pipeline command in a subshell in sh modeZZ Daniel Shahaf
2020-06-06 16:28         ` brian m. carlson
2020-06-07 11:29           ` Daniel Shahaf
2020-06-07 16:55       ` [PATCH v2] exec: run final pipeline command in a subshell in sh mode Bart Schaefer
2020-06-07 17:24         ` Peter Stephenson
2020-06-09  7:57           ` Daniel Shahaf
2020-06-09 10:54             ` Mikael Magnusson
2020-06-17 18:26               ` Daniel Shahaf
2020-07-03 20:16                 ` brian m. carlson
2020-06-11  0:24         ` brian m. carlson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).