From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 1550 invoked from network); 7 Jun 2020 11:30:03 -0000 Received: from ns1.primenet.com.au (HELO primenet.com.au) (203.24.36.2) by inbox.vuxu.org with ESMTPUTF8; 7 Jun 2020 11:30:03 -0000 Received: (qmail 23782 invoked by alias); 7 Jun 2020 11:29:51 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: List-Unsubscribe: X-Seq: 46013 Received: (qmail 17576 invoked by uid 1010); 7 Jun 2020 11:29:50 -0000 X-Qmail-Scanner-Diagnostics: from out3-smtp.messagingengine.com by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.102.3/25828. spamassassin: 3.4.4. Clear:RC:0(66.111.4.27):SA:0(-2.6/5.0):. Processed in 6.067808 secs); 07 Jun 2020 11:29:50 -0000 X-Envelope-From: d.s@daniel.shahaf.name X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | Received-SPF: none (ns1.primenet.com.au: domain at daniel.shahaf.name does not designate permitted sender hosts) X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduhedrudegledggeduucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkjghfofggtgfgsehtqh dttdertdejnecuhfhrohhmpeffrghnihgvlhcuufhhrghhrghfuceougdrshesuggrnhhi vghlrdhshhgrhhgrfhdrnhgrmhgvqeenucggtffrrghtthgvrhhnpedttdehudefvdffve elveelgedvgeeugedvuedvteekieejheeikeeukeethffgudenucffohhmrghinhepohhp vghnghhrohhuphdrohhrghdpuddqvddtudejrdhimhdplhhipheirdhfrhdpfihikhhiph gvughirgdrohhrghenucfkphepjeelrddujeeirdefledrieelnecuvehluhhsthgvrhfu ihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepugdrshesuggrnhhivghlrdhshh grhhgrfhdrnhgrmhgv X-ME-Proxy: Date: Sun, 7 Jun 2020 11:29:04 +0000 From: Daniel Shahaf To: "brian m. carlson" Cc: Mikael Magnusson , zsh-workers@zsh.org Subject: Re: [PATCH v2] exec: run final pipeline command in a subshell in sh modeZZ Message-ID: <20200607112904.48307c0a@tarpaulin.shahaf.local2> In-Reply-To: <20200606162843.GI6569@camp.crustytoothpaste.net> References: <20200605015338.1347787-1-sandals@crustytoothpaste.net> <20200605015338.1347787-2-sandals@crustytoothpaste.net> <20200605204144.GD6569@camp.crustytoothpaste.net> <20200606043350.6b7fb334@tarpaulin.shahaf.local2> <20200606162843.GI6569@camp.crustytoothpaste.net> X-Mailer: Claws Mail 3.17.3 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable brian m. carlson wrote on Sat, 06 Jun 2020 16:28 +0000: > On 2020-06-06 at 04:33:50, Daniel Shahaf wrote: > > brian m. carlson wrote on Fri, 05 Jun 2020 20:41 +0000: =20 > > > On 2020-06-05 at 10:21:41, Mikael Magnusson wrote: =20 > > > > On 6/5/20, brian m. carlson wrote: = =20 > > > > > zsh typically runs the final command in a pipeline in the main sh= ell > > > > > instead of a subshell. However, POSIX requires that all commands= in a > > > > > pipeline run in a subshell, but permits zsh's behavior as an exte= nsion. =20 > > > >=20 > > > > What POSIX actually says is: > > > > "each command of a multi-command pipeline is in a subshell > > > > environment; as an extension, however, any or all commands in a > > > > pipeline may be executed in the current environment" =20 > >=20 > > That's quoted from https://pubs.opengroup.org/onlinepubs/9699919799/uti= lities/V3_chap02.html#tag_18_12. > >=20 > > The part Brian quotes below is from https://pubs.opengroup.org/onlinepu= bs/9699919799/basedefs/V1_chap02.html#tag_02_01_01. > > =20 > > > > Ie, it does not say "shall", so it doesn't require a subshell all, = in > > > > fact it explicitly does permit not using one as you also say. The = =20 > >=20 > > This interpretation is analogous to how conforming C programs must > > assume neither that =C2=ABchar=C2=BB is signed nor that it is unsigned.= =20 >=20 > Right. That term in C is "implementation defined." POSIX has that term > as well, and it is not used here. That term means that the > implementation may pick a behavior, but must document its choice. >=20 > > The sentence preceding the one you quoted reads: > > . > > Non-standard extensions, when used, may change the behavior of > > utilities, functions, or facilities defined by POSIX.1-2017. > >=20 > > I take this to mean non-standard extensions aren't bound by "shall"s. > >=20 > > As to why the passage Mikael quoted doesn't use the word "shall"=E2=80= =A6 well, > > presumably it doesn't use the word "shall" because it doesn't describe > > "a feature or behavior that is mandatory"=C2=B9. =20 >=20 > Sure, but if the standard didn't want that behavior to be specified > somehow, then it wouldn't have mentioned it. Why wouldn't POSIX have > just omitted that statement and said nothing about it? Perhaps because POSIX tries to first describe how an abstract or common implementation behaves, and then proceeds to describe a set of alternative behaviours known to be used by some implementations. For example, IIRC C doesn't specify the signedness of =C2=ABchar=C2=BB beca= use, at the time C was standardized, some platforms used signed chars and other used unsigned chars, and it was desired to make both kinds of platforms conformant. > POSIX also says[0] that "[w]hen data is transmitted over the network, it > is sent as a sequence of octets (8-bit unsigned values)" and "16 and > 32-bit values can be converted using the htonl(), htons(), ntohl(), and > ntohs() functions." I don't think we can argue that POSIX permits one > to use 8-bit signed values or 9-bit values or that the implementation > can fail to make those functions work this way just because they didn't > use "shall". The word "shall" is omitted (and "is" used) all over the > shell definitions to describe syntax forms, and one isn't permitted to > substitute some other syntax form in place of the standard one. So you're saying that wherever POSIX says "is" it is to be read as "shall", if I understand correctly? That's a fair argument, but I'm not sure whether I agree. > > > What POSIX does say is that one =E2=80=9Cshall define an environment = in which an > > > application can be run with the behavior specified by POSIX.1-2017.= =E2=80=9D > > > I'm proposing that "zsh --emulate sh" implement the POSIX behavior for > > > that reason. =20 > >=20 > > What Mikael's saying is that zsh's incumbent behaviour is already > > POSIX-conforming, but POSIX-conforming implementations have some leeway: > > have a range of possible behaviours to choose from, just like conforming > > C compilers can choose what signedness to give to =C2=ABchar=C2=BB. =20 >=20 > I don't agree. That behavior is implementation defined, and that has a > specific meaning. Certainly implementations can implement additional > extensions, provided they don't conflict with the behavior specified in > POSIX. >=20 Could you please clarify what exactly is implementation-defined here, according to your reading? What decision in this are implementors supposed to make for themselves and document for their users? In any case, our readings of the standards differ. How can we figure out what the correct interpretation is? Is there background information on Austin Group's bug tracker or mailing lists, for example? Or can we just ask them? > > The passage Mikael quoted specifies that running the last command in > > a pipeline in a subshell by default is permitted in certain cases, > > outlined by the phrases "as an extension" and "may". > >=20 > > The definition of "may"=C2=B9 says it's used to describe "optional" beh= aviours, > > and that conforming applications should tolerate both presence and > > absence of that behaviour. =20 >=20 > It says that an "application should not rely on the existence of the > feature or behavior." It doesn't say that we can't rely on the absence > of that feature in a conforming environment. If "may" describes a feature on whose _absence_ conforming applications may rely, then what's the difference between "may" and "shall not"? And between their respective opposites, "need not" and "shall"? For example, consider this bit from [2.3.1]: "Implementations also may provide predefined valid aliases that are in effect when the shell is invoked." If conforming applications can rely on the absence of predefined aliases, that would imply that conforming implementations must not predefine aliases. [2.3.1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap0= 2.html#tag_18_03_01 > > To summarize, I don't see why behaviour specified with the phrases "as > > an extension" and "may" should be off by default in a POSIX-conforming > > mode. Would you elaborate on this? =20 >=20 > Because the behavior materially differs between the behavior specified > declaratively (albeit without "shall") and the extension. If we were > talking about situations where the behavior was a choice between > producing an error (that is, just failing) and producing a useful > output, then clearly nobody would care: just don't rely on the program > failing if you give it the syntax specified in an extension. >=20 > For example, the shell is permitted to recognize additional arithmetic > expressions as an extension. It would be permissible for the shell to > understand the legacy C-style expressions like =3D* (instead of *=3D), but > when in POSIX mode, the following would need to print -4: >=20 > sh -c 'x=3D2; : $((x =3D- 4)); echo $x' >=20 > For behaviors where there is no conflict, such as =3D*, then we could > always print 8 here, even in a POSIX mode: >=20 > sh -c 'x=3D2; : $((x =3D* 4)); echo $x' Thanks. I understand your argument; not sure yet whether I agree with it. > > (On the other hand, I'm not sure why they bothered to write the words > > "as an extension" there. They don't seem to change the meaning one way > > or the other.) =20 >=20 > In general, we have to assume standards authors (and legislators) wrote > the text for a reason and not to be wasteful with words. Therefore, we > should assume there is a relevant difference in meaning. Agreed. That's exactly why I questioned whether the behaviour in question was a "non-standard extension": I was trying to interpret the term 'non-standard' within the phrase 'non-standard extension' as non-superfluous. > > Well, perhaps there is something we can do to make their lives easier. > >=20 > > Continuing the analogy to C, gcc(1) has -fsigned-char/-funsigned-char > > flags to help unportable programs. However, I hesitate to propose > > adding an option just for this: adding options is always easy to > > suggest, but not always a good idea. > >=20 > > Since zsh already incorporates a parser for sh scripts, perhaps we could > > write a tool that automatically adds parentheses to the last element in > > every pipeline. That's not such a crazy idea: it already exists (in a > > much more general form) for C: http://coccinelle.lip6.fr/ =20 >=20 > I think if your goal is for people to change their code to work around > this when zsh is sh, they will simply not do so, even if that's an > option, because it doesn't work by default. In Git alone, there are > over 240,000 lines of shell between code and tests. Debian must contain > tens of millions more. It's just not going to be achievable to get all > of those lines changed to work this way. >=20 > If I were to add an option that were off by default for sh and on for > zsh, then that would meet my needs, and I'd be happy to implement that. > You seem to be unexcited about that possibility, though. I'm not sure you understood my point of view precisely. What I was saying [in my previous message, based on my understanding at the time, not taking into account your latest reply] was: - The long-term solution is for people to add parentheses around their pipeline elements. - That solution can be implemented mechanically. - As a stopgap measure, we can consider enabling the patch's behaviour in sh mode _as an opt-in_. =20 Notwithstanding the opt-in aspect, I'm sure we can figure out a way to arrange things so random third party code that runs /bin/sh will be served by zsh in sh emulation mode with the patch's behaviour already on, if that's what the sysadmin or third-party maintainer want. > > > zsh is a very popular interactive shell, and allowing it to be used a= s a > > > portable sh on systems where the system sh is less capable would be > > > really beneficial. =20 > >=20 > > How would it be beneficial? =20 >=20 > It's already present on a lot of those systems and it avoids the need to > build one shell for interactive use and another for portable scripting. > zsh is also appealing as a portable sh because it has a pleasant > interactive mode, whereas many sh implementations (e.g., dash) do not. >=20 Thanks. > > > If your objection is to the wording, I'm happy to revise it to remove > > > the word "requires", but I do think this provides a lot of benefits f= or > > > the sh scripting case while not impacting users who are expecting > > > different behavior for the zsh case. =20 > >=20 > > The patch would constitute a backwards-incompatible change to anyone who > > uses zsh as sh today and relies on the current behaviour of pipelines. = =20 >=20 > The thing is, I don't believe anyone does, except for the possibility of > macOS[1]. https://en.wikipedia.org/wiki/No_true_Scotsman > I have tried zsh as sh on Debian and many things are broken > (including debconf). I'm not aware of any other supported operating > systems[2] where a user using zsh as /bin/sh is permitted as an option. And I'm not aware of any regulars on this list who have symlinked /bin/sh to zsh independently of their OS vendor's configuration options. > I should also point out that when people write "emulate sh" that they > probably very much want to emulate the behavior of /bin/sh on their > system. Personally, when I write =C2=ABemulate sh=C2=BB I would expect to get, not = what bash does as sh or what dash does as sh, but what POSIX specifies sh should do. > I'm not aware of any supported system in existence where the > default /bin/sh (or the default POSIX sh, when /bin/sh is not > POSIX-compatible) has the zsh behavior; they all run all pipeline stages > in a subshell. >=20 > I want to be clear that I don't want to change the behavior of the zsh > mode, where I agree a change would be undesirable and people are almost > certainly relying on the current behavior. Thanks for clarifying this. > > This might have been acceptable if it were a question of changing > > a non-conforming behaviour to a conforming behaviour. However, the > > current behaviour does appear to be conforming. =20 >=20 > I'm not in agreement that a shell which provides only zsh's behavior is > conforming in this case. Okay, so see above re how to resolve our differing interpretations. Cheers, Daniel > [0] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.h= tml > [1] And macOS users are not relying on this behavior from zsh as sh > because bash and dash are also valid sh options. > [2] That is, operating systems in versions which still receive security > support from their vendor.