From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10248 invoked by alias); 30 May 2014 21:15:06 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 32632 Received: (qmail 28398 invoked from network); 30 May 2014 21:14:50 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED autolearn=ham version=3.3.2 X-Submitted: to socket.bbn.com (Postfix) with ESMTPSA id 383EB403E8 Message-ID: <5388F4C3.6070801@bbn.com> Date: Fri, 30 May 2014 17:14:43 -0400 From: Richard Hansen User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: Bart Schaefer CC: zsh-workers@zsh.org Subject: Re: 'emulate sh -c' and $0 References: <5387BD0D.8090202@bbn.com> <140529204533.ZM5362@torch.brasslantern.com> <5388461D.8060203@bbn.com> <140530100050.ZM18382@torch.brasslantern.com> In-Reply-To: <140530100050.ZM18382@torch.brasslantern.com> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit On 2014-05-30 13:00, Bart Schaefer wrote: > Finally the entry for the emulate builtin says: > > With single argument set up zsh options to emulate the specified > shell as much as possible. `csh' will never be fully emulated. > If the argument is not one of the shells listed above, zsh will be > used as a default; more precisely, the tests performed on the > argument are the same as those used to determine the emulation at > startup based on the shell name, see Compatibility. Thanks for the documentation references. I had read "to emulate the specified shell as much as possible" without paying enough attention to the qualifying "set up zsh options". So you are right: The documentation says that emulate only toggles options, and the behavior of $0 with FUNCTION_ARGZERO is clear, so there's no reason to expect Zsh to reset $0 to the original $0 when in sh emulation mode. That being said, I still think there's value in changing Zsh's behavior. > >> (I am aware of the documentation for the FUNCTION_ARGZERO option. I'm >> more interested in what it really means to be running in sh emulation >> mode, as that's where I think the bug is.) > > In general, emulation is at its most complete if and only if the shell > is actually started as an emulator (e.g., the path name to the shell > binary itself is not zsh, or ARGV0 is set in the environment). The > "emulate" builtin only changes setopts to the closest possible. Would it add too much complexity to the code or documentation if the emulate builtin did more than just toggle options (specifically: temporarily change the binding of $0 to the original value)? Perhaps the behavior of FUNCTION_ARGZERO could be altered so that $0 expands as follows: If option FUNCTION_ARGZERO is enabled and $0 is expanded inside the body of a function, $0 expands to the name of the enclosing function. Otherwise, if option FUNCTION_ARGZERO is enabled and $0 is expanded inside a sourced file, $0 expands to the pathname given to the 'source' or '.' builtin command. Otherwise, if the shell was invoked with an argument naming a script containing shell commands to be executed, $0 expands to the value of that argument. Otherwise, if the shell was invoked with the '-c' flag and at least one non-option non-flag argument was given, $0 expands to the value of the first non-option non-flag argument. Otherwise, $0 expands to the value of the first argument passed to zsh from its parent (argv[0] in C). This modification would make it possible to toggle the setting back and forth to examine the local or original value as desired, even within the same function. I wouldn't expect this change to break many scripts, but maybe any backward incompatibility is unacceptable. >>> I don't find those examples particularly compelling, >> >> Here's the real-world problem that motivated my bug report; perhaps it >> is a more compelling example (or perhaps you'll think of a better way to >> solve the problem I was addressing): >> >> http://article.gmane.org/gmane.comp.version-control.git/250409 > > Instead of "compelling" I perhaps should have said "likely to come up > in common usage." You have a fairly rare special case there. Good point. :) > In that example, > > ARGV0=sh exec zsh "$0" "$@" > > might do what you want, but I'm not entirely following from the diff > context what's intended. Some more context if you're curious: The Git distribution comes with t/test-lib.sh, a file containing POSIX shell code implementing common test infrastructure (print error messages, declare and run test cases, etc.). The test scripts are POSIX shell scripts that source this shared file, with two exceptions: * t/t9903-bash-prompt.sh starts off running under /bin/sh, but it does the following early on: exec bash "$0" "$@" so that it can run and test Bash-specific shell code. After reinvoking itself under Bash, the code sources test-lib.sh in order to reuse the shared test infrastructure code. (The code in test-lib.sh is interpreted as Bash code, not POSIX shell code, but that doesn't really matter because the code is compatible with both shells.) * t/t9904-zsh-prompt.sh (new in that linked patch series) is similar to t9903, except it restarts itself under Zsh instead of Bash. Like t9903, it sources test-lib.sh, but because the code in test-lib.sh is incompatible with Zsh, it uses Zsh's sh emulation to source test-lib.sh. The point of these two test scripts is to run Bash and Zsh in their native modes as much as possible -- emulation is explicitly avoided except as necessary to run the shared test infrastructure. So 'ARGV0=sh exec zsh "$0" "$@"' doesn't work for two reasons: * at the time that line is executed, the script is being interpreted by /bin/sh and not Zsh, so the ARGV0 assignment won't have the desired effect * we want as little as possible to run in sh emulation mode so that we can test Zsh-specific code > >>> but the original >>> value of $0 is already stashed; what would need to change is that the >>> *local* value of $0 gets temporarily replaced by the global one. >> >> That's good news; that should make it easier to write a patch that >> temporarily replaces the local value with the global value. > > Unfortunately the way the local value is implemented is usually to use > C local variable scoping to stash and restore the contents of a C global > pointer, so this would mean at least one additional C global. > >> Would you (or anyone else in the community) be opposed to such a patch? > > The use cases in both directions seem pretty unusual to me. Losing the > ability to "localize" $0 for scripts feels almost as likely to create > questions as does your situation. I'm not sure what you mean by losing the ability to localize $0. I see a few OK options: * Option #1: 1. Add a new global variable 'orig_argzero' to hold the original value of $0. This variable is never modified once set. 2. The existing global variable 'argzero' continues to serve its current role of holding the "localized" value of $0 (it is updated when executing functions or sourcing files if FUNCTION_ARGZERO is enabled). 3. When 'emulate sh' starts, temporarily set argzero to orig_argzero. Restore argzero when 'emulate sh' returns. This would result in behavior that is identical to the current behavior except $0 would match the POSIX spec when in sh emulation mode (and only in sh emulation mode). * Option #2: 1. Add a new global variable 'orig_argzero' to hold the original value of $0. This variable is never modified once set. 2. The existing global variable 'argzero' continues to serve its current role of holding the "localized" value of $0 (it is updated when executing functions or sourcing files if FUNCTION_ARGZERO is enabled). 3. Add a new option; let's call it LOCALIZE_ARGZERO for now. If LOCALIZE_ARGZERO is enabled, use argzero to expand $0. If LOCALIZE_ARGZERO is disabled, use orig_argzero to expand $0. 4. Enable LOCALIZE_ARGZERO by default, but disable it in sh emulation mode. 5. Stop disabling FUNCTION_ARGZERO by default in sh emulation mode. * Option #3: 1. Add a new global variable 'orig_argzero' to hold the original value of $0. This variable is never modified once set. 2. Whenever a function is called or a file sourced, update the global variable holding the "localized" $0 ('argzero'), even if FUNCTION_ARGZERO is disabled. 3. Modify the expansion rules for $0 as follows: If FUNCTION_ARGZERO is enabled, use argzero to expand $0. If FUNCTION_ARGZERO is disabled, use orig_argzero to expand $0. Pros and cons: Option #1 is simplest to implement, simple for users, and (mostly) backward compatible, but less powerful than options #2 and #3 and 'emulate' no longer just sets options. Option #2 is complex but powerful (scripts can read both the original $0 and the localized $0 in the same chunk of code) and (mostly) backward compatible. Note that option #1 can be used as a stepping stone to option #2. Option #3 is simple for users but not backward compatible. I think my preference is to go with option #1 with a possible future step to option #2 (at which time FUNCTION_ARGZERO can be deprecated in favor of LOCALIZE_ARGZERO). > I suppose if both values were in the > C global state, it would be possible to have the "correct" one appear > at the instant functionargzero changes, instead of being determined by > the setting at the time the function is entered. OTOH that would be a > larger behavior difference / lack of backward compatibilty. Oops, I should have thoroughly read your email before proposing the same thing but with more words. :) > >> If not, can you point me to the relevant bits of code to help me get >> started? > > Search Src/*.c for references to "argzero", with particular attention to > builtin.c:bin_emulate. Thanks. No promises that I'll have the time to submit a patch soon (or even at all), but I plan on taking a crack at it this weekend. -Richard