From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 9153 invoked by alias); 11 Dec 2011 22:37:45 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 30003 Received: (qmail 20281 invoked from network); 11 Dec 2011 22:37:30 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 Received-SPF: none (ns1.primenet.com.au: domain at closedmail.com does not designate permitted sender hosts) From: Bart Schaefer Message-id: <111211143704.ZM32652@torch.brasslantern.com> Date: Sun, 11 Dec 2011 14:37:04 -0800 In-reply-to: <20111210181503.12e4b2ab@pws-pc.ntlworld.com> Comments: In reply to Peter Stephenson "Re: Bug in sh emulation" (Dec 10, 6:15pm) References: <111209184747.ZM5000@torch.brasslantern.com> <111209194044.ZM5067@torch.brasslantern.com> <20111210181503.12e4b2ab@pws-pc.ntlworld.com> <20111210194022.5051f91c@pws-pc.ntlworld.com> <20111210232801.7dc8fef2@pws-pc.ntlworld.com> <20111211193949.2d58062b@pws-pc.ntlworld.com> In-reply-to: <20111210194022.5051f91c@pws-pc.ntlworld.com> Comments: In reply to Peter Stephenson "Re: Bug in sh emulation" (Dec 10, 7:40pm) In-reply-to: <20111210232801.7dc8fef2@pws-pc.ntlworld.com> Comments: In reply to Peter Stephenson "Re: Bug in sh emulation" (Dec 10, 11:28pm) In-reply-to: <20111211193949.2d58062b@pws-pc.ntlworld.com> Comments: In reply to Peter Stephenson "Re: Bug in sh emulation" (Dec 11, 7:39pm) X-Mailer: OpenZMail Classic (0.9.2 24April2005) To: zsh-workers@zsh.org Subject: Re: Bug in sh emulation MIME-version: 1.0 Content-type: text/plain; charset=us-ascii On Dec 10, 6:15pm, Peter Stephenson wrote: } } I think it's probably the set of changes around 27100 to 27109 that } allowed MONITOR to remain on in a subshell. In particular, } } if (!isset(POSIXJOBS)) } opts[MONITOR] = 0; } } My session / process group / controlling terminal understanding is } distinctly ropey. Is the fix that we simply don't execute that code if } we've already executed it, i.e. subsh is already 1 when we get there } (see ultra-ultra-tentative patch --- we set subsh to 1 just after that)? } That fixes the problem, but that may be simply because we don't execute } the code that fails. But what should actually happen if we have a } subshell of a subshell, i.e. a nested "( ... )"? And what should the } group leader be set to at this point after a second fork? With the caveat that it's been a really long time since I (a) studied this or (b) did any real work that relied on it ... two factors are at work here. (1) There's no limit on the number of process groups you can have. A process group is an organizational unit for signal delivery, so that a whole set of processes can get the same signal at [conceptually] the same time. Any process can declare itself to be the leader of its own process group; any process which does not do so is part of the process group of its parent (which may actually be the process group of its grandparent, etc.). (2) A TTY device can only be attached to one process group at a time. A process can take control of a TTY if no other process already has and if it is able to open a file descriptor on that TTY; but once some process controls it, only the leader of that process group can "give away" control of the TTY to some other process group. If another process group tries to "take" the TTY, it gets a TTOU signal (unless that's been disabled in the TTY driver); it can ignore that signal, but it can't grab the terminal. Exactly how "give away" happens is a little goofy. Wikipedia says that usual shell behavior is for both the parent and the child to try to set the tty process group to the child when that is the desired effect, to avoid a race condition [the parent isn't allowed to give away the tty after the child has called exec(), for security]. It appears only a direct child can inherit the tty, not a grandchild, which may be the issue in our current dilemma. http://www.cs.ucsb.edu/~almeroth/classes/W99.276/assignment1/signals.html#Pgrps Thus control of the terminal can be passed down an arbitrary number of subshell levels, but only if each ancestor hands it off before the descendant attempts to do the same. I have forgotten what happens to control of the TTY if the group leader exits before its descendants, but I don't think that's an issue here? Given that the reason for associating a TTY with a process group is so that signals from the TTY will be delivered to all the descendants of the group leader, it's very likely the case that if attachtty() is not able to succeed, then we didn't want to create a new process group [declare a new group leader] in the first place -- UNLESS the job IS being run asynchronously (not true here) so that it is intended NOT to get signals from the TTY. On Dec 10, 7:40pm, Peter Stephenson wrote: } } What's confusing me, possibly based on ignorance, is that I naively } expect something like the following to happen (without my } ultra-ultra-tentative patch). } } - Shell forks. This is treated pretty much like any other fork to } create a new foreground process. } } - Job control is active, so the forked shell takes over the TTY and sets } itself as the group leader. That's the first time through the code } under discussion, in entersubsh() for the case where MONITOR is still } set. In fact in this case job control in the subshell doesn't matter. It's the new foreground process, so the parent shell (or both the parent and child, see above) should already want to attach the subshell to the TTY. You got to this yourself in the next message in the thread. } - Shell forks again. This is a pipeline, so part of the same process } group. I would expect this to find there is already a group leader from } the previous fork, so this process doesn't try to make itself group } leader and grab the pipeline. Evidently this isn't happening, } however. On Dec 10, 11:28pm, Peter Stephenson wrote: } } - We entersubsh() again. MONITOR is still set (this is where the } difference really kicks in) so we execute the code for handling the } tty again. We're now job 2, so no group leader for this. We create a } new group leader, and try to attach this to the tty, but we can't } because job 1 is attached to it. } } I suppose the problem is we're doing the attach in the wrong way. If your step-thorugh analysis is correct then the problem is with "we're now job 2, so no group leader for this." The group leader should still be the same as from the first entersubsh(). We want this new pipeline to remain part of the original group that already has control of the terminal. } So what's with the stuff in entersubsh() that we always execute } anyway, and successfully does atttachtty() the first time? Should we } really have done that in the parent shell? I think this means the MONITOR option is covering too many cases -- it is being used both as a flag that the (sub)shell should track status of its children, and also as a flag that the (sub)shell is THE shell that governs control of the TTY. The latter is no longer true; if we want to leave MONITOR set then we need separate knowledge of which process is the ultimate ancestor. Compare the current pid to the pid of the original shell, for example, although then we need to be sure to update that in the event of certain kinds of "exec". Put a different way, the concept of the "foreground" job in a subshell with MONITOR set is not the same as the "foreground" job started from the ancestral shell. Only in the latter case does forground also mean TTY process group leader. } I wonder if this is related to the business with the "superjob" in the } case of a non-subshell that I never understood? There's code in } handle_sub() in jobs.c line 278 that looks like it might be trying to do } something like what we need to do here. Not directly, AFAICT -- the superjob is a construct invented because of the way zsh runs the right side of pipelines in the current shell when the right side is a builtin. If you suspend the right side, the current shell needs to wake up again, so it does two things: (A) creates a new process to finish the right side [which immediately suspends], and (B) creates a "superjob" in the job table to track both sides of the now fully-forked-off pipeline. Where "creates" may mean "promotes another already existing entry" or something. Of course you knew about (A). On Dec 11, 7:39pm, Peter Stephenson wrote: } Subject: Re: Bug in sh emulation } } If we're letting the subshell do job control (not resetting MONITOR in } entersubsh() because POSIXJOBS is set), then presumably we shouldn't be } resetting the signals that are special to shells that do job control? } } This actually makes the issue go away, but I'm not sure at all sure it's } the basic issue; it's part of the stuff I'm hoping Mystified of Marin } County might know a little more about. This goes back to "if attachtty() is not able to succeed, then we didn't want to create a new process group" -- ignoring the signal may be the right thing for other reasons, but for this specific problem it's just masking the fact that [I *think* you'll find that] signals from the terminal are unable to affect the pipeline spawned from the subshell because that pipeline has incorrectly been put in a new progress group. -- Barton E. Schaefer