From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 27262 invoked by alias); 8 Aug 2011 04:05:27 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 29654 Received: (qmail 17143 invoked from network); 8 Aug 2011 04:05:25 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received-SPF: none (ns1.primenet.com.au: domain at closedmail.com does not designate permitted sender hosts) From: Bart Schaefer Message-id: <110807210507.ZM28821@torch.brasslantern.com> Date: Sun, 07 Aug 2011 21:05:07 -0700 In-reply-to: <110807144359.ZM27903@torch.brasslantern.com> Comments: In reply to Bart Schaefer "Re: How to misplace an entire pipeline" (Aug 7, 2:43pm) References: <110805203111.ZM32508@torch.brasslantern.com> <20110807185002.6a042cab@pws-pc.ntlworld.com> <110807144359.ZM27903@torch.brasslantern.com> X-Mailer: OpenZMail Classic (0.9.2 24April2005) To: zsh-workers@zsh.org Subject: Re: How to misplace an entire pipeline MIME-version: 1.0 Content-type: text/plain; charset=us-ascii Two patches inline with other discussion below. On Aug 7, 2:43pm, Bart Schaefer wrote: } } (By the way, arguably "wait" ought to continue the job if it's stopped, } which it does if you wait for it by job number but does not if you wait } for it by PID.) Patch for that follows; I'm not sure the makerunning() is correct: --- ../zsh-forge/current/Src/jobs.c 2011-06-14 19:54:57.000000000 -0700 +++ Src/jobs.c 2011-08-07 18:47:58.000000000 -0700 @@ -1933,12 +1933,19 @@ Process p; if (findproc(pid, &j, &p, 0)) { - /* - * returns 0 for normal exit, else signal+128 - * in which case we should return that status. - */ - retval = waitforpid(pid, 1); - if (!retval) + if (j->stat & STAT_STOPPED) { + retval = (killjb(j, SIGCONT) != 0); + if (retval == 0) + makerunning(j); + } + if (retval == 0) { + /* + * returns 0 for normal exit, else signal+128 + * in which case we should return that status. + */ + retval = waitforpid(pid, 1); + } + if (retval == 0) retval = lastval2; } else if (isset(POSIXJOBS) && pid == lastpid && lastpid_status >= 0L) { } Now, here's an interesting tidbit: } } torch% jobs %?foo } jobs: job not found: ?foo } torch% jobs %?sleep } jobs: %?sleep: no such job } } Note the difference? The latter message means that getjob() found the } pipeline, but either it's _not_ STAT_INUSE or it _is_ STAT_NOPRINT. So } I think what we have here is a simple failure to communicate. The following is clearly not a complete fix and maybe is even wrong if a different problem is fixed elsewhere, but this at least allows the suspended pipeline to be manipulated with jobs/fg/bg/wait. --- ../zsh-forge/current/Src/exec.c 2011-07-27 01:13:48.000000000 -0700 +++ Src/exec.c 2011-08-07 19:07:59.000000000 -0700 @@ -2845,7 +2845,9 @@ /* This is a current shell procedure that didn't need to fork. * * This includes current shell procedures that are being exec'ed, * * as well as null execs. */ - jobtab[thisjob].stat |= STAT_CURSH|STAT_NOPRINT; + jobtab[thisjob].stat |= STAT_CURSH; + if (!jobtab[thisjob].procs) + jobtab[thisjob].stat |= STAT_NOPRINT; } else { /* This is an exec (real or fake) for an external command. * * Note that any form of exec means that the subshell is fake * Even with this patch, "... | read" followed by ^Z results in "read" left waiting for a pipeline that has been stopped. What seems to be going on here is that a job table entry was created for the pipeline and the forked-off left side made the group leader, but then that job table entry is overloaded to represent the current shell builtin that is also running as the right-hand-side. When the terminal driver generates a TSTP it hits the forked-off left side (via the group leader). This comes through zhandler() into update_job() which does notice that the job is STAT_CURSH, but in this case list_pipe == 0 because "read" is an ordinary builtin, not a loop or other compound construct; so STAT_CURSH is ignored, and the group leader is attached to the tty (I believe that's a no-op as it already is attached). The other side-effect of this snippet: if (jn->stat & STAT_CURSH) inforeground = 1; else if (job == thisjob) { lastval = val; inforeground = 2; } is that inforeground != 2, so at the end of update_job() when checking to see if the current shell should pretend to have been signaled, the test fails; but that doesn't matter for TSTP because only INT and QUIT are handled specially. When "read" is the tail of the pipe, the above all happens behind the scenes and then the I/O system call gets restarted, which is how the shell ends up stuck. I'm not sure how to escape from that, except maybe to have zhandler() kill the shell with a different signal from which the system call will not recover. When something like "true" is the tail of the pipe, we return into execpline at line 1500 (from waitjobs()), where list_pipe_job is set but list_pipe and list_pipe_child are not. If all three were nonzero, a dummy shell would be forked off as PWS described to act as the suspended job, but instead execpline() simply returns because the last job in the pipeline has exited. The only obvious thing I can think to do here is to note in zhandler() that we have STAT_CURSH but not list_pipe, and therefore SIGCONT the left-hand-side immediately and return as if no signal had occurred (possibly printing a warning about not being able to suspend the job, which is what happens elsewhere if pipe() or fork() fails). However, that could lead to a serious busy-loop if somehow TTIN or TTOU was the signal instead of TSTP.