From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12315 invoked by alias); 15 Aug 2015 01:56:40 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 36180 Received: (qmail 8238 invoked from network); 15 Aug 2015 01:56:39 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.0 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:message-id:date:in-reply-to:comments :references:to:subject:mime-version:content-type; bh=3HF9UOeynE98oQEqQ7KT9NZUK9aUwUF768GmXYpI8+g=; b=Aef7IoF9esylnJeZ+DHVoiYIEN+neqDbx19ZaVh00IG5AGSp6QtG36RbnTZq64csFE ZcV8HV7hx6agHEyVEj2SG3eJpiMxGLs9gqS9v2OpAd+JlHxZK1emRQblxjRtfpIBtBKF sHnrIBmaVYB4cblO4FC2BrIusNi8hyjol9FbtpDZXvryhTpA+hJeu7vcSimU0E1RLtDh TOo3aB4+Jh5Vz3t9J5nzglwlQwZF3bOAZbBFBKWyxKpupWqG7ZnC25c1/sYHTzONgAk9 kc4IvD/1bcKzkCGVdtj2MRaLwfZP6H39jEEMk2taM0TTy09jQT7D6Ai6fjHXTKrPb2WZ 5yLQ== X-Gm-Message-State: ALoCoQnbiOPNyOkYDzuB1N7mPhgywCJHy1caAM9PcRhPH2YY9l39a2N4cCAGRcq3EluneDGPh5xO X-Received: by 10.60.79.193 with SMTP id l1mr41468631oex.60.1439603795519; Fri, 14 Aug 2015 18:56:35 -0700 (PDT) From: Bart Schaefer Message-Id: <150814185632.ZM16733@torch.brasslantern.com> Date: Fri, 14 Aug 2015 18:56:31 -0700 In-Reply-To: <150812103100.ZM13899@torch.brasslantern.com> Comments: In reply to Bart Schaefer "Re: 5.0.8 regression when waiting for suspended jobs" (Aug 12, 10:31am) References: <87wpxhk970.fsf@gmail.com> <150730123904.ZM11774@torch.brasslantern.com> <87si84k9uf.fsf@gmail.com> <150731085638.ZM15733@torch.brasslantern.com> <150811165655.ZM31504@torch.brasslantern.com> <20150812104351.65a4cbea@pwslap01u.europe.root.pri> <150812075858.ZM32741@torch.brasslantern.com> <20150812170904.2c0e78de@pwslap01u.europe.root.pri> <150812103100.ZM13899@torch.brasslantern.com> X-Mailer: OpenZMail Classic (0.9.2 24April2005) To: zsh-workers@zsh.org Subject: Re: 5.0.8 regression when waiting for suspended jobs MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Long ramble with a question at the end. On Aug 12, 10:31am, Bart Schaefer wrote: } } This is a bit ugly, suggestions welcome. In the "is a child but has no } job table entry" case, this won't SIGCONT the child where previous it } would have, so there's that. I've been trying to clean this up and have some ideas, but it occurs to me that kill(pid, SIGCONT) might not be harmless if the process belongs to the same user but is not a child of the current shell, or even if it is a child of the current shell but for some reason has trapped CONT. Practically speaking "wait" never gets as far as calling waitforpid() on a process that is not a child, so we shouldn't need to worry about that except in the abstract (e.g., some future programmer decides to call waitforpid() in some other context). However waitforpid() is used by getoutput() and getoutputfile() which do not create job table entries, so we DO need to deal with that, and if one of those is stopped it seems wise to continue it. Problem is we can't tell whether one of those is stopped or not, so it's a toss- up whether to send the SIGCONT. Doing so only on the second+ time around the loop (the old behavior) seems to be making the assumption that we got SIGCHLD as a result of the job being stopped. If we got SIGCHLD because the job actually exited, we'd have failed the kill(pid, 0) on the next attempt and left the loop. However, if the job is already stopped before we go into waitpid(), that SIGCHLD has likely already been processed, so we'll block in signal_suspend() without sending SIGCONT. [Another bash aside -- "wait $pid" for a stopped job returns immediately with no message, but: $ wait %1 bash: warning: wait_for_job: job 1 is stopped This warning doesn't appear if the wait is already running at the time the child is stopped; in that case you get "[1]+ Stopped ..." as if the job had been running in the foreground.] Now consider this pathological case: % ( while : ; do kill -STOP $sysparams[pid]; done ) & Even with the SIGTT* test in place, waiting for that is an infinite loop if SIGCONT is being sent. Finally, and here's the kicker, bin_fg() ALREADY attempts to SIGCONT the job before calling waitforpid(). Thus I think we have two choices: (1) Go for a minimal change of sending the SIGCONT only when !wait_cmd, which ends up with us blocking forever as with 5.0.7 and before (but blocked on signal_suspend() rather than busy-waiting, which is good). (2) Actually determine whether the job is WIFSTOPPED() and break the loop; the job status is printed when returning to the prompt. Patch here for the minimal change. Man that's a lot of analysis to come up with a two-word edit. diff --git a/Src/jobs.c b/Src/jobs.c index ed647b8..b47ba8c 100644 --- a/Src/jobs.c +++ b/Src/jobs.c @@ -1384,10 +1384,17 @@ waitforpid(pid_t pid, int wait_cmd) dont_queue_signals(); child_block(); /* unblocked in signal_suspend() */ queue_traps(wait_cmd); + + /* This function should never be called with a pid that is not a + * child of the current shell. Consequently, if kill(0, pid) + * fails here with ESRCH, the child has already been reaped. In + * the loop body, we expect this to happen in signal_suspend() + * via zhandler(), after which this test terminates the loop. + */ while (!errflag && (kill(pid, 0) >= 0 || errno != ESRCH)) { if (first) first = 0; - else + else if (!wait_cmd) kill(pid, SIGCONT); last_signal = -1;