From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 16833 invoked by alias); 11 Aug 2015 23:57:06 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 36112 Received: (qmail 3982 invoked from network); 11 Aug 2015 23:57:05 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2 autolearn=ham autolearn_force=no version=3.4.0 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:message-id:date:in-reply-to:comments :references:to:subject:mime-version:content-type; bh=u5W7upBBjQDPIgSelF8qovdm0WGhwf6sGPv3FRkwjJg=; b=DhTawLnUYFDde0z4zSi0pVG8Fq7QkqdT4s0eo1sfQRbK1YQJEIhWWJBErgRcECbihF y4w6KXYRP2dbvkJK69fHgeJhSW6L0YTQATIHxKIbEzu8UaxlSmNOez3buFy/kJR658M+ 5CsZODviKOgpikz7ZOgx2Y+1cOUMKl1Sr3O/+u237GeasN0HQ6U8x0LZiahC/twlSlFe r8MwKoz9DQYwlT/3CpxHRxrH6+CU1caC/N6UNNp5rqKPXCS9U6GHF8etySseqbi2l+Xy I7x6TCzX4mZd/fWwzf5trSclt20jF+hRNq2z/iIPOxsuCJ78z7VeeRkbVCwzQfrFGhu6 w8Lg== X-Gm-Message-State: ALoCoQmuMraqcmD/S6jM3+2ktP4EUmNLaeNrXeXfjGPU9IKh2YsRZ0Vc/RcMY8RusXYMdAX2+B4q X-Received: by 10.202.77.78 with SMTP id a75mr16203603oib.32.1439337419852; Tue, 11 Aug 2015 16:56:59 -0700 (PDT) From: Bart Schaefer Message-Id: <150811165655.ZM31504@torch.brasslantern.com> Date: Tue, 11 Aug 2015 16:56:55 -0700 In-Reply-To: <150731085638.ZM15733@torch.brasslantern.com> Comments: In reply to Bart Schaefer "Re: 5.0.8 regression when waiting for suspended jobs" (Jul 31, 8:56am) References: <87wpxhk970.fsf@gmail.com> <150730123904.ZM11774@torch.brasslantern.com> <87si84k9uf.fsf@gmail.com> <150731085638.ZM15733@torch.brasslantern.com> X-Mailer: OpenZMail Classic (0.9.2 24April2005) To: zsh-workers@zsh.org Subject: Re: 5.0.8 regression when waiting for suspended jobs MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii On Jul 31, 8:56am, Bart Schaefer wrote: } } zsh-5.0.7 } - "wait $!" blocks (looping on repeated wait3() nonzero) } - "wait %1" returns immediately } - "wait" returns immediately } } zsh-5.0.8 } - "wait $!" loops but also printing status every time } - "wait %1" returns immediately } - "wait" returns immediately I still only suspect what changed to make 5.0.8 different from 5.0.7 in this regard, but here's what's going on: - "wait $!" - bin_fg() calls waitforpid() which discovers the job is stopped and goes into a loop calling kill(pid, SIGCONT) to try to get the job to run again. In the 5.0.8 case, each time this happens the job briefly wakes up, gets stopped with SIGTTIN, thus causes another SIGCHLD to go to the parent zsh, which then prints the "suspended" message and loops right back to kill(pid, SIGCONT) again. All of this is exactly the same as in 5.0.7 except that because of the SIGCONT change in workers/35032 we notice the stopped -> continued -> stopped again status change and therefore print the new status even though it's actually the same as the last time we printed the status, because we skipped printing the "continued" status. Or so I surmise. - wait %1" - bin_fg() calls zwaitjob() which does NOT do kill(pid, SIGCONT) instead simply blocking forever waiting for a SIGCHLD that will never arrive. If a signal *is* received and the waiting shell is a subshell, *then* the awaited job is SIGCONT'd, but I don't recall why and it doesn't matter for this bug anyway. This does however raise the question of why zwaitjob() is not calling waitforpid(). If it did so, we'd have the ksh behavior for all three cases of "wait", and we could even add the bit where interrupting the wait sends the signal through to the waited-for job. - "wait" - bin_fg() goes into a loop calling zwaitjob() on every entry in the job table; i.e., identical to "wait %1" repeated for every job number. ====== So what do we do about this? Skip the SIGCONT in waitforpid()? Only try SIGCONT once in waitforpid() rather than every time around the loop? Some other thing involving the WIFCONTINUED() test? Assuming we work that out, should zwaitjob() be changed to use waitforpid(), or do we think someone is relying on the bash-like immediate return?