From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=ham autolearn_force=no version=3.4.2 Received: from primenet.com.au (ns1.primenet.com.au [203.24.36.2]) by inbox.vuxu.org (OpenSMTPD) with ESMTP id 57252687 for ; Sun, 23 Feb 2020 20:19:30 +0000 (UTC) Received: (qmail 2670 invoked by alias); 23 Feb 2020 20:19:23 -0000 Mailing-List: contact zsh-users-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Users List List-Post: List-Help: List-Unsubscribe: X-Seq: 24710 Received: (qmail 4319 invoked by uid 1010); 23 Feb 2020 20:19:22 -0000 X-Qmail-Scanner-Diagnostics: from know-smtprelay-omc-8.server.virginmedia.net by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.102.2/25731. spamassassin: 3.4.2. Clear:RC:0(80.0.253.72):SA:0(-2.0/5.0):. Processed in 2.554606 secs); 23 Feb 2020 20:19:22 -0000 X-Envelope-From: p.w.stephenson@ntlworld.com X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | Received-SPF: pass (ns1.primenet.com.au: SPF record at _smtprelay.virginmedia.com designates 80.0.253.72 as permitted sender) X-Originating-IP: [86.16.88.158] X-Authenticated-User: p.w.stephenson@ntlworld.com X-Spam: 0 X-Authority: v=2.3 cv=br5i+nSi c=1 sm=1 tr=0 a=MiHCjVqLJ44lE3bxSlffFQ==:117 a=MiHCjVqLJ44lE3bxSlffFQ==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=IkcTkHD0fZMA:10 a=-KLejTsmHKmlMGT6W1wA:9 a=QEXdDO2ut3YA:10 Message-ID: Subject: Re: zsh mysteriously suspending job with sudo From: Peter Stephenson To: zsh-users@zsh.org Date: Sun, 23 Feb 2020 20:18:41 +0000 In-Reply-To: <20200223140337.ol24h5ioisdulusw@chazelas.org> References: <20200223140337.ol24h5ioisdulusw@chazelas.org> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.5-0ubuntu0.18.04.1 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-CMAE-Envelope: MS4wfLsJmKC9nHOQznu+L3JCsKq70+1htLOIb8rnZU0b+Ua1EgAWQTiTDxmYZ2M4Q2ak1RPDjD6jjkt2n1R2Yh3npt3zJyLu7SwWGnnk/6v9f7HOrn73GrAL F44AYeB0ZtW6wE+EpYuONFf4D0FQQ2jk97ZrFeiWl4TTtKwdNAbIa5gV On Sun, 2020-02-23 at 14:03 +0000, Stephane Chazelas wrote: > 2020-02-22 18:09:13 -0700, Ronan Pigott: > Can be reproduced with: > > $ sleep 1 | sudo sh -c 'sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" /proc/self/stat /dev/tty' > UID PID PPID PGID SID C STIME TTY TIME CMD > chazelas 25430 8308 25430 25430 0 13:44 pts/1 00:00:00 /bin/zsh > root 26867 25430 26866 25430 0 13:46 pts/1 00:00:00 sudo sh -c sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" /proc/self/stat /dev/tty > root 26868 26867 26866 25430 0 13:46 pts/1 00:00:00 sh -c sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" /proc/self/stat /dev/tty > root 26871 26868 26866 25430 0 13:46 pts/1 00:00:00 ps -jfHt /dev/pts/1 > 25430 > zsh: done sleep 1 | > zsh: suspended (tty input) sudo sh -c > > As seen above, at the time "ps" is run (and awk later), the > foreground process group of the terminal is 25430 which is the > pgid of the main shell, not the pgid of the foreground job > (26866), which is why that job gets a SIGTTIN when awk tries to > read from the terminal (or SIGTTOU when pacman does an ioctl to > the terminal). > > From "strace", it seems it's because when "sleep 1" (the process > group leader) finishes, zsh does a kill(-26866,0), presumably to > check that the process group is still alive, but that fails with > EPERM as there are processes running as root in that group, and > then zsh changes the foreground process group back to the main > shell's. > > So it seems indeed to be a bug in zsh. > > I suppose an easy fix would be to check for an errno of ESRCH > when kill(-pgid,0) fails to make sure it's because the process > group is gone. Thanks, this is probably something like the following --- it hits a number of other places doing something similar which look like they need the same treatment, though I'm happy to be told otherwise, but the one in signals.c is the key one here. I don't know if we really need the different cases of killpg(pgrp, 0) and kill(-pgprp, 0)? > But, here the shell should be able to know that > the job is not gone as sudo, a direct children of the shell has > not returned yet, so there's probably something wrong with the > logic in the first place. I wouldn't have the first clue how to begin checking that there is some appropriate set of processes associated with the process group still there that would allow this in a sufficiently race-free manner, but if anyone does have a clue they are welcome to try. pws diff --git a/Src/exec.c b/Src/exec.c index 50027654a..0d321b757 100644 --- a/Src/exec.c +++ b/Src/exec.c @@ -1036,7 +1036,8 @@ entersubsh(int flags, struct entersubsh_ret *retp) } else if (thisjob != -1 && (flags & ESUB_PGRP)) { if (jobtab[list_pipe_job].gleader && (list_pipe || list_pipe_child)) { if (setpgrp(0L, jobtab[list_pipe_job].gleader) == -1 || - killpg(jobtab[list_pipe_job].gleader, 0) == -1) { + (killpg(jobtab[list_pipe_job].gleader, 0) == -1 && + errno == ESRCH)) { jobtab[list_pipe_job].gleader = jobtab[thisjob].gleader = (list_pipe_child ? mypgrp : getpid()); setpgrp(0L, jobtab[list_pipe_job].gleader); diff --git a/Src/jobs.c b/Src/jobs.c index e7438251e..182e65f39 100644 --- a/Src/jobs.c +++ b/Src/jobs.c @@ -283,7 +283,8 @@ handle_sub(int job, int fg) if ((cp = ((WIFEXITED(jn->procs->status) || WIFSIGNALED(jn->procs->status)) && - killpg(jn->gleader, 0) == -1))) { + (killpg(jn->gleader, 0) == -1 && + errno == ESRCH)))) { Process p; for (p = jn->procs; p->next; p = p->next); jn->gleader = p->pid; @@ -541,9 +542,13 @@ update_job(Job jn) /* is this job in the foreground of an interactive shell? */ if (mypgrp != pgrp && inforeground && - (jn->gleader == pgrp || (pgrp > 1 && kill(-pgrp, 0) == -1))) { + (jn->gleader == pgrp || + (pgrp > 1 && + (kill(-pgrp, 0) == -1 && errno == ESRCH)))) { if (list_pipe) { - if (somestopped || (pgrp > 1 && kill(-pgrp, 0) == -1)) { + if (somestopped || (pgrp > 1 && + kill(-pgrp, 0) == -1 && + errno == ESRCH)) { attachtty(mypgrp); /* check window size and adjust if necessary */ adjustwinsize(0); @@ -2469,7 +2474,8 @@ bin_fg(char *name, char **argv, Options ops, int func) if ((jobtab[job].stat & STAT_SUPERJOB) && ((!jobtab[job].procs->next || (jobtab[job].stat & STAT_SUBLEADER) || - killpg(jobtab[job].gleader, 0) == -1)) && + (killpg(jobtab[job].gleader, 0) == -1 && + errno == ESRCH))) && jobtab[jobtab[job].other].gleader) attachtty(jobtab[jobtab[job].other].gleader); else diff --git a/Src/signals.c b/Src/signals.c index 96ff9e9b3..4adf03202 100644 --- a/Src/signals.c +++ b/Src/signals.c @@ -539,7 +539,8 @@ wait_for_processes(void) #endif if (WIFEXITED(status) && pn->pid == jn->gleader && - killpg(pn->pid, 0) == -1) { + killpg(pn->pid, 0) == -1 && + errno == ESRCH) { if (last_attached_pgrp == jn->gleader && !(jn->stat & STAT_NOSTTY)) { /*