* Re: zsh mysteriously suspending job with sudo [not found] <CAG4d5Bg=vkj6PYYGVyNO3pWTh=RkT=2rTFdE_hPBj1xxz6YTSw__29611.299622205$1582420251$gmane$org@mail.gmail.com> @ 2020-02-23 14:03 ` Stephane Chazelas 2020-02-23 18:53 ` Ronan Pigott 2020-02-23 20:18 ` Peter Stephenson 0 siblings, 2 replies; 8+ messages in thread From: Stephane Chazelas @ 2020-02-23 14:03 UTC (permalink / raw) To: Ronan Pigott; +Cc: Zsh Users 2020-02-22 18:09:13 -0700, Ronan Pigott: [...] > $ sudo true > [sudo] password for ronan: > $ pacman -Qttdq | sudo pacman -Rns - > [...] > :: Do you want to remove these packages? [Y/n] zsh: done > pacman -Qttdq | > zsh: suspended (tty output) sudo pacman -Rns - [...] Can be reproduced with: $ sleep 1 | sudo sh -c 'sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" /proc/self/stat /dev/tty' UID PID PPID PGID SID C STIME TTY TIME CMD chazelas 25430 8308 25430 25430 0 13:44 pts/1 00:00:00 /bin/zsh root 26867 25430 26866 25430 0 13:46 pts/1 00:00:00 sudo sh -c sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" /proc/self/stat /dev/tty root 26868 26867 26866 25430 0 13:46 pts/1 00:00:00 sh -c sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" /proc/self/stat /dev/tty root 26871 26868 26866 25430 0 13:46 pts/1 00:00:00 ps -jfHt /dev/pts/1 25430 zsh: done sleep 1 | zsh: suspended (tty input) sudo sh -c As seen above, at the time "ps" is run (and awk later), the foreground process group of the terminal is 25430 which is the pgid of the main shell, not the pgid of the foreground job (26866), which is why that job gets a SIGTTIN when awk tries to read from the terminal (or SIGTTOU when pacman does an ioctl to the terminal). From "strace", it seems it's because when "sleep 1" (the process group leader) finishes, zsh does a kill(-26866,0), presumably to check that the process group is still alive, but that fails with EPERM as there are processes running as root in that group, and then zsh changes the foreground process group back to the main shell's. So it seems indeed to be a bug in zsh. I suppose an easy fix would be to check for an errno of ESRCH when kill(-pgid,0) fails to make sure it's because the process group is gone. But, here the shell should be able to know that the job is not gone as sudo, a direct children of the shell has not returned yet, so there's probably something wrong with the logic in the first place. -- Stephane ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: zsh mysteriously suspending job with sudo 2020-02-23 14:03 ` zsh mysteriously suspending job with sudo Stephane Chazelas @ 2020-02-23 18:53 ` Ronan Pigott 2020-02-23 23:57 ` Ronan Pigott 2020-02-23 20:18 ` Peter Stephenson 1 sibling, 1 reply; 8+ messages in thread From: Ronan Pigott @ 2020-02-23 18:53 UTC (permalink / raw) To: stephane, Zsh Users [-- Attachment #1: Type: text/plain, Size: 2571 bytes --] Wow! Thank you Stephane for your analysis. So you think that a patch checking for ESRCH is not suitable? I tried poking around in Src/jobs.c in update_job where it sounds like this issue originates, but I'm not quite sure how the logic should be revised. Is there somewhere else where this bug should be reported? Thanks again for your help! On Sun, Feb 23, 2020 at 7:03 AM Stephane Chazelas <stephane@chazelas.org> wrote: > 2020-02-22 18:09:13 -0700, Ronan Pigott: > [...] > > $ sudo true > > [sudo] password for ronan: > > $ pacman -Qttdq | sudo pacman -Rns - > > [...] > > :: Do you want to remove these packages? [Y/n] zsh: done > > pacman -Qttdq | > > zsh: suspended (tty output) sudo pacman -Rns - > [...] > > Can be reproduced with: > > $ sleep 1 | sudo sh -c 'sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" > /proc/self/stat /dev/tty' > UID PID PPID PGID SID C STIME TTY TIME CMD > chazelas 25430 8308 25430 25430 0 13:44 pts/1 00:00:00 /bin/zsh > root 26867 25430 26866 25430 0 13:46 pts/1 00:00:00 sudo sh -c > sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" /proc/self/stat /dev/tty > root 26868 26867 26866 25430 0 13:46 pts/1 00:00:00 sh -c > sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" /proc/self/stat /dev/tty > root 26871 26868 26866 25430 0 13:46 pts/1 00:00:00 ps -jfHt > /dev/pts/1 > 25430 > zsh: done sleep 1 | > zsh: suspended (tty input) sudo sh -c > > As seen above, at the time "ps" is run (and awk later), the > foreground process group of the terminal is 25430 which is the > pgid of the main shell, not the pgid of the foreground job > (26866), which is why that job gets a SIGTTIN when awk tries to > read from the terminal (or SIGTTOU when pacman does an ioctl to > the terminal). > > From "strace", it seems it's because when "sleep 1" (the process > group leader) finishes, zsh does a kill(-26866,0), presumably to > check that the process group is still alive, but that fails with > EPERM as there are processes running as root in that group, and > then zsh changes the foreground process group back to the main > shell's. > > So it seems indeed to be a bug in zsh. > > I suppose an easy fix would be to check for an errno of ESRCH > when kill(-pgid,0) fails to make sure it's because the process > group is gone. But, here the shell should be able to know that > the job is not gone as sudo, a direct children of the shell has > not returned yet, so there's probably something wrong with the > logic in the first place. > > -- > Stephane > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: zsh mysteriously suspending job with sudo 2020-02-23 18:53 ` Ronan Pigott @ 2020-02-23 23:57 ` Ronan Pigott 2020-05-14 21:15 ` Ronan Pigott 0 siblings, 1 reply; 8+ messages in thread From: Ronan Pigott @ 2020-02-23 23:57 UTC (permalink / raw) To: Zsh Users [-- Attachment #1: Type: text/plain, Size: 2829 bytes --] Brilliant! Thank you Peter for the patch. I have applied it to my zsh and pacman is working again. On Sun, Feb 23, 2020 at 11:53 AM Ronan Pigott <rpigott314@gmail.com> wrote: > Wow! Thank you Stephane for your analysis. > > So you think that a patch checking for ESRCH is not suitable? > > I tried poking around in Src/jobs.c in update_job where > it sounds like this issue originates, but I'm not quite sure > how the logic should be revised. > > Is there somewhere else where this bug should be reported? > > Thanks again for your help! > > > On Sun, Feb 23, 2020 at 7:03 AM Stephane Chazelas <stephane@chazelas.org> > wrote: > >> 2020-02-22 18:09:13 -0700, Ronan Pigott: >> [...] >> > $ sudo true >> > [sudo] password for ronan: >> > $ pacman -Qttdq | sudo pacman -Rns - >> > [...] >> > :: Do you want to remove these packages? [Y/n] zsh: done >> > pacman -Qttdq | >> > zsh: suspended (tty output) sudo pacman -Rns - >> [...] >> >> Can be reproduced with: >> >> $ sleep 1 | sudo sh -c 'sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" >> /proc/self/stat /dev/tty' >> UID PID PPID PGID SID C STIME TTY TIME CMD >> chazelas 25430 8308 25430 25430 0 13:44 pts/1 00:00:00 /bin/zsh >> root 26867 25430 26866 25430 0 13:46 pts/1 00:00:00 sudo sh -c >> sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" /proc/self/stat /dev/tty >> root 26868 26867 26866 25430 0 13:46 pts/1 00:00:00 sh -c >> sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" /proc/self/stat /dev/tty >> root 26871 26868 26866 25430 0 13:46 pts/1 00:00:00 ps >> -jfHt /dev/pts/1 >> 25430 >> zsh: done sleep 1 | >> zsh: suspended (tty input) sudo sh -c >> >> As seen above, at the time "ps" is run (and awk later), the >> foreground process group of the terminal is 25430 which is the >> pgid of the main shell, not the pgid of the foreground job >> (26866), which is why that job gets a SIGTTIN when awk tries to >> read from the terminal (or SIGTTOU when pacman does an ioctl to >> the terminal). >> >> From "strace", it seems it's because when "sleep 1" (the process >> group leader) finishes, zsh does a kill(-26866,0), presumably to >> check that the process group is still alive, but that fails with >> EPERM as there are processes running as root in that group, and >> then zsh changes the foreground process group back to the main >> shell's. >> >> So it seems indeed to be a bug in zsh. >> >> I suppose an easy fix would be to check for an errno of ESRCH >> when kill(-pgid,0) fails to make sure it's because the process >> group is gone. But, here the shell should be able to know that >> the job is not gone as sudo, a direct children of the shell has >> not returned yet, so there's probably something wrong with the >> logic in the first place. >> >> -- >> Stephane >> > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: zsh mysteriously suspending job with sudo 2020-02-23 23:57 ` Ronan Pigott @ 2020-05-14 21:15 ` Ronan Pigott 2020-05-15 9:08 ` Peter Stephenson 0 siblings, 1 reply; 8+ messages in thread From: Ronan Pigott @ 2020-05-14 21:15 UTC (permalink / raw) To: Zsh Users, stephane [-- Attachment #1: Type: text/plain, Size: 779 bytes --] Is it possible to apply Peter's ESRCH patch to master? I've run this patch for a while and I think many pacman users could benefit. On Sun, Feb 23, 2020 at 4:57 PM Ronan Pigott <rpigott314@gmail.com> wrote: > Brilliant! Thank you Peter for the patch. > > I have applied it to my zsh and pacman is working again. > > On Sun, Feb 23, 2020 at 11:53 AM Ronan Pigott <rpigott314@gmail.com> > wrote: > >> Wow! Thank you Stephane for your analysis. >> >> So you think that a patch checking for ESRCH is not suitable? >> >> I tried poking around in Src/jobs.c in update_job where >> it sounds like this issue originates, but I'm not quite sure >> how the logic should be revised. >> >> Is there somewhere else where this bug should be reported? >> >> Thanks again for your help! >> > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: zsh mysteriously suspending job with sudo 2020-05-14 21:15 ` Ronan Pigott @ 2020-05-15 9:08 ` Peter Stephenson 2020-05-15 9:20 ` Ronan Pigott 0 siblings, 1 reply; 8+ messages in thread From: Peter Stephenson @ 2020-05-15 9:08 UTC (permalink / raw) To: Ronan Pigott, Zsh Users, stephane > On 14 May 2020 at 22:15 Ronan Pigott <rpigott314@gmail.com> wrote: > Is it possible to apply Peter's ESRCH patch to master? It's been there for a while now. pws ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: zsh mysteriously suspending job with sudo 2020-05-15 9:08 ` Peter Stephenson @ 2020-05-15 9:20 ` Ronan Pigott 0 siblings, 0 replies; 8+ messages in thread From: Ronan Pigott @ 2020-05-15 9:20 UTC (permalink / raw) To: Peter Stephenson, Zsh Users [-- Attachment #1: Type: text/plain, Size: 321 bytes --] Oh, indeed it is. Somehow I missed that. Thanks again! On Fri, May 15, 2020 at 2:08 AM Peter Stephenson < p.w.stephenson@ntlworld.com> wrote: > > On 14 May 2020 at 22:15 Ronan Pigott <rpigott314@gmail.com> wrote: > > Is it possible to apply Peter's ESRCH patch to master? > > It's been there for a while now. > > pws > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: zsh mysteriously suspending job with sudo 2020-02-23 14:03 ` zsh mysteriously suspending job with sudo Stephane Chazelas 2020-02-23 18:53 ` Ronan Pigott @ 2020-02-23 20:18 ` Peter Stephenson 1 sibling, 0 replies; 8+ messages in thread From: Peter Stephenson @ 2020-02-23 20:18 UTC (permalink / raw) To: zsh-users On Sun, 2020-02-23 at 14:03 +0000, Stephane Chazelas wrote: > 2020-02-22 18:09:13 -0700, Ronan Pigott: > Can be reproduced with: > > $ sleep 1 | sudo sh -c 'sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" /proc/self/stat /dev/tty' > UID PID PPID PGID SID C STIME TTY TIME CMD > chazelas 25430 8308 25430 25430 0 13:44 pts/1 00:00:00 /bin/zsh > root 26867 25430 26866 25430 0 13:46 pts/1 00:00:00 sudo sh -c sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" /proc/self/stat /dev/tty > root 26868 26867 26866 25430 0 13:46 pts/1 00:00:00 sh -c sleep 2; ps -jfHt "$(tty<&2)"; awk "{print \$8}" /proc/self/stat /dev/tty > root 26871 26868 26866 25430 0 13:46 pts/1 00:00:00 ps -jfHt /dev/pts/1 > 25430 > zsh: done sleep 1 | > zsh: suspended (tty input) sudo sh -c > > As seen above, at the time "ps" is run (and awk later), the > foreground process group of the terminal is 25430 which is the > pgid of the main shell, not the pgid of the foreground job > (26866), which is why that job gets a SIGTTIN when awk tries to > read from the terminal (or SIGTTOU when pacman does an ioctl to > the terminal). > > From "strace", it seems it's because when "sleep 1" (the process > group leader) finishes, zsh does a kill(-26866,0), presumably to > check that the process group is still alive, but that fails with > EPERM as there are processes running as root in that group, and > then zsh changes the foreground process group back to the main > shell's. > > So it seems indeed to be a bug in zsh. > > I suppose an easy fix would be to check for an errno of ESRCH > when kill(-pgid,0) fails to make sure it's because the process > group is gone. Thanks, this is probably something like the following --- it hits a number of other places doing something similar which look like they need the same treatment, though I'm happy to be told otherwise, but the one in signals.c is the key one here. I don't know if we really need the different cases of killpg(pgrp, 0) and kill(-pgprp, 0)? > But, here the shell should be able to know that > the job is not gone as sudo, a direct children of the shell has > not returned yet, so there's probably something wrong with the > logic in the first place. I wouldn't have the first clue how to begin checking that there is some appropriate set of processes associated with the process group still there that would allow this in a sufficiently race-free manner, but if anyone does have a clue they are welcome to try. pws diff --git a/Src/exec.c b/Src/exec.c index 50027654a..0d321b757 100644 --- a/Src/exec.c +++ b/Src/exec.c @@ -1036,7 +1036,8 @@ entersubsh(int flags, struct entersubsh_ret *retp) } else if (thisjob != -1 && (flags & ESUB_PGRP)) { if (jobtab[list_pipe_job].gleader && (list_pipe || list_pipe_child)) { if (setpgrp(0L, jobtab[list_pipe_job].gleader) == -1 || - killpg(jobtab[list_pipe_job].gleader, 0) == -1) { + (killpg(jobtab[list_pipe_job].gleader, 0) == -1 && + errno == ESRCH)) { jobtab[list_pipe_job].gleader = jobtab[thisjob].gleader = (list_pipe_child ? mypgrp : getpid()); setpgrp(0L, jobtab[list_pipe_job].gleader); diff --git a/Src/jobs.c b/Src/jobs.c index e7438251e..182e65f39 100644 --- a/Src/jobs.c +++ b/Src/jobs.c @@ -283,7 +283,8 @@ handle_sub(int job, int fg) if ((cp = ((WIFEXITED(jn->procs->status) || WIFSIGNALED(jn->procs->status)) && - killpg(jn->gleader, 0) == -1))) { + (killpg(jn->gleader, 0) == -1 && + errno == ESRCH)))) { Process p; for (p = jn->procs; p->next; p = p->next); jn->gleader = p->pid; @@ -541,9 +542,13 @@ update_job(Job jn) /* is this job in the foreground of an interactive shell? */ if (mypgrp != pgrp && inforeground && - (jn->gleader == pgrp || (pgrp > 1 && kill(-pgrp, 0) == -1))) { + (jn->gleader == pgrp || + (pgrp > 1 && + (kill(-pgrp, 0) == -1 && errno == ESRCH)))) { if (list_pipe) { - if (somestopped || (pgrp > 1 && kill(-pgrp, 0) == -1)) { + if (somestopped || (pgrp > 1 && + kill(-pgrp, 0) == -1 && + errno == ESRCH)) { attachtty(mypgrp); /* check window size and adjust if necessary */ adjustwinsize(0); @@ -2469,7 +2474,8 @@ bin_fg(char *name, char **argv, Options ops, int func) if ((jobtab[job].stat & STAT_SUPERJOB) && ((!jobtab[job].procs->next || (jobtab[job].stat & STAT_SUBLEADER) || - killpg(jobtab[job].gleader, 0) == -1)) && + (killpg(jobtab[job].gleader, 0) == -1 && + errno == ESRCH))) && jobtab[jobtab[job].other].gleader) attachtty(jobtab[jobtab[job].other].gleader); else diff --git a/Src/signals.c b/Src/signals.c index 96ff9e9b3..4adf03202 100644 --- a/Src/signals.c +++ b/Src/signals.c @@ -539,7 +539,8 @@ wait_for_processes(void) #endif if (WIFEXITED(status) && pn->pid == jn->gleader && - killpg(pn->pid, 0) == -1) { + killpg(pn->pid, 0) == -1 && + errno == ESRCH) { if (last_attached_pgrp == jn->gleader && !(jn->stat & STAT_NOSTTY)) { /* ^ permalink raw reply [flat|nested] 8+ messages in thread
* zsh mysteriously suspending job with sudo @ 2020-02-23 1:09 Ronan Pigott 0 siblings, 0 replies; 8+ messages in thread From: Ronan Pigott @ 2020-02-23 1:09 UTC (permalink / raw) To: Zsh Users [-- Attachment #1: Type: text/plain, Size: 2405 bytes --] Hello zsh users, I've encountered an unusual bug using zsh and I finally have a reliable reproduction so I'm looking for some insight. The issue is as follows: === $ sudo true [sudo] password for ronan: $ pacman -Qttdq | sudo pacman -Rns - [...] :: Do you want to remove these packages? [Y/n] zsh: done pacman -Qttdq | zsh: suspended (tty output) sudo pacman -Rns - === When I try to pipe a package list for pacman to read from stdin, the job is mysteriously suspended by zsh when "pacman -Rns -" prompts [Y/n] for confirmation to continue, before I have a chance to input anything. I have not input the SUSP character. The issue is also reproducible from a 'zsh -f' in a clean environment. Trying to 'fg' the suspended job always results in the following error: === $ fg[1] + done pacman -Qttdq | continued sudo pacman -Rns - zsh: can't set tty pgrp: no such process === I can only reproduce in zsh, not bash. The issue also only occurs when sudo *does not* prompt for a password, which is why I use a "sudo true" to temporarily authorize my user before demonstrating the issue. Bizarrely, the following examples all work as intended: === # with an extra file $ pacman -Qttdq > packages.txt; cat packages.txt | sudo pacman -Rns - [...] :: Do you want to remove these packages? [Y/n] # with an extra cat $ pacman -Qttdq | cat - | sudo pacman -Rns - [...] :: Do you want to remove these packages? [Y/n] # with process substitution $ sudo pacman -Rns - < <(pacman -Qttdq) [...] :: Do you want to remove these packages? [Y/n] === To exonerate pacman, a user from #archlinux helped create a script that exhibits similar behavior: === $ cat read-and-prompt.sh#!/bin/sh -- exec 3<&1 while IFS= read -r input; do printf '%s\n' "$input" >&3 done exec < /dev/tty printf 'prompt: ' >&2 read -r _ exec "$@" <&3 $ for a in a b c; do print "$a"; sleep 1; done | sudo ./read-and-prompt.sh cat a b c prompt:zsh: done for a in a b c; do; print "$a"; sleep 1; done | zsh: suspended (tty input) sudo ./read-and-prompt.sh cat === They also report that they cannot reproduce using doas in place of sudo. Considering the above, I believe my issue is a bug in either sudo or zsh, hence why I am asking here. Why is zsh suspending this job? My understanding is that jobs are suspended only in response to a signal, but I do not know where it could be generated in this case. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-05-15 9:21 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CAG4d5Bg=vkj6PYYGVyNO3pWTh=RkT=2rTFdE_hPBj1xxz6YTSw__29611.299622205$1582420251$gmane$org@mail.gmail.com> 2020-02-23 14:03 ` zsh mysteriously suspending job with sudo Stephane Chazelas 2020-02-23 18:53 ` Ronan Pigott 2020-02-23 23:57 ` Ronan Pigott 2020-05-14 21:15 ` Ronan Pigott 2020-05-15 9:08 ` Peter Stephenson 2020-05-15 9:20 ` Ronan Pigott 2020-02-23 20:18 ` Peter Stephenson 2020-02-23 1:09 Ronan Pigott
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).