* Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze
@ 2013-12-20 19:24 Axel Beckert
2013-12-20 20:27 ` Bart Schaefer
0 siblings, 1 reply; 17+ messages in thread
From: Axel Beckert @ 2013-12-20 19:24 UTC (permalink / raw)
To: zsh-workers
Hi,
this has been reported[1] against zsh 5.0.3 in Debian today. Looks like
some regression in 5.0.3 on a first glance. Maybe related to the
test-suite freezes which have been reported on a few architectures.
[1] http://bugs.debian.org/732726
Since 5.0.4 seems close, I thought I'd better forward it soon despite
I haven't looked at it closer yet.
----- Forwarded message from Vincent Lefevre <vincent@vinc17.net> -----
Date: Fri, 20 Dec 2013 19:14:36 +0100
From: Vincent Lefevre <vincent@vinc17.net>
Subject: [Pkg-zsh-devel] Bug#732726: zsh function freeze
Reply-To: Vincent Lefevre <vincent@vinc17.net>, 732726@bugs.debian.org
Package: zsh
Version: 5.0.3-1
Severity: important
This is an important regression (no problems after downgrading to
5.0.2-6), always reproducible.
With a private SVN repository, "svncdiff -c65935 ~/wd | head" freezes, where
svncdiff is the following zsh function (I have "autoload -U svncdiff"):
------------------------------------------------------------------------
#!/usr/bin/env zsh
# Wrapper to "svn diff", written by Vincent Lefevre <vincent@vinc17.org>
# Needs my tdiff utility to process the diff output; otherwise you need
# to remove the "| tdiff" at the end.
# Example of svncdiff usage:
# svncdiff -5 -x -p file
# for 5 lines of unified context and function information.
emulate -LR zsh
local -a args xopt
setopt EXTENDED_GLOB
while [[ $# -ge 1 ]]
do
if [[ "x$1" == x-[0-9]# ]] then
args=($args --diff-cmd diff)
xopt=($xopt -U${1[2,-1]})
elif [[ $# -ge 2 && "x$1" == x-x ]] then
shift
xopt=($xopt $1)
else
args=($args $1)
fi
shift
done
[[ $#xopt -ge 1 ]] && args=(-x "$xopt" $args)
svnwrapper diff "$args[@]" | tdiff
# $Id: svncdiff 38442 2010-08-05 11:41:16Z vinc17/ypig $
------------------------------------------------------------------------
The dependencies can be found on <https://www.vinc17.net/unix/>.
When svncdiff is called as a script, this is no such problem.
With my tps utility, I can observe:
10987 sshd: vlefevre@pts/2
└─> 10988 -zsh
└─> 11262 -zsh
├─> 11264 zsh -f -- /home/vlefevre/bin/svnwrapper diff -c65935 /home/vlefevre/wd
│ └─> 11269 svn diff -c65935 /home/vlefevre/wd
│ ├─> 11273 zsh /home/vlefevre/scripts/ssh mysvn svnserve -t
│ │ ├─> 11295 cat
│ │ └─> 11298 ssh -F /home/vlefevre/.ssh/config -C mysvn svnserve -t
│ └─> 11299 zsh /home/vlefevre/scripts/ssh mysvn svnserve -t
│ ├─> 11317 cat
│ └─> 11320 ssh -F /home/vlefevre/.ssh/config -C mysvn svnserve -t
└─> 11265 perl /home/vlefevre/bin/tdiff
I think that this occurs on big changesets.
$ svncdiff -c65935 ~/wd | wc
2166 14746 212680
I'll try to investigate, but any idea about which zsh change could
trigger the problem?
-- System Information:
Debian Release: jessie/sid
APT prefers unstable
APT policy: (500, 'unstable'), (500, 'testing'), (500, 'stable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386
Kernel: Linux 3.11-2-amd64 (SMP w/8 CPU cores)
Locale: LANG=POSIX, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages zsh depends on:
ii libc6 2.17-97
ii libcap2 1:2.22-1.2
ii libtinfo5 5.9+20130608-1
ii zsh-common 5.0.3-1
Versions of packages zsh recommends:
ii libncursesw5 5.9+20130608-1
ii libpcre3 1:8.31-2
Versions of packages zsh suggests:
ii zsh-doc 5.0.3-1
-- no debconf information
----- End forwarded message -----
----- Forwarded message from Vincent Lefevre <vincent@vinc17.net> -----
Date: Fri, 20 Dec 2013 19:19:18 +0100
From: Vincent Lefevre <vincent@vinc17.net>
To: 732726@bugs.debian.org
Subject: [Pkg-zsh-devel] Bug#732726: zsh function freeze
Reply-To: Vincent Lefevre <vincent@vinc17.net>, 732726@bugs.debian.org
On 2013-12-20 19:14:36 +0100, Vincent Lefevre wrote:
> I think that this occurs on big changesets.
Indeed I can reproduce it with:
svncdiff -c 8540 svn://scm.gforge.inria.fr/svn/mpfr/trunk | head
----- End forwarded message -----
Kind regards, Axel
--
/~\ Plain Text Ribbon Campaign | Axel Beckert
\ / Say No to HTML in E-Mail and News | abe@deuxchevaux.org (Mail)
X See http://www.asciiribbon.org/ | abe@noone.org (Mail+Jabber)
/ \ I love long mails: http://email.is-not-s.ms/ | http://noone.org/abe/ (Web)
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze 2013-12-20 19:24 Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze Axel Beckert @ 2013-12-20 20:27 ` Bart Schaefer 2013-12-20 20:49 ` Axel Beckert 2013-12-20 23:51 ` Vincent Lefevre 0 siblings, 2 replies; 17+ messages in thread From: Bart Schaefer @ 2013-12-20 20:27 UTC (permalink / raw) To: Axel Beckert, zsh-workers On Dec 20, 8:24pm, Axel Beckert wrote: } } Since 5.0.4 seems close, I thought I'd better forward it soon despite } I haven't looked at it closer yet. You're just a tiny bit too late -- PWS announced zsh-5.0.4 about 25 minutes before your email was sent. :-( } this has been reported[1] against zsh 5.0.3 in Debian today. Looks like } some regression in 5.0.3 on a first glance. Maybe related to the } test-suite freezes which have been reported on a few architectures. I think that's unlikely unless svnwrapper is using zpty. I'm not a user of SVN at this point so I don't have a reference. Best at this point to see if you can still reproduce it with the current zsh source from git. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze 2013-12-20 20:27 ` Bart Schaefer @ 2013-12-20 20:49 ` Axel Beckert 2013-12-20 23:51 ` Vincent Lefevre 1 sibling, 0 replies; 17+ messages in thread From: Axel Beckert @ 2013-12-20 20:49 UTC (permalink / raw) To: zsh-workers Hi, On Fri, Dec 20, 2013 at 12:27:00PM -0800, Bart Schaefer wrote: > On Dec 20, 8:24pm, Axel Beckert wrote: > } Since 5.0.4 seems close, I thought I'd better forward it soon despite > } I haven't looked at it closer yet. > > You're just a tiny bit too late -- PWS announced zsh-5.0.4 about 25 > minutes before your email was sent. :-( Hrm, I just saw the git tags so far, but neither me nor Frank Terbeck have received such an announcement via e-mail yet. Well, maybe it's hanging in Greylisting or such. Thanks for the information anyway. :-) Kind regards, Axel -- /~\ Plain Text Ribbon Campaign | Axel Beckert \ / Say No to HTML in E-Mail and News | abe@deuxchevaux.org (Mail) X See http://www.asciiribbon.org/ | abe@noone.org (Mail+Jabber) / \ I love long mails: http://email.is-not-s.ms/ | http://noone.org/abe/ (Web) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze 2013-12-20 20:27 ` Bart Schaefer 2013-12-20 20:49 ` Axel Beckert @ 2013-12-20 23:51 ` Vincent Lefevre 2013-12-20 23:59 ` Vincent Lefevre 1 sibling, 1 reply; 17+ messages in thread From: Vincent Lefevre @ 2013-12-20 23:51 UTC (permalink / raw) To: zsh-workers; +Cc: 732726 On 2013-12-20 12:27:00 -0800, Bart Schaefer wrote: > I think that's unlikely unless svnwrapper is using zpty. I'm not a user > of SVN at this point so I don't have a reference. That's the pipe itself that is completely broken. The problem can be reproduced with "zsh -f" (version 5.0.3) and: ypig% foo() { ls -R / } ypig% foo | head -- Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze 2013-12-20 23:51 ` Vincent Lefevre @ 2013-12-20 23:59 ` Vincent Lefevre 2013-12-21 0:12 ` Vincent Lefevre 0 siblings, 1 reply; 17+ messages in thread From: Vincent Lefevre @ 2013-12-20 23:59 UTC (permalink / raw) To: zsh-workers On 2013-12-21 00:51:49 +0100, Vincent Lefevre wrote: > The problem can be reproduced with "zsh -f" (version 5.0.3) and: > > ypig% foo() { ls -R / } > ypig% foo | head And also with zsh 5.0.4, which has just been released. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze 2013-12-20 23:59 ` Vincent Lefevre @ 2013-12-21 0:12 ` Vincent Lefevre 2013-12-21 2:19 ` Bart Schaefer 0 siblings, 1 reply; 17+ messages in thread From: Vincent Lefevre @ 2013-12-21 0:12 UTC (permalink / raw) To: zsh-workers A better example: foo() { printf "%d\n" {1..20000} } "foo > /dev/null" terminates immediately. "foo" terminates after a few seconds (output takes a bit time via ssh). "foo | head" freezes. -- Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/> 100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/> Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon) ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze 2013-12-21 0:12 ` Vincent Lefevre @ 2013-12-21 2:19 ` Bart Schaefer 2013-12-21 3:42 ` Bart Schaefer 0 siblings, 1 reply; 17+ messages in thread From: Bart Schaefer @ 2013-12-21 2:19 UTC (permalink / raw) To: Vincent Lefevre, zsh-workers On Dec 21, 1:12am, Vincent Lefevre wrote: } Subject: Re: Fwd (potential regression in 5.0.3): Bug#732726: zsh function } } A better example: } } foo() { printf "%d\n" {1..20000} } Actually the "ls -R /" example was better because it ruled out problems with SIGPIPE handling and the "print" builtin. This has to be result of either + * 31549: Src/exec,c, Src/zsh.h: replace ad-hoc subsh_close file + descriptor for pipes with new addfilelist() job-based mechanism. (which is the tail of a whole thread starting with 31528) or else it's + * 31919: Src/exec.c, Src/init.c: fix deadlock when a shell builtin + with a multio redirection is used on the left side of a pipeline, + by making sure stdin/out/err file descriptors are closed for the + multio copy process, which means not re-using those descriptors + after they are closed and marked FDT_UNUSED in fdtable[]. For + completeness, initialize their fdtable[] state to FDT_EXTERNAL. because the problem is that the parent shell still holds open the file descriptor that is the standard output of the shell function, so the write is blocked rather than getting SIGPIPE or other failure. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze 2013-12-21 2:19 ` Bart Schaefer @ 2013-12-21 3:42 ` Bart Schaefer 2013-12-21 7:16 ` Bart Schaefer 2013-12-21 18:08 ` Peter Stephenson 0 siblings, 2 replies; 17+ messages in thread From: Bart Schaefer @ 2013-12-21 3:42 UTC (permalink / raw) To: Vincent Lefevre, zsh-workers On Dec 20, 6:19pm, Bart Schaefer wrote: } } + * 31549: Src/exec,c, Src/zsh.h: replace ad-hoc subsh_close file } + descriptor for pipes with new addfilelist() job-based mechanism. Yep, that's the one. I compiled commit 709dbbbda82efde2020d9d67a19687c101b91570 and was able to reproduce the problem. The immediately preceding commit 39ab9952e8255cb99e9c0abcc8bbec43158a55d7 does not show the bug. Non-interactive shells also do not show the bug, which may be why "make check" didn't flag anything. The problem seems to be with this hunk specifically: @@ -1744,8 +1742,6 @@ execpline2(Estate state, wordcode pcode, execpline2(state, *state->pc++, how, pipes[0], output, last1); list_pipe = old_list_pipe; cmdpop(); - zclose(pipes[0]); - subsh_close = -1; } } The problem is that pipes[0] isn't always added to the list of files for the job; sometimes it really does need to be closed there. diff --git a/Src/exec.c b/Src/exec.c index dccdc2b..4480033 100644 --- a/Src/exec.c +++ b/Src/exec.c @@ -1691,6 +1691,7 @@ execpline2(Estate state, wordcode pcode, execcmd(state, input, output, how, last1 ? 1 : 2); else { int old_list_pipe = list_pipe; + int subsh_close = -1; Wordcode next = state->pc + (*state->pc), pc; wordcode code; @@ -1738,6 +1739,7 @@ execpline2(Estate state, wordcode pcode, } else { /* otherwise just do the pipeline normally. */ addfilelist(NULL, pipes[0]); + subsh_close = pipes[0]; execcmd(state, input, pipes[1], how, 0); } zclose(pipes[1]); @@ -1750,6 +1752,8 @@ execpline2(Estate state, wordcode pcode, execpline2(state, *state->pc++, how, pipes[0], output, last1); list_pipe = old_list_pipe; cmdpop(); + if (subsh_close != pipes[0]) + zclose(pipes[0]); } } ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze 2013-12-21 3:42 ` Bart Schaefer @ 2013-12-21 7:16 ` Bart Schaefer 2013-12-21 18:08 ` Peter Stephenson 1 sibling, 0 replies; 17+ messages in thread From: Bart Schaefer @ 2013-12-21 7:16 UTC (permalink / raw) To: zsh-workers On Dec 20, 7:42pm, Bart Schaefer wrote: } } The problem is that pipes[0] isn't always added to the list of files } for the job; sometimes it really does need to be closed there. Here's a regression test. I assume it's OK to do this coproc trick, since if it were not tests A01 and A04 would already have failed. I verified that this test *fails* with 5.0.3 and 5.0.4 (sigh). diff --git a/Test/A05execution.ztst b/Test/A05execution.ztst index c8320a1..61d24fe 100644 --- a/Test/A05execution.ztst +++ b/Test/A05execution.ztst @@ -202,3 +202,15 @@ F:the bug is still there or it reappeared. See workers-29973 for details. 0:Check $pipestatus with a known difficult case >1 0 1 0 0 F:This similar test was triggering a reproducible failure with pipestatus. + + { unsetopt MONITOR } 2>/dev/null + coproc { read -Et 5 || kill -INT $$ } + print -u $ZTST_fd 'This test takes 5 seconds to fail...' + { printf "%d\n" {1..20000} } | ( read -E ) + print -p done + read -Ep +0:Bug regression: piping a shell construct to an external process may hang +>1 +>done +F:This test checks for a file descriptor leak that could cause the left +F:side of a pipe to block on write after the right side has exited ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze 2013-12-21 3:42 ` Bart Schaefer 2013-12-21 7:16 ` Bart Schaefer @ 2013-12-21 18:08 ` Peter Stephenson 2013-12-21 20:57 ` Bart Schaefer 1 sibling, 1 reply; 17+ messages in thread From: Peter Stephenson @ 2013-12-21 18:08 UTC (permalink / raw) To: zsh-workers On Fri, 20 Dec 2013 19:42:23 -0800 Bart Schaefer <schaefer@brasslantern.com> wrote: > The problem is that pipes[0] isn't always added to the list of files > for the job; sometimes it really does need to be closed there. Thanks for the quick work. Are we happy this fixes the particular problem (the regression test looks plausible, don't see why we shouldn't use coproc)? If so I'll make a new version immediately. This evening is my last opportunity for a while so if this is a definite improvement I'd like to get on with it. -- Peter Stephenson <p.w.stephenson@ntlworld.com> Web page now at http://homepage.ntlworld.com/p.w.stephenson/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze 2013-12-21 18:08 ` Peter Stephenson @ 2013-12-21 20:57 ` Bart Schaefer 2013-12-21 22:34 ` Peter Stephenson 0 siblings, 1 reply; 17+ messages in thread From: Bart Schaefer @ 2013-12-21 20:57 UTC (permalink / raw) To: zsh-workers On Dec 21, 6:08pm, Peter Stephenson wrote: } } Thanks for the quick work. The timing was convenient and I was afraid it might have been my fault. } Are we happy this fixes the particular problem (the regression test looks } plausible, don't see why we shouldn't use coproc)? If so I'll make a } new version immediately. As far as I can tell this fixes the problem. It'd be nice to hear from Vincent for confirmation but as that may not happen in time I'd say you should go ahead. -- Barton E. Schaefer ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze 2013-12-21 20:57 ` Bart Schaefer @ 2013-12-21 22:34 ` Peter Stephenson 2013-12-22 1:34 ` Bart Schaefer 0 siblings, 1 reply; 17+ messages in thread From: Peter Stephenson @ 2013-12-21 22:34 UTC (permalink / raw) To: zsh-workers On Sat, 21 Dec 2013 12:57:18 -0800 Bart Schaefer <schaefer@brasslantern.com> wrote: > As far as I can tell this fixes the problem. It'd be nice to hear from > Vincent for confirmation but as that may not happen in time I'd say you > should go ahead. Unfortunately it (the patch in 32171) doesn't fix the problem for me. With foo() { ls -R / } foo | head I'm still getting a hang. I think we're forking inside execcmd() after adding pipes[0] to the filelist for thisjob. This subshell is what's going to form the LHS of the pipeline --- and we entersubsh() which will clear the job table. So I think we need to salvage the filelist from the job table and remove the pipe file descriptors in the danger cases, which I take to be the places where we were handling subsh_close in the old version of the code (where we are handling nested shell constructs of some sort). The following does seem to fix the hang here and not cause any new test failures. Note it includes your code change, but not your regression test (you hadn't pushed either yet last I looked). The change to zwaitjob() is just pulling out the common code shared with the two new cases. I'm not going to commit this and make a new release just on the basis of my guesses, however. diff --git a/Src/exec.c b/Src/exec.c index dccdc2b..f16cfd3 100644 --- a/Src/exec.c +++ b/Src/exec.c @@ -1691,6 +1691,7 @@ execpline2(Estate state, wordcode pcode, execcmd(state, input, output, how, last1 ? 1 : 2); else { int old_list_pipe = list_pipe; + int subsh_close = -1; Wordcode next = state->pc + (*state->pc), pc; wordcode code; @@ -1738,6 +1739,7 @@ execpline2(Estate state, wordcode pcode, } else { /* otherwise just do the pipeline normally. */ addfilelist(NULL, pipes[0]); + subsh_close = pipes[0]; execcmd(state, input, pipes[1], how, 0); } zclose(pipes[1]); @@ -1750,6 +1752,8 @@ execpline2(Estate state, wordcode pcode, execpline2(state, *state->pc++, how, pipes[0], output, last1); list_pipe = old_list_pipe; cmdpop(); + if (subsh_close != pipes[0]) + zclose(pipes[0]); } } @@ -2385,7 +2389,7 @@ static void execcmd(Estate state, int input, int output, int how, int last1) { HashNode hn = NULL; - LinkList args; + LinkList args, filelist = NULL; LinkNode node; Redir fn; struct multio *mfds[10]; @@ -2907,6 +2911,7 @@ execcmd(Estate state, int input, int output, int how, int last1) flags |= ESUB_KEEPTRAP; if (type == WC_SUBSH && !(how & Z_ASYNC)) flags |= ESUB_JOB_CONTROL; + filelist = jobtab[thisjob].filelist; entersubsh(flags); close(synch[1]); forked = 1; @@ -3260,6 +3265,7 @@ execcmd(Estate state, int input, int output, int how, int last1) if (is_shfunc) { /* It's a shell function */ + pipecleanfilelist(filelist); execshfunc((Shfunc) hn, args); } else { /* It's a builtin */ @@ -3338,6 +3344,7 @@ execcmd(Estate state, int input, int output, int how, int last1) DPUTS(varspc, "BUG: assignment before complex command"); list_pipe = 0; + pipecleanfilelist(filelist); /* If we're forked (and we should be), no need to return */ DPUTS(last1 != 1 && !forked, "BUG: not exiting?"); DPUTS(type != WC_SUBSH, "Not sure what we're doing."); diff --git a/Src/jobs.c b/Src/jobs.c index 371b8eb..a321172 100644 --- a/Src/jobs.c +++ b/Src/jobs.c @@ -1173,6 +1173,30 @@ addfilelist(const char *name, int fd) zaddlinknode(ll, jf); } +/* Clean up pipes no longer needed associated with a job */ + +/**/ +void +pipecleanfilelist(LinkList filelist) +{ + LinkNode node; + + if (!filelist) + return; + node = firstnode(filelist); + while (node) { + Jobfile jf = (Jobfile)getdata(node); + if (jf->is_fd) { + LinkNode next = nextnode(node); + zclose(jf->u.fd); + (void)remnode(filelist, node); + zfree(jf, sizeof(*jf)); + node = next; + } else + incnode(node); + } +} + /* Finished with list of files for a job */ /**/ @@ -1415,19 +1439,7 @@ zwaitjob(int job, int wait_cmd) * we can't deadlock on the fact that those still exist, so * that's not a problem. */ - LinkNode node = firstnode(jn->filelist); - while (node) { - Jobfile jf = (Jobfile)getdata(node); - if (jf->is_fd) { - LinkNode next = nextnode(node); - (void)remnode(jn->filelist, node); - zclose(jf->u.fd); - zfree(jf, sizeof(*jf)); - node = next; - } else { - incnode(node); - } - } + pipecleanfilelist(jn->filelist); } while (!errflag && jn->stat && !(jn->stat & STAT_DONE) && -- Peter Stephenson <p.w.stephenson@ntlworld.com> Web page now at http://homepage.ntlworld.com/p.w.stephenson/ ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze 2013-12-21 22:34 ` Peter Stephenson @ 2013-12-22 1:34 ` Bart Schaefer 2013-12-23 2:06 ` (potential regression in 5.0.3) Paul Ackersviller 2014-01-02 18:06 ` Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze Peter Stephenson 0 siblings, 2 replies; 17+ messages in thread From: Bart Schaefer @ 2013-12-22 1:34 UTC (permalink / raw) To: zsh-workers On Dec 21, 10:34pm, Peter Stephenson wrote: } } Unfortunately it (the patch in 32171) doesn't fix the problem for me. Drat. } I think we're forking inside execcmd() after adding pipes[0] to the } filelist for thisjob. This subshell is what's going to form the LHS of } the pipeline --- and we entersubsh() which will clear the job table. } So I think we need to salvage the filelist from the job table and remove } the pipe file descriptors in the danger cases, which I take to be the } places where we were handling subsh_close in the old version of the code } (where we are handling nested shell constructs of some sort). I wondered if that would be necessary, but couldn't ever manage to get DPUTS() statements in those two places to print anything, so concluded that the issue was in the place that I did patch. What concerns me is whether we might be closing too many file descriptors if we remove all is_fd entries from filelist at that point, but if that's in the parent that's going to do nothing but wait for the child it should be OK. } The following does seem to fix the hang here and not cause any new test } failures. Note it includes your code change, but not your regression } test (you hadn't pushed either yet last I looked). Argh! Did "git commit" but forgot "git push". Did that now. Here's your new bit of 32175 plus an additional regression test (fails with just 32171 but succeeds after 32175). diff --git a/Src/exec.c b/Src/exec.c index 4480033..f16cfd3 100644 --- a/Src/exec.c +++ b/Src/exec.c @@ -2389,7 +2389,7 @@ static void execcmd(Estate state, int input, int output, int how, int last1) { HashNode hn = NULL; - LinkList args; + LinkList args, filelist = NULL; LinkNode node; Redir fn; struct multio *mfds[10]; @@ -2911,6 +2911,7 @@ execcmd(Estate state, int input, int output, int how, int last1) flags |= ESUB_KEEPTRAP; if (type == WC_SUBSH && !(how & Z_ASYNC)) flags |= ESUB_JOB_CONTROL; + filelist = jobtab[thisjob].filelist; entersubsh(flags); close(synch[1]); forked = 1; @@ -3264,6 +3265,7 @@ execcmd(Estate state, int input, int output, int how, int last1) if (is_shfunc) { /* It's a shell function */ + pipecleanfilelist(filelist); execshfunc((Shfunc) hn, args); } else { /* It's a builtin */ @@ -3342,6 +3344,7 @@ execcmd(Estate state, int input, int output, int how, int last1) DPUTS(varspc, "BUG: assignment before complex command"); list_pipe = 0; + pipecleanfilelist(filelist); /* If we're forked (and we should be), no need to return */ DPUTS(last1 != 1 && !forked, "BUG: not exiting?"); DPUTS(type != WC_SUBSH, "Not sure what we're doing."); diff --git a/Src/jobs.c b/Src/jobs.c index 371b8eb..a321172 100644 --- a/Src/jobs.c +++ b/Src/jobs.c @@ -1173,6 +1173,30 @@ addfilelist(const char *name, int fd) zaddlinknode(ll, jf); } +/* Clean up pipes no longer needed associated with a job */ + +/**/ +void +pipecleanfilelist(LinkList filelist) +{ + LinkNode node; + + if (!filelist) + return; + node = firstnode(filelist); + while (node) { + Jobfile jf = (Jobfile)getdata(node); + if (jf->is_fd) { + LinkNode next = nextnode(node); + zclose(jf->u.fd); + (void)remnode(filelist, node); + zfree(jf, sizeof(*jf)); + node = next; + } else + incnode(node); + } +} + /* Finished with list of files for a job */ /**/ @@ -1415,19 +1439,7 @@ zwaitjob(int job, int wait_cmd) * we can't deadlock on the fact that those still exist, so * that's not a problem. */ - LinkNode node = firstnode(jn->filelist); - while (node) { - Jobfile jf = (Jobfile)getdata(node); - if (jf->is_fd) { - LinkNode next = nextnode(node); - (void)remnode(jn->filelist, node); - zclose(jf->u.fd); - zfree(jf, sizeof(*jf)); - node = next; - } else { - incnode(node); - } - } + pipecleanfilelist(jn->filelist); } while (!errflag && jn->stat && !(jn->stat & STAT_DONE) && diff --git a/Test/A05execution.ztst b/Test/A05execution.ztst index 61d24fe..6abfd8b 100644 --- a/Test/A05execution.ztst +++ b/Test/A05execution.ztst @@ -207,10 +207,12 @@ F:This similar test was triggering a reproducible failure with pipestatus. coproc { read -Et 5 || kill -INT $$ } print -u $ZTST_fd 'This test takes 5 seconds to fail...' { printf "%d\n" {1..20000} } | ( read -E ) + hang(){ printf "%d\n" {2..20000} | cat }; hang | ( read -E ) print -p done read -Ep 0:Bug regression: piping a shell construct to an external process may hang >1 +>2 >done F:This test checks for a file descriptor leak that could cause the left F:side of a pipe to block on write after the right side has exited ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: (potential regression in 5.0.3) 2013-12-22 1:34 ` Bart Schaefer @ 2013-12-23 2:06 ` Paul Ackersviller 2013-12-23 5:47 ` Bart Schaefer 2014-01-02 18:06 ` Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze Peter Stephenson 1 sibling, 1 reply; 17+ messages in thread From: Paul Ackersviller @ 2013-12-23 2:06 UTC (permalink / raw) To: zsh-workers I've ran across something else that first looked like a regression, but upon further inspection Test/A05execution.ztst was doing this for a couple of months. Strangely, it goes into an infinite loop, but only when run in the background. It does this for me on both Linux and FreeBSD, starting from the commit c3114a7735c85b79771e08bd156470bde1a36950, but on 5.0.2 too. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: (potential regression in 5.0.3) 2013-12-23 2:06 ` (potential regression in 5.0.3) Paul Ackersviller @ 2013-12-23 5:47 ` Bart Schaefer 0 siblings, 0 replies; 17+ messages in thread From: Bart Schaefer @ 2013-12-23 5:47 UTC (permalink / raw) To: zsh-workers On Dec 23, 2:06am, Paul Ackersviller wrote: } } I've ran across something else that first looked like a regression, } but upon further inspection Test/A05execution.ztst was doing this } for a couple of months. Strangely, it goes into an infinite loop, } but only when run in the background. It does this for me on both } Linux and FreeBSD, starting from the commit } c3114a7735c85b79771e08bd156470bde1a36950, but on 5.0.2 too. It's a known issue that changing the MONITOR option from off to on can cause the acquire_pgrp() routine to go into an infinite loop, depending on the shape of the process tree from the foreground. The jobs.c diff in c3114a7 partly fixed that, but the new A05execution test needs setopt MONITOR to create the conditions it is regressing, so it is still possible for that to trigger a loop. Here's an attempt to resolve the remaining infinite-loop condition: diff --git a/Src/jobs.c b/Src/jobs.c index a321172..8719465 100644 --- a/Src/jobs.c +++ b/Src/jobs.c @@ -2619,6 +2619,7 @@ acquire_pgrp(void) sigset_t blockset, oldset; if ((mypgrp = GETPGRP()) > 0) { + long lastpgrp = mypgrp; sigemptyset(&blockset); sigaddset(&blockset, SIGTTIN); sigaddset(&blockset, SIGTTOU); @@ -2639,6 +2640,9 @@ acquire_pgrp(void) if (read(0, NULL, 0) != 0) {} /* Might generate SIGT* */ signal_block(blockset); mypgrp = GETPGRP(); + if (mypgrp == lastpgrp && !interact) + break; /* Unlikely that pgrp will ever change */ + lastpgrp = mypgrp; } if (mypgrp != mypid) { if (setpgrp(0, 0) == 0) { ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze 2013-12-22 1:34 ` Bart Schaefer 2013-12-23 2:06 ` (potential regression in 5.0.3) Paul Ackersviller @ 2014-01-02 18:06 ` Peter Stephenson 2014-01-02 20:40 ` Bart Schaefer 1 sibling, 1 reply; 17+ messages in thread From: Peter Stephenson @ 2014-01-02 18:06 UTC (permalink / raw) To: zsh-workers I'm now back and looking through the mail, most of which I've already seen as I managed to get it onto my phone while the network was networking (it was not just that it was 2G, I think the backhaul was a woodpecker with a dodgy beak and a morse key it kept missing). It looks like it was either a good or a bad time for me to be away, depending whether you're me or Bart. As far as I can see most of the stuff has been sorted out, by Bart with assistance. If there's nothing to indicate otherwise over the next couple of days I'll make a 5.0.5 with the fixes in at the weekend. First... On Sat, 21 Dec 2013 17:34:59 -0800 Bart Schaefer <schaefer@brasslantern.com> wrote: > } I think we're forking inside execcmd() after adding pipes[0] to the > } filelist for thisjob. This subshell is what's going to form the LHS of > } the pipeline --- and we entersubsh() which will clear the job table. > } So I think we need to salvage the filelist from the job table and remove > } the pipe file descriptors in the danger cases, which I take to be the > } places where we were handling subsh_close in the old version of the code > } (where we are handling nested shell constructs of some sort). > > I wondered if that would be necessary, but couldn't ever manage to get > DPUTS() statements in those two places to print anything, so concluded > that the issue was in the place that I did patch. > > What concerns me is whether we might be closing too many file descriptors > if we remove all is_fd entries from filelist at that point, but if that's > in the parent that's going to do nothing but wait for the child it should > be OK. Yes, I don't think there should be too much problem with what we *are* doing, and the regression test seems fine. I'm a little bit worried about what we're *not* doing. The closure is tied to thisjob. That's probably better than the previous version where it was simply signalled by a static, neither associated with the current job nor with the call hierarchy, but I wonder if we should be closing even more in the subshell (only); should we, in fact, be closing all f.d.s associated with pipes now that we're no longer in a shell that has any interest in them? Or doesn't this actually cause a problem in practice? (We clearly shouldn't be removing files, the other purpose of the filelist, since that's needed at exactly one point after the final use --- it might be cleaner having a separate list to emphasise the different behaviour of files and f.d.s but it's probably not worth the extra pointer now.) Anyway, that's for future study (FFS, as a colleague slightly ambiguously puts it). pws ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze 2014-01-02 18:06 ` Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze Peter Stephenson @ 2014-01-02 20:40 ` Bart Schaefer 0 siblings, 0 replies; 17+ messages in thread From: Bart Schaefer @ 2014-01-02 20:40 UTC (permalink / raw) To: zsh-workers On Jan 2, 6:06pm, Peter Stephenson wrote: } Subject: Re: Fwd (potential regression in 5.0.3): Bug#732726: zsh function } } It looks like it was either a good or a bad time for me to be away, } depending whether you're me or Bart. I was at home with a few days off, so it really wasn't a problem. } If there's nothing to indicate otherwise over the next couple of days } I'll make a 5.0.5 with the fixes in at the weekend. There was one thing I left for your attention: Carl's patch in 32196 for copy-prev-shell-word. } > } So I think we need to salvage the filelist from the job table and remove } > } the pipe file descriptors in the danger cases, which I take to be the } > } places where we were handling subsh_close in the old version of the code } > } (where we are handling nested shell constructs of some sort). [...] } > What concerns me is whether we might be closing too many file descriptors } > if we remove all is_fd entries from filelist at that point } } Yes, I don't think there should be too much problem with what we *are* } doing, and the regression test seems fine. Is it only pipes in that filelist? I was concerned that there might be descriptors for real files there, that shouldn't be closed. But again I think it's OK because in that context all the shell is going to do is wait. } I wonder if we should be closing even more in the subshell (only); } should we, in fact, be closing all f.d.s associated with pipes now } that we're no longer in a shell that has any interest in them? Probably yes, which might in fact fix some of the nits with coprocs (though zsh's coproc doesn't seem to have any nits of that particular flavor that e.g. ksh's does not also have). To do that I think we'd need to start keeping track of what kind of fd is in the filelist, and whether it's read or write, not just that it's a descriptor associated with a particular job. } Or doesn't this actually cause a problem in practice? As you pointed out, we're doing better now than we were, and even the old way it hasn't been especially noticable for many years so far. ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2014-01-02 20:40 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-12-20 19:24 Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze Axel Beckert 2013-12-20 20:27 ` Bart Schaefer 2013-12-20 20:49 ` Axel Beckert 2013-12-20 23:51 ` Vincent Lefevre 2013-12-20 23:59 ` Vincent Lefevre 2013-12-21 0:12 ` Vincent Lefevre 2013-12-21 2:19 ` Bart Schaefer 2013-12-21 3:42 ` Bart Schaefer 2013-12-21 7:16 ` Bart Schaefer 2013-12-21 18:08 ` Peter Stephenson 2013-12-21 20:57 ` Bart Schaefer 2013-12-21 22:34 ` Peter Stephenson 2013-12-22 1:34 ` Bart Schaefer 2013-12-23 2:06 ` (potential regression in 5.0.3) Paul Ackersviller 2013-12-23 5:47 ` Bart Schaefer 2014-01-02 18:06 ` Fwd (potential regression in 5.0.3): Bug#732726: zsh function freeze Peter Stephenson 2014-01-02 20:40 ` Bart Schaefer
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).