* Re: multios doesn't work with 2>&1 [not found] ` <131027100137.ZM4100@torch.brasslantern.com> @ 2013-10-27 17:46 ` Peter Stephenson 2013-10-27 18:27 ` Bart Schaefer 2013-10-27 18:11 ` multios doesn't work with 2>&1 Bart Schaefer 1 sibling, 1 reply; 9+ messages in thread From: Peter Stephenson @ 2013-10-27 17:46 UTC (permalink / raw) To: Zsh Hackers' List Bart Schaefer wrote: > If I back out workers/20666 (Jan 2005), then this example works again. (This is really a zsh-workers thing, too, so I've moved it.) Crikey, I'd worked out it was already broken in 2007 and I thought I was doing well... Playing around here, the other observation I have before looking at the thread you're talking about is that echo foo 2>&1 >/dev/null > >(sed 's/foo/bar/') echo foo >/dev/null 2>&1 > >(sed 's/foo/bar/') echo foo >/dev/null > >(sed 's/foo/bar/') 2>&1 all work as expected. This seems inconsistent --- the synchronous / asynchronous and run-in-shell / fork behaviour is clearly different in that case, but it's not clear that should affect redirection. (I'd also worked out that echo foo >/dev/null 2>&1 | sed 's/foo/bar/' gives a different bad effect, namely you get the output you want but the shell hangs, but that maybe another natural consequence of the thread I haven't read yet.) pws ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: multios doesn't work with 2>&1 2013-10-27 17:46 ` multios doesn't work with 2>&1 Peter Stephenson @ 2013-10-27 18:27 ` Bart Schaefer 2013-10-27 19:39 ` Multio deadlock (Re: multios doesn't work with 2>&1) Bart Schaefer 0 siblings, 1 reply; 9+ messages in thread From: Bart Schaefer @ 2013-10-27 18:27 UTC (permalink / raw) To: Zsh Hackers' List On Oct 27, 5:46pm, Peter Stephenson wrote: } Subject: Re: multios doesn't work with 2>&1 } } Bart Schaefer wrote: } > If I back out workers/20666 (Jan 2005), then this example works again. } } (This is really a zsh-workers thing, too, so I've moved it.) } } Crikey, I'd worked out it was already broken in 2007 and I thought I was } doing well... I went through "git whatchanged Src/exec.c" searching for the string "multio" and then cross-referenced with Etc/ChangeLog-4.3. } echo foo >/dev/null 2>&1 | sed 's/foo/bar/' } } gives a different bad effect, namely you get the output you want but the } shell hangs This problem is still there after 31912. The parent shell is in zwaitjob(), as is the shell that spawned sed. The multio copy-input-to-output job is blocked on read of the pipe, which is never going to close because the other two jobs are waiting. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Multio deadlock (Re: multios doesn't work with 2>&1) 2013-10-27 18:27 ` Bart Schaefer @ 2013-10-27 19:39 ` Bart Schaefer 2013-10-27 20:24 ` Bart Schaefer 2013-10-27 20:33 ` Peter Stephenson 0 siblings, 2 replies; 9+ messages in thread From: Bart Schaefer @ 2013-10-27 19:39 UTC (permalink / raw) To: Zsh Hackers' List On Oct 27, 11:27am, Bart Schaefer wrote: } } } echo foo >/dev/null 2>&1 | sed 's/foo/bar/' } } } } gives a different bad effect, namely you get the output you want but the } } shell hangs } } The parent shell is in zwaitjob(), as is the shell that spawned sed. That should actually say "as is the shell that was forked for echo". The parent shell is waiting for "sed". Here's the spot where the "echo" shell is stopped: /* * So what's going on here then? Well, I'm glad you asked. * * If we create multios for use in a subshell we do * this after forking, in this function above. That * means that the current (sub)process is responsible * for clearing them up. However, the processes won't * go away until we have closed the fd's talking to them. * Since we're about to exit the shell there's nothing * to stop us closing all fd's (including the ones 0 to 9 * that we usually leave alone). * * Then we wait for any processes. When we forked, * we cleared the jobtable and started a new job just for * any oddments like this, so if there aren't any we won't * need to wait. The result of not waiting is that * the multios haven't flushed the fd's properly, leading * to obscure missing data. * * It would probably be cleaner to ensure that the * parent shell handled multios, but that requires * some architectural changes which are likely to be * hairy. */ for (i = 0; i < 10; i++) if (fdtable[i] != FDT_UNUSED) close(i); closem(FDT_UNUSED); if (thisjob != -1) waitjobs(); _exit(lastval); Obviously we've not succeeded in closing all the necessary descriptors. Here's what's still open (PID 16361 == parent, 16383 == echo, 16384 == /* * So what's going on here then? Well, I'm glad you asked. * * If we create multios for use in a subshell we do * this after forking, in this function above. That * means that the current (sub)process is responsible * for clearing them up. However, the processes won't * go away until we have closed the fd's talking to them. * Since we're about to exit the shell there's nothing * to stop us closing all fd's (including the ones 0 to 9 * that we usually leave alone). * * Then we wait for any processes. When we forked, * we cleared the jobtable and started a new job just for * any oddments like this, so if there aren't any we won't * need to wait. The result of not waiting is that * the multios haven't flushed the fd's properly, leading * to obscure missing data. * * It would probably be cleaner to ensure that the * parent shell handled multios, but that requires * some architectural changes which are likely to be * hairy. */ for (i = 0; i < 10; i++) if (fdtable[i] != FDT_UNUSED) close(i); closem(FDT_UNUSED); if (thisjob != -1) waitjobs(); _exit(lastval); Obviously we've not succeeded in closing all the necessary descriptors. Here's what's still open: zsh 16361 16361 schaefer 0u CHR 136,3 5 /dev/pts/3 zsh 16361 16361 schaefer 1u CHR 136,3 5 /dev/pts/3 zsh 16361 16361 schaefer 2u CHR 136,3 5 /dev/pts/3 zsh 16361 16361 schaefer 10u CHR 136,3 5 /dev/pts/3 zsh 16383 16383 schaefer 2w FIFO 0,7 1227018 pipe zsh 16384 16383 schaefer 12w FIFO 0,7 1227016 pipe zsh 16384 16383 schaefer 13w CHR 1,3 2056 /dev/null zsh 16384 16383 schaefer 14r FIFO 0,7 1227018 pipe sed 16385 16383 schaefer 0r FIFO 0,7 1227016 pipe sed 16385 16383 schaefer 1u CHR 136,3 5 /dev/pts/3 sed 16385 16383 schaefer 2u CHR 136,3 5 /dev/pts/3 16361 is the parent, it's clean. 16383 is echo and 16384 is the multio. The multio is blocked reading fd 14 (1227018 pipe), which it's parent still has open as stderr because fdtable[2] == FDT_UNUSED. Does the following look right? It does fix the deadlock, but we might call close() on an already closed fd, which it appears this is trying to avoid (maybe so as not to change errno?). diff --git a/Src/exec.c b/Src/exec.c index 99c7eaa..7ac1ad5 100644 --- a/Src/exec.c +++ b/Src/exec.c @@ -3372,7 +3372,7 @@ execcmd(Estate state, int input, int output, int how, int last1) * hairy. */ for (i = 0; i < 10; i++) - if (fdtable[i] != FDT_UNUSED) + if (i < 3 || fdtable[i] != FDT_UNUSED) close(i); closem(FDT_UNUSED); if (thisjob != -1) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Multio deadlock (Re: multios doesn't work with 2>&1) 2013-10-27 19:39 ` Multio deadlock (Re: multios doesn't work with 2>&1) Bart Schaefer @ 2013-10-27 20:24 ` Bart Schaefer 2013-10-27 20:33 ` Peter Stephenson 1 sibling, 0 replies; 9+ messages in thread From: Bart Schaefer @ 2013-10-27 20:24 UTC (permalink / raw) To: Zsh Hackers' List Sorry about the duplication in that last message, something obviously got pasted twice by mistake. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Multio deadlock (Re: multios doesn't work with 2>&1) 2013-10-27 19:39 ` Multio deadlock (Re: multios doesn't work with 2>&1) Bart Schaefer 2013-10-27 20:24 ` Bart Schaefer @ 2013-10-27 20:33 ` Peter Stephenson 2013-10-27 21:16 ` Peter Stephenson 2013-10-27 22:18 ` Bart Schaefer 1 sibling, 2 replies; 9+ messages in thread From: Peter Stephenson @ 2013-10-27 20:33 UTC (permalink / raw) To: Zsh Hackers' List On Sun, 27 Oct 2013 12:39:17 -0700 Bart Schaefer <schaefer@brasslantern.com> wrote: > Does the following look right? It does fix the deadlock, but we might > call close() on an already closed fd, which it appears this is trying > to avoid (maybe so as not to change errno?). > > diff --git a/Src/exec.c b/Src/exec.c > index 99c7eaa..7ac1ad5 100644 > --- a/Src/exec.c > +++ b/Src/exec.c > @@ -3372,7 +3372,7 @@ execcmd(Estate state, int input, int output, int how, int last1) > * hairy. > */ > for (i = 0; i < 10; i++) > - if (fdtable[i] != FDT_UNUSED) > + if (i < 3 || fdtable[i] != FDT_UNUSED) > close(i); > closem(FDT_UNUSED); > if (thisjob != -1) Hmm... why don't we initialise fdtable[0..2] to FDT_INTERNAL at boot? I can see we don't, since from a shell soon after start I get (gdb) p fdtable[0]@11 $6 = "\000\000\000\000\000\000\000\000\000\000\001" (FDT_UNUSED = 0, FDT_INTERNAL = 1). However, I've alsoe looked later and seen all three set to FDT_INTERNAL; not sure where that happens but presumably something to do with redir. Shouldn't we just initialise them properly? We could in principle detect if they're open by attempting to dup() and seeing what fd we get. Not sure if that's worth it, but it might be cleaner. pws ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Multio deadlock (Re: multios doesn't work with 2>&1) 2013-10-27 20:33 ` Peter Stephenson @ 2013-10-27 21:16 ` Peter Stephenson 2013-10-27 22:31 ` Bart Schaefer 2013-10-27 22:18 ` Bart Schaefer 1 sibling, 1 reply; 9+ messages in thread From: Peter Stephenson @ 2013-10-27 21:16 UTC (permalink / raw) To: Zsh Hackers' List On Sun, 27 Oct 2013 20:33:47 +0000 Peter Stephenson <p.w.stephenson@ntlworld.com> wrote: > Hmm... why don't we initialise fdtable[0..2] to FDT_INTERNAL at boot? Er, because it doesn't mean that, FDT_INTERNAL means "users can't monkey with this". We need something to mean "open and users can monkey with it". Maybe something with a similar function to FDT_EXTERNAL would work... but we'd need to be careful about the points where things got closed. We'd probably have to special case it in closem() when that gets passed FDT_UNUSED. Not sure how much this is worth it to avoid a bad close that's ignored anyway. pws ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Multio deadlock (Re: multios doesn't work with 2>&1) 2013-10-27 21:16 ` Peter Stephenson @ 2013-10-27 22:31 ` Bart Schaefer 0 siblings, 0 replies; 9+ messages in thread From: Bart Schaefer @ 2013-10-27 22:31 UTC (permalink / raw) To: Zsh Hackers' List On Oct 27, 9:16pm, Peter Stephenson wrote: } Subject: Re: Multio deadlock (Re: multios doesn't work with 2>&1) } } Er, because it doesn't mean that, FDT_INTERNAL means "users can't } monkey with this". See my other message RE asymmetry of movefd/redup. } We need something to mean "open and users can monkey with it". Except that obviously FDT_INTERNAL doesn't really prevent user monkeying, because fdtable[0..2] spend a lot of time in FDT_INTERNAL state without breaking any redirections. So either users could in fact monkey with an FDT_INTERNAL fd if they were able to guess the FD number, or the monkeying is barred some other way. In any case this patch finally does seem to resolve the deadlock in a safe manner; it can't have been correct that those dup() calls could be allowed to return an FD in 0..2 range. I threw in the init.c bit just for completeness, as mentioned that state is rapidly stomped on if any redirections of 0..2 are done. diff --git a/Src/exec.c b/Src/exec.c index 99c7eaa..df915e1 100644 --- a/Src/exec.c +++ b/Src/exec.c @@ -3123,7 +3123,7 @@ execcmd(Estate state, int input, int output, int how, int last1) int fd = fn->fd2; if(fd == -2) fd = (fn->type == REDIR_MERGEOUT) ? coprocout : coprocin; - fil = dup(fd); + fil = movefd(dup(fd)); } if (fil == -1) { char fdstr[4]; @@ -3151,7 +3151,7 @@ execcmd(Estate state, int input, int output, int how, int last1) else fil = clobber_open(fn); if(fil != -1 && IS_ERROR_REDIR(fn->type)) - dfil = dup(fil); + dfil = movefd(dup(fil)); else dfil = 0; if (fil == -1 || dfil == -1) { diff --git a/Src/init.c b/Src/init.c index 01a969d..7032ff8 100644 --- a/Src/init.c +++ b/Src/init.c @@ -1584,6 +1584,7 @@ zsh_main(UNUSED(int argc), char **argv) fdtable_size = zopenmax(); fdtable = zshcalloc(fdtable_size*sizeof(*fdtable)); + fdtable[0] = fdtable[1] = fdtable[2] = FDT_EXTERNAL; createoptiontable(); emulate(zsh_name, 1, &emulation, opts); /* initialises most options */ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Multio deadlock (Re: multios doesn't work with 2>&1) 2013-10-27 20:33 ` Peter Stephenson 2013-10-27 21:16 ` Peter Stephenson @ 2013-10-27 22:18 ` Bart Schaefer 1 sibling, 0 replies; 9+ messages in thread From: Bart Schaefer @ 2013-10-27 22:18 UTC (permalink / raw) To: Zsh Hackers' List On Oct 27, 8:33pm, Peter Stephenson wrote: } Subject: Re: Multio deadlock (Re: multios doesn't work with 2>&1) } } > for (i = 0; i < 10; i++) } > - if (fdtable[i] != FDT_UNUSED) } > + if (i < 3 || fdtable[i] != FDT_UNUSED) } > close(i); } } Hmm... why don't we initialise fdtable[0..2] to FDT_INTERNAL at boot? I don't know. Is FDT_INTERNAL the correct flag? It supposedly means that those descriptors are "not visible to other processes" which does not sound right to me. I think they should be FDT_EXTERNAL by default? } (FDT_UNUSED = 0, FDT_INTERNAL = 1). However, I've alsoe looked later } and seen all three set to FDT_INTERNAL; not sure where that happens but } presumably something to do with redir. It happens in utils.c:redup() but I'm not sure it's strictly correct. FDT_INTERNAL means they'll be "closed on exec" in execcmd() which is not right in the general case, but maybe it is when they've been dup'd. } Shouldn't we just initialise them properly? That would be better, yes, but they get toggled between FDT_UNUSED and FDT_INTERNAL every time there's a redirection, first when the original fd is closed [zclose()] and then when restored again afterward [redup()]. So the question is, in the deadlock case why is fd 2 open but still marked FDT_UNUSED? Who dup'd onto it without going through redup()? Or is that not what happened? (Its really difficult to trace this with gdb because all of this happens in one of the child processes.) See patch below, done essentially on a guess. Interesting aside: If fdtable[x] starts out FDT_EXTERNAL and is then saved with movefd() and restored with redup(), its state changes to FDT_INTERNAL because movefd() does not preserve the original state [it marks everything internal], but redup() copies the state. } We could in principle detect if they're open by attempting to dup() and } seeing what fd we get. Not sure if that's worth it, but it might be } cleaner. You mean, do that in order to determine whether they are already "unused" when the shell starts up? I don't think that's necessary. If you mean do it to avoid close(), it won't help, errno will still get frobbed. The following does appear to prevent the deadlock, but probably is not yet sufficient because it doesn't check that fdtable[] is big enough. diff --git a/Src/exec.c b/Src/exec.c index 99c7eaa..2e57443 100644 --- a/Src/exec.c +++ b/Src/exec.c @@ -3124,6 +3124,7 @@ execcmd(Estate state, int input, int output, int how, int last1) if(fd == -2) fd = (fn->type == REDIR_MERGEOUT) ? coprocout : coprocin; fil = dup(fd); + fdtable[fil] = FDT_INTERNAL; } if (fil == -1) { char fdstr[4]; @@ -3150,9 +3151,10 @@ execcmd(Estate state, int input, int output, int how, int last1) O_WRONLY | O_APPEND | O_CREAT | O_NOCTTY, 0666); else fil = clobber_open(fn); - if(fil != -1 && IS_ERROR_REDIR(fn->type)) + if(fil != -1 && IS_ERROR_REDIR(fn->type)) { dfil = dup(fil); - else + fdtable[dfil] = FDT_INTERNAL; + } else dfil = 0; if (fil == -1 || dfil == -1) { if(fil != -1) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: multios doesn't work with 2>&1 [not found] ` <131027100137.ZM4100@torch.brasslantern.com> 2013-10-27 17:46 ` multios doesn't work with 2>&1 Peter Stephenson @ 2013-10-27 18:11 ` Bart Schaefer 1 sibling, 0 replies; 9+ messages in thread From: Bart Schaefer @ 2013-10-27 18:11 UTC (permalink / raw) To: zsh-workers [> workers] On Oct 27, 10:01am, Bart Schaefer wrote: } } } % echo foo 2>&1 >/dev/null | sed 's/foo/bar/' } } } } should output "bar" if multios has been set. But it outputs nothing } } If there's a way to fix the original complaint (that >&- doesn't really } close the descriptor when a multio is involved, so unnecessary multios } result) without the above example remaining broken, I may need some help } to find it. Turns out I didn't need help after all. As usual, it's just a matter of passing around enough information. diff --git a/Src/exec.c b/Src/exec.c index d5fe69e..99c7eaa 100644 --- a/Src/exec.c +++ b/Src/exec.c @@ -1967,7 +1967,7 @@ clobber_open(struct redir *f) /**/ static void -closemn(struct multio **mfds, int fd) +closemn(struct multio **mfds, int fd, int type) { if (fd >= 0 && mfds[fd] && mfds[fd]->ct >= 2) { struct multio *mn = mfds[fd]; @@ -2026,7 +2026,7 @@ closemn(struct multio **mfds, int fd) } } _exit(0); - } else if (fd >= 0) + } else if (fd >= 0 && type == REDIR_CLOSE) mfds[fd] = NULL; } @@ -3093,7 +3093,7 @@ execcmd(Estate state, int input, int output, int how, int last1) } } if (fn->fd1 < 10) - closemn(mfds, fn->fd1); + closemn(mfds, fn->fd1, REDIR_CLOSE); if (!closed && zclose(fn->fd1) < 0) { zwarn("failed to close file descriptor %d: %e", fn->fd1, errno); @@ -3102,7 +3102,7 @@ execcmd(Estate state, int input, int output, int how, int last1) case REDIR_MERGEIN: case REDIR_MERGEOUT: if (fn->fd2 < 10) - closemn(mfds, fn->fd2); + closemn(mfds, fn->fd2, fn->type); if (!checkclobberparam(fn)) fil = -1; else if (fn->fd2 > 9 && @@ -3181,7 +3181,7 @@ execcmd(Estate state, int input, int output, int how, int last1) * spawning tee/cat processes as necessary. */ for (i = 0; i < 10; i++) if (mfds[i] && mfds[i]->ct >= 2) - closemn(mfds, i); + closemn(mfds, i, REDIR_CLOSE); if (nullexec) { if (nullexec == 1) { ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-10-27 22:31 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <20131027145917.GA5509@localhost.localdomain> [not found] ` <131027100137.ZM4100@torch.brasslantern.com> 2013-10-27 17:46 ` multios doesn't work with 2>&1 Peter Stephenson 2013-10-27 18:27 ` Bart Schaefer 2013-10-27 19:39 ` Multio deadlock (Re: multios doesn't work with 2>&1) Bart Schaefer 2013-10-27 20:24 ` Bart Schaefer 2013-10-27 20:33 ` Peter Stephenson 2013-10-27 21:16 ` Peter Stephenson 2013-10-27 22:31 ` Bart Schaefer 2013-10-27 22:18 ` Bart Schaefer 2013-10-27 18:11 ` multios doesn't work with 2>&1 Bart Schaefer
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).