zsh-workers
 help / color / mirror / code / Atom feed
* Strange function/pipestatus behavior, maybe a scope bug?
@ 2013-10-22 18:03 Ian F
  2013-10-23  8:21 ` Peter Stephenson
  0 siblings, 1 reply; 15+ messages in thread
From: Ian F @ 2013-10-22 18:03 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 525 bytes --]

Hi,

First time mailer, just subscribed.  Thanks for taking the time to read
this.

Test cases:

trueFunc() { true; }
falseFunc() { false; }

% false | true | false | true; echo $pipestatus
1 0 1 0
% false | true | false | true | trueFunc; echo $pipestatus
1 0 1 0 0
% false | true | false | true | falseFunc; echo $pipestatus
1 0 1 0 1

Potential bug case:

awkFunc() { awk; }

% false | true | false | true | awkFunc; echo $pipestatus
0

Thanks.  Also, this does not happen in bash.

%Ian


-- 
Ian F

Big hitter, the Lama

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Strange function/pipestatus behavior, maybe a scope bug?
  2013-10-22 18:03 Strange function/pipestatus behavior, maybe a scope bug? Ian F
@ 2013-10-23  8:21 ` Peter Stephenson
  2013-10-23 19:49   ` Ian F
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Stephenson @ 2013-10-23  8:21 UTC (permalink / raw)
  To: Ian F, zsh-workers

On Tue, 22 Oct 2013 11:03:07 -0700
Ian F <in4mer@gmail.com> wrote:
> Potential bug case:
> 
> awkFunc() { awk; }
> 
> % false | true | false | true | awkFunc; echo $pipestatus
> 0

Yes, that's a bug owing to the way the shell executes internal shell
commands within the right hand side of a pipeline.  There was a thread
on this starting at zsh-workers/29973:

http://www.zsh.org/mla/workers/2011/msg01394.html

pws


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Strange function/pipestatus behavior, maybe a scope bug?
  2013-10-23  8:21 ` Peter Stephenson
@ 2013-10-23 19:49   ` Ian F
  2013-10-23 21:14     ` Peter Stephenson
  0 siblings, 1 reply; 15+ messages in thread
From: Ian F @ 2013-10-23 19:49 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 1820 bytes --]

Peter, all,

Thank you for taking the time to provide your research and experience on
this issue.

Absent any further material communication on the subject (more googling has
not yielded anything of substance), I appear to have provided a test case
that exhibits the error behavior 100% of the time.  Based on the referenced
thread and my further search results, this is a material development.

After reading, AFAICT the main hurdle to identification of this bug has
been related to its previously extremely low probability of occurrence in
provided test cases.

I have used my test case to verify 100% occurrence on another, very
different hardware, kernel, and zsh (4.2.1 vs. 5.0.0) version, using
precisely the same test case (different version of awk; forced to provide
'-f -' in awkFunc).

I suggest this to be a significant progression in identification of this
bug, as well as indicating a greatly heightened severity, owing to the 100%
incorrect handling of pipeline exit values in the provided, very simple
test case (again, substantially different than previous reports I've been
able to find).

Sadly, my C fu is not up to par, otherwise I would delve into it myself.
 Please let me know if I can be of any further assistance.

Best,



%Ian


On Wed, Oct 23, 2013 at 1:21 AM, Peter Stephenson
<p.stephenson@samsung.com>wrote:

> On Tue, 22 Oct 2013 11:03:07 -0700
> Ian F <in4mer@gmail.com> wrote:
> > Potential bug case:
> >
> > awkFunc() { awk; }
> >
> > % false | true | false | true | awkFunc; echo $pipestatus
> > 0
>
> Yes, that's a bug owing to the way the shell executes internal shell
> commands within the right hand side of a pipeline.  There was a thread
> on this starting at zsh-workers/29973:
>
> http://www.zsh.org/mla/workers/2011/msg01394.html
>
> pws
>



-- 
Ian F

Big hitter, the Lama

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Strange function/pipestatus behavior, maybe a scope bug?
  2013-10-23 19:49   ` Ian F
@ 2013-10-23 21:14     ` Peter Stephenson
  2013-10-24  5:15       ` Bart Schaefer
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Stephenson @ 2013-10-23 21:14 UTC (permalink / raw)
  To: zsh-workers

On Wed, 23 Oct 2013 12:49:51 -0700
Ian F <in4mer@gmail.com> wrote:
> I suggest this to be a significant progression in identification of this
> bug, as well as indicating a greatly heightened severity, owing to the 100%
> incorrect handling of pipeline exit values in the provided, very simple
> test case (again, substantially different than previous reports I've been
> able to find).

The problem isn't any lack of severity; the problem is the shell's job
control code isn't structured to handle this well.  Short of a hack to
fix some subset of cases, which it would be best to avoid as it's likely
to make the real problem worse, it probably means a major rewrite of job
handling: so don't hold your breath.

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Strange function/pipestatus behavior, maybe a scope bug?
  2013-10-23 21:14     ` Peter Stephenson
@ 2013-10-24  5:15       ` Bart Schaefer
  2013-10-24  8:57         ` Peter Stephenson
  2013-10-24 15:17         ` Jun T.
  0 siblings, 2 replies; 15+ messages in thread
From: Bart Schaefer @ 2013-10-24  5:15 UTC (permalink / raw)
  To: zsh-workers

On Oct 23, 10:14pm, Peter Stephenson wrote:
} Subject: Re: Strange function/pipestatus behavior, maybe a scope bug?
}
} On Wed, 23 Oct 2013 12:49:51 -0700
} Ian F <in4mer@gmail.com> wrote:
} > I suggest this to be a significant progression in identification of this
} > bug, as well as indicating a greatly heightened severity, owing to the 100%
} > incorrect handling of pipeline exit values in the provided, very simple
} > test case (again, substantially different than previous reports I've been
} > able to find).
} 
} The problem isn't any lack of severity; the problem is the shell's job
} control code isn't structured to handle this well.  Short of a hack to
} fix some subset of cases, which it would be best to avoid as it's likely
} to make the real problem worse, it probably means a major rewrite of job
} handling: so don't hold your breath.

So I was thinking about this and it occurred to me that we already do
have a hierarchical job structure -- a Job object contains one or more
Process objects, and may point (by job number, or PID for the proccess
group leader) to other jobs in the flat job table, etc.

Thus if the problem really was as I analyzed it back in December 2012,
it should be resolvable by storing a copy of the pipestatus in the Job
object and then copying it to the global pipestats [that is not a typo]
when the job is reaped.

Which in turn led me by a circuitous route to discover that although my
earlier analysis wasn't entirely wrong, it was incomplete.  The real
problem behind all the SIGCHLD-handling mumbo-jumbo turns out to be that
the task of deleting the Job object is actually delegated to printjob(), 
which was destroying the linked list of Process objects from the Job
object before their status had been recorded in pipestats.

It therefore suffices to lift the pipestatus-updating code out of the
update_job() routine and call it from printjob() just before the Job
object is deleted.

There's a minor redundancy still here in that update_job() sometimes
calls printjob() which will result in storepipestats() being called
twice; but update_job() needs the return status, sometimes has a
different idea of whether the job is in the foreground, and doesn't
always call printjob(), so I don't see any obvious way around this.

One remaining bug which is neither new to this nor fixed by it:  The
very first time one runs a pipeline ending in a shell function,
something goes wrong and the whole thing is treated as if it is in
the background.  Here's zsh-5.0.2-248-ge05261f without my patch:

% true | false | true | false | (){ true }; echo $pipestatus
[1]  + done       true | 
       exit 1     false | 
       done       true | 
       running    false
1
% 

In this revisions of the shell, repeating this command produces this
same bad output every time.

Here's zsh-5.0.2-225-g225ee4c which includes my patch:

% true | false | true | false | (){ true }; echo $pipestatus
[1]  + done       true | 
       exit 1     false | 
       done       true | 
       exit 1     false
(anon): can't set tty pgrp: operation not permitted
1
% 

The tty pgrp error may be OS-related rather than shell-version, I'm
testing on two different hosts.

Now here's the same command executed again immediately in g225ee4c:

% true | false | true | false | (){ true }; echo $pipestatus
0 1 0 1 0
% 

I don't see anything in my patch that ought to cause this change in
behavior.  It would seem something is failing to be initialized in the
first instance, to make the pipeline behave as if backgrounded, and in
ge05261f it remains in the same state.  But what would that be and how
did my patch affect it?

Oh, one other bug which I don't believe is new:  the PIPEFAIL option
doesn't work right if the last command is a builtin.  Maybe that
test should be moved into storepipestats() as well, in which case
the redundancy I mentioned above might also be eliminated.

diff --git a/Src/jobs.c b/Src/jobs.c
index b9d7a84..82ffdf2 100644
--- a/Src/jobs.c
+++ b/Src/jobs.c
@@ -376,6 +376,30 @@ check_cursh_sig(int sig)
     }
 }
 
+/**/
+int
+storepipestats(Job jn, int inforeground)
+{
+    int i, pipefail = 0, jpipestats[MAX_PIPESTATS];
+    Process p;
+
+    for (p = jn->procs, i = 0; p && i < MAX_PIPESTATS; p = p->next, i++) {
+	jpipestats[i] = ((WIFSIGNALED(p->status)) ?
+			 0200 | WTERMSIG(p->status) :
+			 WEXITSTATUS(p->status));
+	if (jpipestats[i])
+	    pipefail = jpipestats[i];
+    }
+    if (inforeground) {
+	memcpy(pipestats, jpipestats, sizeof(int)*i);
+	if ((jn->stat & STAT_CURSH) && i < MAX_PIPESTATS)
+	    pipestats[i++] = lastval;
+	numpipestats = i;
+    }
+
+    return pipefail;
+}
+
 /* Update status of job, possibly printing it */
 
 /**/
@@ -507,24 +531,16 @@ update_job(Job jn)
 	return;
     jn->stat |= (somestopped) ? STAT_CHANGED | STAT_STOPPED :
 	STAT_CHANGED | STAT_DONE;
-    if (job == thisjob && (jn->stat & STAT_DONE)) {
-	int i, newlastval = 0;
-	Process p;
-
-	for (p = jn->procs, i = 0; p && i < MAX_PIPESTATS; p = p->next, i++) {
-	    pipestats[i] = ((WIFSIGNALED(p->status)) ?
-			    0200 | WTERMSIG(p->status) :
-			    WEXITSTATUS(p->status));
-	    if (pipestats[i])
-		newlastval = pipestats[i];
-	}
-	if ((jn->stat & STAT_CURSH) && i < MAX_PIPESTATS) {
-	    pipestats[i++] = lastval;
-	    if (!lastval && isset(PIPEFAIL))
+    if (jn->stat & STAT_DONE) {
+	int newlastval = storepipestats(jn, inforeground);
+
+	if (job == thisjob) {
+	    if (jn->stat & STAT_CURSH) {
+		if (!lastval && isset(PIPEFAIL))
+		    lastval = newlastval;
+	    } else if (isset(PIPEFAIL))
 		lastval = newlastval;
-	} else if (isset(PIPEFAIL))
-	    lastval= newlastval;
-	numpipestats = i;
+	}
     }
     if (!inforeground &&
 	(jn->stat & (STAT_SUBJOB | STAT_DONE)) == (STAT_SUBJOB | STAT_DONE)) {
@@ -975,6 +991,7 @@ printjob(Job jn, int lng, int synch)
 
     if (skip_print) {
 	if (jn->stat & STAT_DONE) {
+	  (void) storepipestats(jn, job == thisjob);
 	    if (should_report_time(jn))
 		dumptime(jn);
 	    deletejob(jn, 0);
@@ -1105,6 +1122,7 @@ printjob(Job jn, int lng, int synch)
 /* delete job if done */
 
     if (jn->stat & STAT_DONE) {
+	(void) storepipestats(jn, job == thisjob);
 	if (should_report_time(jn))
 	    dumptime(jn);
 	deletejob(jn, 0);


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Strange function/pipestatus behavior, maybe a scope bug?
  2013-10-24  5:15       ` Bart Schaefer
@ 2013-10-24  8:57         ` Peter Stephenson
  2013-10-24 15:48           ` Peter Stephenson
  2013-10-24 16:26           ` Bart Schaefer
  2013-10-24 15:17         ` Jun T.
  1 sibling, 2 replies; 15+ messages in thread
From: Peter Stephenson @ 2013-10-24  8:57 UTC (permalink / raw)
  To: zsh-workers

On Wed, 23 Oct 2013 22:15:48 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> It therefore suffices to lift the pipestatus-updating code out of the
> update_job() routine and call it from printjob() just before the Job
> object is deleted.
> 
> There's a minor redundancy still here in that update_job() sometimes
> calls printjob() which will result in storepipestats() being called
> twice; but update_job() needs the return status, sometimes has a
> different idea of whether the job is in the foreground, and doesn't
> always call printjob(), so I don't see any obvious way around this.

This doesn't seem serious.

> One remaining bug which is neither new to this nor fixed by it:  The
> very first time one runs a pipeline ending in a shell function,
> something goes wrong and the whole thing is treated as if it is in
> the background.  Here's zsh-5.0.2-248-ge05261f without my patch:
> 
> % true | false | true | false | (){ true }; echo $pipestatus
> [1]  + done       true | 
>        exit 1     false | 
>        done       true | 
>        running    false
> 1
> % 

Sounds like a race setting up the job, maybe related to the fact that
"thisjob" isn't initialised in a particularly obvious place when the
code is executing within the shell.  Are you seeing this with any
shell function, or just anonymous functions?

> Oh, one other bug which I don't believe is new:  the PIPEFAIL option
> doesn't work right if the last command is a builtin.  Maybe that
> test should be moved into storepipestats() as well, in which case
> the redundancy I mentioned above might also be eliminated.

Are you saying that $pipestatus is correct but the overall status is
wrong if you have something like "true | true | true" or "true | true |
false"?  It certainly sounds like they ought to run at the same point.
The last command ought to be the easy one --- that's where the status
has always come from, so if those two pipelines are giving different
answers with and without PIPEFAIL something is definitely odd.

pws


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Strange function/pipestatus behavior, maybe a scope bug?
  2013-10-24  5:15       ` Bart Schaefer
  2013-10-24  8:57         ` Peter Stephenson
@ 2013-10-24 15:17         ` Jun T.
  2013-10-24 16:25           ` Peter Stephenson
  1 sibling, 1 reply; 15+ messages in thread
From: Jun T. @ 2013-10-24 15:17 UTC (permalink / raw)
  To: zsh-workers

On Mac OS X, A05execution.ztst fails as

*** 1 ****
!    2048 0 0 0
--- 1 ----
! 2048 0 0 0
Test ./A05execution.ztst failed: output differs from expected as shown above for:
  repeat 2048; do (: | : | while false; do
                             break
                           done;
                   print "${pipestatus[@]}")
    ZTST_hashmark
  done | sort | uniq -c

i.e., number of spaces output by 'uniq -c' before the count is different.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Strange function/pipestatus behavior, maybe a scope bug?
  2013-10-24  8:57         ` Peter Stephenson
@ 2013-10-24 15:48           ` Peter Stephenson
  2013-10-24 16:03             ` Peter Stephenson
  2013-10-24 16:39             ` Bart Schaefer
  2013-10-24 16:26           ` Bart Schaefer
  1 sibling, 2 replies; 15+ messages in thread
From: Peter Stephenson @ 2013-10-24 15:48 UTC (permalink / raw)
  To: zsh-workers

On Thu, 24 Oct 2013 09:57:13 +0100
Peter Stephenson <p.stephenson@samsung.com> wrote:
> > Oh, one other bug which I don't believe is new:  the PIPEFAIL option
> > doesn't work right if the last command is a builtin.  Maybe that
> > test should be moved into storepipestats() as well, in which case
> > the redundancy I mentioned above might also be eliminated.
> 
> Are you saying that $pipestatus is correct but the overall status is
> wrong if you have something like "true | true | true" or "true | true |
> false"?  It certainly sounds like they ought to run at the same point.
> The last command ought to be the easy one --- that's where the status
> has always come from, so if those two pipelines are giving different
> answers with and without PIPEFAIL something is definitely odd.

I can't see any problem here, which might not be too surprising if it's
a race, but I don't see how the following fix can be wrong if pipestatus
is now working: the most complicated case appears to be if it gets
called multiple times and STAT_CURSH is set, but even there it looks
like it's idempotent with respect to lastval (assuming the information
hasn't changed, of course).

I don't know if this potentially makes one of the calls removable.

There is some bad interaction with completion, at least on my setup
(happens from a vanilla setup, too):

% true | true | false
% print $pipest<TAB>
0

This doesn't seem to happen with hooks, so completion is failing to save
something it should.

diff --git a/Src/jobs.c b/Src/jobs.c
index 82ffdf2..200e38e 100644
--- a/Src/jobs.c
+++ b/Src/jobs.c
@@ -377,7 +377,7 @@ check_cursh_sig(int sig)
 }
 
 /**/
-int
+static void
 storepipestats(Job jn, int inforeground)
 {
     int i, pipefail = 0, jpipestats[MAX_PIPESTATS];
@@ -397,7 +397,13 @@ storepipestats(Job jn, int inforeground)
 	numpipestats = i;
     }
 
-    return pipefail;
+    if (jn == jobtab+thisjob) {
+	if (jn->stat & STAT_CURSH) {
+	    if (!lastval && isset(PIPEFAIL))
+		lastval = pipefail;
+	} else if (isset(PIPEFAIL))
+	    lastval = pipefail;
+    }
 }
 
 /* Update status of job, possibly printing it */
@@ -531,17 +537,8 @@ update_job(Job jn)
 	return;
     jn->stat |= (somestopped) ? STAT_CHANGED | STAT_STOPPED :
 	STAT_CHANGED | STAT_DONE;
-    if (jn->stat & STAT_DONE) {
-	int newlastval = storepipestats(jn, inforeground);
-
-	if (job == thisjob) {
-	    if (jn->stat & STAT_CURSH) {
-		if (!lastval && isset(PIPEFAIL))
-		    lastval = newlastval;
-	    } else if (isset(PIPEFAIL))
-		lastval = newlastval;
-	}
-    }
+    if (jn->stat & STAT_DONE)
+	storepipestats(jn, inforeground);
     if (!inforeground &&
 	(jn->stat & (STAT_SUBJOB | STAT_DONE)) == (STAT_SUBJOB | STAT_DONE)) {
 	int su;
@@ -991,7 +988,7 @@ printjob(Job jn, int lng, int synch)
 
     if (skip_print) {
 	if (jn->stat & STAT_DONE) {
-	  (void) storepipestats(jn, job == thisjob);
+	    storepipestats(jn, job == thisjob);
 	    if (should_report_time(jn))
 		dumptime(jn);
 	    deletejob(jn, 0);
@@ -1122,7 +1119,7 @@ printjob(Job jn, int lng, int synch)
 /* delete job if done */
 
     if (jn->stat & STAT_DONE) {
-	(void) storepipestats(jn, job == thisjob);
+	storepipestats(jn, job == thisjob);
 	if (should_report_time(jn))
 	    dumptime(jn);
 	deletejob(jn, 0);

pws


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Strange function/pipestatus behavior, maybe a scope bug?
  2013-10-24 15:48           ` Peter Stephenson
@ 2013-10-24 16:03             ` Peter Stephenson
  2013-10-24 16:44               ` Bart Schaefer
  2013-10-24 16:39             ` Bart Schaefer
  1 sibling, 1 reply; 15+ messages in thread
From: Peter Stephenson @ 2013-10-24 16:03 UTC (permalink / raw)
  To: zsh-workers

On Thu, 24 Oct 2013 16:48:29 +0100
Peter Stephenson <p.stephenson@samsung.com> wrote:
> There is some bad interaction with completion, at least on my setup
> (happens from a vanilla setup, too):
> 
> % true | true | false
> % print $pipest<TAB>
> 0
> 
> This doesn't seem to happen with hooks, so completion is failing to save
> something it should.

Hmm... does the following have any bad effects?  If we're really only
after the return value of the function, I don't think it should...

diff --git a/Src/Zle/compcore.c b/Src/Zle/compcore.c
index 39d41bd..5c5628a 100644
--- a/Src/Zle/compcore.c
+++ b/Src/Zle/compcore.c
@@ -826,7 +826,6 @@ callcompfunc(char *s, char *fn)
 	sfcontext = SFC_CWIDGET;
 	NEWHEAPS(compheap) {
 	    LinkList largs = NULL;
-	    int olv = lastval;
 
 	    if (*cfargs) {
 		char **p = cfargs;
@@ -836,9 +835,7 @@ callcompfunc(char *s, char *fn)
 		while (*p)
 		    addlinknode(largs, dupstring(*p++));
 	    }
-	    doshfunc(shfunc, largs, 0);
-	    cfret = lastval;
-	    lastval = olv;
+	    cfret = doshfunc(shfunc, largs, 1);
 	} OLDHEAPS;
 	sfcontext = osc;
 	endparamscope();

pws


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Strange function/pipestatus behavior, maybe a scope bug?
  2013-10-24 15:17         ` Jun T.
@ 2013-10-24 16:25           ` Peter Stephenson
  0 siblings, 0 replies; 15+ messages in thread
From: Peter Stephenson @ 2013-10-24 16:25 UTC (permalink / raw)
  To: zsh-workers

On Fri, 25 Oct 2013 00:17:38 +0900
"Jun T." <takimoto-j@kba.biglobe.ne.jp> wrote:
> On Mac OS X, A05execution.ztst fails as
> 
> *** 1 ****
> !    2048 0 0 0
> --- 1 ----
> ! 2048 0 0 0
> Test ./A05execution.ztst failed: output differs from expected as shown above for:
>   repeat 2048; do (: | : | while false; do
>                              break
>                            done;
>                    print "${pipestatus[@]}")
>     ZTST_hashmark
>   done | sort | uniq -c
> 
> i.e., number of spaces output by 'uniq -c' before the count is different.

diff --git a/Test/A05execution.ztst b/Test/A05execution.ztst
index c42e454..fd13e91 100644
--- a/Test/A05execution.ztst
+++ b/Test/A05execution.ztst
@@ -184,8 +184,8 @@
                            done;
                    print "${pipestatus[@]}")
 	ZTST_hashmark
-  done | sort | uniq -c
+  done | sort | uniq -c | sed 's/^ *//'
 0:Check whether `$pipestatus[]' behaves.
->   2048 0 0 0
+>2048 0 0 0
 F:This test checks for a bug in `$pipestatus[]' handling.  If it breaks then
 F:the bug is still there or it reappeared. See workers-29973 for details.

pws


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Strange function/pipestatus behavior, maybe a scope bug?
  2013-10-24  8:57         ` Peter Stephenson
  2013-10-24 15:48           ` Peter Stephenson
@ 2013-10-24 16:26           ` Bart Schaefer
  2013-10-24 16:46             ` Peter Stephenson
  1 sibling, 1 reply; 15+ messages in thread
From: Bart Schaefer @ 2013-10-24 16:26 UTC (permalink / raw)
  To: zsh-workers

On Oct 24,  9:57am, Peter Stephenson wrote:
} Subject: Re: Strange function/pipestatus behavior, maybe a scope bug?
}
} On Wed, 23 Oct 2013 22:15:48 -0700
} Bart Schaefer <schaefer@brasslantern.com> wrote:
} > something goes wrong and the whole thing is treated as if it is in
} > the background.  Here's zsh-5.0.2-248-ge05261f without my patch:
} > 
} > % true | false | true | false | (){ true }; echo $pipestatus
} > [1]  + done       true | 
} >        exit 1     false | 
} >        done       true | 
} >        running    false
} > 1
} > % 
} 
} Sounds like a race setting up the job, maybe related to the fact that
} "thisjob" isn't initialised in a particularly obvious place when the
} code is executing within the shell.  Are you seeing this with any
} shell function, or just anonymous functions?

Any shell function will do it, except one that runs an external command.

The odd thing is that this doesn't require "zsh -f", it happens for the
first interactive command in a new shell even after loading the whole
of compinit etc.

It does NOT happen if the MONITOR option is not set.

Here's a different clue:

% true | false | (){ sleep 10; true }
zsh: done       true | 
zsh: exit 1     false | 
zsh: suspended  () { ... }
% echo $? : $pipestatus
20 : 0 1 0
% fg
[1]  + done       true | 
       exit 1     false | 
       continued  () { ... }
(anon):fg: no job control in this shell.
% 

This one is repeatable (happens every time, not just the first, and
with any function, not just an anonymous one).
 
} > Oh, one other bug which I don't believe is new:  the PIPEFAIL option
} > doesn't work right if the last command is a builtin.  Maybe that
} > test should be moved into storepipestats() as well, in which case
} > the redundancy I mentioned above might also be eliminated.
} 
} Are you saying that $pipestatus is correct but the overall status is
} wrong if you have something like "true | true | true" or "true | true |
} false"?

Not exactly.  PIPEFAIL means that if any command anywhere in the pipeline
fails, then the entire pipeline fails.  *Without* PIPEFAIL, the command

	true | (return 3) | true

shoule have pipefail=(0 3 0) and exit status 0.   *With* PIPEFAIL, it
should have pipefail=(0 3 0) and exit status 3.  But it does not,
becuase "true" is a builtin.  Replace with /bin/true and the PIPEFAIL
exit status is correct.

} It certainly sounds like they ought to run at the same point.

Yeah, I'm becoming convinced that the PIPEFAIL tests also need to be
moved into storepipestats().

It's a little messy that you can't tell from the Job pointer whether
the job *number* is equal to thisjob.  Since inforground can be true
in update_job even when job != thisjob, but PIPEFAIL only applies
when job == thisjob, it probably means passing an extra parameter.

Having implemented this, I can't force a case where it's the right
thing to call storepipestats() from update_job().  On the other hand,
I'm also getting signal handling race conditions again, particularly
when sending TSTP (^Z) to a pipeline that ends with (){ sleep N } --
the value of pipestatus is correct, but the job gets treated as done
when it should be stopped, so the stopped sleep is orphaned.

Somebody else please try this patch and see if it works consistently
for you.

Aside:  I did think of another reason we might want to keep a copy of
pipestatus in the Job object:  so that we can restore it when a job
that has been stopped or backgrounded is brought into the foreground
again.  Right now the exit status of "fg" seems to always be zero
even after suspending/resuming (){ sleep 10; false }, and pipestatus
is only valid right after suspending the job, not upon resuming it.

diff --git a/Src/jobs.c b/Src/jobs.c
index 82ffdf2..c218743 100644
--- a/Src/jobs.c
+++ b/Src/jobs.c
@@ -377,8 +377,8 @@ check_cursh_sig(int sig)
 }
 
 /**/
-int
-storepipestats(Job jn, int inforeground)
+void
+storepipestats(Job jn, int inforeground, int fixlastval)
 {
     int i, pipefail = 0, jpipestats[MAX_PIPESTATS];
     Process p;
@@ -397,7 +397,13 @@ storepipestats(Job jn, int inforeground)
 	numpipestats = i;
     }
 
-    return pipefail;
+    if (fixlastval) {
+      if (jn->stat & STAT_CURSH) {
+	if (!lastval && isset(PIPEFAIL))
+	  lastval = pipefail;
+      } else if (isset(PIPEFAIL))
+	lastval = pipefail;
+    }
 }
 
 /* Update status of job, possibly printing it */
@@ -532,15 +538,12 @@ update_job(Job jn)
     jn->stat |= (somestopped) ? STAT_CHANGED | STAT_STOPPED :
 	STAT_CHANGED | STAT_DONE;
     if (jn->stat & STAT_DONE) {
-	int newlastval = storepipestats(jn, inforeground);
-
-	if (job == thisjob) {
-	    if (jn->stat & STAT_CURSH) {
-		if (!lastval && isset(PIPEFAIL))
-		    lastval = newlastval;
-	    } else if (isset(PIPEFAIL))
-		lastval = newlastval;
-	}
+	/* This may be redundant with printjob() but note that inforeground
+	 * is true here for STAT_CURSH jobs even when job != thisjob, most
+	 * likely because thisjob = -1 from exec.c:execsimple() trickery.
+	 * However, if we reset lastval here we break it for printjob().
+	 */
+	storepipestats(jn, inforeground, 0);
     }
     if (!inforeground &&
 	(jn->stat & (STAT_SUBJOB | STAT_DONE)) == (STAT_SUBJOB | STAT_DONE)) {
@@ -991,7 +994,8 @@ printjob(Job jn, int lng, int synch)
 
     if (skip_print) {
 	if (jn->stat & STAT_DONE) {
-	  (void) storepipestats(jn, job == thisjob);
+	    /* This looks silly, but see update_job() */
+	    storepipestats(jn, job == thisjob, job == thisjob);
 	    if (should_report_time(jn))
 		dumptime(jn);
 	    deletejob(jn, 0);
@@ -1107,9 +1111,9 @@ printjob(Job jn, int lng, int synch)
 	fflush(fout);
     }
 
-/* print "(pwd now: foo)" messages: with (lng & 4) we are printing
- * the directory where the job is running, otherwise the current directory
- */
+    /* print "(pwd now: foo)" messages: with (lng & 4) we are printing
+     * the directory where the job is running, otherwise the current directory
+     */
 
     if ((lng & 4) || (interact && job == thisjob &&
 		      jn->pwd && strcmp(jn->pwd, pwd))) {
@@ -1119,10 +1123,12 @@ printjob(Job jn, int lng, int synch)
 	fprintf(fout, ")\n");
 	fflush(fout);
     }
-/* delete job if done */
+
+    /* delete job if done */
 
     if (jn->stat & STAT_DONE) {
-	(void) storepipestats(jn, job == thisjob);
+	/* This looks silly, but see update_job() */
+	storepipestats(jn, job == thisjob, job == thisjob);
 	if (should_report_time(jn))
 	    dumptime(jn);
 	deletejob(jn, 0);


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Strange function/pipestatus behavior, maybe a scope bug?
  2013-10-24 15:48           ` Peter Stephenson
  2013-10-24 16:03             ` Peter Stephenson
@ 2013-10-24 16:39             ` Bart Schaefer
  1 sibling, 0 replies; 15+ messages in thread
From: Bart Schaefer @ 2013-10-24 16:39 UTC (permalink / raw)
  To: zsh-workers

On Oct 24,  4:48pm, Peter Stephenson wrote:
} Subject: Re: Strange function/pipestatus behavior, maybe a scope bug?
}
} I can't see any problem here, which might not be too surprising if it's
} a race, but I don't see how the following fix can be wrong if pipestatus
} is now working: the most complicated case appears to be if it gets
} called multiple times and STAT_CURSH is set, but even there it looks
} like it's idempotent with respect to lastval (assuming the information
} hasn't changed, of course).

Unfortunately (see my patch which crossed in the mail with yours) there
are cases where updating lastval both from update_job() and then again
from print_job() will in fact end up with the wrong lastval.  You have
to run some pipelines both with and without PIPEFAIL set to see this.
 
} I don't know if this potentially makes one of the calls removable.

I *suspect* that the call in update_job() is removable, but I got
inconsistent results from testing it both ways (hence my conclusion
that there's still a signal handling race).


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Strange function/pipestatus behavior, maybe a scope bug?
  2013-10-24 16:03             ` Peter Stephenson
@ 2013-10-24 16:44               ` Bart Schaefer
  0 siblings, 0 replies; 15+ messages in thread
From: Bart Schaefer @ 2013-10-24 16:44 UTC (permalink / raw)
  To: zsh-workers

On Oct 24,  5:03pm, Peter Stephenson wrote:
} Subject: Re: Strange function/pipestatus behavior, maybe a scope bug?
}
} On Thu, 24 Oct 2013 16:48:29 +0100
} Peter Stephenson <p.stephenson@samsung.com> wrote:
} > There is some bad interaction with completion, at least on my setup
} > (happens from a vanilla setup, too):
} > 
} > % true | true | false
} > % print $pipest<TAB>
} > 0
} > 
} > This doesn't seem to happen with hooks, so completion is failing to save
} > something it should.
} 
} Hmm... does the following have any bad effects?  If we're really only
} after the return value of the function, I don't think it should...

I don't see any reason why that wouldn't be correct.  Possibly the
save/restore of lastval there predates doshfunc() having similar code?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Strange function/pipestatus behavior, maybe a scope bug?
  2013-10-24 16:26           ` Bart Schaefer
@ 2013-10-24 16:46             ` Peter Stephenson
  2013-10-25  0:38               ` Bart Schaefer
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Stephenson @ 2013-10-24 16:46 UTC (permalink / raw)
  To: zsh-workers

On Thu, 24 Oct 2013 09:26:33 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> Not exactly.  PIPEFAIL means that if any command anywhere in the pipeline
> fails, then the entire pipeline fails.  *Without* PIPEFAIL, the command
> 
> 	true | (return 3) | true
> 
> shoule have pipefail=(0 3 0) and exit status 0.   *With* PIPEFAIL, it
> should have pipefail=(0 3 0) and exit status 3.  But it does not,
> becuase "true" is a builtin.  Replace with /bin/true and the PIPEFAIL
> exit status is correct.

With your patch (removing mine, which isn't so careful about thisjob),
this seems to work here.

The following adds Ian's test that consistently failed before and no
longer does; any external command probably ought todo the trick, but I
suppose there might still be a small race; maybe the function should
even have a sleep in it.  I've kept awk in case the timing hits a sweet
(or rather sour) spot, but sanitised it so awk can't flail around doing
different things.

Amusingly (or otherwise), running this from the command line was for a
time consistently giving me the "no job control in this shell message",
then it stopped, so this definitely looks like a race.

diff --git a/Test/A05execution.ztst b/Test/A05execution.ztst
index fd13e91..8578016 100644
--- a/Test/A05execution.ztst
+++ b/Test/A05execution.ztst
@@ -189,3 +189,10 @@
 >2048 0 0 0
 F:This test checks for a bug in `$pipestatus[]' handling.  If it breaks then
 F:the bug is still there or it reappeared. See workers-29973 for details.
+
+  externFunc() { awk >/dev/null 2>&1; true; }
+  false | true | false | true | externFunc
+  echo $pipestatus
+0:Check $pipestatus with a known difficult case
+>1 0 1 0 0
+F:This similar test was triggering a reproducible failure with pipestatus.

pws


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Strange function/pipestatus behavior, maybe a scope bug?
  2013-10-24 16:46             ` Peter Stephenson
@ 2013-10-25  0:38               ` Bart Schaefer
  0 siblings, 0 replies; 15+ messages in thread
From: Bart Schaefer @ 2013-10-25  0:38 UTC (permalink / raw)
  To: zsh-workers

On Oct 24,  5:46pm, Peter Stephenson wrote:
}
} Amusingly (or otherwise), running this from the command line was for a
} time consistently giving me the "no job control in this shell message",
} then it stopped, so this definitely looks like a race.

Yes, I fully expect Eitan Adler to pop up in a few weeks having managed
to crash the shell again, which will tell us what function needs to
be made signal safe now ...


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2013-10-25  0:39 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-22 18:03 Strange function/pipestatus behavior, maybe a scope bug? Ian F
2013-10-23  8:21 ` Peter Stephenson
2013-10-23 19:49   ` Ian F
2013-10-23 21:14     ` Peter Stephenson
2013-10-24  5:15       ` Bart Schaefer
2013-10-24  8:57         ` Peter Stephenson
2013-10-24 15:48           ` Peter Stephenson
2013-10-24 16:03             ` Peter Stephenson
2013-10-24 16:44               ` Bart Schaefer
2013-10-24 16:39             ` Bart Schaefer
2013-10-24 16:26           ` Bart Schaefer
2013-10-24 16:46             ` Peter Stephenson
2013-10-25  0:38               ` Bart Schaefer
2013-10-24 15:17         ` Jun T.
2013-10-24 16:25           ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).