From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <zsh-workers-return-31879-mason-zsh=primenet.com.au@zsh.org>
Received: (qmail 13438 invoked by alias); 24 Oct 2013 05:16:12 -0000
Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm
Precedence: bulk
X-No-Archive: yes
List-Id: Zsh Workers List <zsh-workers.zsh.org>
List-Post: <mailto:zsh-workers@zsh.org>
List-Help: <mailto:zsh-workers-help@zsh.org>
X-Seq: 31879
Received: (qmail 12980 invoked from network); 24 Oct 2013 05:15:56 -0000
X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au
X-Spam-Level: 
X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE
	autolearn=ham version=3.3.2
From: Bart Schaefer <schaefer@brasslantern.com>
Message-id: <131023221548.ZM2352@torch.brasslantern.com>
Date: Wed, 23 Oct 2013 22:15:48 -0700
In-reply-to: <20131023221412.5cdecd76@pws-pc.ntlworld.com>
Comments: In reply to Peter Stephenson <p.w.stephenson@ntlworld.com>
 "Re: Strange function/pipestatus behavior, maybe a scope bug?" (Oct 23,
 10:14pm)
References:
 <CAMeu-1WYhBr-hbcgXQy3gEGOSSpS2FFmxDa+QDFyN1m+7Z1kQA@mail.gmail.com>
	<20131023092155.5ba8a54b@pwslap01u.europe.root.pri>
	<CAMeu-1WsKO71YfJ+aO9uy=_oGLJEsKh-yL8HJ_rMicqmCj2FQw@mail.gmail.com>
	<20131023221412.5cdecd76@pws-pc.ntlworld.com>
X-Mailer: OpenZMail Classic (0.9.2 24April2005)
To: zsh-workers@zsh.org
Subject: Re: Strange function/pipestatus behavior, maybe a scope bug?
MIME-version: 1.0
Content-type: text/plain; charset=us-ascii

On Oct 23, 10:14pm, Peter Stephenson wrote:
} Subject: Re: Strange function/pipestatus behavior, maybe a scope bug?
}
} On Wed, 23 Oct 2013 12:49:51 -0700
} Ian F <in4mer@gmail.com> wrote:
} > I suggest this to be a significant progression in identification of this
} > bug, as well as indicating a greatly heightened severity, owing to the 100%
} > incorrect handling of pipeline exit values in the provided, very simple
} > test case (again, substantially different than previous reports I've been
} > able to find).
} 
} The problem isn't any lack of severity; the problem is the shell's job
} control code isn't structured to handle this well.  Short of a hack to
} fix some subset of cases, which it would be best to avoid as it's likely
} to make the real problem worse, it probably means a major rewrite of job
} handling: so don't hold your breath.

So I was thinking about this and it occurred to me that we already do
have a hierarchical job structure -- a Job object contains one or more
Process objects, and may point (by job number, or PID for the proccess
group leader) to other jobs in the flat job table, etc.

Thus if the problem really was as I analyzed it back in December 2012,
it should be resolvable by storing a copy of the pipestatus in the Job
object and then copying it to the global pipestats [that is not a typo]
when the job is reaped.

Which in turn led me by a circuitous route to discover that although my
earlier analysis wasn't entirely wrong, it was incomplete.  The real
problem behind all the SIGCHLD-handling mumbo-jumbo turns out to be that
the task of deleting the Job object is actually delegated to printjob(), 
which was destroying the linked list of Process objects from the Job
object before their status had been recorded in pipestats.

It therefore suffices to lift the pipestatus-updating code out of the
update_job() routine and call it from printjob() just before the Job
object is deleted.

There's a minor redundancy still here in that update_job() sometimes
calls printjob() which will result in storepipestats() being called
twice; but update_job() needs the return status, sometimes has a
different idea of whether the job is in the foreground, and doesn't
always call printjob(), so I don't see any obvious way around this.

One remaining bug which is neither new to this nor fixed by it:  The
very first time one runs a pipeline ending in a shell function,
something goes wrong and the whole thing is treated as if it is in
the background.  Here's zsh-5.0.2-248-ge05261f without my patch:

% true | false | true | false | (){ true }; echo $pipestatus
[1]  + done       true | 
       exit 1     false | 
       done       true | 
       running    false
1
% 

In this revisions of the shell, repeating this command produces this
same bad output every time.

Here's zsh-5.0.2-225-g225ee4c which includes my patch:

% true | false | true | false | (){ true }; echo $pipestatus
[1]  + done       true | 
       exit 1     false | 
       done       true | 
       exit 1     false
(anon): can't set tty pgrp: operation not permitted
1
% 

The tty pgrp error may be OS-related rather than shell-version, I'm
testing on two different hosts.

Now here's the same command executed again immediately in g225ee4c:

% true | false | true | false | (){ true }; echo $pipestatus
0 1 0 1 0
% 

I don't see anything in my patch that ought to cause this change in
behavior.  It would seem something is failing to be initialized in the
first instance, to make the pipeline behave as if backgrounded, and in
ge05261f it remains in the same state.  But what would that be and how
did my patch affect it?

Oh, one other bug which I don't believe is new:  the PIPEFAIL option
doesn't work right if the last command is a builtin.  Maybe that
test should be moved into storepipestats() as well, in which case
the redundancy I mentioned above might also be eliminated.

diff --git a/Src/jobs.c b/Src/jobs.c
index b9d7a84..82ffdf2 100644
--- a/Src/jobs.c
+++ b/Src/jobs.c
@@ -376,6 +376,30 @@ check_cursh_sig(int sig)
     }
 }
 
+/**/
+int
+storepipestats(Job jn, int inforeground)
+{
+    int i, pipefail = 0, jpipestats[MAX_PIPESTATS];
+    Process p;
+
+    for (p = jn->procs, i = 0; p && i < MAX_PIPESTATS; p = p->next, i++) {
+	jpipestats[i] = ((WIFSIGNALED(p->status)) ?
+			 0200 | WTERMSIG(p->status) :
+			 WEXITSTATUS(p->status));
+	if (jpipestats[i])
+	    pipefail = jpipestats[i];
+    }
+    if (inforeground) {
+	memcpy(pipestats, jpipestats, sizeof(int)*i);
+	if ((jn->stat & STAT_CURSH) && i < MAX_PIPESTATS)
+	    pipestats[i++] = lastval;
+	numpipestats = i;
+    }
+
+    return pipefail;
+}
+
 /* Update status of job, possibly printing it */
 
 /**/
@@ -507,24 +531,16 @@ update_job(Job jn)
 	return;
     jn->stat |= (somestopped) ? STAT_CHANGED | STAT_STOPPED :
 	STAT_CHANGED | STAT_DONE;
-    if (job == thisjob && (jn->stat & STAT_DONE)) {
-	int i, newlastval = 0;
-	Process p;
-
-	for (p = jn->procs, i = 0; p && i < MAX_PIPESTATS; p = p->next, i++) {
-	    pipestats[i] = ((WIFSIGNALED(p->status)) ?
-			    0200 | WTERMSIG(p->status) :
-			    WEXITSTATUS(p->status));
-	    if (pipestats[i])
-		newlastval = pipestats[i];
-	}
-	if ((jn->stat & STAT_CURSH) && i < MAX_PIPESTATS) {
-	    pipestats[i++] = lastval;
-	    if (!lastval && isset(PIPEFAIL))
+    if (jn->stat & STAT_DONE) {
+	int newlastval = storepipestats(jn, inforeground);
+
+	if (job == thisjob) {
+	    if (jn->stat & STAT_CURSH) {
+		if (!lastval && isset(PIPEFAIL))
+		    lastval = newlastval;
+	    } else if (isset(PIPEFAIL))
 		lastval = newlastval;
-	} else if (isset(PIPEFAIL))
-	    lastval= newlastval;
-	numpipestats = i;
+	}
     }
     if (!inforeground &&
 	(jn->stat & (STAT_SUBJOB | STAT_DONE)) == (STAT_SUBJOB | STAT_DONE)) {
@@ -975,6 +991,7 @@ printjob(Job jn, int lng, int synch)
 
     if (skip_print) {
 	if (jn->stat & STAT_DONE) {
+	  (void) storepipestats(jn, job == thisjob);
 	    if (should_report_time(jn))
 		dumptime(jn);
 	    deletejob(jn, 0);
@@ -1105,6 +1122,7 @@ printjob(Job jn, int lng, int synch)
 /* delete job if done */
 
     if (jn->stat & STAT_DONE) {
+	(void) storepipestats(jn, job == thisjob);
 	if (should_report_time(jn))
 	    dumptime(jn);
 	deletejob(jn, 0);