zsh-workers
 help / color / mirror / code / Atom feed
From: Bart Schaefer <schaefer@brasslantern.com>
To: zsh-workers@zsh.org
Subject: Re: $pipestatus broken?
Date: Sat, 24 Dec 2011 10:23:33 -0800	[thread overview]
Message-ID: <111224102333.ZM18731@torch.brasslantern.com> (raw)
In-Reply-To: <878vm2uw6i.fsf@ft.bewatermyfriend.org>

On Dec 24, 10:59am, Frank Terbeck wrote:
} Subject: Re: $pipestatus broken?
}
} Bart Schaefer wrote:
} >
} > In the loop case { echo foo | repeat 1; read -E } there is a job table
} > entry for the loop which is the group leader, but a new entry is
} > created for "read -E".  execpline() remembers the previous thisjob as
} > the local "pj" and restores thisjob = pj at line 1619, but by that
} > time it is too late -- waitjobs() has set thisjob = -1 for just long
} > enough for zhandler() to call update_job(), which fails to update the
} > pipestats because thisjob = -1 tells it there is no current job.
} >
} > The following seems to fix it, by telling waitjobs() what the previous
} > job number was so it can be reset immediately.  There may still be a
} > race condition that requires fiddling with signal blocks to make sure
} > thisjob is correct at the time the zhandler() catches the signal, but
} > if so this should at least allow the block/unblock to be localized.
} 
} Hm. I'm having a hard time following what's going on...

A bit more explanation, then; let's use your Test/A04 example:

    : | while read a; do :; done

In execpline(), zsh wants to keep the right side ("while ...") in the
current shell.  So it creates a job entry jobtab[1] for the pipeline,
forks to run ":" on the left side which becomes jobtab[1]->procs, and
enters execwhile() for the right side.  At this point thisjob = 1.

Now it needs to run "read a", so execpline temporarily creates a new
entry jobtab[2], saves pj = thisjob, sets thisjob = 2 and enters
execbuiltin() [whether it's a builtin isn't important to the bug].
When the builtin completes, execpline restores thisjob = pj to make
the loop the current job again.

EXCEPT ... at various times including during execbuiltin(), the child
forked off to run ":" may exit and hit the parent with SIGCHLD.  This
invokes zhandler() which reaps the process and calls update_job() to
change status in the job table, including $pipestatus.  update_job()
compares the job that just exited (a process linked to jobtab[1]) with
the current foreground process (which thisjob says is jobtab[2]) and
concludes that a background job has exited.  Therefore it skips the
update of $pipestatus and instead resets it as if there were no pipe.

When the shell then gets around to waiting for jobtab[1] at the end
of the loop, it has lost the reaped left-hand-side and behaves as if
there is only one job in the pipeline.

What we have is a case where the shell is juggling two "current" jobs
(the loop itself, and the command executed from within the loop) and
it loses track of one of them at a crucial instant.

} With this change, the test I posted in workers-30047 changes a bit.
} Before, there were only lines that either looked like "1" or "0 0". Now
} I'm getting "0 1", too.

Yes, when I actually put your test into my patched sandbox and run
"make check" I also get all three results.  Obviously waitjobs() is
not the only place where the SIGCHLD can sneak in, it's just the only
one I was able to catch with the debugger.  So that patch is not
sufficient (and also probably not necessary if we resolve the race).

I don't know exactly where the "0 1" is coming from -- or rather, it
must be coming from this in update_job() but I'm not sure why:

        if ((jn->stat & STAT_CURSH) && i < MAX_PIPESTATS)
            pipestats[i++] = lastval;

In this case I'm *guessing* lastval = 1 because we're catching the
signal after thisjob = pj but before the actual wait for jobtab[1],
so lastval reflects that "read a" has failed.


  reply	other threads:[~2011-12-24 18:23 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-10 12:24 Frank Terbeck
2011-12-10 12:48 ` Frank Terbeck
2011-12-10 14:58   ` Bart Schaefer
2011-12-11 14:37     ` Frank Terbeck
2011-12-23 10:49     ` Frank Terbeck
2011-12-23 21:31       ` Bart Schaefer
2011-12-23 22:11         ` Frank Terbeck
2011-12-24  9:32           ` Bart Schaefer
2011-12-24  9:59             ` Frank Terbeck
2011-12-24 18:23               ` Bart Schaefer [this message]
2011-12-24 18:46                 ` Bart Schaefer
2011-12-24 17:37       ` Bart Schaefer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=111224102333.ZM18731@torch.brasslantern.com \
    --to=schaefer@brasslantern.com \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).