zsh-workers
 help / color / mirror / code / Atom feed
From: Bart Schaefer <schaefer@brasslantern.com>
To: zsh-workers@zsh.org
Subject: Re: 5.0.8 regression when waiting for suspended jobs
Date: Wed, 12 Aug 2015 07:58:58 -0700	[thread overview]
Message-ID: <150812075858.ZM32741@torch.brasslantern.com> (raw)
In-Reply-To: <20150812104351.65a4cbea@pwslap01u.europe.root.pri>

Preface:  I think I've figured out why zwaitjob() does not use the same
logic as waitforpid(); zwaitjob() may be waiting for an entire pipeline,
needing to record status of multiple actual processes which may exit in
any order, and only finish when all the processes are complete.

On Aug 12, 10:43am, Peter Stephenson wrote:
} Subject: Re: 5.0.8 regression when waiting for suspended jobs
}
} On Tue, 11 Aug 2015 16:56:55 -0700
} Bart Schaefer <schaefer@brasslantern.com> wrote:
} > } zsh-5.0.7
} > }  - "wait $!" blocks (looping on repeated wait3() nonzero)
} > } zsh-5.0.8
} > }  - "wait $!" loops but also printing status every time
} > 
} > bin_fg() calls waitforpid() which discovers the job is stopped and goes
} > into a loop calling kill(pid, SIGCONT) to try to get the job to run
} > again.
} > 
} > All of this is exactly the same as in 5.0.7 except that because of the
} > SIGCONT change in workers/35032 we notice the stopped -> continued ->
} > stopped again status change and therefore print the new status
} 
} So you might have thought the right thing to do was note it had been
} stopped immediately, possibly warn the user, and not try to continue it
} again without further user action?  Is that easy?

No, not really.  I suppose we could do something baroque like examine
the rusage cputime but otherwise the CHLD could be arriving at any point.

We could special-case the SIGTT* signals, we obviously know (from the
status that's printed) which signal stopped the job.

} Clearly there's a race in the real world
} where the programme could get SIGTTIN at any time, but in the general
} case (i.e. where a background process got SIGTTIN when the foreground
} was doing something irrelevant) you clearly *don't* want it to continue
} every time.

This only happens for the "wait" command, not for handling the signal
while something else is in the foreground.  There might be some weird
edge case where you could cause it to happen with command substitution
(the only other place waitforpid() is used) but I can't come up with it.

} Do we even understand what the loop with SIGCONT is doing for us?  Under
} what circumstances would this help?

It would seem that it's trying very hard not to have "wait" either fail
immediately (bash behavior) or block forever (ksh behavior).  Doing the
ksh thing makes a bit of sense when "wait" will propagate the signals
(so interrupting wait also interrupts the stopped job).

} Some (other sort of) race where something else (what? Not zsh and
} not the process that's suspended) takes a while to get going, so the
} SIGCONT only succeeds after a few attempts?

Reasoning lost to history, I fear (predates source code control).

} > - wait %1" -
} > 
} > bin_fg() calls zwaitjob() which does NOT do kill(pid, SIGCONT) instead
} > simply blocking forever waiting for a SIGCHLD that will never arrive.

I actually got this one wrong -- yes, zwaitjob() would block forever if
it reached that signal_suspend() call, but in fact it won't even try if
the job status is STAT_STOPPED.  It just silently returns.

-- 
Barton E. Schaefer


  reply	other threads:[~2015-08-12 14:59 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-30 14:32 Christian Neukirchen
2015-07-30 19:39 ` Bart Schaefer
2015-07-31  8:30   ` Christian Neukirchen
2015-07-31 15:56     ` Bart Schaefer
2015-07-31 17:28       ` Christian Neukirchen
2015-07-31 17:41         ` Bart Schaefer
2015-08-11 23:56       ` Bart Schaefer
2015-08-12  9:43         ` Peter Stephenson
2015-08-12 14:58           ` Bart Schaefer [this message]
2015-08-12 16:09             ` Peter Stephenson
2015-08-12 17:31               ` Bart Schaefer
2015-08-15  1:56                 ` Bart Schaefer
2015-08-16 17:37                   ` Peter Stephenson
2015-08-16 21:49                     ` Bart Schaefer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=150812075858.ZM32741@torch.brasslantern.com \
    --to=schaefer@brasslantern.com \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).