zsh-workers
 help / color / mirror / code / Atom feed
* long-standing tty related issue: wrapped Emacs not suspended
@ 2018-09-20 12:30 Vincent Lefevre
  2018-09-20 15:43 ` Joey Pabalinas
  2018-09-21 16:57 ` Peter Stephenson
  0 siblings, 2 replies; 17+ messages in thread
From: Vincent Lefevre @ 2018-09-20 12:30 UTC (permalink / raw)
  To: zsh-workers

A long-standing tty related issue...

I often run Emacs in background:

  emacs &

This makes sense as it has its own X interface. But I sometimes do
this when I forgot that I was in a SSH session with no X forwarding,
and this makes the terminal unusable since Emacs is not suspended
and both zsh and Emacs try to get terminal input.

Now I've noticed that when I run the Emacs binary directly, Emacs
is suspended as expected. But when Emacs is wrapped in a function,
it is not suspended. After "zsh -f":

zira% e() { emacs -nw "$@"; }
zira% e &

I cannot quit Emacs or get the zsh prompt. I need to kill the
terminal.

I've tested the same thing with other shells: dash behaves like
zsh, and bash, ksh93 and mksh immediately terminate.

The same thing happens when using a sh script instead of a function.

Is this a bug? In any case, can this be solved?

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: long-standing tty related issue: wrapped Emacs not suspended
  2018-09-20 12:30 long-standing tty related issue: wrapped Emacs not suspended Vincent Lefevre
@ 2018-09-20 15:43 ` Joey Pabalinas
  2018-09-20 23:10   ` Vincent Lefevre
  2018-09-21 16:57 ` Peter Stephenson
  1 sibling, 1 reply; 17+ messages in thread
From: Joey Pabalinas @ 2018-09-20 15:43 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: zsh-workers, Joey Pabalinas

[-- Attachment #1: Type: text/plain, Size: 2494 bytes --]

On Thu, Sep 20, 2018 at 02:30:05PM +0200, Vincent Lefevre wrote:
> A long-standing tty related issue...
> 
> I often run Emacs in background:
> 
>   emacs &
> 
> This makes sense as it has its own X interface. But I sometimes do
> this when I forgot that I was in a SSH session with no X forwarding,
> and this makes the terminal unusable since Emacs is not suspended
> and both zsh and Emacs try to get terminal input.
> 
> Now I've noticed that when I run the Emacs binary directly, Emacs
> is suspended as expected. But when Emacs is wrapped in a function,
> it is not suspended. After "zsh -f":
> 
> zira% e() { emacs -nw "$@"; }
> zira% e &
> 
> I cannot quit Emacs or get the zsh prompt. I need to kill the
> terminal.
> 
> I've tested the same thing with other shells: dash behaves like
> zsh, and bash, ksh93 and mksh immediately terminate.
> 
> The same thing happens when using a sh script instead of a function.
> 
> Is this a bug? In any case, can this be solved?

So there is a bit of subtlety at work here cause your issues.

Note the difference between these two `pstree $$` outputs:

The broken case:
$ f() { emacs -nw "$@"; }; f &
zsh─┬─pstree
    ├─zsh───emacs───{emacs}

A working alternative:
$ f() { emacs -nw "$@" & }; f
zsh─┬─pstree
    ├─emacs───{emacs}

In the broken one, your terminal control commands like fg are being sent
to the parent zsh instance, which resumes the zsh subshell. I don't know
the specifics of how emacs' internals work, but usually editors tend to
suspend themselves when they can no longer read/write to or from the
foreground terminal, and Long story short, emacs is stuck suspended and
nested in a subshell, unable to receive your resume commands as the
subshell is intercepting them.

I don't actually know if you can resume emacs in that state, but anyway the
simpler workaround is to use something like `f() { emacs -nw "$@" & }; f`
where having the & inside the function itself avoids the extra subshell and
gives you expected behavior.

I don't know if I would really call this a bug, as pretty much all
common shells these have this same behavior or some other similar
way of handling like simply exiting when put in this odd situation.

I am unsure if this due to historical cruft with how terminals
are implemented these days, so maybe someone else can shed some light
on the reason this is the case.

-- 
Cheers,
Joey Pabalinas

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: long-standing tty related issue: wrapped Emacs not suspended
  2018-09-20 15:43 ` Joey Pabalinas
@ 2018-09-20 23:10   ` Vincent Lefevre
  0 siblings, 0 replies; 17+ messages in thread
From: Vincent Lefevre @ 2018-09-20 23:10 UTC (permalink / raw)
  To: zsh-workers; +Cc: Joey Pabalinas

On 2018-09-20 05:43:42 -1000, Joey Pabalinas wrote:
> I don't actually know if you can resume emacs in that state, but anyway the
> simpler workaround is to use something like `f() { emacs -nw "$@" & }; f`
> where having the & inside the function itself avoids the extra subshell and
> gives you expected behavior.

Well, this is not a workaround in my case, since I do not necessarily
want the "-nw" and the "&" (that was just for testing). But thanks to
your explanations, I could find a workaround:

1. Instead of using a shell function, use a script.

2. And in the script, replacing

  emacs "$@"

by

  exec emacs "$@"

seems to avoid the problem, because the subshell is replaced by
emacs (I can do this because emacs is the last command).

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: long-standing tty related issue: wrapped Emacs not suspended
  2018-09-20 12:30 long-standing tty related issue: wrapped Emacs not suspended Vincent Lefevre
  2018-09-20 15:43 ` Joey Pabalinas
@ 2018-09-21 16:57 ` Peter Stephenson
  2018-09-21 23:14   ` Joey Pabalinas
                     ` (2 more replies)
  1 sibling, 3 replies; 17+ messages in thread
From: Peter Stephenson @ 2018-09-21 16:57 UTC (permalink / raw)
  To: zsh-workers

On Thu, 20 Sep 2018 14:30:05 +0200
Vincent Lefevre <vincent@vinc17.net> wrote:
> is suspended as expected. But when Emacs is wrapped in a function,
> it is not suspended. After "zsh -f":
> 
> zira% e() { emacs -nw "$@"; }
> zira% e &
> 
> I cannot quit Emacs or get the zsh prompt. I need to kill the
> terminal.

I think this one *is* the same basic issue that Thilo was talking about.
As you note, it's not new.  (With the latest code I can only get this
the way Thilo did, by suspending, putting it into the background, then
into the foreground, then I can only get things to work again with kill
-CONT from elsewhere and a bit of massaging to kill of the remaining
goo.   The code shown above actually works OK for me with a bg and fg.)

Joey's explanation --- that there's another copy of the shell in play
that's getting in the way of job control --- is the basic issue, and the
"exec" workaround is what I would have recommended trying.

However, there are a couple of interesting zsh-specific issues here.


The fact that another instance of the shell is created on the ^Z is
actually a special feature for allowing you to suspend builtins,
designed for functions and pipelines with more processing in than this
--- typically loops over external processes.  The bog-standard shell
thing to do is simply suspend the external process, and let the rest of
the shell code take care of itself, which almost certainly isn't what
you want if the function is doing something sophsiticated.

Unfortunately zsh's enhancement is not doing you any favours in this
case since you don't actually want any more shell code to run.  The
forked shell and the emacs go into different process groups, and when
you pass them around between background and foreground they lose track
of who's doing what where.  It's not clear if this is a bug because this
is entirely untrodden ground but it might be possible to get this to
work with enough fiddling.  My first simple-minded go didn't work
(though was at least having effects on the right bits of the code, just
not the effects I wanted).


The other point is that in a lot of cases zsh will exec the final
external programme anyway.  However, it doesn't do it in the case of
functions since it dosn't have enough knowledge of what needs to happen
when the programme called by the function exits.  With enough safeguards
this could probably be got to work --- though this is definitely an
optimisation rather than a bug fix and there is the danger of breaking
things.

Passing through the flag saying if this is the last chunk of code in the
list being executed by the function and its parents is the most obvious
necessary test, but I think most of the infrastructure for this already
exists.  For example, if you run

{ :; sleep 20; } &

you should find there's no additional shell instance when the sleep
is running (up to shell state that might get in the way --- I get an
extra shell when running from the top-level shell in the window but not
if I start a new one, for reasons that escape me completely).

The other thing I can think of is we'd need to test for the presence of
traps that might get called on function exit, including EXIT and DEBUG.
This is interesting because they might not actually be present when the
function starts, so we'd need a special version of the flag saying only
exec if there are no relevant traps at the last command to be executed.

I can't think of any other gotchas but there could well be some.

It's not clear this is worth the bother.

pws

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: long-standing tty related issue: wrapped Emacs not suspended
  2018-09-21 16:57 ` Peter Stephenson
@ 2018-09-21 23:14   ` Joey Pabalinas
  2018-09-22  1:17   ` Bart Schaefer
  2018-09-24 19:51   ` Peter Stephenson
  2 siblings, 0 replies; 17+ messages in thread
From: Joey Pabalinas @ 2018-09-21 23:14 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: zsh-workers, Joey Pabalinas

[-- Attachment #1: Type: text/plain, Size: 885 bytes --]

On Fri, Sep 21, 2018 at 05:57:40PM +0100, Peter Stephenson wrote:
> Passing through the flag saying if this is the last chunk of code in the
> list being executed by the function and its parents is the most obvious
> necessary test, but I think most of the infrastructure for this already
> exists.  For example, if you run
> 
> { :; sleep 20; } &
> 
> you should find there's no additional shell instance when the sleep
> is running (up to shell state that might get in the way --- I get an
> extra shell when running from the top-level shell in the window but not
> if I start a new one, for reasons that escape me completely).

I found this statement very weird, and dug through the code for a couple hours
myself; sadly, without any luck figuring out why, either.

Definitely the strangest part of this entire thread so far for me.

-- 
Cheers,
Joey Pabalinas

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: long-standing tty related issue: wrapped Emacs not suspended
  2018-09-21 16:57 ` Peter Stephenson
  2018-09-21 23:14   ` Joey Pabalinas
@ 2018-09-22  1:17   ` Bart Schaefer
  2018-09-22  5:51     ` Joey Pabalinas
  2018-09-24 19:51   ` Peter Stephenson
  2 siblings, 1 reply; 17+ messages in thread
From: Bart Schaefer @ 2018-09-22  1:17 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: zsh-workers

On Fri, Sep 21, 2018 at 9:57 AM, Peter Stephenson
<p.w.stephenson@ntlworld.com> wrote:
> On Thu, 20 Sep 2018 14:30:05 +0200
> Vincent Lefevre <vincent@vinc17.net> wrote:
>> is suspended as expected. But when Emacs is wrapped in a function,
>> it is not suspended. After "zsh -f":
>>
>> zira% e() { emacs -nw "$@"; }
>> zira% e &
>>
>> I cannot quit Emacs or get the zsh prompt. I need to kill the
>> terminal.
>
> I think this one *is* the same basic issue that Thilo was talking about.

I think maybe you're mis-reading.  Note that in Vincent's case
backgrounding the function causes emacs to FAIL to stop (or so he
says), not to stop forever and need a SIGCONT.

When Vincent says:
>> I've tested the same thing with other shells: dash behaves like
>> zsh, and bash, ksh93 and mksh immediately terminate.

I'm fairly sure this is because bash, ksh93, and mksh are ignoring the
TTIN signal.  In that case the shell receives a read error on standard
input instead of being stopped, which causes those shells to exit.

Zsh and dash on the other hand allow the TTIN to stop the shell, which
deadlocks everything; either emacs is running but not paying attention
to the terminal because it is trying to be an X application, or it is
also stopped.

I have a vague recollection that this is an emacs-specific problem, in
that it actively attempts to grab control of the terminal and
sometimes succeeds (race with the parent shell?).

FWIW I can't get what Vincent describes to happen at all with the
above example and the current git checkout, even after assorted
bg/fg/^Z/etc.  But the system on which I can currently test doesn't
have an X-enabled emacs.

I *can* get both the shell and emacs to be suspended even though the
parent thinks they are running in the background, by starting the "e"
function in the foreground and then doing ^Z.

% jobs
[1]  - running    e
% ps x
  PID TTY      STAT   TIME COMMAND
21333 ?        S      0:00 sshd: barts@pts/5
21334 pts/5    Ss     0:00 Src/zsh
21446 pts/5    T      0:00 emacs -nw
21453 pts/5    T      0:00 Src/zsh
21569 pts/5    R+     0:00 ps x

At this point if I bring "e" into the foreground explicitly, the shell
ends up blocked:

% fg
fg: no current job
% jobs
[1]  - running    e
% fg %1
[1]  - running    e
(deadlock)

If next I "kill -CONT 21446", emacs briefly grabs the terminal, then
goes back to being stopped and the parent shell regains control.  It
still believes "e" is running.

If instead I "kill -CONT 21453", that subshell exits, the parent shell
gets back control and deletes the "e" job, and emacs is orphaned
(still stopped, and unable to continue).

> However, there are a couple of interesting zsh-specific issues here.
>
> The fact that another instance of the shell is created on the ^Z is
> actually a special feature for allowing you to suspend builtins,
> designed for functions and pipelines with more processing

Indeed, a potential problem is that emacs is a child of the parent
shell, not a child of the subshell spawned to suspend the "e"
function, so the subshell can't wait() for it.  The parent shell now
has to keep track of two jobs for the backgrounded function, and I
think it's losing track of one of them.

Also, the parent shell should probably be sending a SIGCONT to all
those jobs whenever they are fg'd, even if it believes they are
already running.  SIGCONT is harmless for a job that is not stopped,
and trusting the job table idea of the process state is an unnecessary
optimization.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: long-standing tty related issue: wrapped Emacs not suspended
  2018-09-22  1:17   ` Bart Schaefer
@ 2018-09-22  5:51     ` Joey Pabalinas
  2018-09-22 18:54       ` Vincent Lefevre
  0 siblings, 1 reply; 17+ messages in thread
From: Joey Pabalinas @ 2018-09-22  5:51 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Peter Stephenson, zsh-workers, Joey Pabalinas

[-- Attachment #1: Type: text/plain, Size: 1391 bytes --]

On Fri, Sep 21, 2018 at 06:17:47PM -0700, Bart Schaefer wrote:
> FWIW I can't get what Vincent describes to happen at all with the
> above example and the current git checkout, even after assorted
> bg/fg/^Z/etc.  But the system on which I can currently test doesn't
> have an X-enabled emacs.

This did confuse me a bit earlier in Peter's email, as I also had this
exact behavior on 5.5.1.r143.g225b35c9070f94cf79 where -CONT didn't even
seem to work regardless of which child process I sent it to. I had just
chalked it up to one of my weird configs, but I guess this is pretty
much happening to everyone running the development versions?

> Also, the parent shell should probably be sending a SIGCONT to all
> those jobs whenever they are fg'd, even if it believes they are
> already running.  SIGCONT is harmless for a job that is not stopped,
> and trusting the job table idea of the process state is an unnecessary
> optimization.

Interestingly enough I did once run into a program that used a SIGCONT
handler, reset to default disposition in SIGTSTP handler, for some internal
callback-ish stuff. My guess was it assumed SIGCONT to be a "safe" sentinel
signal (while completely forgetting about its uncatchable brother SIGSTOP),
but cases like that are mostly just broken curiosities and definitely not
worth worrying about :)

-- 
Cheers,
Joey Pabalinas

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: long-standing tty related issue: wrapped Emacs not suspended
  2018-09-22  5:51     ` Joey Pabalinas
@ 2018-09-22 18:54       ` Vincent Lefevre
  2018-09-22 21:27         ` Joey Pabalinas
  0 siblings, 1 reply; 17+ messages in thread
From: Vincent Lefevre @ 2018-09-22 18:54 UTC (permalink / raw)
  To: zsh-workers; +Cc: Joey Pabalinas, Bart Schaefer, Peter Stephenson

On 2018-09-21 19:51:48 -1000, Joey Pabalinas wrote:
> Interestingly enough I did once run into a program that used a SIGCONT
> handler, reset to default disposition in SIGTSTP handler, for some internal
> callback-ish stuff. My guess was it assumed SIGCONT to be a "safe" sentinel
> signal (while completely forgetting about its uncatchable brother SIGSTOP),
> but cases like that are mostly just broken curiosities and definitely not
> worth worrying about :)

A SIGCONT handler can be useful for programs running under
Grid Engine (together with a SIGUSR1 handler, as this signal
is sent shortly before SIGSTOP).

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: long-standing tty related issue: wrapped Emacs not suspended
  2018-09-22 18:54       ` Vincent Lefevre
@ 2018-09-22 21:27         ` Joey Pabalinas
  2018-09-23  7:21           ` Vincent Lefevre
  0 siblings, 1 reply; 17+ messages in thread
From: Joey Pabalinas @ 2018-09-22 21:27 UTC (permalink / raw)
  To: zsh-workers, Joey Pabalinas, Bart Schaefer, Peter Stephenson

[-- Attachment #1: Type: text/plain, Size: 697 bytes --]

On Sat, Sep 22, 2018 at 08:54:48PM +0200, Vincent Lefevre wrote:
> A SIGCONT handler can be useful for programs running under
> Grid Engine (together with a SIGUSR1 handler, as this signal
> is sent shortly before SIGSTOP).

This assumes that you will always receive USR1 before every STOP, but
since STOP is uncatchable like KILL, this may not always happen. Imagine
an external process sendind it without sending USR1 first, and suddenly your
assumption is no longer true.

Granted, edge cases like this are rare, but they do happen. Signals
are things that should be given a bit more consideration than usual when
dealing with things like handlers.

-- 
Cheers,
Joey Pabalinas

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: long-standing tty related issue: wrapped Emacs not suspended
  2018-09-22 21:27         ` Joey Pabalinas
@ 2018-09-23  7:21           ` Vincent Lefevre
  0 siblings, 0 replies; 17+ messages in thread
From: Vincent Lefevre @ 2018-09-23  7:21 UTC (permalink / raw)
  To: zsh-workers; +Cc: Joey Pabalinas, Bart Schaefer, Peter Stephenson

On 2018-09-22 11:27:32 -1000, Joey Pabalinas wrote:
> On Sat, Sep 22, 2018 at 08:54:48PM +0200, Vincent Lefevre wrote:
> > A SIGCONT handler can be useful for programs running under
> > Grid Engine (together with a SIGUSR1 handler, as this signal
> > is sent shortly before SIGSTOP).
> 
> This assumes that you will always receive USR1 before every STOP, but
> since STOP is uncatchable like KILL, this may not always happen. Imagine
> an external process sendind it without sending USR1 first, and suddenly your
> assumption is no longer true.

Under the Grid Engine usage, the user should make sure that this never
happens, or this could be seen as a usage error (just like sending a
SIGKILL for no reasons would also be an error).

Now, if the program was started by an interactive shell, I assume
that a SIGCONT can be fine: the handler should be able to detect
that it is not under the specific SIGUSR1 / SIGCONT usage.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: long-standing tty related issue: wrapped Emacs not suspended
  2018-09-21 16:57 ` Peter Stephenson
  2018-09-21 23:14   ` Joey Pabalinas
  2018-09-22  1:17   ` Bart Schaefer
@ 2018-09-24 19:51   ` Peter Stephenson
  2018-09-25 10:37     ` Peter Stephenson
  2 siblings, 1 reply; 17+ messages in thread
From: Peter Stephenson @ 2018-09-24 19:51 UTC (permalink / raw)
  To: zsh-workers

On Fri, 21 Sep 2018 17:57:40 +0100
Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> On Thu, 20 Sep 2018 14:30:05 +0200
> Vincent Lefevre <vincent@vinc17.net> wrote:
> > is suspended as expected. But when Emacs is wrapped in a function,
> > it is not suspended. After "zsh -f":
> > 
> > zira% e() { emacs -nw "$@"; }
> > zira% e &
> > 
> > I cannot quit Emacs or get the zsh prompt. I need to kill the
> > terminal.  
> 
> I think this one *is* the same basic issue that Thilo was talking about.
> As you note, it's not new.  (With the latest code I can only get this
> the way Thilo did, by suspending, putting it into the background, then
> into the foreground, then I can only get things to work again with kill
> -CONT from elsewhere and a bit of massaging to kill of the remaining
> goo.   The code shown above actually works OK for me with a bg and fg.)

This fixes this variant of the bug for me.  However, there is a lot of
confusion over who is seeing what variant, so it probably doesn't fix
everything --- I have mostly modified code associated with bg and fg
plus some quite well-hidden stuff to do with superjobs.  If you don't
know what a superjob is you should seriously consider keeping it that
way.

Although the changes look individually sane, it's perfectly possible
there are undesirable knock-on effects.  I intend to commit this anyway
as otherwise we won't find out.  If we are going to find out, it needs
to be subjected to some interactive job control usage *now*, *NOT* after
the next release, which doesn't work as a procedure.

All three parts were needed:

- Send SIGCONT when attaching a subjob to the TTY.  This is because
its stoppedness is invisible to the user, who is (unknowlingly) only
directly manipulating the superjob, so the shell has to give a helping
hand.  Comment added.

- Wait for subjob of superjob if waiting for the superjob, else the "fg"
makes the top-level shell continue immediately, so both jobs are running
at once.  (The wait for the superjob returns when that exits, even
though the only process, the forked shell, is stopped at this point.
It's not clear to me this is correct --- see note on makerunning()
below.  I think with this behaviour waiting for the subjob may be
superfluous, but I think it's correct anyway as the whole lot is
considered to be in the forground.)

- Mark a job as no longer STAT_STOPPED if we send it SIGCONT --- do this
generically from within the killjb() where it's already specialised for
SIGCONT and also in the non-superjob branch, hence removing a couple of
cases that did this outside klllb().  Why in goodness name did we not do
this already?

Note I'm not 100% convinced that makerunning() is consistent with the
way killjb() doesn't SIGCONT the forked subshell --- so the superjob
isn't actually fully running and I'm not sure how it should be flagged.
I'm assuming that any change to this needs to go in makerunning(), which
has been made aware of superjobs, so I've used that consistently to
update the status regardless of this query.

pws

diff --git a/Src/jobs.c b/Src/jobs.c
index 2d58319a8..c399f1d3e 100644
--- a/Src/jobs.c
+++ b/Src/jobs.c
@@ -1569,10 +1569,8 @@ zwaitjob(int job, int wait_cmd)
 
            errflag = 0; */
 
-	    if (subsh) {
+	    if (subsh)
 		killjb(jn, SIGCONT);
-		jn->stat &= ~STAT_STOPPED;
-	    }
 	    if (jn->stat & STAT_SUPERJOB)
 		if (handle_sub(jn - jobtab, 1))
 		    break;
@@ -1590,6 +1588,17 @@ zwaitjob(int job, int wait_cmd)
     return 0;
 }
 
+static void waitonejob(Job jn)
+{
+    if (jn->procs || jn->auxprocs)
+	zwaitjob(jn - jobtab, 0);
+    else {
+	deletejob(jn, 0);
+	pipestats[0] = lastval;
+	numpipestats = 1;
+    }
+}
+
 /* wait for running job to finish */
 
 /**/
@@ -1599,13 +1608,10 @@ waitjobs(void)
     Job jn = jobtab + thisjob;
     DPUTS(thisjob == -1, "No valid job in waitjobs.");
 
-    if (jn->procs || jn->auxprocs)
-	zwaitjob(thisjob, 0);
-    else {
-	deletejob(jn, 0);
-	pipestats[0] = lastval;
-	numpipestats = 1;
-    }
+    waitonejob(jn);
+    if (jn->stat & STAT_SUPERJOB)
+	waitonejob(jobtab + jn->other);
+	
     thisjob = -1;
 }
 
@@ -2294,11 +2300,8 @@ bin_fg(char *name, char **argv, Options ops, int func)
 	    Process p;
 
 	    if (findproc(pid, &j, &p, 0)) {
-		if (j->stat & STAT_STOPPED) {
+		if (j->stat & STAT_STOPPED)
 		    retval = (killjb(j, SIGCONT) != 0);
-		    if (retval == 0)
-			makerunning(j);
-		}
 		if (retval == 0) {
 		    /*
 		     * returns 0 for normal exit, else signal+128
@@ -2404,9 +2407,17 @@ bin_fg(char *name, char **argv, Options ops, int func)
 			((!jobtab[job].procs->next ||
 			  (jobtab[job].stat & STAT_SUBLEADER) ||
 			  killpg(jobtab[job].gleader, 0) == -1)) &&
-			jobtab[jobtab[job].other].gleader)
+			jobtab[jobtab[job].other].gleader) {
 			attachtty(jobtab[jobtab[job].other].gleader);
-		    else
+			/*
+			 * In case stopped by putting in background.
+			 * Usually that's visible to the user, who
+			 * can restart, but with a superjob this is done
+			 * behind the scenes, so do it here.  Should
+			 * be harmless if not needed.
+			 */
+			stopped = 1;
+		    } else
 			attachtty(jobtab[job].gleader);
 		}
 	    }
diff --git a/Src/signals.c b/Src/signals.c
index 26d88abc2..f294049c2 100644
--- a/Src/signals.c
+++ b/Src/signals.c
@@ -782,7 +782,20 @@ killjb(Job jn, int sig)
 		    if (kill(pn->pid, sig) == -1 && errno != ESRCH)
 			err = -1;
 
-                return err;
+		/*
+		 * The following marks both the superjob and subjob
+		 * as running, as done elsewhere.
+		 *
+		 * It's not entirely clear to me what the right way
+		 * to flag the status of a still-pausd final process,
+		 * as handled above, but we should be cnsistent about
+		 * this inside makerunning() rather than doing anything
+		 * special here.
+		 */
+		if (err != -1)
+		    makerunning(jn);
+
+		return err;
             }
             if (killpg(jobtab[jn->other].gleader, sig) == -1 && errno != ESRCH)
 		err = -1;
@@ -792,8 +805,11 @@ killjb(Job jn, int sig)
 
 	    return err;
         }
-        else
-	    return killpg(jn->gleader, sig);
+        else {
+	    err = killpg(jn->gleader, sig);
+	    if (sig == SIGCONT && err != -1)
+		makerunning(jn);
+	}
     }
     for (pn = jn->procs; pn; pn = pn->next) {
 	/*

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: long-standing tty related issue: wrapped Emacs not suspended
  2018-09-24 19:51   ` Peter Stephenson
@ 2018-09-25 10:37     ` Peter Stephenson
  2018-09-25 17:04       ` Daniel Shahaf
  2018-09-26  5:47       ` Bart Schaefer
  0 siblings, 2 replies; 17+ messages in thread
From: Peter Stephenson @ 2018-09-25 10:37 UTC (permalink / raw)
  To: zsh-workers

Just for clarity, what I'm actually testing is the following.  I'm using
vi as a simpler alternative to emacs --- as Bart noted, emacs may have
additional ishoos of its own.

% foo() { vi; }
% foo
^Z
% bg
# Not sure why bg loses the current job number but assuming this is
# standard behaviour...
% fg %1

This should cleanly restore a full-window vi, which it does after the
last patch, but not before.

Now read on.

Or not.

On Mon, 24 Sep 2018 20:51:06 +0100
Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> - Send SIGCONT when attaching a subjob to the TTY.  This is because
> its stoppedness is invisible to the user, who is (unknowlingly) only
> directly manipulating the superjob, so the shell has to give a helping
> hand.  Comment added.

After a bit more research: we have code to SIGTSTP the superjob if the
subjob does get stopped (as it does here because of the SIGTTOU), but it
doesn't trigger because it expects the job to be STAT_STOPPED already,
whereas that's only going to happen further down the file.  As it *is*
going to happen I think the use of the somestopped flag is good enough,
but as we return immediately we now need to change the job state here.
(The superjob has some code to put itself back to sleep but I hope the
interaction is benign.)

Also, we need to mark the superjob (not just the subjob) as stopped at
this point because if it's just the forked shell it's already suspended
so there's no other place to hang that on.

With that in place, the "stopped = 1" I added to fg/bg to send a
possibly unnecessary SIGCONT isn't needed as the records are correct.

(Hope you're enjoying this.)

One remaining glitch is there's no message that the backgrounded job as
been stopped again because we suppress messages for the job.  This isn't
particularly consistent given we do display the effect of the bg and the
fg.  Not sure what a consistent way forward would be here.

> - Wait for subjob of superjob if waiting for the superjob, else the "fg"
> makes the top-level shell continue immediately, so both jobs are running
> at once.  (The wait for the superjob returns when that exits, even
> though the only process, the forked shell, is stopped at this point.
> It's not clear to me this is correct --- see note on makerunning()
> below.  I think with this behaviour waiting for the subjob may be
> superfluous, but I think it's correct anyway as the whole lot is
> considered to be in the forground.)

Logically, we should wait for the subjob first --- its exit should
provoke the superjob into further action.  So I've swapped it round.

As before, the only way of teasing out problems will be to try this out,
so I'll commit it.

pws

diff --git a/Src/jobs.c b/Src/jobs.c
index c399f1d..f64b46b 100644
--- a/Src/jobs.c
+++ b/Src/jobs.c
@@ -459,19 +459,29 @@ update_job(Job jn)
 	    jn->ty = (struct ttyinfo *) zalloc(sizeof(struct ttyinfo));
 	    gettyinfo(jn->ty);
 	}
-	if (jn->stat & STAT_STOPPED) {
-	    if (jn->stat & STAT_SUBJOB) {
-		/* If we have `cat foo|while read a; grep $a bar;done'
-		 * and have hit ^Z, the sub-job is stopped, but the
-		 * super-job may still be running, waiting to be stopped
-		 * or to exit. So we have to send it a SIGTSTP. */
-		int i;
-
-		if ((i = super_job(job)))
-		    killpg(jobtab[i].gleader, SIGTSTP);
+	if (jn->stat & STAT_SUBJOB) {
+	    /* If we have `cat foo|while read a; grep $a bar;done'
+	     * and have hit ^Z, the sub-job is stopped, but the
+	     * super-job may still be running, waiting to be stopped
+	     * or to exit. So we have to send it a SIGTSTP. */
+	    int i;
+
+	    if ((i = super_job(job))) {
+		killpg(jobtab[i].gleader, SIGTSTP);
+		/*
+		 * Job may already be stopped if it consists of only the
+		 * forked shell waiting for the subjob -- so mark as
+		 * stopped immediately.  This ensures we send it (and,
+		 * crucially, the subjob, as the visible job used with
+		 * fg/bg is the superjob) a SIGCONT if we need it.
+		 */
+		jobtab[i].stat |= STAT_CHANGED | STAT_STOPPED;
 	    }
+	    jn->stat |= STAT_CHANGED | STAT_STOPPED;
 	    return;
 	}
+	if (jn->stat & STAT_STOPPED)
+	    return;
     }
     {                   /* job is done or stopped, remember return value */
 	lastval2 = val;
@@ -1608,10 +1618,11 @@ waitjobs(void)
     Job jn = jobtab + thisjob;
     DPUTS(thisjob == -1, "No valid job in waitjobs.");
 
-    waitonejob(jn);
+    /* If there's a subjob, it should finish first. */
     if (jn->stat & STAT_SUPERJOB)
 	waitonejob(jobtab + jn->other);
-	
+    waitonejob(jn);
+
     thisjob = -1;
 }
 
@@ -2407,17 +2418,9 @@ bin_fg(char *name, char **argv, Options ops, int func)
 			((!jobtab[job].procs->next ||
 			  (jobtab[job].stat & STAT_SUBLEADER) ||
 			  killpg(jobtab[job].gleader, 0) == -1)) &&
-			jobtab[jobtab[job].other].gleader) {
+			jobtab[jobtab[job].other].gleader)
 			attachtty(jobtab[jobtab[job].other].gleader);
-			/*
-			 * In case stopped by putting in background.
-			 * Usually that's visible to the user, who
-			 * can restart, but with a superjob this is done
-			 * behind the scenes, so do it here.  Should
-			 * be harmless if not needed.
-			 */
-			stopped = 1;
-		    } else
+		    else
 			attachtty(jobtab[job].gleader);
 		}
 	    }

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: long-standing tty related issue: wrapped Emacs not suspended
  2018-09-25 10:37     ` Peter Stephenson
@ 2018-09-25 17:04       ` Daniel Shahaf
  2018-09-26  5:47       ` Bart Schaefer
  1 sibling, 0 replies; 17+ messages in thread
From: Daniel Shahaf @ 2018-09-25 17:04 UTC (permalink / raw)
  To: Peter Stephenson, zsh-workers

Peter Stephenson wrote on Tue, 25 Sep 2018 11:37 +0100:
> As before, the only way of teasing out problems will be to try this out,
> so I'll commit it.

I rebuilt + installed HEAD yesterday and will do so again once you
commit this.  (I see you've pushed 43535 but I'll wait for 43543.)

Incidentally, I saw some issues with 'svn commit' spawning gpg-agent
spawning pinentry in the terminal.  The symptom was that pinentry
wouldn't start and had to be killed from another shell, or if it did
start, the terminal would become unusable, even sequences of ^\
(SIGQUIT) + blind 'reset\n' wouldn't work, only half the keystrokes
would seem to reach the shell.

That was with 5.6.2; I'll report in if I see this post 43543.

I wasn't able to reproduce this by running 'pinentry' at the prompt and
typing 'getpin\n' into its stdin.  (Typing, not echoing; I suppose it
fails unless isatty(stdin).)

This was with svn trunk, but I don't recall any changes on svn's side.

Cheers,

Daniel

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: long-standing tty related issue: wrapped Emacs not suspended
  2018-09-25 10:37     ` Peter Stephenson
  2018-09-25 17:04       ` Daniel Shahaf
@ 2018-09-26  5:47       ` Bart Schaefer
  2018-09-26 13:41         ` Peter Stephenson
                           ` (2 more replies)
  1 sibling, 3 replies; 17+ messages in thread
From: Bart Schaefer @ 2018-09-26  5:47 UTC (permalink / raw)
  To: zsh-workers

On Tue, Sep 25, 2018 at 3:38 AM Peter Stephenson
<p.stephenson@samsung.com> wrote:
>
> As before, the only way of teasing out problems will be to try this out,
> so I'll commit it.

Minor nit -- if a normal background job gets stopped by a signal, the
shell reports it:

% e &
[1] 2936
%
[1]  + suspended (tty output)  e

But if you ^Z a function and then background it, the parent still
believes it to be running even though it has instantly stopped:

% e
zsh: suspended  e
% bg
[1]  + continued  e
% jobs
[1]  + running    e
% ps a
  PID   TT  STAT      TIME COMMAND
 2813 s003  S      0:00.09 Src/zsh -f
 2906 s003  T      0:00.21 emacs
 2908 s003  T      0:00.00 Src/zsh -f
 2917 s003  R+     0:00.00 ps a

The right thing happens if you "fg" it again (it gets sent a SIGCONT,
and the parent waits) so this isn't critical, but it's potentially
confusing.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: long-standing tty related issue: wrapped Emacs not suspended
  2018-09-26  5:47       ` Bart Schaefer
@ 2018-09-26 13:41         ` Peter Stephenson
  2018-09-26 15:17         ` Peter Stephenson
       [not found]         ` <20180926161708.39be6402@camnpupstephen.cam.scsc.local>
  2 siblings, 0 replies; 17+ messages in thread
From: Peter Stephenson @ 2018-09-26 13:41 UTC (permalink / raw)
  To: zsh-workers

On Tue, 25 Sep 2018 22:47:44 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Tue, Sep 25, 2018 at 3:38 AM Peter Stephenson
> But if you ^Z a function and then background it, the parent still
> believes it to be running even though it has instantly stopped:

Yes, I mentioned that problem --- it's not new, at least 5.5 behaved
that way, and it's related to the fact that we have STAT_NOPRINT
associated with the subjob, which is what's actually being suspended.
I'm not sure how to fix this without unleashing a torrent of new
messages, but it's possible the key is ensuring the superjob and subjob
are aligned and printing a message if we update both --- since the point
here is to pretend to the user that they're acting as a single job.
(They bloody aren't, I can tell you...)

pws


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: long-standing tty related issue: wrapped Emacs not suspended
  2018-09-26  5:47       ` Bart Schaefer
  2018-09-26 13:41         ` Peter Stephenson
@ 2018-09-26 15:17         ` Peter Stephenson
       [not found]         ` <20180926161708.39be6402@camnpupstephen.cam.scsc.local>
  2 siblings, 0 replies; 17+ messages in thread
From: Peter Stephenson @ 2018-09-26 15:17 UTC (permalink / raw)
  To: zsh-workers

On Tue, 25 Sep 2018 22:47:44 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
But if you ^Z a function and then background it, the parent still
> believes it to be running even though it has instantly stopped:
> 
> % e
> zsh: suspended  e
> % bg
> [1]  + continued  e
> % jobs
> [1]  + running    e

To get this to work, I had to hack printjobs().  There's another hack in
printjobs to do pretty much the reverse --- if you're in the superjob
and the last job is suspended, treat it as running, because that's the
subshell that's waiting for the subjob so it's suspension isn't
relevant.  However, in this case I'm actually printing the subjob state
as the most relevant so that works around that --- this also means
"jobs" lists the correct process as suspended.

I've also added a printjob() to the point where the subjob is suspended
to make it reflect the fact that, as far as the user is concerned,
everything is associated with the superjob.

I would say we are in danger of disappearing up our own backsides but I
think that happened long ago.

pws

diff --git a/Src/jobs.c b/Src/jobs.c
index f64b46b..8103f5c 100644
--- a/Src/jobs.c
+++ b/Src/jobs.c
@@ -466,8 +466,10 @@ update_job(Job jn)
 	     * or to exit. So we have to send it a SIGTSTP. */
 	    int i;
 
+	    jn->stat |= STAT_CHANGED | STAT_STOPPED;
 	    if ((i = super_job(job))) {
-		killpg(jobtab[i].gleader, SIGTSTP);
+		Job sjn = &jobtab[i];
+		killpg(sjn->gleader, SIGTSTP);
 		/*
 		 * Job may already be stopped if it consists of only the
 		 * forked shell waiting for the subjob -- so mark as
@@ -475,9 +477,20 @@ update_job(Job jn)
 		 * crucially, the subjob, as the visible job used with
 		 * fg/bg is the superjob) a SIGCONT if we need it.
 		 */
-		jobtab[i].stat |= STAT_CHANGED | STAT_STOPPED;
+		sjn->stat |= STAT_CHANGED | STAT_STOPPED;
+		if (isset(NOTIFY) && (sjn->stat & STAT_LOCKED) &&
+		    !(sjn->stat & STAT_NOPRINT)) {
+		    /*
+		     * Print the subjob state, which we don't usually
+		     * do, so the user knows something has stopped.
+		     * So as not to be confusing, we actually output
+		     * the user-visible superjob.
+		     */
+		    if (printjob(sjn, !!isset(LONGLISTJOBS), 0) &&
+			zleactive)
+			zleentry(ZLE_CMD_REFRESH);
+		}
 	    }
-	    jn->stat |= STAT_CHANGED | STAT_STOPPED;
 	    return;
 	}
 	if (jn->stat & STAT_STOPPED)
@@ -1035,15 +1048,34 @@ printjob(Job jn, int lng, int synch)
 	   "bogus job number, jn = %L, jobtab = %L, oldjobtab = %L",
 	   (long)jn, (long)jobtab, (long)oldjobtab);
 
-    if (jn->stat & STAT_NOPRINT) {
+    if (jn->stat & STAT_NOPRINT)
 	skip_print = 1;
-    }
 
     if (lng < 0) {
 	conted = 1;
 	lng = !!isset(LONGLISTJOBS);
     }
 
+    if (jn->stat & STAT_SUPERJOB &&
+	jn->other)
+    {
+	Job sjn = &jobtab[jn->other];
+	if (sjn->stat & STAT_STOPPED)
+	{
+	    /*
+	     * A subjob is stopped, which will prevent further excution
+	     * of the superjob, which the user wants to know about.  So
+	     * report the status of the subjob as if it were the
+	     * user-visible superjob.
+	     *
+	     * TBD: there may be other times we want to do this
+	     * which would, for example, remove the need for the
+	     * hack at the top of the loop over processes just below.
+	     */
+	    jn = sjn;
+	}
+    }
+
 /* find length of longest signame, check to see */
 /* if we really need to print this job          */
 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: long-standing tty related issue: wrapped Emacs not suspended
       [not found]         ` <20180926161708.39be6402@camnpupstephen.cam.scsc.local>
@ 2018-10-02 15:32           ` Peter Stephenson
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Stephenson @ 2018-10-02 15:32 UTC (permalink / raw)
  To: Zsh Hackers List

I've been thinking about this tweak for a week and I'll forget about it
if I don't do it now.

Instead of reporting the superjob state when we have a subjob only when
the subjob is suspended, report it any time the subjob still has processes, as these are the things that the user is interested in until the subjob exits.
So if a superjob is suspended but a subjob isn't, as far as the user is
concerned everything is ticking along.

  "It's hard to see how this can be wrong."
      --- me

pws

diff --git a/Src/jobs.c b/Src/jobs.c
index 8103f5c..ec6d629 100644
--- a/Src/jobs.c
+++ b/Src/jobs.c
@@ -1060,17 +1060,13 @@ printjob(Job jn, int lng, int synch)
 	jn->other)
     {
 	Job sjn = &jobtab[jn->other];
-	if (sjn->stat & STAT_STOPPED)
+	if (sjn->procs || sjn->auxprocs)
 	{
 	    /*
-	     * A subjob is stopped, which will prevent further excution
-	     * of the superjob, which the user wants to know about.  So
-	     * report the status of the subjob as if it were the
-	     * user-visible superjob.
-	     *
-	     * TBD: there may be other times we want to do this
-	     * which would, for example, remove the need for the
-	     * hack at the top of the loop over processes just below.
+	     * A subjob still has process, which must finish before
+	     * further excution of the superjob, which the user wants to
+	     * know about.  So report the status of the subjob as if it
+	     * were the user-visible superjob.
 	     */
 	    jn = sjn;
 	}

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2018-10-02 15:32 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-20 12:30 long-standing tty related issue: wrapped Emacs not suspended Vincent Lefevre
2018-09-20 15:43 ` Joey Pabalinas
2018-09-20 23:10   ` Vincent Lefevre
2018-09-21 16:57 ` Peter Stephenson
2018-09-21 23:14   ` Joey Pabalinas
2018-09-22  1:17   ` Bart Schaefer
2018-09-22  5:51     ` Joey Pabalinas
2018-09-22 18:54       ` Vincent Lefevre
2018-09-22 21:27         ` Joey Pabalinas
2018-09-23  7:21           ` Vincent Lefevre
2018-09-24 19:51   ` Peter Stephenson
2018-09-25 10:37     ` Peter Stephenson
2018-09-25 17:04       ` Daniel Shahaf
2018-09-26  5:47       ` Bart Schaefer
2018-09-26 13:41         ` Peter Stephenson
2018-09-26 15:17         ` Peter Stephenson
     [not found]         ` <20180926161708.39be6402@camnpupstephen.cam.scsc.local>
2018-10-02 15:32           ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).