zsh-workers
 help / color / mirror / code / Atom feed
* Bug in sh emulation
@ 2011-12-10  1:39 Ivan S. Freitas
  2011-12-10  2:47 ` Bart Schaefer
  0 siblings, 1 reply; 17+ messages in thread
From: Ivan S. Freitas @ 2011-12-10  1:39 UTC (permalink / raw)
  To: zsh-workers

Zsh is hanging in commands like: emulate sh -c "(echo | grep)"

Everything I've tried in the form "(command | command)" fails in the same way.

zsh 4.3.14
/configure --prefix=/usr \
 		--bindir=/bin \
 		--enable-etcdir=/etc/zsh \
 		--enable-zshenv=/etc/zsh/zshenv \
 		--enable-zlogin=/etc/zsh/zlogin \
 		--enable-zlogout=/etc/zsh/zlogout \
		--enable-zprofile=/etc/zsh/zprofile \
 		--enable-zshrc=/etc/zsh/zshrc \
 		--enable-maildir-support \
 		--with-term-lib='ncursesw' \
 		--enable-multibyte \
 		--enable-function-subdirs \
 		--enable-fndir=/usr/share/zsh/functions \
zshrc: https://github.com/ISF/dotfiles/blob/master/.zshrc
Although it doesn't seem to be affected by settings in the startup files.

Origin: last update of zsh in archlinux uses /etc/zsh/zprofile in the
startup, which runs emulate sh -c "/etc/profile".
In /etc/profile.d/GNUstep.sh there is the following:

if [ -n "$ZSH_VERSION" ]; then
  # If -y is not already set, set it and remember that we  # need to
set it back to what it was at the end.  if ( setopt | grep shwordsplit
> /dev/null ); then :; else    set -y
GS_ZSH_NEED_TO_RESTORE_SET=yes  fi
fi

The most inner if makes the initialization of every zsh (as a login
shell) hangs until a SIGINT is sent.

-- 
Ivan Sichmann Freitas
GNU/Linux user #509059


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug in sh emulation
  2011-12-10  1:39 Bug in sh emulation Ivan S. Freitas
@ 2011-12-10  2:47 ` Bart Schaefer
  2011-12-10  3:40   ` Bart Schaefer
  2011-12-10 19:40   ` Peter Stephenson
  0 siblings, 2 replies; 17+ messages in thread
From: Bart Schaefer @ 2011-12-10  2:47 UTC (permalink / raw)
  To: zsh-workers

On Dec 9, 11:39pm, Ivan S. Freitas wrote:
} Subject: Bug in sh emulation
}
} Zsh is hanging in commands like: emulate sh -c "(echo | grep)"

Well, this is interesting.

#0  0x007067a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x007e4029 in ioctl () from /lib/tls/libc.so.6
#2  0x007e35cd in tcsetpgrp () from /lib/tls/libc.so.6
#3  0x080c5424 in attachtty (pgrp=4940) at ../../zsh-4.0/Src/utils.c:3876
#4  0x08060b66 in entersubsh (flags=6) at ../../zsh-4.0/Src/exec.c:947
#5  0x080655f0 in execcmd (state=0xbfe8e630, input=0, output=12, how=18, 
    last1=0) at ../../zsh-4.0/Src/exec.c:2832
#6  0x080628b1 in execpline2 (state=0xbfe8e630, pcode=163, how=18, input=0, 
    output=0, last1=1) at ../../zsh-4.0/Src/exec.c:1686
#7  0x08061a3e in execpline (state=0xbfe8e630, slcode=8194, how=18, last1=1)
    at ../../zsh-4.0/Src/exec.c:1424
#8  0x08061316 in execlist (state=0xbfe8e630, dont_change_job=0, exiting=1)
    at ../../zsh-4.0/Src/exec.c:1207
#9  0x08066d5c in execcmd (state=0xbfe8e630, input=0, output=0, how=18, 
    last1=2) at ../../zsh-4.0/Src/exec.c:3285
#10 0x08062690 in execpline2 (state=0xbfe8e630, pcode=131, how=18, input=0, 
    output=0, last1=0) at ../../zsh-4.0/Src/exec.c:1640
#11 0x08061a3e in execpline (state=0xbfe8e630, slcode=14338, how=18, last1=0)
    at ../../zsh-4.0/Src/exec.c:1424
#12 0x08061316 in execlist (state=0xbfe8e630, dont_change_job=1, exiting=0)
    at ../../zsh-4.0/Src/exec.c:1207
#13 0x08060dac in execode (p=0xb7cde898, dont_change_job=1, exiting=0, 
    context=0x812deb6 "eval") at ../../zsh-4.0/Src/exec.c:1028
#14 0x0805a2b0 in eval (argv=0xbfe8e7d8) at ../../zsh-4.0/Src/builtin.c:4928
#15 0x0805a568 in bin_emulate (nam=0xb7cde788 "emulate", argv=0xbfe8e7d0, 
    ops=0xbfe8e820, func=0) at ../../zsh-4.0/Src/builtin.c:5021

(rest uninteresting)

It's stuck in an infinite TTOU/CONT loop because for some reason it
wants to attachtty(), which it can't because it's not the process leader
(the parent shell still in the foreground is).

945                 setpgrp(0L, jobtab[thisjob].gleader);
946                 if (!(flags & ESUB_ASYNC))
947                     attachtty(jobtab[thisjob].gleader);

That particular code fragment dates from 2007, so that can't directly be
the problem, but I don't see any recent changes in exec.c that would have
an effect on when a job is made ASYNC.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug in sh emulation
  2011-12-10  2:47 ` Bart Schaefer
@ 2011-12-10  3:40   ` Bart Schaefer
  2011-12-10 18:15     ` Peter Stephenson
  2011-12-10 19:40   ` Peter Stephenson
  1 sibling, 1 reply; 17+ messages in thread
From: Bart Schaefer @ 2011-12-10  3:40 UTC (permalink / raw)
  To: zsh-workers

On Dec 9,  6:47pm, Bart Schaefer wrote:
}
} I don't see any recent changes in exec.c that would have
} an effect on when a job is made ASYNC.

Unless this?

revision 1.182
date: 2010/08/22 20:08:57;  author: pws;  state: Exp;  lines: +1 -1
28179, users/15314, users/15310, users/15200:
various job and process control fixes

The references to users/15310 should be 15301; I believe 15200 should be
15300?  (I fixed ChangeLog and committed.)

Anyway the change in question is really from 15301, which removes the
ESUB_ASYNC from a bunch of entersubsh() calls.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug in sh emulation
  2011-12-10  3:40   ` Bart Schaefer
@ 2011-12-10 18:15     ` Peter Stephenson
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Stephenson @ 2011-12-10 18:15 UTC (permalink / raw)
  To: zsh-workers

On Fri, 09 Dec 2011 19:40:44 -0800
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Dec 9,  6:47pm, Bart Schaefer wrote:
> }
> } I don't see any recent changes in exec.c that would have
> } an effect on when a job is made ASYNC.
> 
> Unless this?
> 
> Anyway the change in question is really from 15301, which removes the
> ESUB_ASYNC from a bunch of entersubsh() calls.

Two of those have reappeared and reappearing the third doesn't seem to
help.

The problem appears to have come in between 4.3.10 and 4.3.11.  It's not
in 4.3.10 (2009-06-01) but is in a 4.3.10-dev-1 I have, which would be
just before I changed the version to 4.3.10-dev-2 (2010-07-25).

I think it's probably the set of changes around 27100 to 27109 that
allowed MONITOR to remain on in a subshell.  In particular,

    if (!isset(POSIXJOBS))
	opts[MONITOR] = 0;

(now) at line 974 of exec.c.  Commenting out the "if" line makes the
problem goes away.

Allowing MONITOR in subshells was done tentatively to see if it caused
problems.  I suppose the real surprise is it's taken over two years for
one to turn up.  So the bug is caused by a combination of having the
option set in a subshell and preexisting code that doesn't properly
support it.

> It's stuck in an infinite TTOU/CONT loop because for some reason it
> wants to attachtty(), which it can't because it's not the process leader
> (the parent shell still in the foreground is).

Note that this attachtty() is above the point where we set MONITOR to 0
if not POSIXJOBS, so this is occurring on a later fork --- indeed, we
need to remember that the problem only occurs with a pipeline in the
subshell.

My session / process group / controlling terminal understanding is
distinctly ropey.  Is the fix that we simply don't execute that code if
we've already executed it, i.e. subsh is already 1 when we get there
(see ultra-ultra-tentative patch --- we set subsh to 1 just after that)?
That fixes the problem, but that may be simply because we don't execute
the code that fails.  But what should actually happen if we have a
subshell of a subshell, i.e. a nested "( ... )"?  And what should the
group leader be set to at this point after a second fork?

P.S. somewhere high on the list of things we desperately need is a set
of interactive tests.

Index: Src/exec.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/exec.c,v
retrieving revision 1.205
diff -p -u -r1.205 exec.c
--- Src/exec.c	26 Oct 2011 18:48:13 -0000	1.205
+++ Src/exec.c	10 Dec 2011 18:09:26 -0000
@@ -925,7 +925,7 @@ entersubsh(int flags)
 		}
 	    }
 	}
-    } else if (thisjob != -1 && (flags & ESUB_PGRP)) {
+    } else if (thisjob != -1 && (flags & ESUB_PGRP) && !subsh) {
 	if (jobtab[list_pipe_job].gleader && (list_pipe || list_pipe_child)) {
 	    if (setpgrp(0L, jobtab[list_pipe_job].gleader) == -1 ||
 		killpg(jobtab[list_pipe_job].gleader, 0) == -1) {

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug in sh emulation
  2011-12-10  2:47 ` Bart Schaefer
  2011-12-10  3:40   ` Bart Schaefer
@ 2011-12-10 19:40   ` Peter Stephenson
  2011-12-10 23:28     ` Peter Stephenson
  1 sibling, 1 reply; 17+ messages in thread
From: Peter Stephenson @ 2011-12-10 19:40 UTC (permalink / raw)
  To: zsh-workers

On Fri, 09 Dec 2011 18:47:47 -0800
Bart Schaefer <schaefer@brasslantern.com> wrote:
> It's stuck in an infinite TTOU/CONT loop because for some reason it
> wants to attachtty(), which it can't because it's not the process leader
> (the parent shell still in the foreground is).

While my minds on this, and in the knowledge that, based on past form,
only Bart is actually going to respond...

What's confusing me, possibly based on ignorance, is that I naively
expect something like the following to happen (without my
ultra-ultra-tentative patch).

- Shell forks.  This is treated pretty much like any other fork to
create a new foreground process.  If, for example, it was an editor we'd
expect the parent shell to stand back and let the editor grab the
terminal.  I don't see anything that makes the subshell a special case
here.  For example, we call addproc() to mark the newly forked process as the group leader from the main shell.

- Job control is active, so the forked shell takes over the TTY and sets
itself as the group leader.  That's the first time through the code
under discussion, in entersubsh() for the case where MONITOR is still
set.  (Have I got this wrong?  Is there actually some difference that
means the subshell, unlike any other foreground process, can't take over
the terminal?  If so, where is the logic for that?  Yet apparently the
*first* fork is happy --- it's only where we've got the pipeline with
two forks that it hangs.)

- Shell forks again.  This is a pipeline, so part of the same process
group.  I would expect this to find there is already a group leader from
the previous fork, so this process doesn't try to make itself group
leader and grab the pipeline.  Evidently this isn't happening,
however.

So what have I got wrong?  It could be anything, e.g. something in the
newly created subshell thinks it needs to create a new job, so the next
part of the pipeline doesn't know it's part of the same process group.
I'm pretty lost here.

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug in sh emulation
  2011-12-10 19:40   ` Peter Stephenson
@ 2011-12-10 23:28     ` Peter Stephenson
  2011-12-11 19:39       ` Peter Stephenson
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Stephenson @ 2011-12-10 23:28 UTC (permalink / raw)
  To: zsh-workers

(Sorry if you get this twice, but it kind of fits the theme.)

On Sat, 10 Dec 2011 19:40:22 +0000
Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
>> emulate -sh '(echo | grep)'
>
> What's confusing me, possibly based on ignorance, is that I naively
> expect something like the following to happen (without my
> ultra-ultra-tentative patch).

I've examined this a bit further, and got to the point where my brain's
stopped working.

> - Shell forks.  This is treated pretty much like any other fork to
> create a new foreground process.
>
> - Job control is active, so the forked shell takes over the TTY and sets
> itself as the group leader.  That's the first time through the code
> under discussion, in entersubsh() for the case where MONITOR is still
> set.
> 
> - Shell forks again.  This is a pipeline, so part of the same process
> group.  I would expect this to find there is already a group leader from
> the previous fork, so this process doesn't try to make itself group
> leader and grab the pipeline.  Evidently this isn't happening,
> however.

I think what's happening is

- We entersubsh() for the initial fork for the subshell.  This sets job
  1 as group leader and attaches that to the TTY.   This is the normal
  case; what's different at this point is simply that we leave MONITOR
  set.

- We execlist() to run the list for the subshell.

- In execpline() we set the job to 2; execpline() always starts a new
  job.

- We get execmd() to start the pipeline.  We fork because we're
  executing the first command of a pipeline.  All basically as
  expected so far.

- We entersubsh() again.  MONITOR is still set (this is where the
  difference really kicks in) so we execute the code for handling the
  tty again.  We're now job 2, so no group leader for this.  We create a
  new group leader, and try to attach this to the tty, but we can't
  because job 1 is attached to it.

- Something goes KAPOW! loudly in some moral sense.

I suppose the problem is we're doing the attach in the wrong way.
Normally we make the parent shell hand over the terminal to the
appropriate job (er, right?  that's basically what job control is?)  In
this case we grab it from the subprocess.  That blows up when
subprocesses themselves are doing job control --- they should be handing
it on to their subprocesses.  Er... maybe?  So what's with the stuff in
entersubsh() that we always execute anyway, and successfully does
atttachtty() the first time?  Should we really have done that in the
parent shell?

I wonder if this is related to the business with the "superjob" in the
case of a non-subshell that I never understood?  There's code in
handle_sub() in jobs.c line 278 that looks like it might be trying to do
something like what we need to do here.

Confused of Cambridge

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug in sh emulation
  2011-12-10 23:28     ` Peter Stephenson
@ 2011-12-11 19:39       ` Peter Stephenson
  2011-12-11 20:20         ` Peter Stephenson
  2011-12-11 22:37         ` Bart Schaefer
  0 siblings, 2 replies; 17+ messages in thread
From: Peter Stephenson @ 2011-12-11 19:39 UTC (permalink / raw)
  To: zsh-workers

Not so much a follow up as a separate point...

If we're letting the subshell do job control (not resetting MONITOR in
entersubsh() because POSIXJOBS is set), then presumably we shouldn't be
resetting the signals that are special to shells that do job control?

This actually makes the issue go away, but I'm not sure at all sure it's
the basic issue; it's part of the stuff I'm hoping Mystified of Marin
County might know a little more about.

Having said that, the shell does make strenuous efforts to ignore these
signals when doing job control, so I'm reasonable sure this is the right
thing to do here.

Index: Src/exec.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/exec.c,v
retrieving revision 1.205
diff -p -u -r1.205 exec.c
--- Src/exec.c	26 Oct 2011 18:48:13 -0000	1.205
+++ Src/exec.c	11 Dec 2011 19:32:17 -0000
@@ -959,7 +959,7 @@ entersubsh(int flags)
     if ((flags & ESUB_REVERTPGRP) && getpid() == mypgrp)
 	release_pgrp();
     shout = NULL;
-    if (isset(MONITOR)) {
+    if (isset(MONITOR) && !isset(POSIXJOBS)) {
 	signal_default(SIGTTOU);
 	signal_default(SIGTTIN);
 	signal_default(SIGTSTP);

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug in sh emulation
  2011-12-11 19:39       ` Peter Stephenson
@ 2011-12-11 20:20         ` Peter Stephenson
  2011-12-11 20:56           ` Peter Stephenson
  2011-12-11 22:37         ` Bart Schaefer
  1 sibling, 1 reply; 17+ messages in thread
From: Peter Stephenson @ 2011-12-11 20:20 UTC (permalink / raw)
  To: zsh-workers

On Sun, 11 Dec 2011 19:39:49 +0000
Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> Not so much a follow up as a separate point...
> 
> If we're letting the subshell do job control (not resetting MONITOR in
> entersubsh() because POSIXJOBS is set), then presumably we shouldn't be
> resetting the signals that are special to shells that do job control?

Thinking a bit more...

Although it's not part of the problem here (we're in a real subshell),
we're leaving on job control in the case of POSIXJOBS too often... we
should only be doing it if we're entering a (...) subshell, not e.g. if
we're forking to execute an external command, and presumably also not if
we've been put into the background.  That hasn't been an issue
up to now because we've reset the signals any time MONITOR was
previously set.

So this modifies that patch only to do special things with job control
in subshells if we're about to enter a (...) subshell.

Not sure about $(...) etc.  Currently we're strict about not doing
anything MONITOR-related with those, so have left those turning MONITOR
off immediately.

I'm vaguely coming to the conclusion the SIGTTOU's are endemic because
we only do the attachtty() from the new process, which starts off in the
background, and hence the signal ignoring is a key feature, but that's a
good deal hazier.   Useful info at

http://www.gnu.org/s/hello/manual/libc/Implementing-a-Shell.html#Implementing-a-Shell

Index: Src/exec.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/exec.c,v
retrieving revision 1.205
diff -p -u -r1.205 exec.c
--- Src/exec.c	26 Oct 2011 18:48:13 -0000	1.205
+++ Src/exec.c	11 Dec 2011 20:15:32 -0000
@@ -896,14 +896,16 @@ enum {
     /* Release the process group if pid is the shell's process group */
     ESUB_REVERTPGRP = 0x10,
     /* Don't handle the MONITOR option even if previously set */
-    ESUB_NOMONITOR = 0x20
+    ESUB_NOMONITOR = 0x20,
+    /* This is a subshell where job control is allowed */
+    ESUB_JOB_CONTROL = 0x40
 };
 
 /**/
 static void
 entersubsh(int flags)
 {
-    int sig, monitor;
+    int sig, monitor, job_control_ok;
 
     if (!(flags & ESUB_KEEPTRAP))
 	for (sig = 0; sig < VSIGCOUNT; sig++)
@@ -911,6 +913,7 @@ entersubsh(int flags)
 		sig != SIGDEBUG && sig != SIGZERR)
 		unsettrap(sig);
     monitor = isset(MONITOR);
+    job_control_ok = monitor && (flags & ESUB_JOB_CONTROL) && isset(POSIXJOBS);
     if (flags & ESUB_NOMONITOR)
 	opts[MONITOR] = 0;
     if (!isset(MONITOR)) {
@@ -959,7 +962,7 @@ entersubsh(int flags)
     if ((flags & ESUB_REVERTPGRP) && getpid() == mypgrp)
 	release_pgrp();
     shout = NULL;
-    if (isset(MONITOR)) {
+    if (!job_control_ok) {
 	signal_default(SIGTTOU);
 	signal_default(SIGTTIN);
 	signal_default(SIGTSTP);
@@ -971,7 +974,7 @@ entersubsh(int flags)
     }
     if (!(sigtrapped[SIGQUIT] & ZSIG_IGNORED))
 	signal_default(SIGQUIT);
-    if (!isset(POSIXJOBS))
+    if (!job_control_ok)
 	opts[MONITOR] = 0;
     opts[USEZLE] = 0;
     zleactive = 0;
@@ -2829,6 +2832,8 @@ execcmd(Estate state, int input, int out
 	flags = ((how & Z_ASYNC) ? ESUB_ASYNC : 0) | ESUB_PGRP;
 	if ((type != WC_SUBSH) && !(how & Z_ASYNC))
 	    flags |= ESUB_KEEPTRAP;
+	if (type == WC_SUBSH && !(how & Z_ASYNC))
+	    flags |= ESUB_JOB_CONTROL;
 	entersubsh(flags);
 	close(synch[1]);
 	forked = 1;


-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug in sh emulation
  2011-12-11 20:20         ` Peter Stephenson
@ 2011-12-11 20:56           ` Peter Stephenson
  2011-12-11 23:39             ` Bart Schaefer
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Stephenson @ 2011-12-11 20:56 UTC (permalink / raw)
  To: zsh-workers

(sorry! it's me again...  however, this time I think I'm really getting
there...)

> On Sun, 11 Dec 2011 19:39:49 +0000
> Peter Stephenson <p.w.stephenson@ntlworld.com> wrote:
> I'm vaguely coming to the conclusion the SIGTTOU's are endemic because
> we only do the attachtty() from the new process, which starts off in the
> background, and hence the signal ignoring is a key feature...

http://www.gnu.org/s/hello/manual/libc/Job-Control-Signals.html

— Macro: int SIGTTOU

  This is similar to SIGTTIN, but is generated when a process in a
  background job attempts to write to the terminal *or set its modes*.
                                                    ^^^^^^^^^^^^^^^^

Aha!  I think the improbability sum is now complete.

Why the bug occurred:

- POSIXJOBS and MONITOR are on initially

- Fork for subshell

- attachtty() in subshell, which is in background to begin with ---
  parent shell is foreground.  This works because SIGTTOU is being ignored.

- Shell stopped ignoring SIGTTOU (wrong because...)

- MONITOR left on because of POSIXJOBS (correct because this is a subshell)

- Fork for pipeline

- attachtty() again... SIGTTOU not being ignored, so this causes mayhem.


What should happen:

- Fork for subshell as before

- attachtty() as before

- Shell keeps ignoring SIGTTOU because...

- MONITOR is left on for POSIXJOBs as this is a (...) subshell in the
  foreground.

- Fork for pipeline

- attachtty() works because SIGTTOU is still ignored

- Now stop ignoring SIGTTOU because the pipeline element is not a (...)
  subshell, it's some random command

- MONITOR is turned off for the same reason.

So I think I've convinced myself the immediately preceding patch is
correct.

In addition to fixing the bug we get:

% emulate sh -c '(echo foo | grep foo & jobs)' 
[2]    done       echo foo | 
       running    grep foo
% foo

(some inevitable races meaning it may not look quite like that) as job
control in the subshell has stayed on because of POSIXJOBS and has
worked correctly by entering the pipeline into the job table.

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug in sh emulation
  2011-12-11 19:39       ` Peter Stephenson
  2011-12-11 20:20         ` Peter Stephenson
@ 2011-12-11 22:37         ` Bart Schaefer
  1 sibling, 0 replies; 17+ messages in thread
From: Bart Schaefer @ 2011-12-11 22:37 UTC (permalink / raw)
  To: zsh-workers

On Dec 10,  6:15pm, Peter Stephenson wrote:
}
} I think it's probably the set of changes around 27100 to 27109 that
} allowed MONITOR to remain on in a subshell.  In particular,
} 
}     if (!isset(POSIXJOBS))
} 	opts[MONITOR] = 0;
} 
} My session / process group / controlling terminal understanding is
} distinctly ropey.  Is the fix that we simply don't execute that code if
} we've already executed it, i.e. subsh is already 1 when we get there
} (see ultra-ultra-tentative patch --- we set subsh to 1 just after that)?
} That fixes the problem, but that may be simply because we don't execute
} the code that fails.  But what should actually happen if we have a
} subshell of a subshell, i.e. a nested "( ... )"?  And what should the
} group leader be set to at this point after a second fork?

With the caveat that it's been a really long time since I (a) studied
this or (b) did any real work that relied on it ... two factors are at
work here.

(1) There's no limit on the number of process groups you can have.  A
process group is an organizational unit for signal delivery, so that
a whole set of processes can get the same signal at [conceptually] the
same time.  Any process can declare itself to be the leader of its own
process group; any process which does not do so is part of the process
group of its parent (which may actually be the process group of its
grandparent, etc.).

(2) A TTY device can only be attached to one process group at a time.
A process can take control of a TTY if no other process already has
and if it is able to open a file descriptor on that TTY; but once some
process controls it, only the leader of that process group can "give
away" control of the TTY to some other process group.  If another
process group tries to "take" the TTY, it gets a TTOU signal (unless
that's been disabled in the TTY driver); it can ignore that signal,
but it can't grab the terminal.

Exactly how "give away" happens is a little goofy.  Wikipedia says
that usual shell behavior is for both the parent and the child to try
to set the tty process group to the child when that is the desired
effect, to avoid a race condition [the parent isn't allowed to give
away the tty after the child has called exec(), for security].  It
appears only a direct child can inherit the tty, not a grandchild,
which may be the issue in our current dilemma.

http://www.cs.ucsb.edu/~almeroth/classes/W99.276/assignment1/signals.html#Pgrps

Thus control of the terminal can be passed down an arbitrary number
of subshell levels, but only if each ancestor hands it off before the
descendant attempts to do the same.  I have forgotten what happens to
control of the TTY if the group leader exits before its descendants,
but I don't think that's an issue here?

Given that the reason for associating a TTY with a process group is
so that signals from the TTY will be delivered to all the descendants
of the group leader, it's very likely the case that if attachtty() is
not able to succeed, then we didn't want to create a new process group
[declare a new group leader] in the first place -- UNLESS the job IS
being run asynchronously (not true here) so that it is intended NOT to
get signals from the TTY.
 
On Dec 10,  7:40pm, Peter Stephenson wrote:
}
} What's confusing me, possibly based on ignorance, is that I naively
} expect something like the following to happen (without my
} ultra-ultra-tentative patch).
} 
} - Shell forks.  This is treated pretty much like any other fork to
} create a new foreground process.
} 
} - Job control is active, so the forked shell takes over the TTY and sets
} itself as the group leader.  That's the first time through the code
} under discussion, in entersubsh() for the case where MONITOR is still
} set.

In fact in this case job control in the subshell doesn't matter.  It's
the new foreground process, so the parent shell (or both the parent and
child, see above) should already want to attach the subshell to the TTY.
You got to this yourself in the next message in the thread.

} - Shell forks again.  This is a pipeline, so part of the same process
} group.  I would expect this to find there is already a group leader from
} the previous fork, so this process doesn't try to make itself group
} leader and grab the pipeline.  Evidently this isn't happening,
} however.

On Dec 10, 11:28pm, Peter Stephenson wrote:
}
} - We entersubsh() again.  MONITOR is still set (this is where the
}   difference really kicks in) so we execute the code for handling the
}   tty again.  We're now job 2, so no group leader for this.  We create a
}   new group leader, and try to attach this to the tty, but we can't
}   because job 1 is attached to it.
} 
} I suppose the problem is we're doing the attach in the wrong way.

If your step-thorugh analysis is correct then the problem is with "we're
now job 2, so no group leader for this."  The group leader should still
be the same as from the first entersubsh().  We want this new pipeline
to remain part of the original group that already has control of the
terminal.

} So what's with the stuff in entersubsh() that we always execute
} anyway, and successfully does atttachtty() the first time? Should we
} really have done that in the parent shell?

I think this means the MONITOR option is covering too many cases -- it
is being used both as a flag that the (sub)shell should track status
of its children, and also as a flag that the (sub)shell is THE shell
that governs control of the TTY.  The latter is no longer true; if we
want to leave MONITOR set then we need separate knowledge of which
process is the ultimate ancestor.  Compare the current pid to the pid
of the original shell, for example, although then we need to be sure
to update that in the event of certain kinds of "exec".

Put a different way, the concept of the "foreground" job in a subshell
with MONITOR set is not the same as the "foreground" job started from
the ancestral shell.  Only in the latter case does forground also mean
TTY process group leader.

} I wonder if this is related to the business with the "superjob" in the
} case of a non-subshell that I never understood?  There's code in
} handle_sub() in jobs.c line 278 that looks like it might be trying to do
} something like what we need to do here.

Not directly, AFAICT -- the superjob is a construct invented because of
the way zsh runs the right side of pipelines in the current shell when
the right side is a builtin.  If you suspend the right side, the current
shell needs to wake up again, so it does two things: (A) creates a new
process to finish the right side [which immediately suspends], and (B)
creates a "superjob" in the job table to track both sides of the now
fully-forked-off pipeline.  Where "creates" may mean "promotes another
already existing entry" or something.  Of course you knew about (A).

On Dec 11,  7:39pm, Peter Stephenson wrote:
} Subject: Re: Bug in sh emulation
}
} If we're letting the subshell do job control (not resetting MONITOR in
} entersubsh() because POSIXJOBS is set), then presumably we shouldn't be
} resetting the signals that are special to shells that do job control?
} 
} This actually makes the issue go away, but I'm not sure at all sure it's
} the basic issue; it's part of the stuff I'm hoping Mystified of Marin
} County might know a little more about.

This goes back to "if attachtty() is not able to succeed, then we didn't
want to create a new process group" -- ignoring the signal may be the
right thing for other reasons, but for this specific problem it's just
masking the fact that [I *think* you'll find that] signals from the
terminal are unable to affect the pipeline spawned from the subshell
because that pipeline has incorrectly been put in a new progress group.

-- 
Barton E. Schaefer


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug in sh emulation
  2011-12-11 20:56           ` Peter Stephenson
@ 2011-12-11 23:39             ` Bart Schaefer
  2011-12-12 10:01               ` Peter Stephenson
  0 siblings, 1 reply; 17+ messages in thread
From: Bart Schaefer @ 2011-12-11 23:39 UTC (permalink / raw)
  To: zsh-workers

On Dec 11,  8:56pm, Peter Stephenson wrote:
}
} What should happen:
} 
} - Fork for subshell as before
} 
} - attachtty() as before
} 
} - Shell keeps ignoring SIGTTOU because...
} 
} - MONITOR is left on for POSIXJOBs as this is a (...) subshell in the
}   foreground.
} 
} - Fork for pipeline
} 
} - attachtty() works because SIGTTOU is still ignored

I think there's a flaw here:  attachtty() doesn't *work*, in that it
won't actually associate the new process group with the TTY.  Instead
it just won't result in an infinite loop.
 
} - Now stop ignoring SIGTTOU because the pipeline element is not a (...)
}   subshell, it's some random command
} 
} - MONITOR is turned off for the same reason.

I think this will happen as a side-effect of attachtty() failing to do
any other useful work.  Does it not print the "can't set tty pgrp"
error?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug in sh emulation
  2011-12-11 23:39             ` Bart Schaefer
@ 2011-12-12 10:01               ` Peter Stephenson
  2011-12-12 10:14                 ` Peter Stephenson
  2011-12-12 15:44                 ` Bart Schaefer
  0 siblings, 2 replies; 17+ messages in thread
From: Peter Stephenson @ 2011-12-12 10:01 UTC (permalink / raw)
  To: zsh-workers

On Sun, 11 Dec 2011 15:39:18 -0800
Bart Schaefer <schaefer@brasslantern.com> wrote:
> } - attachtty() works because SIGTTOU is still ignored
> 
> I think there's a flaw here:  attachtty() doesn't *work*, in that it
> won't actually associate the new process group with the TTY.  Instead
> it just won't result in an infinite loop.

I'll check in more detail later, but I'm pretty sure this is how it
*always* works, not just in the case we're looking at.  In most cases
the parent shell just lets the new process (whatever it's doing, just as
long as it's in the foreground) grab the terminal --- if you search the
code for attachtty(), and ignore the cases where we attaching to mypgrp,
the shell's own process group, the only other ones in the parent shell
are special cases.  Instrumenting attachtty() didn't show any point
where the parent shell handed the tty over, and if you look at the GNU
documentation I pointed to, although it says both the parent shell and
the new process should perform the handover to minimise races, the
example code just does it from within the new process.

So I think (i) the attachtty() works in the subprocess so long as
SIGTTOU is blocked (ii) if that's not the case, we've got a far worse
problem than just shell compatibility (and I have no direct evidence we
have such a problem).

-- 
Peter Stephenson <pws@csr.com>            Software Engineer
Tel: +44 (0)1223 692070                   Cambridge Silicon Radio Limited
Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Follow CSR on Twitter at http://twitter.com/CSR_PLC and read our blog at www.csr.com/blog


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug in sh emulation
  2011-12-12 10:01               ` Peter Stephenson
@ 2011-12-12 10:14                 ` Peter Stephenson
  2011-12-12 15:44                 ` Bart Schaefer
  1 sibling, 0 replies; 17+ messages in thread
From: Peter Stephenson @ 2011-12-12 10:14 UTC (permalink / raw)
  To: zsh-workers

On Mon, 12 Dec 2011 10:01:00 +0000
Peter Stephenson <Peter.Stephenson@csr.com> wrote:
> So I think (i) the attachtty() works in the subprocess so long as
> SIGTTOU is blocked

POSIX is explicit this works:

  Attempts to use tcsetpgrp() from a process which is a member of a
  background process group on a fildes associated with its controlling
  terminal shall cause the process group to be sent a SIGTTOU signal. If
  the calling process is blocking or ignoring SIGTTOU signals, the process
  shall be allowed to perform the operation, and no signal is sent.

-- 
Peter Stephenson <pws@csr.com>            Software Engineer
Tel: +44 (0)1223 692070                   Cambridge Silicon Radio Limited
Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Follow CSR on Twitter at http://twitter.com/CSR_PLC and read our blog at www.csr.com/blog


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug in sh emulation
  2011-12-12 10:01               ` Peter Stephenson
  2011-12-12 10:14                 ` Peter Stephenson
@ 2011-12-12 15:44                 ` Bart Schaefer
  2011-12-12 16:06                   ` Peter Stephenson
  1 sibling, 1 reply; 17+ messages in thread
From: Bart Schaefer @ 2011-12-12 15:44 UTC (permalink / raw)
  To: zsh-workers

On Dec 12, 10:01am, Peter Stephenson wrote:
}
} the parent shell just lets the new process (whatever it's doing, just as
} long as it's in the foreground) grab the terminal --- if you search the
} code for attachtty(), and ignore the cases where we attaching to mypgrp,
} the shell's own process group, the only other ones in the parent shell
} are special cases.

OK, I'm good with that, but are you sure that the correct pgrp is
ending up attached?  I was concerned about the subshell pipeline not
receiving tty signals before, but now I'm concerned about signals
not going to the previously-attached group from which the subshell
is taking control.

As long as we don't have two forks fighting over who is the foreground
job, we're OK.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug in sh emulation
  2011-12-12 15:44                 ` Bart Schaefer
@ 2011-12-12 16:06                   ` Peter Stephenson
  2011-12-12 18:10                     ` Bart Schaefer
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Stephenson @ 2011-12-12 16:06 UTC (permalink / raw)
  To: zsh-workers

On Mon, 12 Dec 2011 07:44:17 -0800
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Dec 12, 10:01am, Peter Stephenson wrote:
> }
> } the parent shell just lets the new process (whatever it's doing, just as
> } long as it's in the foreground) grab the terminal --- if you search the
> } code for attachtty(), and ignore the cases where we attaching to mypgrp,
> } the shell's own process group, the only other ones in the parent shell
> } are special cases.
> 
> OK, I'm good with that, but are you sure that the correct pgrp is
> ending up attached?  I was concerned about the subshell pipeline not
> receiving tty signals before, but now I'm concerned about signals
> not going to the previously-attached group from which the subshell
> is taking control.

I think this must be OK (unless, of course, something is failing where I
don't see it).  The chunk of code in entersubsh() we've been staring at
is the one that's only executed if there's no group leader, and in that
case it creates its own process group and attaches that to the tty
immediately. So if that succeeds then by definition it's the process
group we're just starting in the foreground.  We get this twice (even
after the last patch): first the subshell puts itself in the
foreground, then the pipeline being run from the subshell puts itself
into the foreground. If we weren't doing job control in the subshell,
the second one wouldn't happen: the subshell and everything run from it
would be treated as a single foreground job (this is the normal case
for zsh native mode).

-- 
Peter Stephenson <pws@csr.com>            Software Engineer
Tel: +44 (0)1223 692070                   Cambridge Silicon Radio Limited
Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Follow CSR on Twitter at http://twitter.com/CSR_PLC and read our blog at www.csr.com/blog


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug in sh emulation
  2011-12-12 16:06                   ` Peter Stephenson
@ 2011-12-12 18:10                     ` Bart Schaefer
  2011-12-12 19:16                       ` Peter Stephenson
  0 siblings, 1 reply; 17+ messages in thread
From: Bart Schaefer @ 2011-12-12 18:10 UTC (permalink / raw)
  To: zsh-workers

On Dec 12,  4:06pm, Peter Stephenson wrote:
}
} I think this must be OK (unless, of course, something is failing where I
} don't see it).  The chunk of code in entersubsh() we've been staring at
} is the one that's only executed if there's no group leader, and in that
} case it creates its own process group and attaches that to the tty

Right.  My lingering doubt is that "no group leader" doesn't necessarily
mean "no other group attached to the tty", if we've been given some wacky
structure like

(emulate sh -c "(foo | bar)" | emulate zsh -c "(baz | ding)") | read -E

However, I can't actually construct a real example that I think would
fail, so we should probably go with your patches so far.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Bug in sh emulation
  2011-12-12 18:10                     ` Bart Schaefer
@ 2011-12-12 19:16                       ` Peter Stephenson
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Stephenson @ 2011-12-12 19:16 UTC (permalink / raw)
  To: zsh-workers

On Mon, 12 Dec 2011 10:10:00 -0800
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Dec 12,  4:06pm, Peter Stephenson wrote:
> }
> } I think this must be OK (unless, of course, something is failing where I
> } don't see it).  The chunk of code in entersubsh() we've been staring at
> } is the one that's only executed if there's no group leader, and in that
> } case it creates its own process group and attaches that to the tty
> 
> Right.  My lingering doubt is that "no group leader" doesn't necessarily
> mean "no other group attached to the tty", if we've been given some wacky
> structure like
> 
> (emulate sh -c "(foo | bar)" | emulate zsh -c "(baz | ding)") | read -E

The last patch attempted to do a bit better in limiting the cases where
the MONITOR option was left on to a straightforward subshell.  In this
example, the first fork is for a pipeline rather than subshell, so
MONITOR should be turned off before the subshell is started --- it's
perfectly possible I've missed a passage through the maze that should do
that, though.

We might well get into problems if MONITOR is explicitly turned back on,
but I don't have so much sympathy there.

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2011-12-12 19:16 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-10  1:39 Bug in sh emulation Ivan S. Freitas
2011-12-10  2:47 ` Bart Schaefer
2011-12-10  3:40   ` Bart Schaefer
2011-12-10 18:15     ` Peter Stephenson
2011-12-10 19:40   ` Peter Stephenson
2011-12-10 23:28     ` Peter Stephenson
2011-12-11 19:39       ` Peter Stephenson
2011-12-11 20:20         ` Peter Stephenson
2011-12-11 20:56           ` Peter Stephenson
2011-12-11 23:39             ` Bart Schaefer
2011-12-12 10:01               ` Peter Stephenson
2011-12-12 10:14                 ` Peter Stephenson
2011-12-12 15:44                 ` Bart Schaefer
2011-12-12 16:06                   ` Peter Stephenson
2011-12-12 18:10                     ` Bart Schaefer
2011-12-12 19:16                       ` Peter Stephenson
2011-12-11 22:37         ` Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).