zsh-workers
 help / color / mirror / code / Atom feed
* Re: 3.0.6-pre-5 problem
@ 1999-06-25 12:52 Sven Wischnowsky
  1999-06-25 15:56 ` Peter Stephenson
  1999-06-25 16:29 ` 3.0.6-pre-5 problem Bart Schaefer
  0 siblings, 2 replies; 6+ messages in thread
From: Sven Wischnowsky @ 1999-06-25 12:52 UTC (permalink / raw)
  To: zsh-workers


Bart Schaefer wrote:

> On Jun 25, 11:38am, Sven Wischnowsky wrote:
> } Subject: Re: 3.0.6-pre-5 problem
> }
> } I hope this fixes it.
> 
> No such luck.  Here's "pstree" output:
> 
> zsh(28198)-+-pstree(4199)
>            |-xterm(4146)---zsh(4147)-+-mutt(4149)
>            |                         `-zsh(4153)
>            `-xterm(4191)---zsh(4192)---zsh(4194)---mutt(4196)
> 
> The first xterm (4146) I ran the "mutt" function directly from the top
> shell and then hit ^Z.  4149 and 4153 are both stopped; 4146 is blocked
> in wait4() which means that 4147 can't get any keystrokes (the xterm
> isn't feeding it) which is the hang that Jos sees.
> 
> The second xterm (4191) I ran a new zsh -f (4194) and then the "mutt"
> function; there, 4194 and 4196 are stopped.
> 
> Note that in the first case zsh created an extra dummy job, but in the
> second case it didn't.  This must have something to do with which process
> is the group leader.

(To Bart: I was doing it inside an xterm, but from a bash that ran
inside the xterm.)

I could finally reproduce it when trying to look at it with strace,
which finally opened my eyes (I would have needed a `ps j' output). It 
goes like this: Someone exec()s zsh without putting it into its own
process group. Then we start the function and zsh executes external
commands in its own process group. Then the user hits ^Z and all three 
of them receive the SIGTSTP. The external command is stopped, which is 
fine, zsh ignores it, which is better, and the parent of zsh happens
to not ignore it and stopt, which is deadly.

So, if we have agreed to use the kill-loop-patches, we'll have to make 
sure that every decent interactive zsh with job-control runs in its
own process group which is what the patch below does.

Ok. Since I still couldn't reproduce the exact original problem, I'd
be thankful for any response (*especially* if it's fixed).

Bye
 Sven

P.S.: Peter: 6838 should be superfluous, but I still like the look of 6848.

--- os/init.c	Thu Jun 24 19:00:56 1999
+++ Src/init.c	Fri Jun 25 14:41:12 1999
@@ -390,7 +390,16 @@
 #ifdef JOB_CONTROL
     /* If interactive, make the shell the foreground process */
     if (opts[MONITOR] && interact && (SHTTY != -1)) {
-	attachtty(GETPGRP());
+      /* Since we now sometimes execute programs in the process group
+       * of the parent shell even when using job-control, we have to
+       * make sure that we run in our own process group. Otherwise if
+       * we are called from a program that doesn't put us in our own
+       * group a SIGTSTP that we ignore might stop our parent process.
+       * Instead of the two calls below we once had:
+       *   attachtty(GETPGRP());
+       */
+	attachtty(getpid());
+	setpgrp(0L, 0L);
 	if ((mypgrp = GETPGRP()) > 0) {
 	    while ((ttpgrp = gettygrp()) != -1 && ttpgrp != mypgrp) {
 		sleep(1);

--
Sven Wischnowsky                         wischnow@informatik.hu-berlin.de


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3.0.6-pre-5 problem
  1999-06-25 12:52 3.0.6-pre-5 problem Sven Wischnowsky
@ 1999-06-25 15:56 ` Peter Stephenson
  1999-06-25 16:17   ` Xterm terminal settings (Re: 3.0.6-pre-5 problem) Bart Schaefer
  1999-06-25 16:29 ` 3.0.6-pre-5 problem Bart Schaefer
  1 sibling, 1 reply; 6+ messages in thread
From: Peter Stephenson @ 1999-06-25 15:56 UTC (permalink / raw)
  To: zsh-workers

(I tried to reply to this several times before but it doesn't seem to have
worked, due to the @!$*! disk being full again.)

Sven Wischnowsky wrote:
> So, if we have agreed to use the kill-loop-patches, we'll have to make 
> sure that every decent interactive zsh with job-control runs in its
> own process group which is what the patch below does.

Does this have implications for sending signals from the Ctrl-Button1 menu
of an xterm, not that that ever worked for me anyway?

By the way, although this is totally unrelated, looking at this I
discovered that
  SHELL=/usr/bin/ksh xterm &
creates an xterm with terminal settings messed up.  Without the & it's OK.

-- 
Peter Stephenson <pws@ibmth.df.unipi.it>       Tel: +39 050 844536
WWW:  http://www.ifh.de/~pws/
Dipartimento di Fisica, Via Buonarroti 2, 56127 Pisa, Italy


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Xterm terminal settings (Re: 3.0.6-pre-5 problem)
  1999-06-25 15:56 ` Peter Stephenson
@ 1999-06-25 16:17   ` Bart Schaefer
  0 siblings, 0 replies; 6+ messages in thread
From: Bart Schaefer @ 1999-06-25 16:17 UTC (permalink / raw)
  To: Peter Stephenson, zsh-workers

On Jun 25,  5:56pm, Peter Stephenson wrote:
} Subject: Re: 3.0.6-pre-5 problem
}
} By the way, although this is totally unrelated, looking at this I
} discovered that
}   SHELL=/usr/bin/ksh xterm &
} creates an xterm with terminal settings messed up.  Without the & it's OK.

I believe this to be an OS pty-allocation problem rather than a zsh problem.
I've seen it before in contexts unrelated to zsh.  SunOS in particular is a
culprit; what's happening is that the tty driver for the pseudo-tty copies
its settings from the current controlling terminal of the process that opens
the master side.  When the master process (xterm in this case) doesn't have
an associated tty, the pty gets bad values -- on SunOS, it gets the values
for the system console, which are generally nothing like what you want an
xterm to have.

(It just occurred to me that this problem may show up on any system that
uses STREAMS modules instead of BSD-style TTY drivers.)

Anyway, for several releases of X11 now there's been a resource for setting
your xterm TTY values to work around this bug; it's XTerm*ttyModes: and the
value looks like an stty command line (without the command name).

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3.0.6-pre-5 problem
  1999-06-25 12:52 3.0.6-pre-5 problem Sven Wischnowsky
  1999-06-25 15:56 ` Peter Stephenson
@ 1999-06-25 16:29 ` Bart Schaefer
  1999-06-25 17:14   ` Bart Schaefer
  1 sibling, 1 reply; 6+ messages in thread
From: Bart Schaefer @ 1999-06-25 16:29 UTC (permalink / raw)
  To: Sven Wischnowsky, zsh-workers

On Jun 25,  2:52pm, Sven Wischnowsky wrote:
} Subject: Re: 3.0.6-pre-5 problem
}
} I could finally reproduce it when trying to look at it with strace,
} which finally opened my eyes (I would have needed a `ps j' output).

 PPID   PID  PGID   SID TTY TPGID  STAT  UID   TIME COMMAND
 5161  5162  5162  5162  p4  5162  T     674   0:00 Src/zsh -f 
 5162  5164  5162  5162  p4  5162  T     674   0:00 mutt 
28198  5165  5165 28198  p3  5165  R     674   0:00 ps j 
28198  5161  5161 28198  p3  5169  S       0   0:00 xterm -e Src/zsh -f 

zsh(28198)-+-pstree(5173)
           `-xterm(5161)---zsh(5162)---mutt(5164)

} It goes like this: Someone exec()s zsh without putting it into its own
} process group. Then we start the function and zsh executes external
} commands in its own process group.

The top-level xterm case is now producing process trees exactly like
the case where there's an intermediate zsh, but the parent zsh is still
not ignoring the TSTP.

} So, if we have agreed to use the kill-loop-patches, we'll have to make 
} sure that every decent interactive zsh with job-control runs in its
} own process group which is what the patch below does.

This is a good idea in any case.

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: 3.0.6-pre-5 problem
  1999-06-25 16:29 ` 3.0.6-pre-5 problem Bart Schaefer
@ 1999-06-25 17:14   ` Bart Schaefer
  1999-06-28  6:04     ` 3.0.6-pre-5 problem and loop killing Andrej Borsenkow
  0 siblings, 1 reply; 6+ messages in thread
From: Bart Schaefer @ 1999-06-25 17:14 UTC (permalink / raw)
  To: zsh-workers

On Jun 25,  4:29pm, Bart Schaefer wrote:
} Subject: Re: 3.0.6-pre-5 problem
}
} zsh(28198)-+-pstree(5173)
}            `-xterm(5161)---zsh(5162)---mutt(5164)
} 
} The top-level xterm case is now producing process trees exactly like
} the case where there's an intermediate zsh, but the parent zsh is still
} not ignoring the TSTP.
 
More information:  Fooling around with "gdbterm" (I posted it to zsh-users
a while back) I managed to get zsh to stop and resume again (though the
mutt process got orphaned).  When zsh came back and I typed "fg" to try to
resume mutt, I got "no job control in this shell" which is pretty strange
as there had been job control a moment before.

That indicates to me that, in this bit of code from execpline() we're going
through the third branch:

		    if ((pid = fork()) == -1) {
			/* ... */
		    }
		    else if (pid) {
			/* ... */
		    else {
			close(synch[0]);
			entersubsh(Z_ASYNC, 0, 0);
			if (jobtab[list_pipe_job].procs)
			    setpgrp(0L, mypgrp = jobtab[list_pipe_job].gleader);
			close(synch[1]);
			kill(getpid(), SIGSTOP);
			list_pipe = 0;
			list_pipe_child = 1;
			opts[INTERACTIVE] = 0;
			break;
		    }

Which of course is the one thing Sven hasn't tried patching yet ... but I'm
not sure WHY zsh is going through the third branch.  All of Sven's patches
have been to the second branch.

What's odd is that over in the 6860 "Re: PATCH: loop killing" thread, zsh
DOESN'T get the signal when it's supposed to.  Maybe Sven's got the cases
in which zsh takes the second and third branches, reversed?

Or maybe I'm just completely confused, which is entirely likely by now.

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: 3.0.6-pre-5 problem and loop killing
  1999-06-25 17:14   ` Bart Schaefer
@ 1999-06-28  6:04     ` Andrej Borsenkow
  0 siblings, 0 replies; 6+ messages in thread
From: Andrej Borsenkow @ 1999-06-28  6:04 UTC (permalink / raw)
  To: Bart Schaefer, zsh-workers

>
> Which of course is the one thing Sven hasn't tried patching yet ... but I'm
> not sure WHY zsh is going through the third branch.  All of Sven's patches
> have been to the second branch.
>
> What's odd is that over in the 6860 "Re: PATCH: loop killing" thread, zsh
> DOESN'T get the signal when it's supposed to.  Maybe Sven's got the cases
> in which zsh takes the second and third branches, reversed?
>

Looking a bit more closely on truss of the two cases reveal, that they are quite
different:

this is from xterm -e zsh (the case, when ^Z does not work):

bor@itsrm2:~%> fgrep 'fork
execve
SIGCLD' /tmp/foo
8323:   execve("/usr/bin/X11/xterm", 0x000000FFFFFEEC30, 0x000000FFFFFEEC40)
argc = 1
8323:   fork()                                          = 8324
8324:   fork()          (returning as child ...)        = 8323
8324:   sighold(SIGCLD)                                 = SIG_DFL
8324:   fork()                                          = 8326
8323:   signal(SIGCLD, 0x00000000004068E0)              = SIG_DFL
8326:   fork()          (returning as child ...)        = 8324
8326:   execve("/usr/lib/pt_chmod", 0x000000007FFEE07C, 0x000000007FFEEDD4)
argc = 2
8324:   sigrelse(SIGCLD)                                = SIG_DFL
8324:       Received signal #18, SIGCLD [default]
8324:         siginfo: SIGCLD CLD_EXITED pid=8326 uid=1 status=0x0000
8324:   signal(SIGCLD, SIG_DFL)                         = SIG_DFL
8324:   execve("/tools/bin/zsh", 0x000000007FFEE0CC, 0x000000000049CE90)  argc =
1
8324:   sigaction(SIGCLD, 0x000000007FFEEC60, 0x0000000000000000) = 0
8324:   fork()                                          = 8331
8331:   fork()          (returning as child ...)        = 8324
8331:   execve("/usr/bin/zcat", 0x0000000000502B28, 0x0000000000505780)  argc =
1
8324:       Received signal #18, SIGCLD, in sigsuspend() [caught]
8324:         siginfo: SIGCLD CLD_KILLED pid=8331 uid=0 status=0x0002
8323:       Received signal #18, SIGCLD, in poll() [caught]
8323:         siginfo: SIGCLD CLD_EXITED pid=8324 uid=61 status=0x0082
bor@itsrm2:~%>

Note, that zsh doe exactly one fork/execve for zcat

And here is the same for the simple case zsh started from other zsh:

bor@itsrm2:/tools/src/zsh-3.1.5-pws-23%>

fgrep 'fork
execve
SIGCLD' /tmp/zsh.1}
12558:  execve("/tools/bin/zsh", 0x000000FFFFFEECF0, 0x000000FFFFFEED00)  argc =
1
12558:  sigaction(SIGCLD, 0x000000007FFEED00, 0x0000000000000000) = 0
12558:  fork()                                          = 12559
12559:  fork()          (returning as child ...)        = 12558
12559:  execve("/usr/bin/zcat", 0x0000000000502D60, 0x00000000004FE5A0)  argc =
1
12558:      Received signal #18, SIGCLD, in sigsuspend() [caught]
12558:        siginfo: SIGCLD CLD_STOPPED pid=12559 uid=0 status=0x0018
12558:  fork()                                          = 12561
12561:  fork()          (returning as child ...)        = 12558
12558:      Received signal #18, SIGCLD [caught]
12558:        siginfo: SIGCLD CLD_STOPPED pid=12561 uid=0 status=0x0017
12558:      Received signal #18, SIGCLD, in sigsuspend() [caught]
12558:        siginfo: SIGCLD CLD_CONTINUED pid=12559 uid=0 status=0x0019
12558:      Received signal #18, SIGCLD, in sigsuspend() [caught]
12558:        siginfo: SIGCLD CLD_KILLED pid=12559 uid=0 status=0x0002
12558:      Received signal #18, SIGCLD [caught]
12558:        siginfo: SIGCLD CLD_CONTINUED pid=12561 uid=0 status=0x0019
12558:      Received signal #18, SIGCLD [caught]
12558:        siginfo: SIGCLD CLD_KILLED pid=12561 uid=0 status=0x0002
bor@itsrm2:/tools/src/zsh-3.1.5-pws-23%>

Here zsh forks once more. I have no idea, where lies the difference (apart from
these PGID/SID). Both are equally *not* login shells and source the same
scripts.

/andrej


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~1999-06-28  6:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-06-25 12:52 3.0.6-pre-5 problem Sven Wischnowsky
1999-06-25 15:56 ` Peter Stephenson
1999-06-25 16:17   ` Xterm terminal settings (Re: 3.0.6-pre-5 problem) Bart Schaefer
1999-06-25 16:29 ` 3.0.6-pre-5 problem Bart Schaefer
1999-06-25 17:14   ` Bart Schaefer
1999-06-28  6:04     ` 3.0.6-pre-5 problem and loop killing Andrej Borsenkow

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).