zsh-workers
 help / color / mirror / code / Atom feed
* Re: 3.0.6-pre-5 problem
@ 1999-06-25  6:22 Sven Wischnowsky
  1999-06-25  9:03 ` Bart Schaefer
  1999-06-25  9:17 ` Bart Schaefer
  0 siblings, 2 replies; 14+ messages in thread
From: Sven Wischnowsky @ 1999-06-25  6:22 UTC (permalink / raw)
  To: zsh-workers


Jos Backus wrote:

> More info: I just rebuilt zsh with --enable-zsh-debug. Now, when I start an
> xterm, start mutt (using the function) and press ^Z, both mutt and the xterm
> disappear, leaving a zsh.core file behind. See stacktrace.
> 
> Interestingly, this backtrace looks quite different from the first one.

(This fact irritates me mightily.)

Anyway, I couldn't reproduce it but there is no harm in adding some
security code to execpline(). I guess this will not apply cleanly to
3.0.6, though.

And, of course, I have 6819 applied (which fixed a problem with
suspending shell functions, although that problem didn't cause a
SEGV).


Bye
 Sven

diff -u os/exec.c Src/exec.c
--- os/exec.c	Thu Jun 24 14:03:58 1999
+++ Src/exec.c	Fri Jun 25 08:10:28 1999
@@ -904,15 +904,16 @@
 		    if (!jn->procs->next)
 			jn->gleader = mypgrp;
 
-		    for (pn = jobtab[jn->other].procs; pn; pn = pn->next)
-			if (WIFSTOPPED(pn->status))
-			    break;
+		    if (jn->stat & STAT_SUPERJOB) {
+			for (pn = jobtab[jn->other].procs; pn; pn = pn->next)
+			    if (WIFSTOPPED(pn->status))
+				break;
 
-		    if (pn) {
-			for (qn = jn->procs; qn->next; qn = qn->next);
-			qn->status = pn->status;
+			if (pn) {
+			    for (qn = jn->procs; qn->next; qn = qn->next);
+			    qn->status = pn->status;
+			}
 		    }
-
 		    jn->stat &= ~(STAT_DONE | STAT_NOPRINT);
 		    jn->stat |= STAT_STOPPED | STAT_CHANGED;
 		    printjob(jn, !!isset(LONGLISTJOBS), 1);

--
Sven Wischnowsky                         wischnow@informatik.hu-berlin.de


^ permalink raw reply	[flat|nested] 14+ messages in thread
* Re: 3.0.6-pre-5 problem
@ 1999-06-25 12:52 Sven Wischnowsky
  1999-06-25 15:56 ` Peter Stephenson
  1999-06-25 16:29 ` Bart Schaefer
  0 siblings, 2 replies; 14+ messages in thread
From: Sven Wischnowsky @ 1999-06-25 12:52 UTC (permalink / raw)
  To: zsh-workers


Bart Schaefer wrote:

> On Jun 25, 11:38am, Sven Wischnowsky wrote:
> } Subject: Re: 3.0.6-pre-5 problem
> }
> } I hope this fixes it.
> 
> No such luck.  Here's "pstree" output:
> 
> zsh(28198)-+-pstree(4199)
>            |-xterm(4146)---zsh(4147)-+-mutt(4149)
>            |                         `-zsh(4153)
>            `-xterm(4191)---zsh(4192)---zsh(4194)---mutt(4196)
> 
> The first xterm (4146) I ran the "mutt" function directly from the top
> shell and then hit ^Z.  4149 and 4153 are both stopped; 4146 is blocked
> in wait4() which means that 4147 can't get any keystrokes (the xterm
> isn't feeding it) which is the hang that Jos sees.
> 
> The second xterm (4191) I ran a new zsh -f (4194) and then the "mutt"
> function; there, 4194 and 4196 are stopped.
> 
> Note that in the first case zsh created an extra dummy job, but in the
> second case it didn't.  This must have something to do with which process
> is the group leader.

(To Bart: I was doing it inside an xterm, but from a bash that ran
inside the xterm.)

I could finally reproduce it when trying to look at it with strace,
which finally opened my eyes (I would have needed a `ps j' output). It 
goes like this: Someone exec()s zsh without putting it into its own
process group. Then we start the function and zsh executes external
commands in its own process group. Then the user hits ^Z and all three 
of them receive the SIGTSTP. The external command is stopped, which is 
fine, zsh ignores it, which is better, and the parent of zsh happens
to not ignore it and stopt, which is deadly.

So, if we have agreed to use the kill-loop-patches, we'll have to make 
sure that every decent interactive zsh with job-control runs in its
own process group which is what the patch below does.

Ok. Since I still couldn't reproduce the exact original problem, I'd
be thankful for any response (*especially* if it's fixed).

Bye
 Sven

P.S.: Peter: 6838 should be superfluous, but I still like the look of 6848.

--- os/init.c	Thu Jun 24 19:00:56 1999
+++ Src/init.c	Fri Jun 25 14:41:12 1999
@@ -390,7 +390,16 @@
 #ifdef JOB_CONTROL
     /* If interactive, make the shell the foreground process */
     if (opts[MONITOR] && interact && (SHTTY != -1)) {
-	attachtty(GETPGRP());
+      /* Since we now sometimes execute programs in the process group
+       * of the parent shell even when using job-control, we have to
+       * make sure that we run in our own process group. Otherwise if
+       * we are called from a program that doesn't put us in our own
+       * group a SIGTSTP that we ignore might stop our parent process.
+       * Instead of the two calls below we once had:
+       *   attachtty(GETPGRP());
+       */
+	attachtty(getpid());
+	setpgrp(0L, 0L);
 	if ((mypgrp = GETPGRP()) > 0) {
 	    while ((ttpgrp = gettygrp()) != -1 && ttpgrp != mypgrp) {
 		sleep(1);

--
Sven Wischnowsky                         wischnow@informatik.hu-berlin.de


^ permalink raw reply	[flat|nested] 14+ messages in thread
* Re: 3.0.6-pre-5 problem
@ 1999-06-25  9:38 Sven Wischnowsky
  1999-06-25 10:10 ` Bart Schaefer
  0 siblings, 1 reply; 14+ messages in thread
From: Sven Wischnowsky @ 1999-06-25  9:38 UTC (permalink / raw)
  To: zsh-workers


Bart Schaefer wrote:

> On Jun 25,  9:17am, Bart Schaefer wrote:
> } Subject: Re: 3.0.6-pre-5 problem
> }
> } [...]  I think zsh
> } may have sent a STOP signal to the xterm that is its parent.
> 
> Yup, I can confirm this:
> 
> zagzig% Src/zsh -f
> zagzig% echo $SHLVL
> 4
> zagzig% Src/zsh -f
> zagzig% echo $SHLVL
> 5
> zagzig% mutt () {
> function>       command mutt "$@"
> function>       echotc rs
> function> }
> zagzig% mutt
> zsh: suspended (signal)  Src/zsh -f
> zagzig% echo $SHLVL
> 4
> 
> Zsh has just stopped its parent.  Naughty zsh.

Err, itself, right?

That was a hint... (I still wonder why I can't reproduce it, though).

I hope this fixes it. It's the only place where a parent shell sends a 
SIGSTOP and, yes, it's good to have that extra test there -- it means
roughly that we don't need to suspend the beginning of a pipeline if
there is no such beginning -- exactly what happens when suspending a
function.


Bye
 Sven

--- os/exec.c	Fri Jun 25 10:06:02 1999
+++ Src/exec.c	Fri Jun 25 11:32:57 1999
@@ -966,7 +966,7 @@
 			close(synch[1]);
 			read(synch[0], &dummy, 1);
 			close(synch[0]);
-			/* If this job has finished, we turn this into a
+			/* If this job has finished, we leave it as a
 			 * normal (non-super-) job. */
 			if (!(jn->stat & STAT_DONE)) {
 			    jobtab[list_pipe_job].other = newjob;
@@ -974,7 +974,7 @@
 			    jn->stat |= STAT_SUBJOB | STAT_NOPRINT;
 			    jn->other = pid;
 			}
-			if (list_pipe || last1)
+			if ((list_pipe || last1) && jobtab[list_pipe_job].procs)
 			    killpg(jobtab[list_pipe_job].gleader, SIGSTOP);
 			break;
 		    }

--
Sven Wischnowsky                         wischnow@informatik.hu-berlin.de


^ permalink raw reply	[flat|nested] 14+ messages in thread
* Re: 3.0.6-pre-5 problem
@ 1999-06-25  9:19 Sven Wischnowsky
  0 siblings, 0 replies; 14+ messages in thread
From: Sven Wischnowsky @ 1999-06-25  9:19 UTC (permalink / raw)
  To: zsh-workers


Bart Schaefer wrote:

> } Anyway, I couldn't reproduce it but there is no harm in adding some
> } security code to execpline(). I guess this will not apply cleanly to
> } 3.0.6, though.
> 
> I think this patch (6838) should NOT be applied, and we need a different
> one instead.  Look here:

Have you tried it with 6819 and 6824 (I forgot to mention the latter, 
sorry)? With them (and pws-23) I can't reproduce it and both of them
could influence it (especially 6824, which fixed a bug with suspending 
functions that garbled the job-table entry).


Bye
 Sven


--
Sven Wischnowsky                         wischnow@informatik.hu-berlin.de


^ permalink raw reply	[flat|nested] 14+ messages in thread
* 3.0.6-pre-5 problem
@ 1999-06-24 15:05 Jos Backus
  1999-06-24 15:26 ` Jos Backus
  0 siblings, 1 reply; 14+ messages in thread
From: Jos Backus @ 1999-06-24 15:05 UTC (permalink / raw)
  To: zsh-workers

	Hi,

Because of the outdated system ncurses FreeBSD uses, I am using the following
function:

mutt () {
	command mutt "$@"
	echotc rs
}

With 3.0.6-pre-5 something odd is happening:

hal:~% xterm &!

---Mutt: =inbox [Msgs:726 New:10 Flag:1 Inc:11 5.1M]---(threads/date)---(end)---
^Z	<background mutt>
zsh: 685 suspended (signal)  mutt | 
zsh: 686 running             mutt
hal:~% fg
q	<quit mutt>
---Mutt: =inbox [Msgs:727 New:11 Flag:1 Inc:11 5.1M]---(threads/date)---(99%)---

<xterm ``hangs'', no prompt is output>

There's a core file, too:

hal:~jos# gdb /bin/zsh zsh.core
GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-unknown-freebsd"...
(no debugging symbols found)...
Core was generated by `zsh'.
Program terminated with signal 11, Segmentation fault.
Reading symbols from /usr/lib/libtermcap.so.2...(no debugging symbols found)...
done.
Reading symbols from /usr/lib/libc.so.3...(no debugging symbols found)...done.
Reading symbols from /usr/libexec/ld-elf.so.1...(no debugging symbols found)...
done.
#0  0x8063943 in makerunning ()
(gdb) where
#0  0x8063943 in makerunning ()
#1  0x806398d in makerunning ()
#2  0x806398d in makerunning ()
#3  0x804b126 in bin_fg ()
#4  0x804a558 in execbuiltin ()
#5  0x80576fb in execcmd ()
#6  0x8055b5b in execpline2 ()
#7  0x805561f in execpline ()
#8  0x805530e in execlist ()
#9  0x8061ad8 in loop ()
#10 0x8061951 in main ()
#11 0x8049f61 in _start ()
(gdb) 

This doesn't happen when I call mutt directly (using ``command mutt'').

This looks like a zsh problem that may have been introduced with pre-5.

I can supply details if needed.

Thanks,
-- 
Jos Backus                          _/ _/_/_/  "Reliability means never
                                   _/ _/   _/   having to say you're sorry."
                                  _/ _/_/_/             -- D. J. Bernstein
                             _/  _/ _/    _/
Jos.Backus@nl.origin-it.com  _/_/  _/_/_/      use Std::Disclaimer;


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~1999-06-25 17:15 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-06-25  6:22 3.0.6-pre-5 problem Sven Wischnowsky
1999-06-25  9:03 ` Bart Schaefer
1999-06-25  9:17 ` Bart Schaefer
1999-06-25  9:23   ` Bart Schaefer
  -- strict thread matches above, loose matches on Subject: below --
1999-06-25 12:52 Sven Wischnowsky
1999-06-25 15:56 ` Peter Stephenson
1999-06-25 16:29 ` Bart Schaefer
1999-06-25 17:14   ` Bart Schaefer
1999-06-25  9:38 Sven Wischnowsky
1999-06-25 10:10 ` Bart Schaefer
1999-06-25  9:19 Sven Wischnowsky
1999-06-24 15:05 Jos Backus
1999-06-24 15:26 ` Jos Backus
1999-06-24 18:54   ` Jos Backus

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).