From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29083 invoked from network); 12 Jun 2001 15:01:26 -0000 Received: from sunsite.dk (130.225.51.30) by ns1.primenet.com.au with SMTP; 12 Jun 2001 15:01:26 -0000 Received: (qmail 774 invoked by alias); 12 Jun 2001 15:00:57 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 14861 Received: (qmail 744 invoked from network); 12 Jun 2001 15:00:56 -0000 To: Sven Wischnowsky cc: zsh-workers@sunsite.dk Subject: Re: fatal flaw zsh 4.0.1 on irix 6.3 & 6.5: suspend "ls -l|less" then resume hangs References: <200106120815.KAA02209@beta.informatik.hu-berlin.de> From: Timothy Miller Date: 12 Jun 2001 11:00:15 -0400 In-Reply-To: Sven Wischnowsky's message of "Tue, 12 Jun 2001 10:15:59 +0200 (MET DST)" Message-ID: User-Agent: Gnus/5.0802 (Gnus v5.8.2) XEmacs/21.1 (Big Bend) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii On Tue, 12 Jun 2001 10:15:59 +0200 (MET DST), Sven Wischnowsky wrote: > Timothy Miller wrote: > > > I continue to not be a subscriber to this list :-) > > > > If I invoke zsh as "zsh-4.0.1 -f" and then run "ls -l|less" on irix 6.3 > > or 6.5, control-z to suspend, and then "fg" to resume, the shell prints out > > > > [1] + done ls -l | > > continued less > > > > and then hangs, unresponsive to all input (ctrl-c, ctrl-z, ctrl-\, other keys, > > etc). I include the results of Util/reporter at the end of this email. This > > bug does not happen on Solaris 2.7, AIX 4.3.2, or redhat 7.0 linux 2.2.16. > > The version of less I'm using is 290 on irix 6.3 and 332 on irix 6.5, > > solaris, and ix, and 358 on linux. > > Hm, that's weird -- it's not even one of the complicated cases. The > reporter output isn't of much help here. Hence some questions: > > - What does the output of `ps j' (or equivalent, showing pids and parent > pids) show when the job hangs? (ps output with signal masks might > help, too.) On irix 6.3 (where there doesn't seem to be any way to get ps to report signal masks): Running zsh 3.1.6 with -f, then ls -l | less with both procs left running, ps -fjl: F S UID PID PPID PGID SID C PRI NI P SZ:RSS WCHAN STIME TTY TIME CMD b0 S tsm 13041 12927 13041 12927 0 39 20 * 666:227 8039d510 09:58:18 ttyq10 0:00 zsh-beta -f b0 S tsm 13043 13041 13042 12927 0 28 20 * 450:124 8039dc80 09:58:46 ttyq10 0:00 less after suspend: b0 S tsm 13041 12927 13041 12927 0 28 20 * 666:227 8039dc80 09:58:18 ttyq10 0:00 zsh-beta -f b0 T tsm 13043 13041 13042 12927 0 60 20 * 450:124 - 09:58:46 ttyq10 0:00 less after resume: b0 S tsm 13041 12927 13041 12927 0 39 20 * 666:239 8039d510 09:58:18 ttyq10 0:00 zsh-beta -f b0 S tsm 13043 13041 13042 12927 0 28 20 * 451:125 8039dc80 09:58:46 ttyq10 0:00 less Running zsh 4.0.1 with -f, both procs still running: b0 S tsm 13097 12927 13097 12927 4 39 20 * 722:285 8039d510 10:07:57 ttyq10 0:00 zsh-4.0.1 -f b0 S tsm 13100 13097 13100 12927 0 28 20 * 450:124 8039dc80 10:08:00 ttyq10 0:00 less after suspend: b0 S tsm 13097 12927 13097 12927 0 28 20 * 722:286 8039dc80 10:07:57 ttyq10 0:00 zsh-4.0.1 -f b0 T tsm 13100 13097 13100 12927 0 60 20 * 450:124 - 10:08:00 ttyq10 0:00 less after resume: b0 S tsm 13097 12927 13097 12927 0 39 20 * 722:288 8039d510 10:07:57 ttyq10 0:00 zsh-4.0.1 -f b0 T tsm 13100 13097 13100 12927 0 60 20 * 450:124 - 10:08:00 ttyq10 0:00 less On irix 6.5, zsh 3.1.6, both running: F S UID PID PPID PGID SID C PRI NI P SZ:RSS WCHAN STIME TTY TIME CMD 0 S tsm 35634 35740 35634 35740 0 20 20 * 186:131 23f900b8 10:20:30 ttyq4 0:00 zsh-beta -f 0 S tsm 31635 35634 36065 35740 0 20 20 * 130:84 203fe018 10:20:32 ttyq4 0:00 less after suspend: 0 S tsm 35634 35740 35634 35740 0 20 20 * 186:131 203fe018 10:20:30 ttyq4 0:00 zsh-beta -f 40 T tsm 31635 35634 36065 35740 0 20 20 * 131:85 - 10:20:32 ttyq4 0:00 less after resume: 0 S tsm 35634 35740 35634 35740 0 20 20 * 186:134 23f900b8 10:20:30 ttyq4 0:00 zsh-beta -f 0 S tsm 31635 35634 36065 35740 0 20 20 * 131:85 203fe018 10:20:32 ttyq4 0:00 less zsh 4.0.1 both running: 0 S tsm 35817 35740 35817 35740 0 20 20 * 220:145 23f900b8 10:22:06 ttyq4 0:00 zsh-4.0.1 -f 0 S tsm 36183 35817 36183 35740 0 20 20 * 130:84 203fe018 10:22:09 ttyq4 0:00 less after suspend: 0 S tsm 35817 35740 35817 35740 0 20 20 * 220:145 203fe018 10:22:06 ttyq4 0:00 zsh-4.0.1 -f 40 T tsm 36183 35817 36183 35740 0 20 20 * 131:85 - 10:22:09 ttyq4 0:00 less after resume: 0 S tsm 35817 35740 35817 35740 0 20 20 * 220:145 23f900b8 10:22:06 ttyq4 0:00 zsh-4.0.1 -f 40 T tsm 36183 35817 36183 35740 0 20 20 * 131:85 - 10:22:09 ttyq4 0:00 less The odd thing here is the difference in flags and flags behavior between the two machines. The documented meaning for the flags is the same on both machines: F (l) Flags (hexadecimal and additive) associated with the process: 001 Process is a system (resident) process. 002 Process is being traced. 004 Stopped process has been given to parent via wait(2). 008 Process is sleeping at a non-interruptible priority. 010 Process is in core. 020 Process user area is in core. 040 Process has enabled atomic operator emulation. 080 Process in stream poll or select. 100 Process is a kernel thread. >>From the evidence, though, I suspect that this is an error for 6.5. On both systems, zsh 4.0.1 was run as a subshell under zsh 3.1.6, hence the different session id. I wrote a small program to get the pending and held signal masks as well as a bit of other information: For irix 6.3, zsh 3.1.6 with -f, at prompt: zsh: no signals held or pending, asleep on syscall 4 while ls -l|less running: zsh: asleep on syscall 166, all signals from 1 to 64 held EXCEPT 1, 9, 18, 23. less: asleep on syscall 4, no signals held or pending after suspend: zsh asleep no syscall 4, no sigs held or pending less: stopped, no sigs after resume: zsh back to while running less back to while running zsh 4.0.1 with -f, at prompt: zsh: asleep on syscall 4 after ls -l|less: zsh: asleep on syscall 166, all sigs from 1 to 64 held except 1, 9, 18, 23. less asleep on syscall 4 after suspend: zsh: asleep on syscall 4 no sigs held or pending less: stopped after resume: zsh: asleep on syscall 166, all sigs 1-64 held except 1, 9, 18, 23. less: stopped The only info I can find on syscall numbers seems to say that they start with 1000, which doesn't seem to be the case, but if I subtract 1000 from them it claims syscall 4 is write() and 166 is poll(). Signal 1 is hup, 9 is kill, 18 is chld, 23 is stop. > - Have you tried it with earlier versions of zsh? Does it work there? Yes, works with 3.1.6, 3.0.6 and all the earlier versions of zsh I've installed on irix (unfortunately I can't recall exactly which). > - Does it work if you replace `less' with another program that doesn't > program the terminal (so much), e.g. `more' and `cat'? more fails as well. I can't type fast enough to suspend cat before it exits! Tim