From mboxrd@z Thu Jan 1 00:00:00 1970 To: 9fans@9fans.net Date: Mon, 20 Jul 2009 08:59:53 +0200 From: gdiaz@9grid.es Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Subject: [9fans] strange behaviour of ps under load Topicbox-Message-UUID: 28820842-ead5-11e9-9d60-3106f5b1d025 hello today i found 9grid plan9 under heavy load, stats reports load ~2000, syscall ~60000, context ~22000, i was trying to discover which proc has gone crazy, but i can't even complete a ps. I can do other operations, such as sending this email over drawterm, run stats, netstat, read the logs, etc. but i can't run ps, or any other /proc related tool, i can't kill/Kill/slay anything. I can ls /proc cpu% ls -l | wc -l 573 something like cpu% for(i in `{ls}) {echo -n 'PID ' $i 'has status. . . '; cat $i/status | wc -c } [....] PID 1944693 has status. . . 176 PID 1944698 has status. . . 176 PID 1944699 has status. . . 176 PID 1944700 has status. . . 176 PID 1944707 has status. . . and here ends, i can't know which process is that nor kill it. I can ls it: cpu% ls -l /proc/1944707/ --rw-rw---- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/args --rw-r----- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/ctl --r--r--r-- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/fd --rw-r----- p 0 offending_user bootes 108 Dec 1 2008 /proc/1944707/fpregs --r--r----- p 0 offending_user bootes 76 Dec 1 2008 /proc/1944707/kregs --rw-r----- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/mem --rw-r----- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/note --rw-rw-r-- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/noteid --rw-r----- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/notepg --r--r--r-- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/ns --r--r----- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/proc --r--r----- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/profile --rw-r----- p 0 offending_user bootes 76 Dec 1 2008 /proc/1944707/regs --r--r--r-- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/segment --r--r--r-- p 0 offending_user bootes 176 Dec 1 2008 /proc/1944707/status --rw-r----- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/text --r--r----- p 0 offending_user bootes 0 Dec 1 2008 /proc/1944707/wait i can't either chmod those files. (is that date normal? seems all /proc is with that date :?) any tip on how to solve this without rebooting? thanks!! gabi