From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/1522 Path: news.gmane.org!not-for-mail From: Alex Efros Newsgroups: gmane.comp.sysutils.supervision.general Subject: Re: runit not collecting zombies Date: Sat, 15 Sep 2007 16:57:49 +0300 Organization: asdfGroup Inc., http://powerman.asdfGroup.com/ Message-ID: <20070915135749.GB30650@home.power> References: <20070912150047.GD12043@home.power> <20070912172245.GF12043@home.power> <20070912181836.GG12043@home.power> <20070912191346.GH12043@home.power> <20070915133641.GA30650@home.power> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1189864674 14125 80.91.229.12 (15 Sep 2007 13:57:54 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sat, 15 Sep 2007 13:57:54 +0000 (UTC) To: supervision@list.skarnet.org Original-X-From: supervision-return-1757-gcsg-supervision=m.gmane.org@list.skarnet.org Sat Sep 15 15:57:51 2007 Return-path: Envelope-to: gcsg-supervision@gmane.org Original-Received: from antah.skarnet.org ([212.85.147.14]) by lo.gmane.org with smtp (Exim 4.50) id 1IWY9f-0004GV-4o for gcsg-supervision@gmane.org; Sat, 15 Sep 2007 15:57:51 +0200 Original-Received: (qmail 2387 invoked by uid 76); 15 Sep 2007 13:58:12 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Archive: Original-Received: (qmail 2380 invoked from network); 15 Sep 2007 13:58:12 -0000 Mail-Followup-To: supervision@list.skarnet.org Content-Disposition: inline In-Reply-To: <20070915133641.GA30650@home.power> User-Agent: Mutt/1.5.16 (2007-06-09) Xref: news.gmane.org gmane.comp.sysutils.supervision.general:1522 Archived-At: Hi! On Sat, Sep 15, 2007 at 04:36:42PM +0300, Alex Efros wrote: > Full `ps -ef axf` output here: http://powerman.asdfgroup.com/tmp/ps.txt Here is full syslog for Sep 15: http://powerman.asdfgroup.com/tmp/syslog.txt According to `ps` output, all zombies was created between 04:19-04:21 and few at 07:51. There no records in kernel log for that period, and in syslog all records for Sep 15 is ssh-related. Looks like last zombie and last record in the log (user mysql) was created by my test attempt to connect. In this case we've a chance to get clean strace output for such simple connect attempt which create unreaped zombie! I'll restart strace and try to connect as mysql user again now ... GOT IT!!! # date ; ps -ef axf | tail -n 1 Sat Sep 15 13:51:38 GMT 2007 sshd 14804 1 0 13:50 ? Z 0:00 [sshd] # date ; ps -ef axf | tail -n 1 Sat Sep 15 13:51:53 GMT 2007 sshd 14804 1 0 13:50 ? Z 0:00 [sshd] # tail -n 1 /var/log/syslog/all/current auth.info: Sep 15 13:50:43 sshd[14803]: User mysql not allowed because account is locked Strace output with all details about PIDs 939 (ssh server), 14803 and 14804 (unreaped zombie) is here: http://powerman.asdfgroup.com/tmp/ssh_strace.txt -- WBR, Alex.