From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/1496 Path: news.gmane.org!not-for-mail From: Charlie Brady Newsgroups: gmane.comp.sysutils.supervision.general Subject: Re: runit not collecting zombies Date: Wed, 12 Sep 2007 09:55:16 -0400 (EDT) Message-ID: References: <20070620183532.4571.qmail@9f638fd8b69905.315fe32.mid.smarden.org> <20070623044205.GA1594@home.power> <20070626095920.6195.qmail@3e147d410b1c2c.315fe32.mid.smarden.org> <20070715144704.GS23517@home.power> <20070715190757.GW23517@home.power> <20070715201846.GT3925@run.galis.org> <20070715223553.GU3925@run.galis.org> <20070716000927.GY23517@home.power> <47939.::ffff:77.75.72.5.1189601606.squirrel@mail.podgorny.cz> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Trace: sea.gmane.org 1189605328 16654 80.91.229.12 (12 Sep 2007 13:55:28 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 12 Sep 2007 13:55:28 +0000 (UTC) Cc: Alex Efros , supervision@list.skarnet.org To: Radek Podgorny Original-X-From: supervision-return-1731-gcsg-supervision=m.gmane.org@list.skarnet.org Wed Sep 12 15:55:24 2007 Return-path: Envelope-to: gcsg-supervision@gmane.org Original-Received: from antah.skarnet.org ([212.85.147.14]) by lo.gmane.org with smtp (Exim 4.50) id 1IVSgY-0002A8-Hv for gcsg-supervision@gmane.org; Wed, 12 Sep 2007 15:55:18 +0200 Original-Received: (qmail 28804 invoked by uid 76); 12 Sep 2007 13:55:40 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Archive: Original-Received: (qmail 28798 invoked from network); 12 Sep 2007 13:55:39 -0000 X-X-Sender: charlieb@e-smith.charlieb.ott.istop.com In-Reply-To: <47939.::ffff:77.75.72.5.1189601606.squirrel@mail.podgorny.cz> Xref: news.gmane.org gmane.comp.sysutils.supervision.general:1496 Archived-At: On Wed, 12 Sep 2007, Radek Podgorny wrote: > Hi! Any progress on this? Alex, have you found at least a workaround? This > is getting really annoying as I have to reboot my servers manually ... You can make the problem (whatever it is) a non-issue for you, as it is for nearly everyone else, if you can fix whichever run script is generating zombies. It's possible, believe me. [I've still seen no evidence that openssh generates zombies.] > (ssh > can't fork for remote login)... :-( > > Radek P. > > >> >> >> On Mon, 16 Jul 2007, Alex Efros wrote: >> >>> On Sun, Jul 15, 2007 at 07:23:13PM -0400, Charlie Brady wrote: >>>> So there are two problems there - the processes which are outliving >>>> their >>>> parents, and runit as process 1. Most people here seem to be ignoring >>>> the >>>> first problem, and instead are just looking for a magic fix by someone >>>> solving problem 2. >>> >>> Ohh. Okay, okay, I think we all agree with you about 'generating >>> zombies' >>> is a Bad Thing (tm). But real world is slightly different from ideal >>> world. >>> In real world we've a 'zombie processes', which are part of *NIX >>> architecture, and which can't be solved by just stopping generating >>> zombies - because there a lot of existing applications (like OpenSSH) >>> which already generate zombies, and because there exists some cases when >>> zombies may and will be generated anyway. >> >> Sure they will. But in every case except a daemon which was given a term >> signal I'd say it is a bug. >> >> I've seen no evidence that openssh generates zombies. >> >>> In this situation, the Right Thing is solve this issue between runit and >>> linux kernel. >> >> That's one of the right thing, yes. >> >>> So. If this is a race condition bug in linux kernel 2.6.20, how to debug >>> it? >> >> Have a look at SystemTap. >> >> > > >