From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/1477 Path: news.gmane.org!not-for-mail From: Alex Efros Newsgroups: gmane.comp.sysutils.supervision.general Subject: Re: runit not collecting zombies Date: Thu, 12 Jul 2007 17:49:13 +0300 Organization: asdfGroup Inc., http://powerman.asdfGroup.com/ Message-ID: <20070712144913.GD23517@home.power> References: <20070619181325.23252.qmail@a92f927aabd53f.315fe32.mid.smarden.org> <20070619190751.GC27090@home.power> <20070620162325.26345.qmail@7d91355cde742c.315fe32.mid.smarden.org> <20070620165736.GC12963@home.power> <20070620183532.4571.qmail@9f638fd8b69905.315fe32.mid.smarden.org> <20070623044205.GA1594@home.power> <20070626095920.6195.qmail@3e147d410b1c2c.315fe32.mid.smarden.org> <20070707071657.GA1517@home.power> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1184251757 20318 80.91.229.12 (12 Jul 2007 14:49:17 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Thu, 12 Jul 2007 14:49:17 +0000 (UTC) To: supervision@list.skarnet.org Original-X-From: supervision-return-1714-gcsg-supervision=m.gmane.org@list.skarnet.org Thu Jul 12 16:49:15 2007 Return-path: Envelope-to: gcsg-supervision@gmane.org Original-Received: from antah.skarnet.org ([212.85.147.14]) by lo.gmane.org with smtp (Exim 4.50) id 1I8zyl-0007GO-7d for gcsg-supervision@gmane.org; Thu, 12 Jul 2007 16:49:15 +0200 Original-Received: (qmail 4009 invoked by uid 76); 12 Jul 2007 14:49:36 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Archive: Original-Received: (qmail 4003 invoked from network); 12 Jul 2007 14:49:36 -0000 Mail-Followup-To: supervision@list.skarnet.org Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.13 (2006-08-11) Xref: news.gmane.org gmane.comp.sysutils.supervision.general:1477 Archived-At: Hi! On Thu, Jul 12, 2007 at 10:42:18AM -0400, Charlie Brady wrote: > A common method of avoiding zombie processes is for a SIGCHILD handler in > the parent to reap the status. I wonder whether there is possibility for > SIGCHILD to be queued to the wrong process (due to a race during > reparenting). Does runit as process 1 depend on SIGCHILD to reap zombies? Yeah, this is possible. But, anyway, it's a race in kernel or runit, which must be fixed there. With last patch runit try reaping every 5 seconds, instead of depending on SIGCHLD. For now (uptime 5 days) there no zombies on my servers. This is ugly workaround, of course. > Alex, are you running an SMP system (which would allow parent and child > to both be scheduled simulateously)? My home workstation is SMP (Core2Duo), while servers are not SMP (nowadays it's usual to have workstation much more powerful than servers :)). This issue arise both on workstation and servers, but much more often on servers (because they work much more intensively and generate much more processes). -- WBR, Alex.