From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/1512 Path: news.gmane.org!not-for-mail From: Mike Buland Newsgroups: gmane.comp.sysutils.supervision.general Subject: Re: runit not collecting zombies Date: Wed, 12 Sep 2007 13:38:54 -0600 Organization: Geek Gene Message-ID: <200709121338.54750.mike@geekgene.com> References: <35517.::ffff:77.75.72.5.1189613042.squirrel@mail.podgorny.cz> <20070912170450.GE12043@home.power> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Trace: sea.gmane.org 1189626143 31330 80.91.229.12 (12 Sep 2007 19:42:23 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 12 Sep 2007 19:42:23 +0000 (UTC) To: supervision@list.skarnet.org Original-X-From: supervision-return-1747-gcsg-supervision=m.gmane.org@list.skarnet.org Wed Sep 12 21:42:21 2007 Return-path: Envelope-to: gcsg-supervision@gmane.org Original-Received: from antah.skarnet.org ([212.85.147.14]) by lo.gmane.org with smtp (Exim 4.50) id 1IVY6E-00006H-PY for gcsg-supervision@gmane.org; Wed, 12 Sep 2007 21:42:10 +0200 Original-Received: (qmail 23125 invoked by uid 76); 12 Sep 2007 19:42:32 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Archive: Original-Received: (qmail 23119 invoked from network); 12 Sep 2007 19:42:32 -0000 User-Agent: KMail/1.9.6 In-Reply-To: <20070912170450.GE12043@home.power> Content-Disposition: inline Xref: news.gmane.org gmane.comp.sysutils.supervision.general:1512 Archived-At: Hello, On Wednesday 12 September 2007 11:04:50 am Alex Efros wrote: > Yeah, I've one server which don't have this issue. His admin made a > mistake many months ago - he installed too new gcc (which isn't support > hardened patches yet - SSP and PIE), and afraid to disgrade it on > production server. He wait until hardened patches will be released for > that gcc version to come back to hardened land. This is only noticeable > difference between our servers. I'm just curious, but doesn't it sound like this is the first place to look for the trouble? Unfortunately, as you point out, there are two differences between the two systems, the one that works isn't using two of the hardened patches, and is using a newer gcc. Have you reported these facts to the maintainers of the hardened patches (I'm sure they know they don't work with gcc 4.1.1, but not-reaping zombies is an issue). Also, anyone who could hope to fix this in runit should have these patches applied and be using an older gcc. Obviously these patches don't completely ruin the kernel/libc's ability to reap zombies, or this would have been found before now, but it does seem to affect it. I think debugging efforts should probably be focused on these modifications to the system, and not general runit (I've never seen this problem on any of my machines). I'd be happy to build out a gentoo system and hack around with all this...in october : ). Before then...I can only offer observation. Good luck. P.S. Doing some quick scans through the patches for references to wait-related changes could be a good, first clue...maybe? It could be where I'd start, that or gdb :)