From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/1516 Path: news.gmane.org!not-for-mail From: "Radek Podgorny" Newsgroups: gmane.comp.sysutils.supervision.general Subject: Re: runit not collecting zombies Date: Thu, 13 Sep 2007 10:58:10 +0200 (CEST) Message-ID: <12087.6419511207$1189673903@news.gmane.org> References: <20070716000927.GY23517@home.power> <47939.::ffff:77.75.72.5.1189601606.squirrel@mail.podgorny.cz> <20070912143557.GC12043@home.power> <20070912150047.GD12043@home.power> <35517.::ffff:77.75.72.5.1189613042.squirrel@mail.podgorny.cz> <20070912170450.GE12043@home.power> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: sea.gmane.org 1189673903 24528 80.91.229.12 (13 Sep 2007 08:58:23 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Thu, 13 Sep 2007 08:58:23 +0000 (UTC) To: supervision@list.skarnet.org Original-X-From: supervision-return-1751-gcsg-supervision=m.gmane.org@list.skarnet.org Thu Sep 13 10:58:21 2007 Return-path: Envelope-to: gcsg-supervision@gmane.org Original-Received: from antah.skarnet.org ([212.85.147.14]) by lo.gmane.org with smtp (Exim 4.50) id 1IVkWa-00005g-Rc for gcsg-supervision@gmane.org; Thu, 13 Sep 2007 10:58:12 +0200 Original-Received: (qmail 21554 invoked by uid 76); 13 Sep 2007 08:58:33 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Archive: Original-Received: (qmail 21548 invoked from network); 13 Sep 2007 08:58:33 -0000 In-Reply-To: <20070912170450.GE12043@home.power> User-Agent: SquirrelMail/1.4.8 X-Priority: 3 (Normal) Importance: Normal Xref: news.gmane.org gmane.comp.sysutils.supervision.general:1516 Archived-At: So, my systems are listed at http://podgorny.cz/moin/RunitBug (I will polish the list soon). The interesting things: * I don't use hardened at all so I think this is not to be blamed. * It is not a 2.6.20 kernel bug as I experience this on 2.6.19.1 and even something like 2.6.18 (can't look right now). * All my kernels are vanilla. * The only two systems that run fine may be fucked up, too. One of them is a laptop so the uptime may be too short to notice. The other one has uptime of something like 160 days so it may screw up on next reboot. What about CFLAGS? I'll get them from my machines and post them. I suspect they are mostly -O3 and arch set to specific processor (not generic i686 or so)... Maybe it's some kind of time overflow bug, can you find when did you start experiencing the trouble? Radek P. > Hi! > > On Wed, Sep 12, 2007 at 06:04:02PM +0200, Radek Podgorny wrote: >> Alex, did I get it right you use gentoo? On what architecture? Stable or > > Stable x86 (except few ~x86 packages like runit and svlogd), all 32bit. > > I use Hardened Gentoo, and one of ideas is it's GrSecurity/PaX patches > introduce that bug - this may explain why a lot of vanilla kernel users > don't see this bug. > Another idea - some of other gentoo-specific kernel patches. > To test this I should stop using GrSecurity/PaX on production servers for > a weeks, and I dislike this idea. > >> unstable? I use gentoo on all my machines (stable/unstable mix, >> x86/amd64 >> mix, different kernels, ...) and some machines are OK, others are not. > > Yeah, I've one server which don't have this issue. His admin made a > mistake many months ago - he installed too new gcc (which isn't support > hardened patches yet - SSP and PIE), and afraid to disgrade it on > production server. He wait until hardened patches will be released for > that gcc version to come back to hardened land. This is only noticeable > difference between our servers. > >> Maybe this is gentoo specific somehow (exotic USE for glibc, wrong >> gcc?...). I'll get the versions from my machines and post it here, could >> you please do the same? Let's find what's common... > > My servers and workstation use (unique lines) (all of them have this > issue): > 2.6.20-hardened-r6 SMP i686 Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz > GenuineIntel > 2.6.20-hardened-r6 i686 Intel(R) Pentium(R) 4 CPU 2.80GHz > GenuineIntel > 2.6.20-hardened-r6 i686 Intel(R) Pentium(R) 4 CPU 3.00GHz > GenuineIntel > 2.6.20-hardened-r6 i686 AMD Athlon(tm) 64 Processor 3500+ > AuthenticAMD > Server without zombie issue use: > 2.6.20-hardened-r6 i686 Intel(R) Celeron(R) CPU 2.00GHz GenuineIntel > Kernel configuration is 100% equal on server without zombies and my P4 > servers. > > All servers use: > sys-libs/glibc-2.5-r4 > sys-devel/binutils-2.17 > > My servers use: > sys-devel/gcc-3.4.6-r2 (with SSP and PIE) > Server without zombie issue use: > sys-devel/gcc-4.1.1-r3 > > I've tried runit from 1.5.0 to 1.7.2 with patches from this maillist on my > servers. Server without this issue work on runit 1.5.0. > > USE-flags on all servers are same: > sys-kernel/hardened-sources-2.6.20-r6 > USE="-build -symlink" > sys-libs/glibc-2.5-r4 > USE="hardened nls nptl nptlonly -build -debug -glibc-compat20 > -glibc-omitfp -multilib -profile (-selinux)" > sys-devel/binutils-2.17 > USE="nls -multislot -multitarget -test -vanilla" > sys-devel/gcc-3.4.6-r2 > USE="hardened nls (-altivec) -bootstrap -boundschecking -build -d -doc > -fortran -gcj -gtk -ip28 -ip32r10k -multilib -multislot (-n32) (-n64) > -nocxx -nopie -nossp -objc -test -vanilla" > sys-process/runit-1.7.2 > USE="-static" > > -- > WBR, Alex. > >