From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 12896 invoked from network); 24 May 2008 23:40:22 -0000 X-Spam-Checker-Version: SpamAssassin 3.2.4 (2008-01-01) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.4 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 24 May 2008 23:40:22 -0000 Received-SPF: none (ns1.primenet.com.au: domain at sunsite.dk does not designate permitted sender hosts) Received: (qmail 67031 invoked from network); 24 May 2008 23:40:14 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 24 May 2008 23:40:14 -0000 Received: (qmail 22319 invoked by alias); 24 May 2008 23:40:10 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 25100 Received: (qmail 22304 invoked from network); 24 May 2008 23:40:10 -0000 Received: from bifrost.dotsrc.org (130.225.254.106) by sunsite.dk with SMTP; 24 May 2008 23:40:10 -0000 Received: from mx.spodhuis.org (redoubt.spodhuis.org [193.202.115.177]) by bifrost.dotsrc.org (Postfix) with ESMTP id 4414580589A4 for ; Sun, 25 May 2008 01:40:05 +0200 (CEST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=d200803; d=spodhuis.org; h=Received:Date:From:To:Cc:Subject:Message-ID:Mail-Followup-To:References:MIME-Version:Content-Type:Content-Disposition:In-Reply-To; b=mFMtRzHqw04Drz2QqB85zLhtU2oDmLxUwdRszW4tsEBVi+oV1YxpDqLYq36/gUV8TTuAEpTEKPRotkXnioUKcrIOLepURIixznMzxPvLV1dyDqbVehJF101gmsjuK0q9B/qqkHVB2wlPgHwFiz0m/CrR+sPuEN+KSFlNRu7LfSE=; Received: by smtp.spodhuis.org with local id 1K03LG-0009Ab-PH; Sat, 24 May 2008 23:40:02 +0000 Date: Sat, 24 May 2008 16:40:02 -0700 From: Phil Pennock To: Vincent Lefevre Cc: zsh-workers@sunsite.dk, 482346@bugs.debian.org Subject: Re: Bug#482346: zsh doesn't always wait for its children (-> zombie) Message-ID: <20080524234002.GA35143@redoubt.spodhuis.org> Mail-Followup-To: Vincent Lefevre , zsh-workers@sunsite.dk, 482346@bugs.debian.org References: <20080521235008.GA5600@ay.vinc17.org> <20080521235930.GW7056@prunille.vinc17.org> <20080522233327.GA24953@scru.org> <080523073940.ZM13804@torch.brasslantern.com> <20080523145722.GA12096@scru.org> <20080523224305.GN7056@prunille.vinc17.org> <20080524025556.GA30511@scru.org> <20080524124445.GQ7056@prunille.vinc17.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20080524124445.GQ7056@prunille.vinc17.org> X-Virus-Scanned: ClamAV 0.91.2/7228/Sat May 24 20:09:29 2008 on bifrost X-Virus-Status: Clean On 2008-05-24 at 14:44 +0200, Vincent Lefevre wrote: > Note: when I kill zsh, the zombie remains there and gets attached > to init. The load average remains very high. If the zombie is reparented to init but still stays a zombie, then there's something worse wrong with your system. If init can't reap its children then it's understandable that zsh might have troubles too. Since you're on a rarer architecture that doesn't see so much Linux kernel debugging, I'd be inclined to look at what has changed in the kernel's architecture-specific signal handling code. (But see below). Further, it's strange that zombies are contributing to load average; if zsh is gone (killed off and no longer even possibly stuck in a tight loop) and there's the zombie and init left, then there shouldn't be anything contributing to load avg. If you use tools such as top(1), what processes are they attributing the load to? Is the high load average confirmed by vmstat reports of idle CPU, or is the load avg really out of sync with CPU reality? Linux is unusual in counting processes blocked on storage IO towards the load average, so if the problem is something like a flaky disk underneath the root filesystem, that might be complicating your problem. -Phil