From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/623 Path: main.gmane.org!not-for-mail From: Gerrit Pape Newsgroups: gmane.comp.sysutils.supervision.general Subject: Re: runsvdir killed Date: Sun, 7 Nov 2004 13:54:44 +0000 Message-ID: <20041107135325.13303.qmail@fee8ec3a5e23da.315fe32.mid.smarden.org> References: <20041106184216.GB4568@home.power> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1099835640 20121 80.91.229.6 (7 Nov 2004 13:54:00 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sun, 7 Nov 2004 13:54:00 +0000 (UTC) Original-X-From: supervision-return-862-gcsg-supervision=m.gmane.org@list.skarnet.org Sun Nov 07 14:53:50 2004 Return-path: Original-Received: from antah.skarnet.org ([212.85.147.14] ident=qmailr) by deer.gmane.org with smtp (Exim 3.35 #1 (Debian)) id 1CQnUM-0002KJ-00 for ; Sun, 07 Nov 2004 14:53:50 +0100 Original-Received: (qmail 25101 invoked by uid 76); 7 Nov 2004 13:54:11 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Archive: Original-Received: (qmail 25096 invoked from network); 7 Nov 2004 13:54:11 -0000 Original-To: supervision@list.skarnet.org Mail-Followup-To: supervision@list.skarnet.org Content-Disposition: inline In-Reply-To: <20041106184216.GB4568@home.power> Xref: main.gmane.org gmane.comp.sysutils.supervision.general:623 X-Report-Spam: http://spam.gmane.org/gmane.comp.sysutils.supervision.general:623 On Sat, Nov 06, 2004 at 08:42:16PM +0200, Alex Efros wrote: > Sometimes when I check `ps axf` I see no runsvdir process, and all `runsv` > processes has no parent (or their parent is process N1: runit-init). > > I think I know what happens - kernel has killed runsvdir because of 'out of > memory' error (a lot of complex perl scripts earn all memory). Of course, > kernel has killed not only runsvdir, but also it try to kill that perl > scripts, mysql, etc. But this isn't a problem - perl scripts will be > restarted by cron, mysql will be restarted by runsv, etc... but who will > restart runsv if runsvdir is killed and runsv reparented (I not sure is this > a correct english term) by runit-init? The runit program running as process 1 monitors the stage 2 which by default is the runsvdir process. If runsvdir, and so /etc/runit/2, crashes or exits 111, runit restarts /etc/runit/2. If it exits 0, runit enters stage 3 and runs /etc/runit/3; see the runit(8) man page. Either of them should happen on your system if /etc/runit/2 is terminated. > So, the question is: how to restore killed runsvdir without reboot? And the > second question: I suppose killing runsvdir mean exiting stage2 and entering > stage3 for reboot/halt... is this correct? And if this correct why this may > not happens in my case? You can send the runsvdir process a HUP signal to have stage 2 restarted, but this should almost never be needed. > P.S. Yeah, I know, perl scripts eating all memory and kernel starting > killing processes isn't correct behaviour for server. But for now I've no > idea why this happens, so I can't fix it. On that server I got kernel > oops/panic every 12-72 hours, and I've not found any information about these [...] This sounds really broken. Regards, Gerrit.