supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed
From: Alex Efros <powerman@sky.net.ua>
Subject: runsvdir killed
Date: Sat, 6 Nov 2004 20:42:16 +0200	[thread overview]
Message-ID: <20041106184216.GB4568@home.power> (raw)

Hi!

Sometimes when I check `ps axf` I see no runsvdir process, and all `runsv`
processes has no parent (or their parent is process N1: runit-init).

I think I know what happens - kernel has killed runsvdir because of 'out of
memory' error (a lot of complex perl scripts earn all memory). Of course,
kernel has killed not only runsvdir, but also it try to kill that perl scripts,
mysql, etc. But this isn't a problem - perl scripts will be restarted by cron,
mysql will be restarted by runsv, etc... but who will restart runsv if runsvdir
is killed and runsv reparented (I not sure is this a correct english term) by
runit-init?

So, the question is: how to restore killed runsvdir without reboot?
And the second question: I suppose killing runsvdir mean exiting stage2 and
entering stage3 for reboot/halt... is this correct? And if this correct why
this may not happens in my case?


P.S. Yeah, I know, perl scripts eating all memory and kernel starting killing
processes isn't correct behaviour for server. But for now I've no idea why
this happens, so I can't fix it. On that server I got kernel oops/panic every
12-72 hours, and I've not found any information about these oopses in google.
I use huge number of simultaneous download in that perl scripts (non-blocking
sockets) 24/7/365 and I suppose I hit some unknown race condition bug in kernel
because same mystic oops/panic happens on different servers with different
kernels. 'Out of memory' errors, for example, happens usually after dnscachex
or mysql stop accepting new connections by unknown reason. So perl script load
into memory (about 35-50 MB memory used), try to connect to database and hang
because mysql don't accept connection and don't return any error... after 1
minute next perl script started by cron and hang too... etc. Of course I can
add alarm() around connect to mysql or refuse to start perl script if 2/3
memory already used, but this is super-ugly workarounds and don't solve
anything.

-- 
			WBR, Alex.


             reply	other threads:[~2004-11-06 18:42 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-06 18:42 Alex Efros [this message]
2004-11-07 13:54 ` Gerrit Pape
2004-11-07 19:40   ` Alex Efros

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20041106184216.GB4568@home.power \
    --to=powerman@sky.net.ua \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).