supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed
* runsvdir killed
@ 2004-11-06 18:42 Alex Efros
  2004-11-07 13:54 ` Gerrit Pape
  0 siblings, 1 reply; 3+ messages in thread
From: Alex Efros @ 2004-11-06 18:42 UTC (permalink / raw)


Hi!

Sometimes when I check `ps axf` I see no runsvdir process, and all `runsv`
processes has no parent (or their parent is process N1: runit-init).

I think I know what happens - kernel has killed runsvdir because of 'out of
memory' error (a lot of complex perl scripts earn all memory). Of course,
kernel has killed not only runsvdir, but also it try to kill that perl scripts,
mysql, etc. But this isn't a problem - perl scripts will be restarted by cron,
mysql will be restarted by runsv, etc... but who will restart runsv if runsvdir
is killed and runsv reparented (I not sure is this a correct english term) by
runit-init?

So, the question is: how to restore killed runsvdir without reboot?
And the second question: I suppose killing runsvdir mean exiting stage2 and
entering stage3 for reboot/halt... is this correct? And if this correct why
this may not happens in my case?


P.S. Yeah, I know, perl scripts eating all memory and kernel starting killing
processes isn't correct behaviour for server. But for now I've no idea why
this happens, so I can't fix it. On that server I got kernel oops/panic every
12-72 hours, and I've not found any information about these oopses in google.
I use huge number of simultaneous download in that perl scripts (non-blocking
sockets) 24/7/365 and I suppose I hit some unknown race condition bug in kernel
because same mystic oops/panic happens on different servers with different
kernels. 'Out of memory' errors, for example, happens usually after dnscachex
or mysql stop accepting new connections by unknown reason. So perl script load
into memory (about 35-50 MB memory used), try to connect to database and hang
because mysql don't accept connection and don't return any error... after 1
minute next perl script started by cron and hang too... etc. Of course I can
add alarm() around connect to mysql or refuse to start perl script if 2/3
memory already used, but this is super-ugly workarounds and don't solve
anything.

-- 
			WBR, Alex.


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2004-11-07 19:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-11-06 18:42 runsvdir killed Alex Efros
2004-11-07 13:54 ` Gerrit Pape
2004-11-07 19:40   ` Alex Efros

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).