supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed
From: Charlie Brady <charlieb-supervision@budge.apana.org.au>
Cc: supervision@list.skarnet.org
Subject: Re: duplicate processes
Date: Tue, 27 Sep 2005 14:25:17 -0400 (EDT)	[thread overview]
Message-ID: <Pine.LNX.4.61.0509270956510.18597@e-smith.charlieb.ott.istop.com> (raw)
In-Reply-To: <43390474.5050309@ericsson.com>


On Tue, 27 Sep 2005, Jussi Ramo wrote:

>> What evidence do you have that the processes are started directly by 
> init? Remember that a process will be inherited by init if its direct 
>> parent dies.
>
> No evidence. Just looked at the parent process.

I'm pretty sure that's misleading.

> So you suggest that the "runsv ndb_mgmd" dies and the ndb_mgmd is 
> inherited by init. Then "runsv ndb_mgmd" is respawned by runsvdir (?) 
> and that starts another ndb_mgmd. This makes sense to me but now the 
> question is why runsv first dies once for certain processes.

No, I didn't mean to suggest that runsv dies. I expect that ndb_mgmd has 
forked, and the parent died. Perhaps it was designed to do that, to 
"daemonise" the child. You will have to find some way to prevent that from 
happening.

>>> So duplicate process will be generated and system becomes unstable. 
> Both of those processes (the one started by init and the one started by 
> runsv) react on sv command.
>> React in what ways?
>
> Do not know if this brings any extra information but if I have first the 
> following ndb_mgmd processes (one of them badly as child of init) :
>
> root      1950  1945  0 07:04 ?        00:00:00 runsv ndb_mgmd
> ais       1963     1  0 07:04 ?        00:00:00 /opt/SGC/bin/ndb_mgmd -f 
> /opt/SGC/etc/ndbconfig.ini
> ais       2276  1950  2 07:05 ?        00:00:00 /opt/SGC/bin/ndb_mgmd -f 
> /opt/SGC/etc/ndbconfig.ini
>
> Then I do like:
>
> blade_0_7:~ # /opt/SGC/bin/sv down /var/services/ndb_mgmd/
>
> and the other "right" ndb_mgmd disappears:
>
> root      1950  1945  0 07:04 ?        00:00:00 runsv ndb_mgmd
> ais       1963     1  0 07:04 ?        00:00:00 /opt/SGC/bin/ndb_mgmd -f 
> /opt/SGC/etc/ndbconfig.ini

OK.

> I then kill the "wrong" ndb_mgmd
>
> blade_0_7:~ # kill -9 1963
> root      1950  1945  0 07:04 ?        00:00:00 runsv ndb_mgmd

OK, but you shouldn't be using -9 unless there is no alternative.

> And when ndb_mgmd is put "up" there are again those two processes:
>
> blade_0_7:~ # /opt/SGC/bin/sv up /var/services/ndb_mgmd/
>
> root      1950  1945  0 07:04 ?        00:00:00 runsv ndb_mgmd
> ais       2805     1  0 07:08 ?        00:00:00 /opt/SGC/bin/ndb_mgmd -f 
> /opt/SGC/etc/ndbconfig.ini
> ais       2837  1950  4 07:08 ?        00:00:00 /opt/SGC/bin/ndb_mgmd -f 
> /opt/SGC/etc/ndbconfig.ini

OK. Assuming that you have pids allocated sequentially, look how many 
intervening processes there are between the two invocations.

>>> The other ndb_mgmd is restarted frequently by runsv because of the 
> same process is started directly by init for some reason.
>> Again, what evidence do you have that there is any process started 
> directly by init? Even if so, why would runsv restart the process it is 
> managing? I expect that the process runsv is monitoring is exiting, and that 
> is why runsv is starting a new process.
>
> Right. The process runsv is monitoring is exiting because the port it tries 
> to use is reserved by the extra process (now hopefully correct phrasing:) 
> whose parent is init.

Not quite. The process runsv is monitoring exits when told to by runsv. 
You showed us that. The other ndb_mgmd process however doesn't exit, and 
runsv will therefore not be able to start a new process which is able to 
keep running.

Does ndb_mgmd create a pid file? If so, what is its content?


  reply	other threads:[~2005-09-27 18:25 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-27  8:36 Jussi Ramo
2005-09-27 18:25 ` Charlie Brady [this message]
2005-09-28  8:25   ` Jussi Ramo
  -- strict thread matches above, loose matches on Subject: below --
2005-09-17  4:46 new "sv status" flags and exit-tracking patch, and misc Charles Duffy
2005-09-19  8:31 ` Gerrit Pape
2005-09-19 16:04   ` Charles Duffy
2005-09-19 19:13     ` Charles Duffy
2005-09-26 10:12       ` Gerrit Pape
2005-09-26 15:31         ` duplicate processes Jussi Ramo
2005-09-26 15:42           ` Charlie Brady

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.61.0509270956510.18597@e-smith.charlieb.ott.istop.com \
    --to=charlieb-supervision@budge.apana.org.au \
    --cc=supervision@list.skarnet.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).