From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/889 Path: news.gmane.org!not-for-mail From: Jussi Ramo Newsgroups: gmane.comp.sysutils.supervision.general Subject: Re: duplicate processes Date: Tue, 27 Sep 2005 11:36:04 +0300 Organization: Oy L M Ericsson Ab Message-ID: <43390474.5050309@ericsson.com> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Trace: sea.gmane.org 1127810281 19614 80.91.229.2 (27 Sep 2005 08:38:01 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Tue, 27 Sep 2005 08:38:01 +0000 (UTC) Cc: supervision@list.skarnet.org Original-X-From: supervision-return-1125-gcsg-supervision=m.gmane.org@list.skarnet.org Tue Sep 27 10:38:00 2005 Return-path: Original-Received: from antah.skarnet.org ([212.85.147.14]) by ciao.gmane.org with smtp (Exim 4.43) id 1EKAx8-0008KF-1H for gcsg-supervision@gmane.org; Tue, 27 Sep 2005 10:36:42 +0200 Original-Received: (qmail 30546 invoked by uid 76); 27 Sep 2005 08:37:03 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Archive: Original-Received: (qmail 30540 invoked from network); 27 Sep 2005 08:37:02 -0000 User-Agent: Mozilla/5.0 (X11; U; SunOS sun4u; en-US; rv:1.7.8) Gecko/20050512 X-Accept-Language: en-us, en Original-To: charlieb-supervision@budge.apana.org.au X-OriginalArrivalTime: 27 Sep 2005 08:36:04.0965 (UTC) FILETIME=[7FCFF150:01C5C33E] X-Brightmail-Tracker: AAAAAA== Xref: news.gmane.org gmane.comp.sysutils.supervision.general:889 Archived-At: >What evidence do you have that the processes are started directly by init? Remember that a process will be inherited by init if its direct >parent dies. No evidence. Just looked at the parent process. So you suggest that the "runsv ndb_mgmd" dies and the ndb_mgmd is inherited by init. Then "runsv ndb_mgmd" is respawned by runsvdir (?) and that starts another ndb_mgmd. This makes sense to me but now the question is why runsv first dies once for certain processes. >> So duplicate process will be generated and system becomes unstable. Both of those processes (the one started by init and the one started by runsv) react on sv command. >React in what ways? Do not know if this brings any extra information but if I have first the following ndb_mgmd processes (one of them badly as child of init) : root 1950 1945 0 07:04 ? 00:00:00 runsv ndb_mgmd ais 1963 1 0 07:04 ? 00:00:00 /opt/SGC/bin/ndb_mgmd -f /opt/SGC/etc/ndbconfig.ini ais 2276 1950 2 07:05 ? 00:00:00 /opt/SGC/bin/ndb_mgmd -f /opt/SGC/etc/ndbconfig.ini Then I do like: blade_0_7:~ # /opt/SGC/bin/sv down /var/services/ndb_mgmd/ and the other "right" ndb_mgmd disappears: root 1950 1945 0 07:04 ? 00:00:00 runsv ndb_mgmd ais 1963 1 0 07:04 ? 00:00:00 /opt/SGC/bin/ndb_mgmd -f /opt/SGC/etc/ndbconfig.ini I then kill the "wrong" ndb_mgmd blade_0_7:~ # kill -9 1963 root 1950 1945 0 07:04 ? 00:00:00 runsv ndb_mgmd And when ndb_mgmd is put "up" there are again those two processes: blade_0_7:~ # /opt/SGC/bin/sv up /var/services/ndb_mgmd/ root 1950 1945 0 07:04 ? 00:00:00 runsv ndb_mgmd ais 2805 1 0 07:08 ? 00:00:00 /opt/SGC/bin/ndb_mgmd -f /opt/SGC/etc/ndbconfig.ini ais 2837 1950 4 07:08 ? 00:00:00 /opt/SGC/bin/ndb_mgmd -f /opt/SGC/etc/ndbconfig.ini >> For example below: aisexec works properly but ndb_mgmd causes problems. >Google tells me that ndb_mgmd is part of mysql. mysql is known to be badly designed wrt co-operating with runit/daemontools. This is a good piece of information. Other 3pp product that causes similar troubles as mysql to me is EMANATE snmp agent, when using with runit. >> There are no major diffrencies in run scripts but file paths or so. Both of the run scrips are quite simple. >What are they? #!/bin/sh # Source environment settings . /opt/SGC/etc/sgcenv exec >> $LOG_FILE 2>&1 aisexec_BIN=/opt/SGC/bin/aisexec exec $aisexec_BIN #!/bin/sh # Source environment settings . /opt/SGC/etc/sgcenv exec >> $LOG_FILE 2>&1 ndb_mgmd_BIN=/opt/SGC/bin/ndb_mgmd exec $ndb_mgmd_BIN -f /opt/SGC/etc/ndbconfig.ini >> The other ndb_mgmd is restarted frequently by runsv because of the same process is started directly by init for some reason. >Again, what evidence do you have that there is any process started directly by init? Even if so, why would runsv restart the process it is managing? I expect that the process runsv is monitoring is exiting, and that is why runsv is starting a new process. Right. The process runsv is monitoring is exiting because the port it tries to use is reserved by the extra process (now hopefully correct phrasing:) whose parent is init. Thanks and regards, Jussi P.S. I now tried to create a new message for a mailing thread, first time I just replied on an existing thread changing the subject. Sorry for that!