From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/1059 Path: news.gmane.org!not-for-mail From: orc Newsgroups: gmane.linux.lib.musl.general Subject: Re: Re: Vision for new platform Date: Sun, 10 Jun 2012 23:51:25 +0800 Message-ID: <20120610235125.31f38cd7@sibserver.ru> References: <20120518010620.GW163@brightrain.aerifal.cx> <20120609192756.6e72f25e@sibserver.ru> <20120609074426.496a5e13@newbook> <20120609212411.GA163@brightrain.aerifal.cx> <87lijwnmao.fsf@gmail.com> <20120610132246.GF163@brightrain.aerifal.cx> <20120610225226.137363d0@sibserver.ru> <20120610151311.GH163@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Trace: dough.gmane.org 1339343610 2973 80.91.229.3 (10 Jun 2012 15:53:30 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Sun, 10 Jun 2012 15:53:30 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-1060-gllmg-musl=m.gmane.org@lists.openwall.com Sun Jun 10 17:53:28 2012 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1SdkS7-00010r-Gz for gllmg-musl@plane.gmane.org; Sun, 10 Jun 2012 17:53:19 +0200 Original-Received: (qmail 25988 invoked by uid 550); 10 Jun 2012 15:53:19 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 25969 invoked from network); 10 Jun 2012 15:53:15 -0000 In-Reply-To: <20120610151311.GH163@brightrain.aerifal.cx> X-Mailer: claws-mail Xref: news.gmane.org gmane.linux.lib.musl.general:1059 Archived-At: On Sun, 10 Jun 2012 11:13:11 -0400 Rich Felker wrote: > On Sun, Jun 10, 2012 at 10:52:26PM +0800, orc wrote: > > If we need no starting and stopping, than this can be already > > implemented in init scripts. Only a simple program-wrapper that > > forcibly daemonizes that daemons with "do not fork" option needed. > > Optionally it can report a pid after fork() before execvp(). > > I don't think you're getting the issue at hand. Suppose you want to be > able to automatically bring down a particular daemon -- perhaps to > restart it with completely new configuration or to switch to a new > version of it. This could happen as part of an automated upgrade > process or under manual admin control. 'Automated' often becomes the source of problems, if this automated subsystem is not engineered properly. If we want daemon that will be responsible for other's daemons status and it will start and stop them automatically based on the admin's decision than it must be well-engineered and tested in many types of situations first. > > Traditional init scripts DO NOT solve this problem. They are extremely > buggy, ranging from doing things as stupid as killing any instance > of the daemon Are you talking about traditional SysV init scripts? Yes, they're buggy, I fully agree. > (even one run by a user as opposed to by root with a > separate config file and running on a separate port) Killing processes based on uid/gid and cmdline can be achieved with pkill already, > to killing > unrelated processes (by scanning /proc or reading a pid file, then > subsequently killing the pid which might not belong to a different > process). Again, pkill much better than "traditional" "kill $(cat /var/run/daemon.pid)" that most of init script use today (Am I right?) > I agree that the problem of daemons crashing or otherwise exiting > unexpectedly is one that should be fixed in the daemons. Unfortunately > that's much harder than it sounds. A large portion of the daemons in > modern use are using "xmalloc" type wrappers that abort > unconditionally on malloc failure, either directly or by virtue of > using atrociously-bad libraries like glib that abort without the > caller's consent. I fully agree that the in reality we have no ideal daemons in this question, many of them are unreliable. > > If daemons really didn't exit unexpectedly, the only race condition in > pid-based approaches to lifetime management would be races between > multiple scripted administrative actions (e.g. 2 admins trying to down > the daemon at the same time) which could be fixed by locking at the > script level. Hm, for me that situation sounds a bit strange: even script will exit with 'daemon already stopped' or script will send an additional signal to daemon that will not harm it such (I omit here talk about sighandlers, most daemons did not crashed after a second signal if it was not KILL signal). I partially agree with approach that such daemon for monitoring status of other daemons should be developed, but I think this daemon should control only critical processes for admin, such as: - syslog daemon (Such situation happened with me when rsyslog crashed for no reason) - possibly various daemons for remote network access, such as sshd (?) - other daemons, if their task is not to write/read something important from disk. For example, database daemons should NOT be restarted automatically. P.S. If you talk about traditional init scripts that to be appear in most distros today - then I fully agree with you in all aspects you talked about here. I was paranoid, and rewritten them from scratch back some time ago.