From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/1344 Path: news.gmane.org!not-for-mail From: =?iso-8859-2?Q?Dra=BEen_Ka=E8ar?= Newsgroups: gmane.comp.sysutils.supervision.general Subject: Re: graceful restart under runit Date: Wed, 22 Nov 2006 20:25:06 +0100 Message-ID: <20061122192506.GA24958@fly.srk.fer.hr> References: <20061117001519.GA652@home.power> <20061117133435.GB2153@home.power> <20061118002245.GB17975@home.power> <20061118123120.GA8388@home.power> <20061120182733.GA629@fly.srk.fer.hr> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-2 Content-Transfer-Encoding: 8bit X-Trace: sea.gmane.org 1164223554 14689 80.91.229.2 (22 Nov 2006 19:25:54 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 22 Nov 2006 19:25:54 +0000 (UTC) Original-X-From: supervision-return-1580-gcsg-supervision=m.gmane.org@list.skarnet.org Wed Nov 22 20:25:49 2006 Return-path: Envelope-to: gcsg-supervision@gmane.org Original-Received: from antah.skarnet.org ([212.85.147.14]) by ciao.gmane.org with smtp (Exim 4.43) id 1GmxiZ-0005qp-0W for gcsg-supervision@gmane.org; Wed, 22 Nov 2006 20:25:12 +0100 Original-Received: (qmail 12257 invoked by uid 76); 22 Nov 2006 19:25:30 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Archive: Original-Received: (qmail 12251 invoked from network); 22 Nov 2006 19:25:30 -0000 Original-To: supervision@list.skarnet.org Mail-Followup-To: =?iso-8859-2?Q?Dra=BEen_Ka=E8ar?= , supervision@list.skarnet.org Content-Disposition: inline In-Reply-To: X-Face: 'UIE}WabGB0+U>p-#(hp<_+AD2{H],=qR*jHfm$/e]l0(kU3oOYc5lqG6gg>[\h^IOc{'siD6#!T&loIShgmYHz3#+*D38:|`~\BE,(W~Ol9BDfDwk'lKJ;Z{sY8E9(ME.E]'wvNO`$n#,;9Z`tOFcW/nHZq!BOSrM>V?C<5DTw=<${c{M2V+|)0jSUl&!+8%8nIBF(u:E>SZWM^e User-Agent: Mutt/1.4i X-Attribution: Dave X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-2.0 (fly.srk.fer.hr [127.0.0.1]); Wed, 22 Nov 2006 20:25:07 +0100 (MET) Xref: news.gmane.org gmane.comp.sysutils.supervision.general:1344 Archived-At: Paul Jarc wrote: > Dražen Kačar wrote: > Well, it depends how much downtime you can tolerate. [...] > > Then you'd need to implement something to take care of metaserver crashes. > > Probably a way for servers to pass listening sockets back to the > > new metaserver. > > I think that's beyond the point of diminishing returns. The problem > can never be completely solved, since the metaserver and other servers > could crash at the same time, or you could lose power, etc. You have > to give up at some point. Well, I'm thinking about systems with 99.999% availability (for fun you can calculate how many seconds per year that is :-). Clustered systems can do that. There's a hartbeat and if one component fails then the processing waits until the living ones reach the recovery point, but after that processing continues and the client only sees a brief (or not) pause. But that's for distributed systems where you just don't have a single point on which you can rely to work properly. On one machine there's the kernel. If it crashes, all processes will also crash and burn, so that would be the point at which I'd give up. :-) > >> While a server is handling connections, it would have to use > >> select()/poll() to notice activity on either the listening socket or > >> the filesystem connection; > > > > And that isn't very nice. > > Well, I'd probably do that anyway, if I wanted to handle signals, > since I'd use the self-pipe technique to notice when signals arrived. I'm trying to use sig_atomic_t flag in signal handlers whenever I can. Things are a bit simpler that way, at least to me. For threaded code there'd be a signal handling thread, so that's allegedly a non-issue. Just a small matter if inter-thread syncronization (yuck). > > I meant to implement (when the time comes) something simpler. Either a > > FIFO or a Unix domain socket[1] is used as a communications channel for > > passing the listening socket, but without additional daemons. > > That was my first thought too, but I couldn't come up with any > satisfying way to handle the race conditions gracefully. > > Open file descriptors can only be passed over sockets, not pipes. Right. Passing them through a FIFO is a SYSV feature. I forgot it was not portable. > Also, using a socket means you have two-way communication, so you > don't need signals or PID files, which are subject to race conditions. Files maybe are, but signals? You can end up with losing some signals if they are sent in a rapid succesion, but for this purpose you just need one to trigger an action and shortly after receiving it the server is supposed to exit, so the possibility of losing other signals (of the same kind) doesn't matter. > But without signals, you'll still have to use select()/poll() even > with all the functionality contained in one program, or else when you > start a new server to replace the old one, the old one will wait > indefinitely for one more client connection before waking up and > noticing that it should hand the listening sockets over to the new > server. I'd use a signal to get out of accept(). :-) > One problem with filesystem sockets is that you have to unlink the > socket before listening on it, so the operation "listen on this > socket, which may or may not already exist" isn't atomic. If two > processes start at the same time, one of them can delete the other's > socket without knowing that anything was listening on it. So it may > be useful to atomically acquire some other dummy resource as a > mutual-exclusion checkpoint before listening on the filesystem socket. A server binds to the file system socket after it got the network socket, either by a direct bind() or by a passover from an existing server. That should be enough, I think. It's not atomic, but it's a locking protocol. My description had: "If there's no writer [on a file system channel], it binds to the network socket, writes the PID file [...]" If the bind to the network socket fails because something else is listening, then it can try again on the file system and bail out with an error if there's no writer. After all, that's not supposed to happen. > Another benefit of making the metaserver a separate program: you can > also write a library for LD_PRELOAD that masks the listen() function > to make existing programs use the metaserver instead of opening their > listening sockets directly. That's a good one. But shouldn't that mask the bind() function? As for the lease problem, couldn't metaserver just SIGTERM the existing server? It needs to know the PID, but that can be passed to it when the server connects to the file system socket and before it gets the network socket from the metaserver. -- .-. .-. Yes, I am an agent of Satan, but my duties are largely (_ \ / _) ceremonial. | | dave@fly.srk.fer.hr