From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/1339
Path: news.gmane.org!not-for-mail
From: prj@po.cwru.edu (Paul Jarc)
Newsgroups: gmane.comp.sysutils.supervision.general
Subject: Re: graceful restart under runit
Date: Sat, 18 Nov 2006 14:30:19 -0500
Organization: What did you have in mind?  A short, blunt, human pyramid?
Message-ID: <m3ac2oeen1.fsf@multivac.cwru.edu>
References: <20061115114754.GA3759@fly.srk.fer.hr>
	<20061115160850.GA26987@home.power>
	<20061116152446.GA4721@fly.srk.fer.hr>
	<20061117001519.GA652@home.power> <m3slgiganl.fsf@multivac.cwru.edu>
	<20061117133435.GB2153@home.power>
	<Pine.LNX.4.64.0611170944270.17275@e-smith.charlieb.ott.istop.com>
	<20061118002245.GB17975@home.power>
	<Pine.LNX.4.64.0611172027140.24960@e-smith.charlieb.ott.istop.com>
	<20061118123120.GA8388@home.power>
NNTP-Posting-Host: main.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: sea.gmane.org 1163878228 7250 80.91.229.2 (18 Nov 2006 19:30:28 GMT)
X-Complaints-To: usenet@sea.gmane.org
NNTP-Posting-Date: Sat, 18 Nov 2006 19:30:28 +0000 (UTC)
Original-X-From: supervision-return-1575-gcsg-supervision=m.gmane.org@list.skarnet.org Sat Nov 18 20:30:25 2006
Return-path: <supervision-return-1575-gcsg-supervision=m.gmane.org@list.skarnet.org>
Envelope-to: gcsg-supervision@gmane.org
Original-Received: from antah.skarnet.org ([212.85.147.14])
	by ciao.gmane.org with smtp (Exim 4.43)
	id 1GlVtR-0008FP-1q
	for gcsg-supervision@gmane.org; Sat, 18 Nov 2006 20:30:25 +0100
Original-Received: (qmail 19759 invoked by uid 76); 18 Nov 2006 19:30:45 -0000
Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm
List-Post: <mailto:supervision@list.skarnet.org>
List-Help: <mailto:supervision-help@list.skarnet.org>
List-Unsubscribe: <mailto:supervision-unsubscribe@list.skarnet.org>
List-Subscribe: <mailto:supervision-subscribe@list.skarnet.org>
List-Archive: <http://www.skarnet.org/lists/>
Original-Received: (qmail 19754 invoked from network); 18 Nov 2006 19:30:45 -0000
Original-To: supervision@list.skarnet.org
In-Reply-To: <20061118123120.GA8388@home.power> (Alex Efros's message of "Sat,
	18 Nov 2006 14:31:20 +0200")
Mail-Copies-To: nobody
Mail-Followup-To: supervision@list.skarnet.org
Original-Lines: 52
User-Agent: Gnus/5.110003 (No Gnus v0.3) Emacs/21.4 (gnu/linux)
Xref: news.gmane.org gmane.comp.sysutils.supervision.general:1339
Archived-At: <http://permalink.gmane.org/gmane.comp.sysutils.supervision.general/1339>

Alex Efros <powerman@powerman.asdfGroup.com> wrote:
> Yep, looks like you right. Looks like I've confused this case and case
> when process with listening socket doing fork and so result in two
> listening sockets for same ip/port in two different processes.

Now that you mention this, I think there is a way to hand off to a new
server process with no unavailability at all.  But if you can tolerate
a small outage during the switchover, you may be better off with the
simpler method of sending a signal to make the old process close its
listening socket, then forgetting the old process using Gerrit's patch
and starting a new one, which will just open a listening socket as
usual.

So, for zero-unavailability: there is a metaserver which listens on a
filesystem socket and takes care of opening other listening sockets
for other servers.  Instead of opening a listening socket directly,
another server would connect to this filesystem socket and ask the
metaserver to open it.  If there is not yet any socket open for the
requested address, the metaserver opens one, and passes the descriptor
to the requestor over the filesystem socket connection with sendmsg().
(The metaserver also keeps the listening socket open for itself, but
never calls accept() on it.)  Both sides keep the filesystem socket
connection open for as long as the requestor is accepting connections.
If the metaserver receives a request for a socket that is already
open, it notifies the previous requestor, which still holds the
listening socket, over the filesystem connection.  The old process can
then close its listening socket (new connections will only be delayed
at this point, not refused, since the metaserver still has the
listening socket open as well), close the filesystem connection,
finish servicing its current connections, and exit.  Once the
metaserver sees that the old requestor's filesystem connection has
been closed, it sends the listening socket to the new requestor.

While a server is handling connections, it would have to use
select()/poll() to notice activity on either the listening socket or
the filesystem connection; it couldn't just block on accept() in a
loop.  (Well, it could, but that would mean that when a switchover
starts, it wouldn't be completed until the next client connected.)

If the requestor exits, and no other requestors are around to pass the
listening socket to, the metaserver could close it immediately, or
could keep it open for a few seconds to see if a new requestor show
up.  So quick, non-overlapping restarts would be transparent to the
end clients.

To trigger the switchover, you wouldn't need any signals - just make
runsv forget about the old process using Gerrit's patch.  When the new
process starts up and connects to the filesystem socket, that will
trigger everything else.


paul