Re: graceful restart under runit

supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed

From: "Dražen Kačar" <dave@fly.srk.fer.hr>
Subject: Re: graceful restart under runit
Date: Wed, 22 Nov 2006 20:25:06 +0100	[thread overview]
Message-ID: <20061122192506.GA24958@fly.srk.fer.hr> (raw)
In-Reply-To: <m3zmalc3rk.fsf@multivac.cwru.edu>

Paul Jarc wrote:
> Dražen Kačar <dave@fly.srk.fer.hr> wrote:

> Well, it depends how much downtime you can tolerate.
[...]

> > Then you'd need to implement something to take care of metaserver crashes.
> > Probably a way for servers to pass listening sockets back to the
> > new metaserver.
> 
> I think that's beyond the point of diminishing returns.  The problem
> can never be completely solved, since the metaserver and other servers
> could crash at the same time, or you could lose power, etc.  You have
> to give up at some point.

Well, I'm thinking about systems with 99.999% availability (for fun you
can calculate how many seconds per year that is :-). Clustered systems can
do that. There's a hartbeat and if one component fails then the processing
waits until the living ones reach the recovery point, but after that
processing continues and the client only sees a brief (or not) pause.

But that's for distributed systems where you just don't have a single
point on which you can rely to work properly. On one machine there's the
kernel. If it crashes, all processes will also crash and burn, so that
would be the point at which I'd give up. :-)

> >> While a server is handling connections, it would have to use
> >> select()/poll() to notice activity on either the listening socket or
> >> the filesystem connection;
> >
> > And that isn't very nice.
> 
> Well, I'd probably do that anyway, if I wanted to handle signals,
> since I'd use the self-pipe technique to notice when signals arrived.

I'm trying to use sig_atomic_t flag in signal handlers whenever I can.
Things are a bit simpler that way, at least to me.

For threaded code there'd be a signal handling thread, so that's allegedly
a non-issue. Just a small matter if inter-thread syncronization (yuck).

> > I meant to implement (when the time comes) something simpler. Either a
> > FIFO or a Unix domain socket[1] is used as a communications channel for
> > passing the listening socket, but without additional daemons.
> 
> That was my first thought too, but I couldn't come up with any
> satisfying way to handle the race conditions gracefully.
> 
> Open file descriptors can only be passed over sockets, not pipes.

Right. Passing them through a FIFO is a SYSV feature. I forgot it was not
portable.

> Also, using a socket means you have two-way communication, so you
> don't need signals or PID files, which are subject to race conditions.

Files maybe are, but signals? You can end up with losing some signals if
they are sent in a rapid succesion, but for this purpose you just need one
to trigger an action and shortly after receiving it the server is supposed
to exit, so the possibility of losing other signals (of the same kind)
doesn't matter.

> But without signals, you'll still have to use select()/poll() even
> with all the functionality contained in one program, or else when you
> start a new server to replace the old one, the old one will wait
> indefinitely for one more client connection before waking up and
> noticing that it should hand the listening sockets over to the new
> server.

I'd use a signal to get out of accept(). :-)

> One problem with filesystem sockets is that you have to unlink the
> socket before listening on it, so the operation "listen on this
> socket, which may or may not already exist" isn't atomic.  If two
> processes start at the same time, one of them can delete the other's
> socket without knowing that anything was listening on it.  So it may
> be useful to atomically acquire some other dummy resource as a
> mutual-exclusion checkpoint before listening on the filesystem socket.

A server binds to the file system socket after it got the network socket,
either by a direct bind() or by a passover from an existing server. That
should be enough, I think. It's not atomic, but it's a locking protocol.

My description had: "If there's no writer [on a file system channel], it
binds to the network socket, writes the PID file [...]"

If the bind to the network socket fails because something else is
listening, then it can try again on the file system and bail out with an
error if there's no writer. After all, that's not supposed to happen.

> Another benefit of making the metaserver a separate program: you can
> also write a library for LD_PRELOAD that masks the listen() function
> to make existing programs use the metaserver instead of opening their
> listening sockets directly.

That's a good one. But shouldn't that mask the bind() function?

As for the lease problem, couldn't metaserver just SIGTERM the existing
server? It needs to know the PID, but that can be passed to it when the
server connects to the file system socket and before it gets the network
socket from the metaserver.

-- 
 .-.   .-.    Yes, I am an agent of Satan, but my duties are largely
(_  \ /  _)   ceremonial.
     |
     |        dave@fly.srk.fer.hr

next prev parent reply	other threads:[~2006-11-22 19:25 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-15 11:47 Dražen Kačar
2006-11-15 16:08 ` Alex Efros
2006-11-16 15:24   ` Dražen Kačar
2006-11-17  0:15     ` Alex Efros
2006-11-17  0:48       ` Paul Jarc
2006-11-17 13:34         ` Alex Efros
2006-11-17 14:53           ` Charlie Brady
2006-11-17 15:39             ` Gerrit Pape
2006-11-18  0:22             ` Alex Efros
2006-11-18  1:34               ` Charlie Brady
2006-11-18 12:31                 ` Alex Efros
2006-11-18 19:30                   ` Paul Jarc
2006-11-20 18:27                     ` Dražen Kačar
2006-11-20 19:32                       ` Paul Jarc
2006-11-20 19:43                         ` Paul Jarc
2006-11-22 19:25                         ` Dražen Kačar [this message]
2006-11-22 19:51                           ` Paul Jarc
2006-11-23 12:25                             ` Dražen Kačar
2006-11-24 21:22                               ` Paul Jarc
2006-11-17 13:14     ` Gerrit Pape

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061122192506.GA24958@fly.srk.fer.hr \
    --to=dave@fly.srk.fer.hr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).