supervision - discussion about system services, daemon supervision, init, runlevel management, and tools such as s6 and runit
 help / color / mirror / Atom feed
* Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
@ 2006-10-28 19:26 Alex Smith
  2006-10-30 10:49 ` Alex Efros
  0 siblings, 1 reply; 20+ messages in thread
From: Alex Smith @ 2006-10-28 19:26 UTC (permalink / raw)


Hi all,

Is it possible to add an option to runsv which specifies how many times 
runsv should try to restart a service in a certain time period before 
stopping for say, 5 minutes? An example of what I mean - if runsv has to 
be restarted more than 10 times in 10 seconds, then just idle for 5 
minutes before trying again. Otherwise you'd get lots of system 
resources being taken up by constant attempts to restart it, when it's 
totally obvious that it's not gonna work - if you see what I mean :-)

Also, if this was implemented, there should be an option for runsvdir 
that specifies this too, which would just be passed on to each runsv 
proces that it spawns.

Is this a good idea? Or not?

Thanks!
Alex

-- 
Alex Smith
Frugalware Linux developer - http://www.frugalware.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-28 19:26 Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up? Alex Smith
@ 2006-10-30 10:49 ` Alex Efros
  2006-10-30 10:50   ` Alex Efros
  2006-10-30 12:13   ` Dražen Kačar
  0 siblings, 2 replies; 20+ messages in thread
From: Alex Efros @ 2006-10-30 10:49 UTC (permalink / raw)


Hi!

On Sat, Oct 28, 2006 at 08:26:27PM +0100, Alex Smith wrote:
> Is this a good idea? Or not?

I think - no, isn't bad idea. Such ideas just make software bloated.
Your services MUST run all of time, that's why they called services.
If some service doesn't start - it's a big problem, and admin should be
notified about it urgently and fix it (maybe by disabling this service :)).

Runit provide enough features to write such 'plugins' manually - you've
./finish file in which you can check how often it's executed and do
something: stop service, sleep, or send sms to admin.

-- 
			WBR, Alex.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 10:49 ` Alex Efros
@ 2006-10-30 10:50   ` Alex Efros
  2006-10-30 12:13   ` Dražen Kačar
  1 sibling, 0 replies; 20+ messages in thread
From: Alex Efros @ 2006-10-30 10:50 UTC (permalink / raw)


Hi!

Oops, typo fix:

On Mon, Oct 30, 2006 at 12:49:23PM +0200, Alex Efros wrote:
> I think - no, isn't bad idea.
I think - no, it's bad idea.
 

-- 
			WBR, Alex.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 10:49 ` Alex Efros
  2006-10-30 10:50   ` Alex Efros
@ 2006-10-30 12:13   ` Dražen Kačar
  2006-10-30 12:30     ` Alex Efros
  1 sibling, 1 reply; 20+ messages in thread
From: Dražen Kačar @ 2006-10-30 12:13 UTC (permalink / raw)


Alex Efros wrote:
> 
> On Sat, Oct 28, 2006 at 08:26:27PM +0100, Alex Smith wrote:
> > Is this a good idea? Or not?
> 
> I think - no, it's a bad idea. Such ideas just make software bloated.

That kind of depends.

> Your services MUST run all of time, that's why they called services.
> If some service doesn't start - it's a big problem, and admin should be
> notified about it urgently and fix it (maybe by disabling this service :)).

Admin might not be available (admins have to sleep, after all). Besides,
the problem might be out of admin's control. For example, a remote
database (on which his service depends) doesn't work, or the network
connectivity was lost or something like that.

> Runit provide enough features to write such 'plugins' manually - you've
> ./finish file in which you can check how often it's executed and do
> something: stop service, sleep, or send sms to admin.

Sure, but if something's a common need for a large group of users, then
they call it a feature. Some of those who don't need such feature call it
a bloat, but I don't think that's a valid argument.

-- 
 .-.   .-.    Yes, I am an agent of Satan, but my duties are largely
(_  \ /  _)   ceremonial.
     |
     |        dave@fly.srk.fer.hr


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 12:13   ` Dražen Kačar
@ 2006-10-30 12:30     ` Alex Efros
  2006-10-30 13:38       ` Laurent Bercot
                         ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Alex Efros @ 2006-10-30 12:30 UTC (permalink / raw)


Hi!

On Mon, Oct 30, 2006 at 01:13:21PM +0100, Dra?en Ka?ar wrote:
> Sure, but if something's a common need for a large group of users, then
> they call it a feature. Some of those who don't need such feature call it
> a bloat, but I don't think that's a valid argument.

No. Bloat isn't equal to 'new feature'. Bloated software isn't equal to
feature-rich software. But if software has wrong features added in wrong
places (from architecture view) then it's become bloated very quickly.

Maybe it's good idea to include additional script in runit package (or
distribute it separately) which can be used from ./finish script this way:

    # add 5 minutes timeout if service was started 5 times in last 10 sec
    restart-timeout --interval 10 --tries 5 300

But adding this functionality to runit is a bad idea just because you
already can develop that restart-timeout script using current runit,
and adding this feature to runit doesn't provide any additional gains.
After all, it's a Unix Way.

-- 
			WBR, Alex.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 12:30     ` Alex Efros
@ 2006-10-30 13:38       ` Laurent Bercot
  2006-10-30 13:42         ` Alex Efros
  2006-10-30 18:49         ` Vincent Danen
  2006-10-30 17:52       ` Alex Smith
  2006-10-30 18:49       ` Dražen Kačar
  2 siblings, 2 replies; 20+ messages in thread
From: Laurent Bercot @ 2006-10-30 13:38 UTC (permalink / raw)


 I also tend to think this feature goes outside the scope of "basic"
supervision tools as runit and daemontools. Sure, it could be integrated
in such tools, and not even make them too bloated; nevertheless, I like
the simple guarantees that they offer, and would prefer not to alter
their semantics too much.

 The feature can rather easily be implemented on a higher layer. You
could write a program that checks the restarting rate (provided runit
notifies some place when it restarts the service - I really need to
finish that notify library of mine :/ - is a program currently able
to listen to such an event without polling anything ?) and touches the
"down" file when the restart rate is too high.

-- 
 Laurent


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 13:38       ` Laurent Bercot
@ 2006-10-30 13:42         ` Alex Efros
  2006-10-30 13:58           ` Laurent Bercot
  2006-10-30 18:49         ` Vincent Danen
  1 sibling, 1 reply; 20+ messages in thread
From: Alex Efros @ 2006-10-30 13:42 UTC (permalink / raw)


Hi!

On Mon, Oct 30, 2006 at 02:38:47PM +0100, Laurent Bercot wrote:
> notifies some place when it restarts the service - I really need to
> finish that notify library of mine :/ - is a program currently able
> to listen to such an event without polling anything ?) and touches the

Of course there no need to polling anything - each ./finish execution
equal to 'restart event' (or, more precisely, 'shutdown event', but in
this case there no significant difference).

-- 
			WBR, Alex.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 13:42         ` Alex Efros
@ 2006-10-30 13:58           ` Laurent Bercot
  2006-10-30 14:24             ` Alex Efros
  2006-10-30 14:51             ` Charlie Brady
  0 siblings, 2 replies; 20+ messages in thread
From: Laurent Bercot @ 2006-10-30 13:58 UTC (permalink / raw)


> Of course there no need to polling anything - each ./finish execution
> equal to 'restart event' (or, more precisely, 'shutdown event', but in
> this case there no significant difference).

 Hmmm. Of course it can be done with the finish script, but my point was:
if I'm going to implement the throttle feature as an external program,
can I do it without changing anything in my runit configuration whatsoever ?
Users shouldn't have to patch their finish scripts in order to use the
throttle feature. So the "./run is down, ./finish is executing" information
has to be available somewhere outside for it to work.

 This is a notification problem; we have extensively discussed notification
on this list, and I have written a piece of software to do just that,
except that it's so ugly I retired it ^^"

 Another approach to the throttle feature that doesn't require notification
from runit would be to have a short-lived program, designed to be called in
the finish script, that stores its information (last calling time ans such)
in the filesystem. Maybe it's what you were thinking about. But I'm not
sure how to make it reliable; storing short-lived information in the
filesystem is very error-prone, that's the .pid way, which is precisely
what supervision tools were designed to avoid.

-- 
 Laurent


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 13:58           ` Laurent Bercot
@ 2006-10-30 14:24             ` Alex Efros
  2006-10-30 14:51             ` Charlie Brady
  1 sibling, 0 replies; 20+ messages in thread
From: Alex Efros @ 2006-10-30 14:24 UTC (permalink / raw)


Hi!

On Mon, Oct 30, 2006 at 02:58:34PM +0100, Laurent Bercot wrote:
>  Another approach to the throttle feature that doesn't require notification
> from runit would be to have a short-lived program, designed to be called in
> the finish script, that stores its information (last calling time ans such)
> in the filesystem. Maybe it's what you were thinking about. But I'm not

Yep, I'm thinking this way.

> sure how to make it reliable; storing short-lived information in the
> filesystem is very error-prone, that's the .pid way, which is precisely
> what supervision tools were designed to avoid.

I don't see any troubles making it reliable. DJB show how to develop
reliable shell scripts long time ago: use atomic operations like 'mv'
for updating files.

    # reliable counter in bash
    prev=$(< .counter )
    next=$(( $prev + 1 ))
    echo $next > .counter.$$
    mv .counter.$$ .counter

For this task we need something more complex than just counter because we
should count restarts for some time interval, but this also can be done in
reliable way.

P.S. If my script die before `mv` it can leave .counter.$$ file on disk.
If this important we can add cleanup code. If it's important to not run 2
such scripts simultaneously, then we can use something like `chpst -l`.
Etc... Reliable shell scripts is reality, let's face it. ;-)

-- 
			WBR, Alex.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 13:58           ` Laurent Bercot
  2006-10-30 14:24             ` Alex Efros
@ 2006-10-30 14:51             ` Charlie Brady
  2006-10-31  0:48               ` Laurent Bercot
  1 sibling, 1 reply; 20+ messages in thread
From: Charlie Brady @ 2006-10-30 14:51 UTC (permalink / raw)
  Cc: supervision


On Mon, 30 Oct 2006, Laurent Bercot wrote:

> Another approach to the throttle feature that doesn't require notification
> from runit would be to have a short-lived program, designed to be called in
> the finish script, that stores its information (last calling time ans such)
> in the filesystem. Maybe it's what you were thinking about. But I'm not
> sure how to make it reliable; storing short-lived information in the
> filesystem is very error-prone, that's the .pid way, which is precisely
> what supervision tools were designed to avoid.

The problem with pid files is race conditions. I don't see that as being a 
problem here, as there is a single thread of execution between ./run and 
./finish. ./finish should be able to reliably store whatever state it 
needs in the file system. Or do you see something which I don't?


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 12:30     ` Alex Efros
  2006-10-30 13:38       ` Laurent Bercot
@ 2006-10-30 17:52       ` Alex Smith
  2006-10-30 21:41         ` Alex Efros
  2006-10-30 18:49       ` Dražen Kačar
  2 siblings, 1 reply; 20+ messages in thread
From: Alex Smith @ 2006-10-30 17:52 UTC (permalink / raw)


Alex Efros wrote:
> Hi!
> 
> On Mon, Oct 30, 2006 at 01:13:21PM +0100, Dra?en Ka?ar wrote:
>> Sure, but if something's a common need for a large group of users, then
>> they call it a feature. Some of those who don't need such feature call it
>> a bloat, but I don't think that's a valid argument.
> 
> No. Bloat isn't equal to 'new feature'. Bloated software isn't equal to
> feature-rich software. But if software has wrong features added in wrong
> places (from architecture view) then it's become bloated very quickly.
> 
> Maybe it's good idea to include additional script in runit package (or
> distribute it separately) which can be used from ./finish script this way:
> 
>     # add 5 minutes timeout if service was started 5 times in last 10 sec
>     restart-timeout --interval 10 --tries 5 300
> 
> But adding this functionality to runit is a bad idea just because you
> already can develop that restart-timeout script using current runit,
> and adding this feature to runit doesn't provide any additional gains.
> After all, it's a Unix Way.
> 

Ok, I could try to write something like this myself, but just one 
question - if the services is running ./finish, what does sv start try 
to do? I mean, the documentation on runit's website says to use sv start 
for dependencies. If a service is started that runs sv start 
some_service, and some_service is running ./finish, what would happen?

Thanks,
Alex

-- 
Alex Smith
Frugalware Linux developer - http://www.frugalware.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 12:30     ` Alex Efros
  2006-10-30 13:38       ` Laurent Bercot
  2006-10-30 17:52       ` Alex Smith
@ 2006-10-30 18:49       ` Dražen Kačar
  2006-10-30 22:03         ` Alex Efros
  2 siblings, 1 reply; 20+ messages in thread
From: Dražen Kačar @ 2006-10-30 18:49 UTC (permalink / raw)


Alex Efros wrote:
> On Mon, Oct 30, 2006 at 01:13:21PM +0100, Dra?en Ka?ar wrote:
> > Sure, but if something's a common need for a large group of users, then
> > they call it a feature. Some of those who don't need such feature call it
> > a bloat, but I don't think that's a valid argument.
> 
> No. Bloat isn't equal to 'new feature'. Bloated software isn't equal to
> feature-rich software. But if software has wrong features added in wrong
> places (from architecture view) then it's become bloated very quickly.

Yes, but that doesn't depend on features at all, although the wrong
architectural decisions usually happen because someone was trying to add
just one more feature.

> Maybe it's good idea to include additional script in runit package (or
> distribute it separately) which can be used from ./finish script this way:
> 
>     # add 5 minutes timeout if service was started 5 times in last 10 sec
>     restart-timeout --interval 10 --tries 5 300

Maybe. However, then you end up with sed or perl as your configuration tool.
Suppose you have 10 or 20 such scripts and you want to change the interval
number. And that somebody else wrote finish scripts, by using whichever
tools he had.

> But adding this functionality to runit is a bad idea just because you
> already can develop that restart-timeout script using current runit,
> and adding this feature to runit doesn't provide any additional gains.

It does. Ease of configuration, for example. If an administrator has to
configure programs with shell scripts, another admin has to find, read and
understand those scripts before he can change anything. That's an overhead
I'd like to avoid for simple things.

Suppose the problem was fixed, but runit is waiting for the finish scripts
to exit. How does one tell them to exit immediately and let runsv restart
the services?

This problem doesn't exist if the restart period is one second because the
service will be restarted faster than you can type the command. But if the
waiting period can be longer, then there should be a way to manually
terminate the waiting prematurely.

I'm not saying that the ability to write a script is wrong. In this
particular example it would be good to have that option if one wants to
have exponential back-off or some other, more complex timing calculation.

> After all, it's a Unix Way.

It's a way of the people who like to write shell scripts.

-- 
 .-.   .-.    Yes, I am an agent of Satan, but my duties are largely
(_  \ /  _)   ceremonial.
     |
     |        dave@fly.srk.fer.hr


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 13:38       ` Laurent Bercot
  2006-10-30 13:42         ` Alex Efros
@ 2006-10-30 18:49         ` Vincent Danen
  2006-10-30 21:28           ` Alex Efros
  1 sibling, 1 reply; 20+ messages in thread
From: Vincent Danen @ 2006-10-30 18:49 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 1959 bytes --]

* Laurent Bercot <ska-supervision@skarnet.org> [2006-10-30 14:38:47 +0100]:

>  I also tend to think this feature goes outside the scope of "basic"
> supervision tools as runit and daemontools. Sure, it could be integrated
> in such tools, and not even make them too bloated; nevertheless, I like
> the simple guarantees that they offer, and would prefer not to alter
> their semantics too much.
> 
>  The feature can rather easily be implemented on a higher layer. You
> could write a program that checks the restarting rate (provided runit
> notifies some place when it restarts the service - I really need to
> finish that notify library of mine :/ - is a program currently able
> to listen to such an event without polling anything ?) and touches the
> "down" file when the restart rate is too high.

This is something we implemented in the srv wrapper program for Annvix:

http://svn.annvix.org/cgi-bin/viewvc.cgi/srv/trunk/?root=tools

What it does is before it "releases" a service, is it checks to see if
it's looping and if it is, marks it down.  Of course, the problem here
is at boot... since srv isn't starting services (srv is a tool like sv
or the "service" command with initscripts), it won't help so there is
some validity to wanting this at the runit layer... ie. we can check for
looping all we want in srv, but when runsvdir is fired up in stage 2,
unless you put these checks in every finish script and/or run script, it
won't help at all.

Anyways, does runsv run the finish script if run exits on it's own?  I
was under the impression that finish was called by sv when you mark a
service down (ie. if i was running apache supervised and i sent it a
kill -9 (without using sv), would runsv actually execute the finish
script or see that run died and just restart run?

-- 
{FEE30AD4 : 7F6C A60C 06C2 4811 FA1C  A2BC 2EBC 5E32 FEE3 0AD4}
mysql> SELECT * FROM users WHERE clue > 0;
Empty set (0.00sec)

[-- Attachment #2: Type: application/pgp-signature, Size: 186 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 18:49         ` Vincent Danen
@ 2006-10-30 21:28           ` Alex Efros
  2006-10-30 21:30             ` Vincent Danen
  0 siblings, 1 reply; 20+ messages in thread
From: Alex Efros @ 2006-10-30 21:28 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 324 bytes --]

Hi!

On Mon, Oct 30, 2006 at 11:49:47AM -0700, Vincent Danen wrote:
> Anyways, does runsv run the finish script if run exits on it's own?  I

2nd paragraph on runsv man page:
    
    runsv switches to the directory service and starts ./run.
    If ./run exits and ./finish exists, runsv starts ./finish.

-- 
			WBR, Alex.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 21:28           ` Alex Efros
@ 2006-10-30 21:30             ` Vincent Danen
  0 siblings, 0 replies; 20+ messages in thread
From: Vincent Danen @ 2006-10-30 21:30 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 563 bytes --]

* Alex Efros <powerman@powerman.asdfGroup.com> [2006-10-30 23:28:42 +0200]:

> > Anyways, does runsv run the finish script if run exits on it's own?  I
> 
> 2nd paragraph on runsv man page:
>     
>     runsv switches to the directory service and starts ./run.
>     If ./run exits and ./finish exists, runsv starts ./finish.

Duh.  Thanks, Alex.  Ok, well that's good... I should make more use of
finish then.  =)

-- 
{FEE30AD4 : 7F6C A60C 06C2 4811 FA1C  A2BC 2EBC 5E32 FEE3 0AD4}
mysql> SELECT * FROM users WHERE clue > 0;
Empty set (0.00sec)

[-- Attachment #2: Type: application/pgp-signature, Size: 186 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 17:52       ` Alex Smith
@ 2006-10-30 21:41         ` Alex Efros
  2006-11-01 12:01           ` Gerrit Pape
  0 siblings, 1 reply; 20+ messages in thread
From: Alex Efros @ 2006-10-30 21:41 UTC (permalink / raw)


Hi!

On Mon, Oct 30, 2006 at 05:52:42PM +0000, Alex Smith wrote:
> Ok, I could try to write something like this myself, but just one 
> question - if the services is running ./finish, what does sv start try 
> to do? I mean, the documentation on runit's website says to use sv start 
> for dependencies. If a service is started that runs sv start 
> some_service, and some_service is running ./finish, what would happen?

Hmm. That's interesting question. I suppose nothing is happens - if
./finish hangs then runsv will just wait until it exit.

Maybe sending TERM (or any other signal) to service using `sv t` will send
it to ./finish script if ./finish is running now instead of ./run, but
I'm not sure.

Probably it's question for Gerrit.

-- 
			WBR, Alex.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 18:49       ` Dražen Kačar
@ 2006-10-30 22:03         ` Alex Efros
  0 siblings, 0 replies; 20+ messages in thread
From: Alex Efros @ 2006-10-30 22:03 UTC (permalink / raw)


Hi!

On Mon, Oct 30, 2006 at 07:49:23PM +0100, Dra?en Ka?ar wrote:
> It does. Ease of configuration, for example. If an administrator has to
> configure programs with shell scripts, another admin has to find, read and
> understand those scripts before he can change anything. That's an overhead
> I'd like to avoid for simple things.

I tend to agree. But I've two comments:
1) In my experience most *nix admins know only about sysvinit, and when
   they see server with runit/daemontools they just give up. So, small
   feature in ./finish script doesn't make task much harder for such admins.
   Other admins, who know about runit, know ./finish scripts isn't usually
   used at all, so if they see ./finish script they usually look into it
   with curiosity. :)
2) Windows is example of thing developed to be ease to configure.
   I prefer harder to configure but more controlled things like *nix.

> Suppose the problem was fixed, but runit is waiting for the finish scripts
> to exit. How does one tell them to exit immediately and let runsv restart
> the services?

This is same interesting question as Alex Smith asked. I don't know, we
should ask Gerrit. Maybe signals will work...

<OT>

> > After all, it's a Unix Way.
> It's a way of the people who like to write shell scripts.

I don't like to write shell scripts. Unix Way is to have a lot of simple
programs each doing it small task. I.e. if some task can be split into
two different programs - why not go this way? Each program will be more
simple and so more reliable and have less bugs. But evil side of this is
effort to integrate all these programs (using shell scripts or something
else), yeah. Actually, there few exceptions from this rule in *nix world -
text editor Vim, for example, or email client Mutt - they both are big
complex programs, and that's good because these programs should parse user
input and provide comfortable user interface - that's impossible for
bundle of small programs. But I don't think runit is fall into this
software category - it should be reliable, secure, fast, simple and small.
This goal much ease to reach using Unix Way of programming. And so, yes,
you should use shell scripts to help runit stay small and reliable.

</OT>

-- 
			WBR, Alex.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 14:51             ` Charlie Brady
@ 2006-10-31  0:48               ` Laurent Bercot
  0 siblings, 0 replies; 20+ messages in thread
From: Laurent Bercot @ 2006-10-31  0:48 UTC (permalink / raw)


> The problem with pid files is race conditions. I don't see that as being a 
> problem here, as there is a single thread of execution between ./run and 
> ./finish. ./finish should be able to reliably store whatever state it 
> needs in the file system. Or do you see something which I don't?

 I haven't thought of the problem in much detail - and I probably won't
in the foreseeable future: until I've finished writing my own supervision
suite, which may take an indeterminate amount of time, Gerrit's the definite
authority on the subject and I'll trust his decision ;)

 However, I've written enough buggy code to know that devil is in the
details, and enough overall Unix code to know where danger resides. And
there's a warning sign going off in my head here: bzzzt! storing
short-lived data in the filesystem - potential problems ahead!

 Not saying it can't work, of course; but there might well be more than
meets the eye at first sight, and that implementation decision should not
be taken lightly and without caution.

-- 
 Laurent


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-10-30 21:41         ` Alex Efros
@ 2006-11-01 12:01           ` Gerrit Pape
  2006-11-01 12:17             ` Alex Efros
  0 siblings, 1 reply; 20+ messages in thread
From: Gerrit Pape @ 2006-11-01 12:01 UTC (permalink / raw)


On Mon, Oct 30, 2006 at 11:41:32PM +0200, Alex Efros wrote:
> On Mon, Oct 30, 2006 at 05:52:42PM +0000, Alex Smith wrote:
> > Ok, I could try to write something like this myself, but just one 
> > question - if the services is running ./finish, what does sv start try 
> > to do? I mean, the documentation on runit's website says to use sv start 
> > for dependencies. If a service is started that runs sv start 
> > some_service, and some_service is running ./finish, what would happen?
> 
> Hmm. That's interesting question. I suppose nothing is happens - if
> ./finish hangs then runsv will just wait until it exit.

sv will wait for the service to become done (for ./finish to terminate),
and to then to be up again (./run running, and using ./check if
available), or report timeout.

> Maybe sending TERM (or any other signal) to service using `sv t` will send
> it to ./finish script if ./finish is running now instead of ./run, but
> I'm not sure.

No, it's not sending TERM to finish, see
 http://thread.gmane.org/gmane.comp.sysutils.supervision.general/270/focus=272

Regards, Gerrit.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up?
  2006-11-01 12:01           ` Gerrit Pape
@ 2006-11-01 12:17             ` Alex Efros
  0 siblings, 0 replies; 20+ messages in thread
From: Alex Efros @ 2006-11-01 12:17 UTC (permalink / raw)


Hi!

On Wed, Nov 01, 2006 at 12:01:56PM +0000, Gerrit Pape wrote:
> No, it's not sending TERM to finish, see
>  http://thread.gmane.org/gmane.comp.sysutils.supervision.general/270/focus=272

So, if ./finish hangs there no way to kill it using runit? One should
manually find PID of hang process and kill it to exit from ./finish and
restart service?

This make all non-trivial tasks very dangerous to use in ./finish.

I agree with race you explained, but I think runsv should be able to
control (kill) all processes it spawn, including ./finish script and
scripts in ./control/ directory (probably ./check script is different
because it isn't executed by runsv).

I don't sure about right interface for this task (using `sv t` to send TERM
to ./finish is surely bad idea).

-- 
			WBR, Alex.


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2006-11-01 12:17 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-10-28 19:26 Option for runsv/runsvdir to specify how many times to restart a service in a certain time period before giving up? Alex Smith
2006-10-30 10:49 ` Alex Efros
2006-10-30 10:50   ` Alex Efros
2006-10-30 12:13   ` Dražen Kačar
2006-10-30 12:30     ` Alex Efros
2006-10-30 13:38       ` Laurent Bercot
2006-10-30 13:42         ` Alex Efros
2006-10-30 13:58           ` Laurent Bercot
2006-10-30 14:24             ` Alex Efros
2006-10-30 14:51             ` Charlie Brady
2006-10-31  0:48               ` Laurent Bercot
2006-10-30 18:49         ` Vincent Danen
2006-10-30 21:28           ` Alex Efros
2006-10-30 21:30             ` Vincent Danen
2006-10-30 17:52       ` Alex Smith
2006-10-30 21:41         ` Alex Efros
2006-11-01 12:01           ` Gerrit Pape
2006-11-01 12:17             ` Alex Efros
2006-10-30 18:49       ` Dražen Kačar
2006-10-30 22:03         ` Alex Efros

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).