* How to kill runsv, no matter what? @ 2007-02-21 20:14 Daniel Clark 2007-02-21 21:04 ` Daniel Clark 0 siblings, 1 reply; 14+ messages in thread From: Daniel Clark @ 2007-02-21 20:14 UTC (permalink / raw) To: supervision I'm integrating runit support into bcfg2 [1], both as something bcfg2 can control, and as an encap [2] package [3]. I'm replacing daemontools, as djb's annoying redistribution policies wouldn't allow me to distribute Xen images or LiveCDs (as I patched daemontools since there hasn't been a release in a very long time, but there have been bugs). I'm trying to get to a state where I can add and remove the runit package without leaving any state behind (I'm using it for runsvdir, not as an init replacement). When the package is removed, all of the runit services should stop, and state about what services were started is saved somewhere; on reinstall, that state should be reintroduced, and any runit services should be restarted. In theory this should be pretty trivial (assuming I am RTFMing correctly); I think something like this in the removal stage: test -d /usr/local/var/service/.disabled || mkdir /usr/local/var/service/.disabled mv /usr/local/var/service/* /usr/local/var/service/.disabled/ 2>/dev/null \ || printf "No services to disable.\n" printf "Waiting 7 seconds for runsv processes to die...\n" sleep 7 # ... (Code that stops runsvdir) ... for service in `ls /usr/local/etc/sv`; do test -d /usr/local/etc/sv/$service/supervise \ && rm -rf /usr/local/etc/sv/$service/supervise test -d /usr/local/etc/sv/$service/log/supervise \ && rm -rf /usr/local/etc/sv/$service/log/supervise done However in practice there are some services that continue to have a "runsv" process even after I remove them from the directory "runsvdir" is monitoring and wait >5 seconds. Below is an example of such a service that refuses to die. With daemontools I had a script called svrm that did this (below), but the same idiom doesn't seem to work with runit/runsvdir. Am I doing something wrong, or is this a bug? ---------------------------------------------------------------------- root@pawn:/usr/local/etc/sv# cat bcfg2-client/run #!/bin/sh exec 2>&1 printf "*** exec /usr/local/bin/chpst -e /usr/local/etc/default/bcfg2-client/env ./bcfg2-client.sh ...\n" exec /usr/local/bin/chpst -e /usr/local/etc/default/bcfg2-client/env ./bcfg2-client.sh ---------------------------------------------------------------------- root@pawn:/usr/local/etc/sv# cat bcfg2-client/bcfg2-client.sh #!/bin/sh # note: variables provided from environment with chpst -e: # /usr/local/etc/default/bcfg2-client/env/OPTIONS # /usr/local/etc/default/bcfg2-client/env/RUN_INTERVAL_SECONDS ENVDIR="/usr/local/etc/default/bcfg2-client/env" # make sure we have options if [ ! -f ${ENVDIR}/OPTIONS ]; then printf "WARNING: ${ENVDIR}/OPTIONS\n" printf "WARNING: does not exist. Using default of \"-q -v -d -n\"\n" OPTIONS="-q -v -d -n" fi # make sure we have a sleep variable if [ "${RUN_INTERVAL_SECONDS}x" = "x" ]; then printf "WARNING: ${ENVDIR}/RUN_INTERVAL_SECONDS\n" printf "WARNING: does not exist or has no value.\n" printf "WARNING: Using default of 3600 seconds between runs.\n" RUN_INTERVAL_SECONDS=3600 fi # loop forever while : do printf "*** starting /usr/local/bin/bcfg2 ${OPTIONS} ...\n" /usr/local/bin/bcfg2 ${OPTIONS} printf "*** sleeping ${RUN_INTERVAL_SECONDS} seconds ...\n" sleep ${RUN_INTERVAL_SECONDS} done exit 0 ---------------------------------------------------------------------- <include_file name="bin/svrm" mode="0755"><![CDATA[ #!/bin/sh # Remove a daemontools service PATH=/command:$PATH export PATH if [ "${1}x" = "x" -o "${2}x" != "x" ]; then printf "Usage: svrm [SERVICE]\n" exit 1 fi SERVICE="`basename ${1}`" if [ ! -h "/service/$SERVICE" -a ! -f "/service/$SERVICE" ]; then printf "Service \"${SERVICE}\" not installed. Installed services:\n" svstat /service/* exit 1 else cd /service/$SERVICE REALDIR=`pwd -P` rm /service/$SERVICE svc -dx . log sleep 1 test -f ${REALDIR}/supervise/status && rm ${REALDIR}/supervise/status test -d ${REALDIR}/supervise && rm -rf ${REALDIR}/supervise test -f ${REALDIR}/log/supervise/status && rm ${REALDIR}/log/supervise/status test -d ${REALDIR}/log/supervise && rm -rf ${REALDIR}/log/supervise fi exit 0 ]]></include_file> ---------------------------------------------------------------------- [1] http://www.bcfg2.org [2] http://www.encap.org [3] http://www.bcfg2.org/browser/trunk/bcfg2/encap/src/encap-profiles/runit-1.7.2.ep Thanks for any help, -- Daniel Clark # http://dclark.us # http://opensysadmin.com ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: How to kill runsv, no matter what? 2007-02-21 20:14 How to kill runsv, no matter what? Daniel Clark @ 2007-02-21 21:04 ` Daniel Clark 2007-02-23 3:51 ` Daniel Clark 0 siblings, 1 reply; 14+ messages in thread From: Daniel Clark @ 2007-02-21 21:04 UTC (permalink / raw) To: supervision On 2/21/07, Daniel Clark <dclark@pobox.com> wrote: > I'm trying to get to a state where I can add and remove the runit > package without leaving any state behind (I'm using it for runsvdir, > not as an init replacement). When the package is removed, all of the > runit services should stop, and state about what services were started > is saved somewhere; on reinstall, that state should be reintroduced, > and any runit services should be restarted. > > In theory this should be pretty trivial (assuming I am RTFMing > correctly); I think something like this in the removal stage: > > test -d /usr/local/var/service/.disabled || mkdir > /usr/local/var/service/.disabled > mv /usr/local/var/service/* /usr/local/var/service/.disabled/ 2>/dev/null \ > || printf "No services to disable.\n" > printf "Waiting 7 seconds for runsv processes to die...\n" > sleep 7 > # ... (Code that stops runsvdir) ... > for service in `ls /usr/local/etc/sv`; do > test -d /usr/local/etc/sv/$service/supervise \ > && rm -rf /usr/local/etc/sv/$service/supervise > test -d /usr/local/etc/sv/$service/log/supervise \ > && rm -rf /usr/local/etc/sv/$service/log/supervise > done I happened upon an earlier mailing list thread, "sv exit doesn't seem to work properly" [1], and changed the code to stop runsvdir and then do a "sv exit" on each service [2], however it didn't help at all. For simplification the basic question is why "sv exit" doesn't stop runsv and runsv's associated processes with this particular service, bcfg2-client; if that can be fixed, then the rest of the problem is also solved. The run code for bcfg2-client was in my previous email, or you can see it here [3] and the script that it calls is here [4]. [1] Re: sv exit doesn't seem to work properly http://article.gmane.org/gmane.comp.sysutils.supervision.general/1259 [2] runit-1.7.2.ep: preremove script http://www.bcfg2.org/browser/branches/feature/runit/encap/src/encap-profiles/runit-1.7.2.ep#L182 [3] bcfg2-client run script http://www.bcfg2.org/browser/branches/feature/runit/encap/src/encap-profiles/bcfg2-0.9.2.ep#L409 [4] Script that the bcfg2-client run script kicks off http://www.bcfg2.org/browser/branches/feature/runit/encap/src/encap-profiles/bcfg2-0.9.2.ep#L373 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: How to kill runsv, no matter what? 2007-02-21 21:04 ` Daniel Clark @ 2007-02-23 3:51 ` Daniel Clark 2007-02-23 12:02 ` Laurent Bercot 2007-02-23 14:05 ` Gerrit Pape 0 siblings, 2 replies; 14+ messages in thread From: Daniel Clark @ 2007-02-23 3:51 UTC (permalink / raw) To: supervision [-- Attachment #1: Type: text/plain, Size: 1656 bytes --] I made a simple test case that should make this bug (or my error in using the software) easy to reproduce. I'm attaching it since it is so tiny; it is also available from http://opensysadmin.com/bugs/runit/test1-service.tar.bz2 Below is a transcript of using it to demonstrate the problem: root@cmlab:/tmp# tar xfj test1-service.tar.bz2 root@cmlab:/tmp# cd test1-service/ root@cmlab:/tmp/test1-service# ./runsvdir-here ^C root@cmlab:/tmp/test1-service# ps auxw | grep [s]v root 19882 0.0 0.0 2516 348 ? Ss 22:28 0:00 runsv test1-service root 19883 0.0 0.0 2656 368 ? S 22:28 0:00 /usr/local/bin/svlogd -tt ./logs root 19884 0.0 0.0 10060 1408 ? S 22:28 0:00 /bin/sh ./test1-sv.sh root@cmlab:/tmp/test1-service# sv exit /tmp/test1-service/var-service/test1-service root@cmlab:/tmp/test1-service# sleep 7 root@cmlab:/tmp/test1-service# ps auxw | grep [s]v root 19882 0.0 0.0 2516 348 ? Ss 22:28 0:00 runsv test1-service root 19883 0.0 0.0 2656 368 ? S 22:28 0:00 /usr/local/bin/svlogd -tt ./logs root@cmlab:/tmp/test1-service# rm var-service/test1-service root@cmlab:/tmp/test1-service# sleep 7 root@cmlab:/tmp/test1-service# ps auxw | grep [s]v root 19882 0.0 0.0 2516 348 ? Ss 22:28 0:00 runsv test1-service root 19883 0.0 0.0 2656 368 ? S 22:28 0:00 /usr/local/bin/svlogd -tt ./logs (I would think runsv and svlogd should not be showing up here, because runsvdir is no longer running, sv exit has been called, and the run director has been removed, with >5 second pauses between the removal and the ps) [-- Attachment #2: test1-service.tar.bz2 --] [-- Type: application/x-bzip2, Size: 619 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: How to kill runsv, no matter what? 2007-02-23 3:51 ` Daniel Clark @ 2007-02-23 12:02 ` Laurent Bercot 2007-02-23 14:05 ` Gerrit Pape 1 sibling, 0 replies; 14+ messages in thread From: Laurent Bercot @ 2007-02-23 12:02 UTC (permalink / raw) To: supervision > I made a simple test case that should make this bug (or my error in > using the software) easy to reproduce. I'm attaching it since it is so > tiny; it is also available from > http://opensysadmin.com/bugs/runit/test1-service.tar.bz2 Please try not to send binaries to the list... if it's so tiny, then some attached text files could do - and if it's not, well, you did the right thing anyway (i.e. make the tarball available on the Web) so there's no point in sending the binary to the list... Thank you. ;) -- Laurent ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: How to kill runsv, no matter what? 2007-02-23 3:51 ` Daniel Clark 2007-02-23 12:02 ` Laurent Bercot @ 2007-02-23 14:05 ` Gerrit Pape 2007-02-23 14:24 ` Alex Efros 2007-02-23 17:32 ` Daniel Clark 1 sibling, 2 replies; 14+ messages in thread From: Gerrit Pape @ 2007-02-23 14:05 UTC (permalink / raw) To: supervision On Thu, Feb 22, 2007 at 10:51:50PM -0500, Daniel Clark wrote: > I made a simple test case that should make this bug (or my error in > using the software) easy to reproduce. I'm attaching it since it is so > tiny; it is also available from > http://opensysadmin.com/bugs/runit/test1-service.tar.bz2 > > Below is a transcript of using it to demonstrate the problem: > > root@cmlab:/tmp# tar xfj test1-service.tar.bz2 > root@cmlab:/tmp# cd test1-service/ > root@cmlab:/tmp/test1-service# ./runsvdir-here > ^C > root@cmlab:/tmp/test1-service# ps auxw | grep [s]v > root 19882 0.0 0.0 2516 348 ? Ss 22:28 0:00 runsv > test1-service > root 19883 0.0 0.0 2656 368 ? S 22:28 0:00 > /usr/local/bin/svlogd -tt ./logs > root 19884 0.0 0.0 10060 1408 ? S 22:28 0:00 > /bin/sh ./test1-sv.sh > root@cmlab:/tmp/test1-service# sv exit > /tmp/test1-service/var-service/test1-service > root@cmlab:/tmp/test1-service# sleep 7 > root@cmlab:/tmp/test1-service# ps auxw | grep [s]v > root 19882 0.0 0.0 2516 348 ? Ss 22:28 0:00 runsv > test1-service > root 19883 0.0 0.0 2656 368 ? S 22:28 0:00 > /usr/local/bin/svlogd -tt ./logs > root@cmlab:/tmp/test1-service# rm var-service/test1-service > root@cmlab:/tmp/test1-service# sleep 7 > root@cmlab:/tmp/test1-service# ps auxw | grep [s]v > root 19882 0.0 0.0 2516 348 ? Ss 22:28 0:00 runsv > test1-service > root 19883 0.0 0.0 2656 368 ? S 22:28 0:00 > /usr/local/bin/svlogd -tt ./logs > > (I would think runsv and svlogd should not be showing up here, because > runsvdir is no longer running, sv exit has been called, and the run > director has been removed, with >5 second pauses between the removal > and the ps) When asked to exit, the runsv supervisor makes sure that all logs are written to the log service before terminating; it first sends TERM to the main service, then waits for it to terminate, and finally waits for the log service to terminate, before runsv exits itself. In the case of your example service, the main run script execs into a shell script that starts a 'sleep' subprocess. Now when runsv is told to exit, it sends the service (the ./test1-sv.sh shell script) a TERM signal, the shell script terminates (fine), but is leaving behind the 'sleep' subprocess. The log service's run script execs into a svlogd process, svlogd will terminate as soon as it sees end-of-file on the pipe connected to its standard input. Because there's still the 'sleep' subprocess running with its output connected to the pipe, and so to svlogd's standard input, svlogd will wait; it might well be that there's still data available on the pipe to be written to the logs. Once the 'sleep' subprocess exits, runsv should exit too. HTH, Gerrit. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: How to kill runsv, no matter what? 2007-02-23 14:05 ` Gerrit Pape @ 2007-02-23 14:24 ` Alex Efros 2007-02-23 17:40 ` Daniel Clark 2007-02-23 17:32 ` Daniel Clark 1 sibling, 1 reply; 14+ messages in thread From: Alex Efros @ 2007-02-23 14:24 UTC (permalink / raw) To: supervision Hi! On Fri, Feb 23, 2007 at 02:05:03PM +0000, Gerrit Pape wrote: > to exit, it sends the service (the ./test1-sv.sh shell script) a TERM > signal, the shell script terminates (fine), but is leaving behind the There one another similar issue: if service run interactive bash (getty-like services) then it also will not stop. # sv t getty1 send SIGTERM while bash require SIGHUP or SIGKILL instead of SIGTERM. Moreover, if you run mc - it will run it's own bash which also should be killed to restart getty service... and same is true for things like su. To solve this I create script /usr/local/bin/term-getty-service: ---cut--- #!/bin/bash bashs() { while [[ -n "$1" ]]; do pgrep -P $1 bash; bashs $(pgrep -P $1); shift; done; } bashs="$( bashs $(<supervise/pid) )" [[ -n "$bashs" ]] && kill -HUP $bashs exit 1 # runsv must send TERM to getty if user don't logged in this console ---cut--- You should create symlink to it from service's ./control/t: # ln -s /usr/local/bin/term-getty-service \ /var/service/getty-tty1/control/t -- WBR, Alex. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: How to kill runsv, no matter what? 2007-02-23 14:24 ` Alex Efros @ 2007-02-23 17:40 ` Daniel Clark 0 siblings, 0 replies; 14+ messages in thread From: Daniel Clark @ 2007-02-23 17:40 UTC (permalink / raw) To: supervision On 2/23/07, Alex Efros <powerman@powerman.asdfgroup.com> wrote: > Hi! > > On Fri, Feb 23, 2007 at 02:05:03PM +0000, Gerrit Pape wrote: > > to exit, it sends the service (the ./test1-sv.sh shell script) a TERM > > signal, the shell script terminates (fine), but is leaving behind the > > There one another similar issue: if service run interactive bash > (getty-like services) then it also will not stop. > > # sv t getty1 > > send SIGTERM while bash require SIGHUP or SIGKILL instead of SIGTERM. > Moreover, if you run mc - it will run it's own bash which also should > be killed to restart getty service... and same is true for things like su. > To solve this I create script /usr/local/bin/term-getty-service: > > ---cut--- > #!/bin/bash > bashs() { while [[ -n "$1" ]]; do pgrep -P $1 bash; bashs $(pgrep -P $1); shift; done; } > bashs="$( bashs $(<supervise/pid) )" > [[ -n "$bashs" ]] && kill -HUP $bashs > exit 1 # runsv must send TERM to getty if user don't logged in this console > ---cut--- > > You should create symlink to it from service's ./control/t: > > # ln -s /usr/local/bin/term-getty-service \ > /var/service/getty-tty1/control/t That looks very inventive (and dense :-), but not very cross-platform, which is the primary reason I am interested in runit (e.g. I want to maintain non-vendor services on AIX, GNU/Linux, Solaris, *BSD etc. in the same way -- many of these systems don't come standard with bash). Perhaps a "kill with extreme prejudice" type flag implemented in the runit code itself is in order? I really like to have commands available that are deterministic (e.g. if I tell sv to kill something with this flag, it dies, don't pass go, don't collect $200) -- Daniel Clark # http://dclark.us # http://opensysadmin.com ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: How to kill runsv, no matter what? 2007-02-23 14:05 ` Gerrit Pape 2007-02-23 14:24 ` Alex Efros @ 2007-02-23 17:32 ` Daniel Clark 2007-02-23 17:39 ` Paul Jarc 1 sibling, 1 reply; 14+ messages in thread From: Daniel Clark @ 2007-02-23 17:32 UTC (permalink / raw) To: supervision On 2/23/07, Gerrit Pape <pape@smarden.org> wrote: > On Thu, Feb 22, 2007 at 10:51:50PM -0500, Daniel Clark wrote: > > I made a simple test case that should make this bug (or my error in > > using the software) easy to reproduce. I'm attaching it since it is so > > tiny; it is also available from > > http://opensysadmin.com/bugs/runit/test1-service.tar.bz2 > > > When asked to exit, the runsv supervisor makes sure that all logs are > written to the log service before terminating; it first sends TERM to > the main service, then waits for it to terminate, and finally waits for > the log service to terminate, before runsv exits itself. > > In the case of your example service, the main run script execs into a > shell script that starts a 'sleep' subprocess. Now when runsv is told > to exit, it sends the service (the ./test1-sv.sh shell script) a TERM > signal, the shell script terminates (fine), but is leaving behind the > 'sleep' subprocess. The log service's run script execs into a svlogd > process, svlogd will terminate as soon as it sees end-of-file on the > pipe connected to its standard input. Because there's still the 'sleep' > subprocess running with its output connected to the pipe, and so to > svlogd's standard input, svlogd will wait; it might well be that there's > still data available on the pipe to be written to the logs. Once the > 'sleep' subprocess exits, runsv should exit too. Ah, that makes a lot of sense. However I'm not seeing how this behavior can mesh with package management systems. e.g.: (a) I install an "runit" package, which starts up a runsvdir process (b) I link some services into my runscvdir /var/service directory; I can't really control if those processes start child processes in many cases; let's say there is a service like my example service among the services (in practice, I'm guessing there is probably some way I can get my shell script to capture TERM and kill the 'sleep' process before exiting itself) (d) I remove the "runit" package. Since I am no longer going to have "runit" installed, I think it follows that all "runit" processes, such as svlogd, need to be gracefully shut down, no matter what their state. (e) Runit is removed, but there are some svlogd processes still around, and therefore also still some files tracking runit state in my /etc/sv directory (f) I install Runit again. (g) I want to re-enable my service, so I again link the service into my /var/service directory. However since there is still a svlogd process running (or I killed it manually), there is still lingering state information in /etc/sv, so runit is confused and complains. So I guess my question is, is there any way to handle the install-remove-install case cleanly with runit? In practice this may not be an issue, but I'm running into it all the time in testing. The previously running svlogd causes failure in 2 ways: (a) the state in /etc/sv/servicename confuses runit, and (b) it wants to write to the same log file as any new svlogd daemons that start up. Actually, wouldn't this also be a problem if I just wanted to force-restart a service that spawns child processes? If the service is restarted but the old logging daemon doesn't get force-killed, don't I run into the same situation as with the install-remove-install (2 conflicting svlogd processes)? Thanks, -- Daniel Clark # http://dclark.us # http://opensysadmin.com ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: How to kill runsv, no matter what? 2007-02-23 17:32 ` Daniel Clark @ 2007-02-23 17:39 ` Paul Jarc 2007-02-23 17:46 ` Daniel Clark 0 siblings, 1 reply; 14+ messages in thread From: Paul Jarc @ 2007-02-23 17:39 UTC (permalink / raw) To: Daniel Clark; +Cc: supervision "Daniel Clark" <dclark@pobox.com> wrote: > I can't really control if those processes start child processes in > many cases It's fine if they start child processes, but if they don't clean up their children when exiting, that's a bug in those services. paul ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: How to kill runsv, no matter what? 2007-02-23 17:39 ` Paul Jarc @ 2007-02-23 17:46 ` Daniel Clark 2007-02-23 17:59 ` Paul Jarc 0 siblings, 1 reply; 14+ messages in thread From: Daniel Clark @ 2007-02-23 17:46 UTC (permalink / raw) To: supervision On 2/23/07, Paul Jarc <prj@po.cwru.edu> wrote: > "Daniel Clark" <dclark@pobox.com> wrote: > > I can't really control if those processes start child processes in > > many cases > > It's fine if they start child processes, but if they don't clean up > their children when exiting, that's a bug in those services. I don't know enough about services to know if that is correct - Alex seems to have a counterexample - but the original daemontools seems to work with services with this "bug", and both daemontools and (I think) runit have a suite of tools to hack around issues with services that aren't designed to work with the supervision model of service control (e.g. the thing that forces processes to stay in the foreground). Actually, perhaps that would be the best way to deal with this - some small binary that can be used instead of exec in "run" scripts that has the property of killing all of its child processes when it dies - would something like that be feasible? -- Daniel Clark # http://dclark.us # http://opensysadmin.com ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: How to kill runsv, no matter what? 2007-02-23 17:46 ` Daniel Clark @ 2007-02-23 17:59 ` Paul Jarc 2007-02-23 18:25 ` Daniel Clark 0 siblings, 1 reply; 14+ messages in thread From: Paul Jarc @ 2007-02-23 17:59 UTC (permalink / raw) To: Daniel Clark; +Cc: supervision "Daniel Clark" <dclark@pobox.com> wrote: > the original daemontools seems to work with services with this "bug" Yes, but only because it makes no attempt to ensure that log data is written before the logger is shut down. > and both daemontools and (I think) runit have a suite of tools to > hack around issues with services that aren't designed to work with > the supervision model of service control (e.g. the thing that forces > processes to stay in the foreground). That's true, but those are just the easy workarounds, and they're imperfect. (E.g., they don't relay signals.) Reliably handling log data and simultaneously working around services that leave stray child processes is a hard problem, and the easiest solution known so far is to fix each individual service. > Actually, perhaps that would be the best way to deal with this - some > small binary that can be used instead of exec in "run" scripts that > has the property of killing all of its child processes when it dies - > would something like that be feasible? That could work for some cases (but, like pgrphack et al., it would be sandwiched between exec and the real service, not used in place of exec). It would have to initially put itself in its own process group, relay SIGTERM to every process in that process group, and relay other signals to its immediate child. But this won't help if the service or its children put themselves in their own process group. Also, SIGKILL and SIGSTOP can't be relayed, so you lose functionality there too. So fixing the service still remains an attractive option. paul ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: How to kill runsv, no matter what? 2007-02-23 17:59 ` Paul Jarc @ 2007-02-23 18:25 ` Daniel Clark 2007-02-23 18:32 ` Paul Jarc 0 siblings, 1 reply; 14+ messages in thread From: Daniel Clark @ 2007-02-23 18:25 UTC (permalink / raw) To: supervision On 2/23/07, Paul Jarc <prj@po.cwru.edu> wrote: > "Daniel Clark" <dclark@pobox.com> wrote: > > the original daemontools seems to work with services with this "bug" > > Yes, but only because it makes no attempt to ensure that log data is > written before the logger is shut down. > > > and both daemontools and (I think) runit have a suite of tools to > > hack around issues with services that aren't designed to work with > > the supervision model of service control (e.g. the thing that forces > > processes to stay in the foreground). > > That's true, but those are just the easy workarounds, and they're > imperfect. (E.g., they don't relay signals.) Reliably handling log > data and simultaneously working around services that leave stray child > processes is a hard problem, and the easiest solution known so far is > to fix each individual service. Okay, so let's assume we have a service that does not have this "bug", but that is running and shouldn't be force killed (e.g. we want to wait until sleep times out, or until some non-atomic process is complete). Is there any way to block until that happens? When "sv exit" returns with a 0 exit code and no text, I tend to think that it was actually successful in killing all of the processes associated with a service; ditto for using rm to remove a service link. I think this is what the "principle of least surprise" would dictate as well. I guess what I dislike most about the current behavior is that the dangling runsv/svlogd processes seem to have no connection to anything any more - you've removed the /var/services/servicename link (and perhaps the /etc/sv/servicename directory as well), and you have these zombie-like background processes running, for which there is no longer any (obvious to me) way to get information on with the runit tools; so if you want to make sure you can reinstall a service cleanly, or remove and then reinstall runit, you have to grep through the output of ps, which is exactly the kind of thing that the supervision scheme was created to avoid. -- Daniel Clark # http://dclark.us # http://opensysadmin.com ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: How to kill runsv, no matter what? 2007-02-23 18:25 ` Daniel Clark @ 2007-02-23 18:32 ` Paul Jarc 2007-02-28 23:24 ` Daniel Clark 0 siblings, 1 reply; 14+ messages in thread From: Paul Jarc @ 2007-02-23 18:32 UTC (permalink / raw) To: Daniel Clark; +Cc: supervision "Daniel Clark" <dclark@pobox.com> wrote: > Okay, so let's assume we have a service that does not have this "bug", > but that is running and shouldn't be force killed (e.g. we want to > wait until sleep times out, or until some non-atomic process is > complete). Is there any way to block until that happens? sv -v http://smarden.org/runit/sv.8.html paul ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: How to kill runsv, no matter what? 2007-02-23 18:32 ` Paul Jarc @ 2007-02-28 23:24 ` Daniel Clark 0 siblings, 0 replies; 14+ messages in thread From: Daniel Clark @ 2007-02-28 23:24 UTC (permalink / raw) To: supervision On 2/23/07, Paul Jarc <prj@po.cwru.edu> wrote: > "Daniel Clark" <dclark@pobox.com> wrote: > > Okay, so let's assume we have a service that does not have this "bug", > > but that is running and shouldn't be force killed (e.g. we want to > > wait until sleep times out, or until some non-atomic process is > > complete). Is there any way to block until that happens? > > sv -v > http://smarden.org/runit/sv.8.html Thanks; I now have a package of runit that I can install/uninstall/reinstall consistently without leaving anything behind. It uses a combination of sv -v (to avoid the problem) on package remove, and a kill pipeline (not yet tested on *nix other than GNU/Linux) on install. Sort of ugly, but it works. If anyone else uses encap, the package is up at: http://tinyurl.com/2nrdx7 It works for running runit's runsvdir under inittab or upstart control. -- Daniel Clark # http://dclark.us # http://opensysadmin.com ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2007-02-28 23:24 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-02-21 20:14 How to kill runsv, no matter what? Daniel Clark 2007-02-21 21:04 ` Daniel Clark 2007-02-23 3:51 ` Daniel Clark 2007-02-23 12:02 ` Laurent Bercot 2007-02-23 14:05 ` Gerrit Pape 2007-02-23 14:24 ` Alex Efros 2007-02-23 17:40 ` Daniel Clark 2007-02-23 17:32 ` Daniel Clark 2007-02-23 17:39 ` Paul Jarc 2007-02-23 17:46 ` Daniel Clark 2007-02-23 17:59 ` Paul Jarc 2007-02-23 18:25 ` Daniel Clark 2007-02-23 18:32 ` Paul Jarc 2007-02-28 23:24 ` Daniel Clark
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).