From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/1385 Path: news.gmane.org!not-for-mail From: "Daniel Clark" Newsgroups: gmane.comp.sysutils.supervision.general Subject: Re: How to kill runsv, no matter what? Date: Fri, 23 Feb 2007 13:25:41 -0500 Message-ID: <5422d5e60702231025j690ef1e9lb59d82d0c3c14f39@mail.gmail.com> References: <5422d5e60702211214q7ecaf23co838e9ff1b9be32de@mail.gmail.com> <5422d5e60702211304g5051747aoad3dd893abaf0b16@mail.gmail.com> <5422d5e60702221951h1abb7e60l77717192900a63a8@mail.gmail.com> <20070223140504.17459.qmail@3f646761ee1f68.315fe32.mid.smarden.org> <5422d5e60702230932q609f8ea8n76a3856c8b6cb3cc@mail.gmail.com> <5422d5e60702230946w2a69034exa0848c8c5163a7ad@mail.gmail.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Trace: sea.gmane.org 1172255151 32320 80.91.229.12 (23 Feb 2007 18:25:51 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Fri, 23 Feb 2007 18:25:51 +0000 (UTC) To: supervision@list.skarnet.org Original-X-From: supervision-return-1621-gcsg-supervision=m.gmane.org@list.skarnet.org Fri Feb 23 19:25:44 2007 Return-path: Envelope-to: gcsg-supervision@gmane.org Original-Received: from antah.skarnet.org ([212.85.147.14]) by lo.gmane.org with smtp (Exim 4.50) id 1HKf71-0003hh-9w for gcsg-supervision@gmane.org; Fri, 23 Feb 2007 19:25:43 +0100 Original-Received: (qmail 20317 invoked by uid 76); 23 Feb 2007 18:26:05 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Archive: Original-Received: (qmail 20312 invoked from network); 23 Feb 2007 18:26:04 -0000 DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=OTwmZLBTHBTv7109F64qvD1K+r25gB+qI4BmWUuFTDaRqvW8oARoHIL6wC7ebVo/H2QXTlKa5h3Ohfk1dr1Rfobv/vLVzC9NdccpMeHs3S+SCTJ/g2nmYPekzR2RvhUac1Z2L3Bc5qotE7hiL2JbQSgkD0U3anfTfIlNkdj40s8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:sender:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=h7gDxw1kh1xh4eV/qsK+5nWrxPMFKd9KJvLYrbkPznOMZf4gSOeITG9TxYeKIrL1PBGSJQ13ctQ3YvPqE3LSU1Lk8fPLSkwBy43t0EUXYNOeRbfkGouxIMcKnv8lh/ZJE2mNmVlJMomMu3QB8KuSeU2ww2HUFn+XbdB4pJA2L8c= Original-Sender: djbclark@gmail.com In-Reply-To: Content-Disposition: inline X-Google-Sender-Auth: f29e03c830d16631 Xref: news.gmane.org gmane.comp.sysutils.supervision.general:1385 Archived-At: On 2/23/07, Paul Jarc wrote: > "Daniel Clark" wrote: > > the original daemontools seems to work with services with this "bug" > > Yes, but only because it makes no attempt to ensure that log data is > written before the logger is shut down. > > > and both daemontools and (I think) runit have a suite of tools to > > hack around issues with services that aren't designed to work with > > the supervision model of service control (e.g. the thing that forces > > processes to stay in the foreground). > > That's true, but those are just the easy workarounds, and they're > imperfect. (E.g., they don't relay signals.) Reliably handling log > data and simultaneously working around services that leave stray child > processes is a hard problem, and the easiest solution known so far is > to fix each individual service. Okay, so let's assume we have a service that does not have this "bug", but that is running and shouldn't be force killed (e.g. we want to wait until sleep times out, or until some non-atomic process is complete). Is there any way to block until that happens? When "sv exit" returns with a 0 exit code and no text, I tend to think that it was actually successful in killing all of the processes associated with a service; ditto for using rm to remove a service link. I think this is what the "principle of least surprise" would dictate as well. I guess what I dislike most about the current behavior is that the dangling runsv/svlogd processes seem to have no connection to anything any more - you've removed the /var/services/servicename link (and perhaps the /etc/sv/servicename directory as well), and you have these zombie-like background processes running, for which there is no longer any (obvious to me) way to get information on with the runit tools; so if you want to make sure you can reinstall a service cleanly, or remove and then reinstall runit, you have to grep through the output of ps, which is exactly the kind of thing that the supervision scheme was created to avoid. -- Daniel Clark # http://dclark.us # http://opensysadmin.com