From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.comp.sysutils.supervision.general/2583 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Guillermo Newsgroups: gmane.comp.sysutils.supervision.general Subject: Re: race condition in killall Date: Sat, 11 May 2019 15:29:28 -0300 Message-ID: References: <20190317132532.GA22622@CasperVector> <20190317153002.52c28cf7@dickeberta> <20190319124239.GA26884@CasperVector> <20190319165853.6bb9f44a@flunder> <20190320051439.GA7636@CasperVector> <20190504060704.GA27290@CasperVector> <20190505015551.GF2595@panda> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="28476"; mail-complaints-to="usenet@blaine.gmane.org" To: Supervision Original-X-From: supervision-return-2173-gcsg-supervision=m.gmane.org@list.skarnet.org Sat May 11 20:29:41 2019 Return-path: Envelope-to: gcsg-supervision@m.gmane.org Original-Received: from alyss.skarnet.org ([95.142.172.232]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1hPWkf-0007JI-4P for gcsg-supervision@m.gmane.org; Sat, 11 May 2019 20:29:41 +0200 Original-Received: (qmail 27080 invoked by uid 89); 11 May 2019 18:30:06 -0000 Mailing-List: contact supervision-help@list.skarnet.org; run by ezmlm Original-Sender: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Original-Received: (qmail 27073 invoked from network); 11 May 2019 18:30:06 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :content-transfer-encoding; bh=sWfA6jqKLLRqsltKA0IW/TNlTbNhE0/xF7aOimnE3Nk=; b=Yw0+sPBLRMKiUyfQUSedU1nGZk8h2HmL5ny4loWfyI/78lLcdHvup12c/qkpbGQc92 FPfQwn3I0aHCgbqQxf2FjbcCn+r5kfPp4tJzwvTnAuevNgnGTxNfC/SUXRVYT8xHq9P1 gaG5SMDlyiuy8kBaUdufy3hv5XXDzXHa9eF59ZeD6V0SqP0yhtpBZ4ng51U46nQa96k/ 3oA9ieJAicxelqkECe5uLZb++wQHQ2Rx78/wWiuWWWKVgrRxBTLjgm2fjT96dLKT83KM ZQtwT3cvINQWc5WiZA4A2k+fCAKXra+rwptUdlsZDLGWy1UdIimPlE/SEFxl7qVMJ5Tn Phvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:content-transfer-encoding; bh=sWfA6jqKLLRqsltKA0IW/TNlTbNhE0/xF7aOimnE3Nk=; b=pexpjXgvXh3ANmChSvncIRyU0Oa6t1xZ9ycp96yEOyh3rC+Ehlk0NHdSdesSCOczpx jwITfhG5ALdfYOUcxDUKqbGGiSVugQcMsIRlLcWRQqP/gWbRnJY9cT4nPK2JDJTYcjZu jWI9cBIEWaAY7DTGAMDK3s1PcpbFNLTdgIwfUljye4U4UkS8Nqc7zzsaLDAQTvl1ueLb PGtV/lYxGW8Fa7HyN3EgIXen17T1U9ef4CgvvOJsSItZjylxplungSby3Akn75S06J6O hbb2EgrTSwJMPox2QsJSTp8xsrudBYtTKcovzWyklG242pH2NK4cDXWdCQWKMDNBBuUq Tujg== X-Gm-Message-State: APjAAAXBV7pESwQ2ARUUEU7/e+0sSwWBhPFqCo+zlMGnrXnafClkGT6y 4otzILgteLuy+Im0WQXbvetJ5W7o4sjzkkK1XSOulA== X-Google-Smtp-Source: APXvYqx/2jrg5UV0LD+VdwIQa5DuYytwrZYM9b9vCRbyoZPRLWZjRrgmKrl9JY0mpXT/faXcn74mAqp3T8RSAKI++oI= X-Received: by 2002:a24:9fc5:: with SMTP id c188mr12695549ite.104.1557599378259; Sat, 11 May 2019 11:29:38 -0700 (PDT) In-Reply-To: <20190505015551.GF2595@panda> Xref: news.gmane.org gmane.comp.sysutils.supervision.general:2583 Archived-At: El s=C3=A1b., 4 may. 2019 a las 22:55, sysinit escribi=C3=B3: > > > pkill(1), killall(1) and killall5(8) all retrieve a process list and > > kill them one by one, instead of calling kill(-1, signal), so a race > > condition can happen thats let some process escape the final SIGKILL. > > interesting. i have not considered this at all. > looks like kill( -1, sig ) from process #1 ensures correctnes here > in a cheap and simple way. I haven't looked at pkill or killall, but it should be noted that killall5 is supposed to *not* send signals to everyone: process 1, processes in the same session (in the POSIX sense), and processes specified with the -o option, if given, are excluded. So it has to retrieve a process list and classify. If the signal is SIGKILL and killall5 is used in a shell script, the session thing generally allows the shell process to continue execution after invocation of the program. And, I suppose, it also allows the process that invoked the script and maybe other ancestors, such as rc subsystem components, to continue execution as well. However, both sysvinit's and BusyBox's kilall5 make a kill(-1, SIGSTOP) call before going through the PID list and selectively sending the requested signal (and I guess Linux does not deliver SIGSTOP to the process that contains the call, or it would be pointless), and make a kill(-1, SIGCONT) call when they are done, so I'm not sure if there's actually a race condition. But yeah, in a version 0.4.x.x s6-linux-init-style setup, where the stage 3 init can just spawn a process that makes a kill(-1, sig) call, all this is not needed, and just using 'kill -KILL -1' or some equivalent is probably the simplest alternative. BTW, the kill program from procps 3.3.15 segfaulted when I tried to use it with a -1 PID argument :/ BusyBox's kill applet, as well as Bash's builtin kill utility (i.e. sh -c 'kill -KILL -1') did work when used like this. I haven't tried s6-nuke, but I'm assuming it works since s6-linux-init-04.x.x relies on it, and haven't tried util-linux's and GNU Coreutils' kill either. > OpenRC also provides a tool for that task btw: > /libexec/rc/bin/kill_all Yeah, ${LIBEXECDIR}/bin/kill_all works like kilall5. OpenRC used to have a killall5 invocation in its 'killprocs' service script, which meant a runtime dependency on a package that provided the program. Probably not a problem in a sysvinit + OpenRC or BusyBox init + OpenRC setup, but ugly in a 'pure' OpenRC setup (i.e. with openrc-init as process 1). G.