From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/10256 Path: news.gmane.org!not-for-mail From: Igmar Palsenberg Newsgroups: gmane.linux.lib.musl.general Subject: Re: abort() fails to terminate PID 1 process Date: Sun, 3 Jul 2016 12:43:59 +0200 (CEST) Message-ID: References: <20160620100443.GV22574@port70.net> <20160620194110.GM10893@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Trace: ger.gmane.org 1467542621 11006 80.91.229.3 (3 Jul 2016 10:43:41 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 3 Jul 2016 10:43:41 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-10269-gllmg-musl=m.gmane.org@lists.openwall.com Sun Jul 03 12:43:41 2016 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1bJesH-0006GE-2N for gllmg-musl@m.gmane.org; Sun, 03 Jul 2016 12:43:41 +0200 Original-Received: (qmail 9381 invoked by uid 550); 3 Jul 2016 10:43:35 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 9359 invoked from network); 3 Jul 2016 10:43:34 -0000 DKIM-Filter: OpenDKIM Filter v2.10.3 s1.palsenberg.com u63AhxeL014120 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=palsenberg.com; s=s1; t=1467542640; bh=ptscYd8QuoimaSI9HRIE3nVF76mN1NzZ6QZ6xWWdBro=; h=Date:From:To:Subject:In-Reply-To:References:From; b=KI+6pHrx1TpMKCu4sDem8STiDyNBOMFlm3X4gPgrCE0V6RFmOplmlRrGZNK3DGMDN ywU07OnMEbaSux9ISxy/eTdQ0bhZWaAAyxdRfuN27fh3WF3aThcC+4nluFwhASL7fg vmgdMz3UFawSj266HKwy+yakcdEVstdVdToibmPo= In-Reply-To: <20160620194110.GM10893@brightrain.aerifal.cx> User-Agent: Alpine 2.20 (LRH 67 2015-01-07) X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.5.16 (s1.palsenberg.com [127.0.0.1]); Sun, 03 Jul 2016 12:43:59 +0200 (CEST) X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,BAYES_00 autolearn=ham autolearn_force=no version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on s1.palsenberg.com Xref: news.gmane.org gmane.linux.lib.musl.general:10256 Archived-At: > > That rule doesn't apply to pid 1 by default. Pid 1 should be a proper init > > system, not a full blows application that makes the system blow up on > > every error. > > abort is specified to terminate the process no matter what. Yes. But like mentioned : pid 1 is an exception to this. > For it to > ever be able to return is a serious bug since both the compiler and > the programmer can assume any code after abort() is unreachable. This specific case talked about pid 1. pid 1 has kernel protection, normal userspace processes don't. In that case, the normal assumptions don't hold up. > At > present musl avoids this worst-case failure (wrongfully returning) > with an infinite loop, but that's just a fail-safe. The intent is that > it terminate, and in particular, terminate abnormally as specified, > which we don't do enough to guarantee (SIGKILL is not "abnormal" > termination). So there's definitely work to be done to fix this. It's > an issue I've been aware of for a long time but the kernel makes it > painful to reliably produce abnormal termination without race > conditions. Can this even be reproduced under normal circumstances (aka : not pid 1) ? If thes, then I agree : It's a bug. If no : Then not. If people have a broken container init system, then it breaks and they keep the pieces. > > Well, normally abort() does some signal magic, and then raises again. > > Which is what POSIX mandates I think. > > To make this work reliably I think we need to make abort() take a lock > the precludes further calls to sigaction prior to re-raising SIGABRT > and resetting the disposition. But there are all sorts of > complications to deal with. For example if another thread performs > posix_spawn for fork and exec concurrent with abort() munging the > disposition of SIGABRT, the child process could start with the wrong > disposition for SIGABRT, which would be non-conforming. Finding ways > to fix all places where the wrong behavior may be observable is a > nontrivial problem. Does the whole guaranteed termination also includes threaded programs ? > > If you're pid 1 however, you should behave like one. > > I tend to agree, but if you're libc you should also behave as > specified, and currently we don't in this regard. Sure, but like mentioned : Normal rules don't apply to pid 1. Igmar