From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/199 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: holywar: malloc() vs. OOM Date: Sun, 24 Jul 2011 08:40:34 -0400 Message-ID: <20110724124034.GI132@brightrain.aerifal.cx> References: <20110724103325.GA24069@albatros> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: dough.gmane.org 1311511886 443 80.91.229.12 (24 Jul 2011 12:51:26 GMT) X-Complaints-To: usenet@dough.gmane.org NNTP-Posting-Date: Sun, 24 Jul 2011 12:51:26 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-283-gllmg-musl=m.gmane.org@lists.openwall.com Sun Jul 24 14:51:22 2011 Return-path: Envelope-to: gllmg-musl@lo.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by lo.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1Qky9R-0006Jk-VQ for gllmg-musl@lo.gmane.org; Sun, 24 Jul 2011 14:51:22 +0200 Original-Received: (qmail 3536 invoked by uid 550); 24 Jul 2011 12:51:21 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 3528 invoked from network); 24 Jul 2011 12:51:21 -0000 Content-Disposition: inline In-Reply-To: <20110724103325.GA24069@albatros> User-Agent: Mutt/1.5.21 (2010-09-15) Xref: news.gmane.org gmane.linux.lib.musl.general:199 Archived-At: On Sun, Jul 24, 2011 at 02:33:25PM +0400, Vasiliy Kulikov wrote: > Rich, > > This is more a question about your malloc() failure policy for musl than > an actual proposal. > > [...] > > In theory, these are bugs of applications and not of libc, and they > should be fully handled in programs, not in libc. Period. > > But looking at the problem from the pragmatic point of view we'll see > that libc is actually the easiest place where the problem may be > workarounded (not fixed, surely). The workaround would be simply > raising SIGKILL if malloc() fails (either because of brk() or mmap()). > For the rare programs craving to handle OOM such code should be used: This is absolutely wrong and non-conformant. It will also ruin all robust programs and result in massive data loss, deadlock with shared locks due to failure to release locks before termination, and all sorts of ills. It also creates trivial DoS opportunities; for example you could kill a daemon that uses glob() simply by passing it a glob expression that matches millions or billions of files. (It may be a bad idea, from a load standpoint, to be using glob in a daemon, but it should simply result in high load then failure, not crashing.) > #define _OOM_MAY_FAIL_ > #include > > Then the workaround is disabled. Being broken by default is not acceptable to me. The other way around could be acceptable, but I'm very doubtful that it would fix any real-world bugs. The modern mmap min address is very high, and it's quite rare for apps to access the end of their allocation before the beginning anyway. The only common situation I can think of where it might happen to initially access a high offset first is when calling glibc's memcpy which sometimes chooses to copy backwards. musl's memcpy does not take this liberty, even if it might be faster in some cases, for that very reason - it's dangerous to access high offsets first if a program was not careful about checking the return value of malloc. A better solution might be to have a gcc option to generate a read from the base address the first time a function performs arithmetic on a pointer it has not already checked. This is valid because the C language does not allow pointer arithmetic to cross object boundaries, and this approach could be made 100% correct rather than being a heuristic that breaks correct applications. It would impose some performance cost, but I doubt it would be high. (Note: Some special handling might be required for "one past the end of an array" pointers here. I'd have to think a bit longer to work out the details but I think it's possible to handle them safely in a similar way.) > Probably I overestimate the importance of OOM errors, and (1) in > particular. However, I think it is worth discussing. I don't think you overestimate the importance of OOM errors. Actually Linux desktop is full of OOM errors that ruin usability, like file managers that hang the system for 5 minutes then crash if you navigate to a directory with a 15000x15000 image file. Unfortunately I don't think it's possible to fix at the libc level, and fixing the worst issues (DoS from apps crashing when they should not crash) usually involves both sanity-checking the size prior to calling malloc *and* checking the return value of malloc... BTW great subject line! :-) Rich