From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/10956 Path: news.gmane.org!.POSTED!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Fix pthread_create on some devices failing to initialize guard area Date: Fri, 20 Jan 2017 14:56:49 -0500 Message-ID: <20170120195649.GS1533@brightrain.aerifal.cx> References: Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1484942225 14882 195.159.176.226 (20 Jan 2017 19:57:05 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Fri, 20 Jan 2017 19:57:05 +0000 (UTC) User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-10971-gllmg-musl=m.gmane.org@lists.openwall.com Fri Jan 20 20:57:01 2017 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1cUfIx-0003KA-Jd for gllmg-musl@m.gmane.org; Fri, 20 Jan 2017 20:56:59 +0100 Original-Received: (qmail 14071 invoked by uid 550); 20 Jan 2017 19:57:02 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 14035 invoked from network); 20 Jan 2017 19:57:01 -0000 Content-Disposition: inline In-Reply-To: Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:10956 Archived-At: On Fri, Jan 20, 2017 at 11:45:09AM -0800, Eric Hassold wrote: > Hi All, > > While deploying test static executable across farm of different > embedded systems, found out that pthread_create() is failing > systematically on some (very few) arm-linux devices whenever non > null stack guard is enabled (that is, also when calling > pthread_create with default - i.e. null - attributes since default > is a one page of guard). One of those device is for example a > Marvell Armada 375 running Linux 3.10.39. Same test code, built with > alternative libc implementations (glibc, uClibc) works as expected > on those devices. > > > Issue > > This occurs because of call to mprotect() in pthread_create fails. > In current implementation, if guard size is non null, memory for > (guard + stack + ...) is first allocated (mmap'ed) with no > accessibility (PROT_NONE), then mprotect() is called to re-enable > read/write access to (memory + guardsize). Since call to mprotect() > systematically fails in this scenario (returning error code EINVAL), > it is impossible to create thread. Failure is ignored and the memory is assumed to be writable in this case, since EINVAL is assumed to imply no MMU. Is this assumption wrong in your case, and if so, can you explain why? > Patch > > In proposed patch (attached below), memory for (guard + stack + ...) > is first mmap'ed with read/write accessibility, then guard area is > protected by calling mprotect() with PROT_NONE on guardsize first > bytes of returned memory. This call to mprotect() to remove all > accessibility on guard area, with guard area being at beginning of > previously mmap'ed memory, works correctly on those platforms having > issue with current implementation. Incidentally, this makes the > logic more concise to handle both cases (with or without guard) is a > more consistent way, and handle systems with partial/invalid page > protection implementation (e.g. mprotect() returning ENOSYS) more > gracefully since the stack is explicitly created with read/write > access. This doesn't work correctly on normal systems with mmu, because the size of the guard pages is accounted against commit charge. Linux should, but AFAIK doesn't, subtract it from commit charge once it's changed to PROT_NONE without having been dirtied, but even if this bug is fixed on the kernel side, there would still be a moment where excess commit charge is consumed and thus where pthread_create might spuriously fail or cause allocations in other processes/threads to fail. If the kernel is not allocating actually-usable address ranges for PROT_NONE on all nommu systems, I think the only solution is to handle EINVAL from mprotect by going back and re-doing the mmap with PROT_READ|PROT_WRITE. Do you have any better ideas? Rich