From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13841 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Stdio resource usage Date: Thu, 21 Feb 2019 12:02:59 -0500 Message-ID: <20190221170259.GH23599@brightrain.aerifal.cx> References: <20190220104901.GU21289@port70.net> <20190220154740.GD19969@voyager> <7816f8b4-644c-87e3-24c4-4ea2dd404584@adelielinux.org> <20190220191151.GE19969@voyager> <20190220192423.GD23599@brightrain.aerifal.cx> <20190221160937.GF19969@voyager> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="209516"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-13857-gllmg-musl=m.gmane.org@lists.openwall.com Thu Feb 21 18:03:16 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1gwrkh-000sNB-9s for gllmg-musl@m.gmane.org; Thu, 21 Feb 2019 18:03:15 +0100 Original-Received: (qmail 32551 invoked by uid 550); 21 Feb 2019 17:03:12 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 32527 invoked from network); 21 Feb 2019 17:03:12 -0000 Content-Disposition: inline In-Reply-To: <20190221160937.GF19969@voyager> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:13841 Archived-At: On Thu, Feb 21, 2019 at 05:09:37PM +0100, Markus Wichmann wrote: > On Wed, Feb 20, 2019 at 02:24:23PM -0500, Rich Felker wrote: > > For what it's worth, gcc has a -fconserve-stack that in principle > > should avoid this problem, but I could never get it to do anything. If > > it works now we should probably detect and add it to default CFLAGS. > > > > Rich > > Well, that also doesn't help since gcc is the compiler that *doesn't* > exhibit the problem. clang does. And clang doesn't have an option to > conserve stack (that I've seen). > > I am wondering what other possibilities exist to prevent the issue. If > we won't change the algorithm, that only leaves exploring other > possibilities for the memory allocation. There is no algorithm that takes less space, at not without some kind of cubic-in-exponent-value or worse time. The amount of space we use is optimal up to some small factor. It might be possible to shrink this factor with a sharper bound on number of digits needed, with no change in the algorihm, but I think the reduction would be at most something like 20%. > So, what are our choices? > > - Heap allocation: But that can fail. Now, printf() is actually allowed > to fail, but no-one expects it to. I would expect such behavior to be > problematic at best. printf can fail for valid reasons, but snprintf cannot. Technically POSIX allows any interface that can fail to be able to fail for additional implementation-defined reasons, but this is unacceptably bad QoI and completely contrary to the principles of musl, that nothing fails unless there's an underlying reason it has to be able to fail. > - Static allocation: Without synchronization this won't be thread-safe, > with synchronization it won't be re-entrant. Now, as far as I could > see, the printf() family is actually not required to be re-entrant > (e.g. signal-safety(7) fails to list any of them), but I have seen > sprintf() in signal handlers in the wild (well, exception handlers, > really). If you can afford to increase .data size by ~8k, why can'd you just increase stack size by ~8k instead? Of course the latter would scale in number of threads, but presumably if you're this resource-constrained you're not using threads, or can avoid using printf from most of them. > - Thread-local static allocation: Which is always a hassle in libc, and > does not take care of re-entrancy. It would only solve the > thread-safety issue. This is strictly-worse than just using the stack. Implementation-wise, the TLS is equivalent to a stack object on the top-level call frame of the thread. There's no reason to put it there rather than in the bottom-level call frame. > - As-needed stack allocation (e.g. alloca()): This fails to prevent the > worst case allocation, though it would make the average allocation > more bearable. But I don't know if especially clever compilers like > clang wouldn't optimize this stuff away, and we'd be back to square > one. This is what I already suggested (via VLA, not alloca, as the latter is not C and worse in most ways) as a workaround for the clang hoisting of allocations. But in principle the compiler could still see that if the declaration is reachable the size is constant (or even close enough to constant that it could just optimize to a fixed-size array of the upper bound), and optimize out its being variable, then hoist it. So this really is a hack that's "tricking the optimizer", not any fundamental fix. > Any ideas left? Getting clang to fix their hoisting of (large) stack objects beyond their scope/lifetime? Rich