From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.1 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: from second.openwall.net (second.openwall.net [193.110.157.125]) by inbox.vuxu.org (Postfix) with SMTP id 504152D865 for ; Tue, 27 Aug 2024 23:32:25 +0200 (CEST) Received: (qmail 15622 invoked by uid 550); 27 Aug 2024 21:32:19 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 15586 invoked from network); 27 Aug 2024 21:32:19 -0000 Date: Tue, 27 Aug 2024 17:32:10 -0400 From: Rich Felker To: Pedro Falcato Cc: musl@lists.openwall.com Message-ID: <20240827213210.GF10433@brightrain.aerifal.cx> References: <20240826200958.GD10433@brightrain.aerifal.cx> <20240827152133.GE10433@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="enLffk0M6cffIOOh" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] Proposed printf stack usage improvement --enLffk0M6cffIOOh Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Tue, Aug 27, 2024 at 04:42:35PM +0100, Pedro Falcato wrote: > On Tue, Aug 27, 2024 at 11:21:33AM GMT, Rich Felker wrote: > > On Tue, Aug 27, 2024 at 10:23:57AM +0100, Pedro Falcato wrote: > > > LGTM. > > > > > > But maybe you should also include my __attribute__((noinline)) > > > sugestion, to make sure the integer printf and floating point paths > > > get mixed by the compiler. Even if current gcc/clang don't seem to > > > want to do that, it's better to be safe than sorry (and I assume any > > > LTO/PGO might change that atm). > > > > I'm not clear what ill effect you're trying to mitigate here. > > (fwiw, if it wasn't clear: I meant "make sure the <...> *don't* get mixed) > > fmt_fp with the patch applied still has a significant stack impact (520 bytes according to my > measurement) which can be avoided on the vast majority of (integer) printfs. How did you measure? There should be essentially no static stack usage in fmt_fp with this patch, only dynamic (VLA). On archs with ld==double, it's possible that the compiler could decide to "optimize" a VLA whose size can only have one possible value to a non-VLA, then lift if, but this would be a highly malicious transformation that could lead to much more catastrophic stack overflows in real-world usage I think, so I would not expect compilers to do it. Indeed a quick check of the attached, which I wrote to be as naively easy to mis-optimize as possible, shows neither gcc nor clang lifting the VLA. > printf_core OTOH uses up 472 bytes of stack, so the simple possibility of inlining it can > (worst case) more than double the stack space used by all printfs. > > Granted, the patch seems to convince clang not to inline fmt_fp at all, but AFAIK this is by no means > a guarantee. GCC inlines it fine, which is a good thing. This is a function which is called only one place, and just outlined in the source for the sake of readability, having its own locals, etc. There's no good reason to *want* the call boundary overhead. At some point it might make sense to move fmt_fp to its own TU if we want to have a way to suppress it from getting linked at all, and this would also force non-inlining. But it doesn't seem to be desirable to suppress inlining for its own sake. > One could consider this somewhat of a microoptimization, but musl thread stacks are by no > means big, so... I think generally we don't care about 500 bytes anyway -- I'm not going to deem a function that overflows the last 500 bytes of a stack that's too small a bug. Even printf using 8k wasn't a "bug"; the main motivation for changing this is not to let people YOLO calling printf with a stack that's barely big enough, but to avoid dirtying extra pages for no good reason. The 8k pretty much unconditionally dirtied 2 extra otherwise-unused pages for any program using printf. Rich --enLffk0M6cffIOOh Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="vla_lift.c" void bah(int *); void foo(int n) { int m = 10000; if (n) { char bar[m]; bah(bar); } } --enLffk0M6cffIOOh--