From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <musl-return-15377-ml=inbox.vuxu.org@lists.openwall.com>
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org
X-Spam-Level: 
X-Spam-Status: No, score=-3.0 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS,
	MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,
	RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.2
Received: from mother.openwall.net (mother.openwall.net [195.42.179.200])
	by inbox.vuxu.org (OpenSMTPD) with SMTP id e16c45bc
	for <ml@inbox.vuxu.org>;
	Mon, 3 Feb 2020 21:57:28 +0000 (UTC)
Received: (qmail 23854 invoked by uid 550); 3 Feb 2020 21:57:26 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Reply-To: musl@lists.openwall.com
Received: (qmail 23834 invoked from network); 3 Feb 2020 21:57:25 -0000
Date: Mon, 3 Feb 2020 16:57:13 -0500
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Cc: Simon <simonhf@gmail.com>
Message-ID: <20200203215713.GS1663@brightrain.aerifal.cx>
References: <CABkUXbdOP8d=BzFTpYetmEEKyEwWqwaW7NmmB9vdJacu-wXABQ@mail.gmail.com>
 <CABkUXbee4S_SfUFScAs7xau6Q43U8Upr_L9wM9=p74fkX+32pg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CABkUXbee4S_SfUFScAs7xau6Q43U8Upr_L9wM9=p74fkX+32pg@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: Rich Felker <dalias@aerifal.cx>
Subject: Re: [musl] Why does musl printf() use so much more stack than other
 implementations when printf()ing floating point numbers?

On Mon, Feb 03, 2020 at 01:14:21PM -0800, Simon wrote:
> I recently noticed that musl printf() implementation uses surprisingly more
> stack space than other implementations, but only if printing floating point
> numbers, and made some notes here [1]. Any ideas why this happens, and any
> chance of fixing it?
> 
> [1] https://gist.github.com/simonhf/2a7b7eb98d2a10c549e8cc858bbefd53

It's fundamental; ability to exactly print arbitrary floating point
numbers takes considerable working space unless you want to spend
O(n³) time or so (n=exponent value) to keep recomputing things. The
minimum needed is probably only around 2/3 of what we use, so it would
be possible to reduce slightly, but I doubt a savings of <3k is worth
the complexity of ensuring it would still be safe and correct.

Note that on archs without extended long double type, which covers
everything used in extreme low-memory embedded environments, the
memory usage is far lower. This is because it's proportional to the
max possible exponent value, which is 1k instead of 16k if nothing
larger than IEEE double is supported.

I don't know exactly what glibc does, but it's likely they're just
using malloc, which is going to be incorrect because it can fail
dynamically with OOM.

In principle we could also make the working array a VLA and compute
smaller bounds on the size needed when precision is limited (the
common case). This might really be a practical "fix" for cases people
care about, and it would also solve the problem where LLVM makes
printf *always* use ~9k stack because it hoists the lifetime of the
floating point working array all the way to the top when inlining
(this is arguably a serious optimization bug since it can transform
all sorts of code that's possible to execute into code that's
impossible to execute due to huge stack requirements). By having it be
a VLA whose size isn't determined except in the floating point path,
LLVM wouldn't be able to hoist it like that.

Making this change would still be significant work though, mainly in
verification that the bounds are correct and that there are no cases
where the smaller array can be made to overflow.

Rich