From: Rich Felker <dalias@libc.org>
To: Michal Terepeta <michalt@google.com>
Cc: musl@lists.openwall.com, t5y-external <t5y-external@google.com>
Subject: Re: Wrong formatting of doubles?
Date: Tue, 1 Jul 2025 21:46:10 -0400 [thread overview]
Message-ID: <20250702014607.GG1827@brightrain.aerifal.cx> (raw)
In-Reply-To: <20250701163758.GD1827@brightrain.aerifal.cx>
[-- Attachment #1: Type: text/plain, Size: 7173 bytes --]
On Tue, Jul 01, 2025 at 12:37:58PM -0400, Rich Felker wrote:
> On Tue, Jul 01, 2025 at 12:19:57PM -0400, Rich Felker wrote:
> > On Tue, Jul 01, 2025 at 10:22:25AM -0400, Rich Felker wrote:
> > > On Tue, Jul 01, 2025 at 07:55:20AM +0200, Michal Terepeta wrote:
> > > > Hi,
> > > >
> > > > We're working on a RISC-V system and using musl as our libc. A recent change
> > > > in musl seems to have caused wrong formatting (printf) of double values:
> > > > https://git.musl-libc.org/cgit/musl/commit/src/stdio/vfprintf.c?id=572a2e2eb91f00f2f25d301cfb50f435e7ae16b3
> > > >
> > > > Using a simple binary to print max double [1] with the current code I get:
> > > >
> > > > ```
> > > > The maximum value of a double (printf %e): 1.348676e+308
> > > > The maximum value of a double (printf %E): 1.348676E+308
> > > > The maximum value of a double (printf %g): 3.13486e+308
> > > > ```
> > > >
> > > > With the patch [2] that reverts most of the above change, I get the expected
> > > > output:
> > > >
> > > > ```
> > > > The maximum value of a double (printf %e): 1.797693e+308
> > > > The maximum value of a double (printf %E): 1.797693E+308
> > > > The maximum value of a double (printf %g): 1.79769e+308
> > > > ```
> > > >
> > > > It'd be great if someone could take a look if they can also repro and perhaps
> > > > revert the change?
> > > >
> > > > Many thanks!
> > > >
> > > > Michal
> > > >
> > > >
> > > >
> > > > [1] Repro program:
> > > >
> > > > ```
> > > > #include <float.h>
> > > > #include <stdio.h>
> > > >
> > > > int main() {
> > > > printf("The maximum value of a double (printf %%e): %e\n", DBL_MAX);
> > > > printf("The maximum value of a double (printf %%E): %E\n", DBL_MAX);
> > > > printf("The maximum value of a double (printf %%g): %g\n", DBL_MAX);
> > > > return 0;
> > > > }
> > > >
> > > > ```
> > > >
> > > >
> > > > [2] Patch to test:
> > > >
> > > > ```
> > > > diff --git a/src/stdio/vfprintf.c b/src/stdio/vfprintf.c
> > > > index 010041ca6c..01004158bf 100644
> > > > --- a/src/stdio/vfprintf.c
> > > > +++ b/src/stdio/vfprintf.c
> > > > @@ -180,12 +180,8 @@
> > > >
> > > > static int fmt_fp(FILE *f, long double y, int w, int p, int fl, int t, int ps)
> > > > {
> > > > - int bufsize = (ps==BIGLPRE)
> > > > - ? (LDBL_MANT_DIG+28)/29 + 1 + // mantissa expansion
> > > > - (LDBL_MAX_EXP+LDBL_MANT_DIG+28+8)/9 // exponent expansion
> > > > - : (DBL_MANT_DIG+28)/29 + 1 +
> > > > - (DBL_MAX_EXP+DBL_MANT_DIG+28+8)/9;
> > > > - uint32_t big[bufsize];
> > > > + uint32_t big[(LDBL_MANT_DIG+28)/29 + 1 // mantissa expansion
> > > > + + (LDBL_MAX_EXP+LDBL_MANT_DIG+28+8)/9]; // exponent expansion
> > > > uint32_t *a, *d, *r, *z;
> > > > int e2=0, e, i, j, l;
> > > > char buf[9+LDBL_MANT_DIG/4], *s;
> > > > ```
> > >
> > > Could you try the attached patch?
> > >
> >
> > > diff --git a/src/stdio/vfprintf.c b/src/stdio/vfprintf.c
> > > index 76733997..2d2f4f3e 100644
> > > --- a/src/stdio/vfprintf.c
> > > +++ b/src/stdio/vfprintf.c
> > > @@ -180,11 +180,10 @@ typedef char compiler_defines_long_double_incorrectly[9-(int)sizeof(long double)
> > >
> > > static int fmt_fp(FILE *f, long double y, int w, int p, int fl, int t, int ps)
> > > {
> > > - int bufsize = (ps==BIGLPRE)
> > > - ? (LDBL_MANT_DIG+28)/29 + 1 + // mantissa expansion
> > > - (LDBL_MAX_EXP+LDBL_MANT_DIG+28+8)/9 // exponent expansion
> > > - : (DBL_MANT_DIG+28)/29 + 1 +
> > > - (DBL_MAX_EXP+DBL_MANT_DIG+28+8)/9;
> > > + int max_mant_dig = (ps==BIGLPRE) ? LDBL_MANT_DIG : DBL_MANT_DIG;
> > > + int max_exp = (ps==BIGLPRE) ? LDBL_MAX_EXP : DBL_MAX_EXP;
> > > + int bufsize = (max_mant_dig+28)/29 + 1 // mantissa expansion
> > > + + (max_exp+max_mant_dig+28+8)/9; // exponent expansion
> > > uint32_t big[bufsize];
> > > uint32_t *a, *d, *r, *z;
> > > int e2=0, e, i, j, l;
> > > @@ -266,7 +265,7 @@ static int fmt_fp(FILE *f, long double y, int w, int p, int fl, int t, int ps)
> > > if (y) y *= 0x1p28, e2-=28;
> > >
> > > if (e2<0) a=r=z=big;
> > > - else a=r=z=big+sizeof(big)/sizeof(*big) - LDBL_MANT_DIG - 1;
> > > + else a=r=z=big+sizeof(big)/sizeof(*big) - max_mant_dig - 1;
> > >
> > > do {
> > > *z = y;
> >
> > To clarify, the root cause here is that the subtraction later down the
> > function to make room for the mantissa, fixed in the last quoted hunk
> > above, was still using a hardcoded LDBL_MANT_DIG rather than adapting
> > to the size of the type being formatted.
> >
> > I was able to reproduce the buf on aarch64, albeit with all-zeros
> > output instead of the prefixed '3'. Hitting it seems to depend on
> > LDBL_MANT_DIG-DBL_MANT_DIG being high enough, which only happens on
> > archs with quad ld (not ld80).
> >
> > For what it's worth, the math does not look right even with the buffer
> > sized for LDBL_MANT_DIG, and it's probably only by luck that it
> > worked. We only reserved one b1b (base-1000000000) slot per 29 bits of
> > mantissa, but adjusted the end pointer to take away a whole b1b slot
> > for each bit of the mantissa, potentially leaving too little room for
> > the exponent expansion. In practice this worked because positive
> > exponent expansions use a lot less (roughly 1/3) the estimated space;
> > it's negative exponent expansions that use the full extimate. (This is
> > because each halving adds an extra digit to the end, but it takes
> > log2(10) doublings to add a digit to the front.)
> >
> > I think the correct code would be something like:
> >
> > + else a=r=z=big+sizeof(big)/sizeof(*big) - (max_mant_dig+28)/29 - 1;
>
> An additional -1 (+1 to the # of slots for mantissa expansion, as in
> the commented expression) is needed here because the do/while loop
> emits a final zero slot for the mantissa. I'm not sure this is
> actually needed later, but as long as it's there it needs to be
> accounted for.
>
> And indeed, with some debug instrumentation, empirically
> z==buf+bufsize after the initial mantissa expansion loop finishes.
>
> > And maybe this math should be done once at the top and reused rather
> > than repeated (i.e. use max_mant_slots instead of max_mant_dig). This
> > would also eliminate the need for the comments by documenting the
> > intermediate size calculations in variable names.
>
> As a result of the above findings, I'm strongly in favor of doing it
> this way. Repeating a confusing expression like this is a formula for
> disaster.
>
> I'd also like to think about ways we could avoid introducing bugs like
> this in the future. In this case, just a testcase for printing DBL_MAX
> in multiple formats would likely have caught it. Simple trapping-only
> UBsan might have caught it, but I'm not sure if that would follow all
> the positional arithmetic in fmt_fp to detect going outside the object
> bounds. This may be worth investigating.
>
> Big thanks to Michal Terepeta for catching this prior to it slipping
> into a release!
Proposed commit attached. Comments welcome. I'll plan to push tomorrow
if there are no objections.
[-- Attachment #2: 0001-printf-fix-regression-in-large-double-formatting-on-.patch --]
[-- Type: text/plain, Size: 2948 bytes --]
From f96e47a26102d537c29435f0abf9ec94676a030e Mon Sep 17 00:00:00 2001
From: Rich Felker <dalias@aerifal.cx>
Date: Tue, 1 Jul 2025 21:30:18 -0400
Subject: [PATCH] printf: fix regression in large double formatting on ld128
archs
commit 572a2e2eb91f00f2f25d301cfb50f435e7ae16b3 adjusted the buffer
for decimal conversion to be a VLA that only uses the full size needed
for long double when the argument type was long double. however, it
failed to update a later expression for the positioning within the
buffer, which still used a fixed offset of LDBL_MANT_DIG. this caused
doubles with a large positive exponent to overflow below the start of
the array, producing wrong output and potentially runaway wrong
execution.
this bug has not been present in any release, and has not been
analyzed in depth for security considerations.
it turns out the original buffer offset expression involving
LDBL_MANT_DIG was incorrect as well, and only worked because the space
reserved for expanding the exponent is roughly 3 times the size it
needs to be when the exponent is positive, leaving plenty of extra
space to compensate for the error. the actual offset should be in
base-1000000000 slot units, not bits, and numerically equal to the
number of slots that were previously allocated for mantissa expansion.
in order to ensure consistency and make the code more comprehensible,
commented subexpressions are replaced by intermediate named variables,
and the newly introduced max_mant_slots is used for both the
allocation and the buffer offset adjustment. the included +1 term
accounts for a trailing zero slot that's always emitted.
---
src/stdio/vfprintf.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/src/stdio/vfprintf.c b/src/stdio/vfprintf.c
index 76733997..a68edabb 100644
--- a/src/stdio/vfprintf.c
+++ b/src/stdio/vfprintf.c
@@ -180,11 +180,11 @@ typedef char compiler_defines_long_double_incorrectly[9-(int)sizeof(long double)
static int fmt_fp(FILE *f, long double y, int w, int p, int fl, int t, int ps)
{
- int bufsize = (ps==BIGLPRE)
- ? (LDBL_MANT_DIG+28)/29 + 1 + // mantissa expansion
- (LDBL_MAX_EXP+LDBL_MANT_DIG+28+8)/9 // exponent expansion
- : (DBL_MANT_DIG+28)/29 + 1 +
- (DBL_MAX_EXP+DBL_MANT_DIG+28+8)/9;
+ int max_mant_dig = (ps==BIGLPRE) ? LDBL_MANT_DIG : DBL_MANT_DIG;
+ int max_exp = (ps==BIGLPRE) ? LDBL_MAX_EXP : DBL_MAX_EXP;
+ int max_mant_slots = (max_mant_dig+28)/29 + 1;
+ int max_exp_slots = (max_exp+max_mant_dig+28+8)/9;
+ int bufsize = max_mant_slots + max_exp_slots;
uint32_t big[bufsize];
uint32_t *a, *d, *r, *z;
int e2=0, e, i, j, l;
@@ -266,7 +266,7 @@ static int fmt_fp(FILE *f, long double y, int w, int p, int fl, int t, int ps)
if (y) y *= 0x1p28, e2-=28;
if (e2<0) a=r=z=big;
- else a=r=z=big+sizeof(big)/sizeof(*big) - LDBL_MANT_DIG - 1;
+ else a=r=z=big+sizeof(big)/sizeof(*big) - max_mant_slots - 1;
do {
*z = y;
--
2.21.0
next prev parent reply other threads:[~2025-07-02 1:46 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-01 5:55 Wrong formatting of doubles? Michal Terepeta
2025-07-01 14:22 ` [musl] " Rich Felker
2025-07-01 16:19 ` Rich Felker
2025-07-01 16:37 ` Rich Felker
2025-07-01 17:02 ` Jeffrey Walton
2025-07-01 17:22 ` [musl] " Markus Wichmann
2025-07-01 17:34 ` Rich Felker
2025-07-02 1:46 ` Rich Felker [this message]
2025-07-02 7:56 ` [musl] " Michal Terepeta
2025-07-08 7:43 ` Michal Terepeta
2025-07-08 16:22 ` [musl] " Rich Felker
2025-07-02 14:58 ` Rich Felker
2025-07-03 4:31 ` Markus Wichmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250702014607.GG1827@brightrain.aerifal.cx \
--to=dalias@libc.org \
--cc=michalt@google.com \
--cc=musl@lists.openwall.com \
--cc=t5y-external@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).