mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] Double Rounding in ....
@ 2023-11-11  8:11 Damian McGuckin
  2023-11-11 10:26 ` Szabolcs Nagy
  0 siblings, 1 reply; 2+ messages in thread
From: Damian McGuckin @ 2023-11-11  8:11 UTC (permalink / raw)
  To: musl


 	hypot, hypotf, hypotl
and
 	scalbn, scalbnf, scalbnl

Does anybody have a good/clear explanation for how double rounding occurs 
in the case of squaring-of/multiplication-by small numbers and how the 
action taken in the code addresses that? Not sure if it needs to mention

 	FLT_EVAL_METHOD

because certainly the float and double versions of those routines are
written in terms of float_t and double_t respectively.

Thanks - Damian

Pacific Engineering Systems International ..... 20D Grose St, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [musl] Double Rounding in ....
  2023-11-11  8:11 [musl] Double Rounding in Damian McGuckin
@ 2023-11-11 10:26 ` Szabolcs Nagy
  0 siblings, 0 replies; 2+ messages in thread
From: Szabolcs Nagy @ 2023-11-11 10:26 UTC (permalink / raw)
  To: Damian McGuckin; +Cc: musl

* Damian McGuckin <damianm@esi.com.au> [2023-11-11 19:11:41 +1100]:
> 
> 	hypot, hypotf, hypotl
> and
> 	scalbn, scalbnf, scalbnl
> 
> Does anybody have a good/clear explanation for how double rounding occurs in
> the case of squaring-of/multiplication-by small numbers and how the action
> taken in the code addresses that? Not sure if it needs to mention

n * 2^n is normally exact, but in the subnormal range there
are less precision bits so rounding can occur.

if you do the scaling in two steps (e.g. because the scale
factor has such a big exponent that it is not representible
in a single float) then you can get double rounding.

e.g.

scalbn(0x1.000000000000bp-1, -1024) == 0x1.0000000000008p-1025

normally the scaling is done in two steps: x*0x1p-1022*0x1p-2
because up to 0x1p-1022 the scaling is easy to construct from an int,
while 0x1p-1024 is subnormal.

with naive implementation the bit pattern at the end is
..001011 x
..00110  x*0x1p-1022 (rounded)
..010    x*0x1p-2 (rounded)

with the fix
..001011 x*0x1p-969 (969 = 1022-53)
..001    x*0x1p-55 (rounded)

so simply split the scale factor such that the first scaling can
only do rounding if the second scaling (<0x1p-53) completely rounds
the entire result away.

> 
> 	FLT_EVAL_METHOD
> 
> because certainly the float and double versions of those routines are
> written in terms of float_t and double_t respectively.
> 
> Thanks - Damian
> 
> Pacific Engineering Systems International ..... 20D Grose St, Glebe NSW 2037
> Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
> Views & opinions here are mine and not those of any past or present employer

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-11-11 10:27 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-11  8:11 [musl] Double Rounding in Damian McGuckin
2023-11-11 10:26 ` Szabolcs Nagy

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).