mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] Double Rounding in ....
@ 2023-11-11  8:11 Damian McGuckin
  2023-11-11 10:26 ` Szabolcs Nagy
  0 siblings, 1 reply; 3+ messages in thread
From: Damian McGuckin @ 2023-11-11  8:11 UTC (permalink / raw)
  To: musl


 	hypot, hypotf, hypotl
and
 	scalbn, scalbnf, scalbnl

Does anybody have a good/clear explanation for how double rounding occurs 
in the case of squaring-of/multiplication-by small numbers and how the 
action taken in the code addresses that? Not sure if it needs to mention

 	FLT_EVAL_METHOD

because certainly the float and double versions of those routines are
written in terms of float_t and double_t respectively.

Thanks - Damian

Pacific Engineering Systems International ..... 20D Grose St, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [musl] Double Rounding in ....
  2023-11-11  8:11 [musl] Double Rounding in Damian McGuckin
@ 2023-11-11 10:26 ` Szabolcs Nagy
  2024-07-12  0:23   ` Damian McGuckin
  0 siblings, 1 reply; 3+ messages in thread
From: Szabolcs Nagy @ 2023-11-11 10:26 UTC (permalink / raw)
  To: Damian McGuckin; +Cc: musl

* Damian McGuckin <damianm@esi.com.au> [2023-11-11 19:11:41 +1100]:
> 
> 	hypot, hypotf, hypotl
> and
> 	scalbn, scalbnf, scalbnl
> 
> Does anybody have a good/clear explanation for how double rounding occurs in
> the case of squaring-of/multiplication-by small numbers and how the action
> taken in the code addresses that? Not sure if it needs to mention

n * 2^n is normally exact, but in the subnormal range there
are less precision bits so rounding can occur.

if you do the scaling in two steps (e.g. because the scale
factor has such a big exponent that it is not representible
in a single float) then you can get double rounding.

e.g.

scalbn(0x1.000000000000bp-1, -1024) == 0x1.0000000000008p-1025

normally the scaling is done in two steps: x*0x1p-1022*0x1p-2
because up to 0x1p-1022 the scaling is easy to construct from an int,
while 0x1p-1024 is subnormal.

with naive implementation the bit pattern at the end is
..001011 x
..00110  x*0x1p-1022 (rounded)
..010    x*0x1p-2 (rounded)

with the fix
..001011 x*0x1p-969 (969 = 1022-53)
..001    x*0x1p-55 (rounded)

so simply split the scale factor such that the first scaling can
only do rounding if the second scaling (<0x1p-53) completely rounds
the entire result away.

> 
> 	FLT_EVAL_METHOD
> 
> because certainly the float and double versions of those routines are
> written in terms of float_t and double_t respectively.
> 
> Thanks - Damian
> 
> Pacific Engineering Systems International ..... 20D Grose St, Glebe NSW 2037
> Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
> Views & opinions here are mine and not those of any past or present employer

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [musl] Double Rounding in ....
  2023-11-11 10:26 ` Szabolcs Nagy
@ 2024-07-12  0:23   ` Damian McGuckin
  0 siblings, 0 replies; 3+ messages in thread
From: Damian McGuckin @ 2024-07-12  0:23 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: MUSL


belatedly ...

On Sat, 11 Nov 2023, Szabolcs Nagy wrote:

> so simply split the scale factor such that the first scaling can
> only do rounding if the second scaling (<0x1p-53) completely rounds
> the entire result away.

That is a very (extremely) elegant solution to the problem.

Who came up with the algorithm used in MUSL? It seems superior to what is 
in most other libraries. It is so much cleaner than the algorithms used in 
most other libraries which use the algorithm (or a minor variant of the 
one) in the SUN library. And it is faster.  Maybe in 30+ years ago when a 
multiplication was much slower compared to branching, the older algorithm 
made more sense. It seems such a long time ago

After your comment about moving to asfloat()/asuint() and friends the 
other day, I used scalbnf() as a test case.

By trying to work only with an unsigned biased exponent, I managed to get 
the code for scalbnf from GCC11 from an original 47 LOC of assembler to 
just 39 in the rewrite. Almost small enough to have as an inline routine.
By LOC I include the labels as well.

Interestingly, CLANG17 was only 42 LOC of assembler for the original. And
only fractionally less at 41 LOC in the rewrite which was unusual.

Thanks again for your explanation - Damian

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-07-12  0:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-11  8:11 [musl] Double Rounding in Damian McGuckin
2023-11-11 10:26 ` Szabolcs Nagy
2024-07-12  0:23   ` Damian McGuckin

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).