x87 asin and acos

mailing list of musl libc
 help / color / mirror / code / Atom feed

* x87 asin and acos
@ 2019-02-23 14:21 Shane Seelig
  2019-02-23 15:08 ` Rich Felker
  2019-02-23 20:30 ` Szabolcs Nagy
  0 siblings, 2 replies; 6+ messages in thread
From: Shane Seelig @ 2019-02-23 14:21 UTC (permalink / raw)
  To: musl

Currently 'asin' uses the algorithm:
	arcsin(x) == arctan(x/(sqrt((1-x)(1+x))))
If the following algorithm were to be used instead, an 'fadd' could be
removed.
	arcsin(x) == arctan(x/(sqrt(1-x**2)))


current:
	fldx	X(%esp)		# %st(0) = x

	fld 	%st(0)		# %st(1) = x
				# %st(0) = x

	fld1			# %st(2) = x
				# %st(1) = x
				# %st(0) = 1
	# unixware bug
	fsub	%st(0),%st(1)	# %st(2) = x
				# %st(1) = 1-x
				# %st(0) = 1

	fadd	%st(2), %st(0)	# %st(2) = x
				# %st(1) = 1-x
				# %st(0) = 1+x

	fmulp	%st(0), %st(1)	# %st(1) = x
				# %st(0) = 1-x**2

	fsqrt			# %st(1) = x
				# %st(0) = sqrt(1-x**2)

	fpatan			# %st(0) = arcsin(x)

	ret


new:
	fldx	X(%esp)		# %st(0) = x

	fld	%st(0)		# %st(1) = x
				# %st(0) = x

	fmul	%st(0), %st(0)	# %st(1) = x
				# %st(0) = x**2

	fld1			# %st(2) = x
				# %st(1) = x**2
				# %st(0) = 1
	# unixware bug
	fsubp	%st(0), %st(1)	# %st(1) = x
				# %st(0) = 1-x**2

	fsqrt			# %st(1) = x
				# %st(0) = sqrt(1-x**2)

	fpatan			# %st(0) = arcsin(x)

	ret


affected files:
	math/i386/acos.s
	math/i386/asin.s
	math/x32/acosl.s
	math/x32/asinl.s
	math/x86_64/acosl.s
	math/x86_64/asinl.s

Please CC me.

--

Shane


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: x87 asin and acos
  2019-02-23 14:21 x87 asin and acos Shane Seelig
@ 2019-02-23 15:08 ` Rich Felker
  2019-02-23 19:57   ` Rich Felker
  2019-02-23 20:30 ` Szabolcs Nagy
  1 sibling, 1 reply; 6+ messages in thread
From: Rich Felker @ 2019-02-23 15:08 UTC (permalink / raw)
  To: musl; +Cc: Shane Seelig

On Sat, Feb 23, 2019 at 09:21:08AM -0500, Shane Seelig wrote:
> Currently 'asin' uses the algorithm:
> 	arcsin(x) == arctan(x/(sqrt((1-x)(1+x))))
> If the following algorithm were to be used instead, an 'fadd' could be
> removed.
> 	arcsin(x) == arctan(x/(sqrt(1-x**2)))

They don't seem to be numerically equal. For example, if x is smaller
than sqrt(LDBL_EPSILON/2), 1-x**2 is 1, but (1-x)*(1+x) is not. I
don't recall the process of writing the function in detail, but I'm
pretty sure this matters to the result, especially since sqrt then
expands the magnitude of the error.

Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: x87 asin and acos
  2019-02-23 15:08 ` Rich Felker
@ 2019-02-23 19:57   ` Rich Felker
  2019-02-24  2:53     ` Damian McGuckin
  0 siblings, 1 reply; 6+ messages in thread
From: Rich Felker @ 2019-02-23 19:57 UTC (permalink / raw)
  To: musl; +Cc: Shane Seelig

On Sat, Feb 23, 2019 at 10:08:58AM -0500, Rich Felker wrote:
> On Sat, Feb 23, 2019 at 09:21:08AM -0500, Shane Seelig wrote:
> > Currently 'asin' uses the algorithm:
> > 	arcsin(x) == arctan(x/(sqrt((1-x)(1+x))))
> > If the following algorithm were to be used instead, an 'fadd' could be
> > removed.
> > 	arcsin(x) == arctan(x/(sqrt(1-x**2)))
> 
> They don't seem to be numerically equal. For example, if x is smaller
> than sqrt(LDBL_EPSILON/2), 1-x**2 is 1, but (1-x)*(1+x) is not. I
> don't recall the process of writing the function in detail, but I'm
> pretty sure this matters to the result, especially since sqrt then
> expands the magnitude of the error.

After some discussion on irc, I think the above may be wrong. I'm not
sure if there are other cases where it can matter, but if not this is
probably a valid, and maybe useful, optimization.

Rich


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: x87 asin and acos
  2019-02-23 14:21 x87 asin and acos Shane Seelig
  2019-02-23 15:08 ` Rich Felker
@ 2019-02-23 20:30 ` Szabolcs Nagy
  2019-02-23 20:36   ` Szabolcs Nagy
  1 sibling, 1 reply; 6+ messages in thread
From: Szabolcs Nagy @ 2019-02-23 20:30 UTC (permalink / raw)
  To: musl; +Cc: Shane Seelig

* Shane Seelig <stseelig@mail.com> [2019-02-23 09:21:08 -0500]:
> Currently 'asin' uses the algorithm:
> 	arcsin(x) == arctan(x/(sqrt((1-x)(1+x))))
> If the following algorithm were to be used instead, an 'fadd' could be
> removed.
> 	arcsin(x) == arctan(x/(sqrt(1-x**2)))

that change seems valid as far as the result is concerned.
(the worst case rounding error of the sqrt argument should
be around LDBL_EPS in both cases)

but the fenv behaviour is not valid:
for tiny x, x*x raises spurious underflow exception
for large x, x*x raises spurious overflow exception

(1-x)(1+x) avoids these issues.

note that there is not much performance difference between
the two expressions: 1-x and 1+x are independent computations
so the latency is 1 add + 1 mul in both cases, and the entire
function is likely dominated by fpsqrt followed by fpatan
both of which have huge latency compared to add or mul.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: x87 asin and acos
  2019-02-23 20:30 ` Szabolcs Nagy
@ 2019-02-23 20:36   ` Szabolcs Nagy
  0 siblings, 0 replies; 6+ messages in thread
From: Szabolcs Nagy @ 2019-02-23 20:36 UTC (permalink / raw)
  To: musl, Shane Seelig

* Szabolcs Nagy <nsz@port70.net> [2019-02-23 21:30:31 +0100]:

> * Shane Seelig <stseelig@mail.com> [2019-02-23 09:21:08 -0500]:
> > Currently 'asin' uses the algorithm:
> > 	arcsin(x) == arctan(x/(sqrt((1-x)(1+x))))
> > If the following algorithm were to be used instead, an 'fadd' could be
> > removed.
> > 	arcsin(x) == arctan(x/(sqrt(1-x**2)))
> 
> that change seems valid as far as the result is concerned.
> (the worst case rounding error of the sqrt argument should
> be around LDBL_EPS in both cases)
> 
> but the fenv behaviour is not valid:
> for tiny x, x*x raises spurious underflow exception
> for large x, x*x raises spurious overflow exception
> 
> (1-x)(1+x) avoids these issues.

hm actually this does not solve the overflow prolem,
indeed current asinl(0x1p10000L) raises spurious
overflow, so it's not correct.

(it's unlikely to matter in practice, the valid input
domain is [-1,1] anyway, but we should probably fix
that)

> 
> note that there is not much performance difference between
> the two expressions: 1-x and 1+x are independent computations
> so the latency is 1 add + 1 mul in both cases, and the entire
> function is likely dominated by fpsqrt followed by fpatan
> both of which have huge latency compared to add or mul.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: x87 asin and acos
  2019-02-23 19:57   ` Rich Felker
@ 2019-02-24  2:53     ` Damian McGuckin
  0 siblings, 0 replies; 6+ messages in thread
From: Damian McGuckin @ 2019-02-24  2:53 UTC (permalink / raw)
  To: musl; +Cc: Shane Seelig

On Sat, 23 Feb 2019, Rich Felker wrote:

>> They don't seem to be numerically equal. For example, if x is smaller
>> than sqrt(LDBL_EPSILON/2), 1-x**2 is 1, but (1-x)*(1+x) is not.
>> don't recall the process of writing the function in detail, but I'm
>> pretty sure this matters to the result, especially since sqrt then
>> expands the magnitude of the error.
>
> After some discussion on irc, I think the above may be wrong.

Yes and no.

Interestingly, if you to look at double (or float) for now

 	1 - x**2 == 1 if |x| < sqrt(DBL_EPSILON/2)

whereas
 	(1 - x) is not 1 nor is (1 + x)

although interesting, to the precision (a

 	(1 - x) * (1 + x) == 1 if DBL_EPSILON/2 < |x| < sqrt(DBL_EPSILON/2)

But at |x| == DBL_EPSILON/2 (== the round bit)

 	(1 + x) == 1 but (1 - x) != 1 and so

 	(1 + x) * (1 - x) != 1

because DBL_EPSILON/2 affects the round bit. That said

 	sqrt((1 - x) * (1 + x)) = 1 + 0.5 * x == 1

because (DBL_EPSILON/2) * 0.5 is half the round bit and does not affect 
it. So, the formula does not affect the result of the sqrt().

The same happens for floats.

I think the same can be said for long double.

Regards - Damian

Pacific Engineering Systems International, 277-279 Broadway, Glebe NSW 2037
Ph:+61-2-8571-0847 .. Fx:+61-2-9692-9623 | unsolicited email not wanted here
Views & opinions here are mine and not those of any past or present employer

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-02-24  2:53 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-23 14:21 x87 asin and acos Shane Seelig
2019-02-23 15:08 ` Rich Felker
2019-02-23 19:57   ` Rich Felker
2019-02-24  2:53     ` Damian McGuckin
2019-02-23 20:30 ` Szabolcs Nagy
2019-02-23 20:36   ` Szabolcs Nagy

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).