* Rich Felker <dalias@aerifal.cx> [2012-04-30 21:19:21 -0400]:
> > 	definition of NAN (0/0) raises invalid exception where it is used
> 
> This is only used as a fallback on compilers that don't have
> __builtin_nan, no?
yes

> > 	signbit >>63 vs !!
> 
> Hm? This has been fixed I think.
ok it seems to be fixed

> > >1ulp error:
> > 	tan(pi/2-eps) round tozero
> > 	exp(-inf) round upward
> 
> What's the issue here? exp(-inf) is an exact zero; there's no rounding
> involved. Are you saying in round-upward mode the result is nonzero
> with the current implementation?
yes, actually
fesetround(FE_UPWARD);
exp(-inf) = 0x1p-1074; // smallest subnormal
instead of 0, which is 1ulp depending on your definition of ulp :)

i'll attach all the known problematic cases
(i386 asm is used, no -ffloat-store, acos is not yet fixed)
(i think acosh,asinh have problems as well but those are not tested)

zero and sign errors are treated as infinite ulp errors
errors >99ulp are printed as 99ulp

> With yesterday's commit, I think all the x86_64 asm that's possible
> with SSE2 is done. If we want asm for non-longdouble versions of
> transcendental functions, I think it will just involve moving the
> value from an SSE register to an FPU register and back... I have no
> idea how this would compare in performance to the C versions using
> SSE.
ok so only some benchmarking is left

> BTW under asm, we may also want to switch to the faster acos
> implementation.
ok i'll add it
(for double it is known to work, but i havent tested it for
the long double case)

> > long double:
> > 	drop ld128 support? and move ldshape union to arch/
> 
> What ld128 support?
> I don't think we really want to "drop" it since it will probably be
> needed later, 
ok
in freebsd several functions are implemented in a way that
they work on ld80 and ld128 as well but sometimes macro
hacks and workarounds are needed
grep for LDBP_MANT_DIG == 113

> > volatile fix:
> > 	-ffloat-store or -fexcess-precision=standard should fix most
> > 	volatile issues so some of the volatile hacks can be cleaned up
> 
> I think it would be nice to try to ensure that the code still works
> with just -ffloat-store for older gcc versions that don't have
> -fexcess-precision=standard.
ok

> > scalbf:
> > 	scalb is buggy, do we need the *f and *l version?
> 
> How so?
it's just non standard and i assume some code may use scalb
instead of scalbn, but scalbf and scalbl is probably not used

> > generic code fixes:
> > 	int32_t -> uint32_t conversion (can be subtle, so testing is needed)
> > 	+= 1, -= 1 -> ++, --
> > 	TWO52, twom1000 vs tiny (renames where it makes sense)
> > 	remove overflow thresholds (sinh, cosh) when result overflows anyway?
> > 	sign bit checking convention (sqrt.c)
> > missing:
> > 	sqrtl
> 
> Until we have a better one, couldn't we just use sqrt() as a first
> approximation then use Newton's method or a binary-search for the
> correct long double result? Or just return sqrt() since the only arch
> that doesn't have sqrtl asm yet is the one where ld==double.
yes that's a good idea

> > 	tgamma, tgammaf
> 
> Well we have these but they're inaccurate for some inputs.
ah yes i added dummy versions, but they are not usable for serious work

> > 	(long double bessel)
> > 	nextafterf on ld64
> 
> Eh?
actually it's nexttowardf(float x, long double y)
it's not implemented when long double == double
it cannot be just a simple wrapper around nextafter or nextafterf

> > tgamma:
> > 	lanczos approx as in boost/math/special_functions and python/Modules/mathmodule.c
> > complex
> > 	optimizable creal cimag (libm.h macro for internal code?)
> 
> This was done a long time ago.
good