* Rich Felker [2012-04-30 21:19:21 -0400]: > > definition of NAN (0/0) raises invalid exception where it is used > > This is only used as a fallback on compilers that don't have > __builtin_nan, no? yes > > signbit >>63 vs !! > > Hm? This has been fixed I think. ok it seems to be fixed > > >1ulp error: > > tan(pi/2-eps) round tozero > > exp(-inf) round upward > > What's the issue here? exp(-inf) is an exact zero; there's no rounding > involved. Are you saying in round-upward mode the result is nonzero > with the current implementation? yes, actually fesetround(FE_UPWARD); exp(-inf) = 0x1p-1074; // smallest subnormal instead of 0, which is 1ulp depending on your definition of ulp :) i'll attach all the known problematic cases (i386 asm is used, no -ffloat-store, acos is not yet fixed) (i think acosh,asinh have problems as well but those are not tested) zero and sign errors are treated as infinite ulp errors errors >99ulp are printed as 99ulp > With yesterday's commit, I think all the x86_64 asm that's possible > with SSE2 is done. If we want asm for non-longdouble versions of > transcendental functions, I think it will just involve moving the > value from an SSE register to an FPU register and back... I have no > idea how this would compare in performance to the C versions using > SSE. ok so only some benchmarking is left > BTW under asm, we may also want to switch to the faster acos > implementation. ok i'll add it (for double it is known to work, but i havent tested it for the long double case) > > long double: > > drop ld128 support? and move ldshape union to arch/ > > What ld128 support? > I don't think we really want to "drop" it since it will probably be > needed later, ok in freebsd several functions are implemented in a way that they work on ld80 and ld128 as well but sometimes macro hacks and workarounds are needed grep for LDBP_MANT_DIG == 113 > > volatile fix: > > -ffloat-store or -fexcess-precision=standard should fix most > > volatile issues so some of the volatile hacks can be cleaned up > > I think it would be nice to try to ensure that the code still works > with just -ffloat-store for older gcc versions that don't have > -fexcess-precision=standard. ok > > scalbf: > > scalb is buggy, do we need the *f and *l version? > > How so? it's just non standard and i assume some code may use scalb instead of scalbn, but scalbf and scalbl is probably not used > > generic code fixes: > > int32_t -> uint32_t conversion (can be subtle, so testing is needed) > > += 1, -= 1 -> ++, -- > > TWO52, twom1000 vs tiny (renames where it makes sense) > > remove overflow thresholds (sinh, cosh) when result overflows anyway? > > sign bit checking convention (sqrt.c) > > missing: > > sqrtl > > Until we have a better one, couldn't we just use sqrt() as a first > approximation then use Newton's method or a binary-search for the > correct long double result? Or just return sqrt() since the only arch > that doesn't have sqrtl asm yet is the one where ld==double. yes that's a good idea > > tgamma, tgammaf > > Well we have these but they're inaccurate for some inputs. ah yes i added dummy versions, but they are not usable for serious work > > (long double bessel) > > nextafterf on ld64 > > Eh? actually it's nexttowardf(float x, long double y) it's not implemented when long double == double it cannot be just a simple wrapper around nextafter or nextafterf > > tgamma: > > lanczos approx as in boost/math/special_functions and python/Modules/mathmodule.c > > complex > > optimizable creal cimag (libm.h macro for internal code?) > > This was done a long time ago. good