From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/13540 Path: news.gmane.org!.POSTED!not-for-mail From: Szabolcs Nagy Newsgroups: gmane.linux.lib.musl.general Subject: [PATCH 00/18] math updates Date: Sat, 8 Dec 2018 13:50:10 +0100 Message-ID: <20181208125009.GY21289@port70.net> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1544273300 20146 195.159.176.226 (8 Dec 2018 12:48:20 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sat, 8 Dec 2018 12:48:20 +0000 (UTC) User-Agent: Mutt/1.10.1 (2018-07-13) To: musl@lists.openwall.com Original-X-From: musl-return-13556-gllmg-musl=m.gmane.org@lists.openwall.com Sat Dec 08 13:48:16 2018 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1gVc1n-00058v-5G for gllmg-musl@m.gmane.org; Sat, 08 Dec 2018 13:48:15 +0100 Original-Received: (qmail 1662 invoked by uid 550); 8 Dec 2018 12:50:23 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 1603 invoked from network); 8 Dec 2018 12:50:22 -0000 Mail-Followup-To: musl@lists.openwall.com Content-Disposition: inline Xref: news.gmane.org gmane.linux.lib.musl.general:13540 Archived-At: add new code from https://github.com/ARM-software/optimized-routines with small modifications: - remove all code paths related to errno handling (WANT_ERRNO cases) - reformat with clang-format to linux style instead of gnu style. - drop non-default configs (polynomial and table size settings) - remove SNaN support (but keep the code so adding it back is easy) - use __FP_FAST_FMA feature test with __builtin_fma - some macros got renamed: barriers, unlikely, HIDDEN - kept TOINT_INTRINSICS code paths, but it's never set (requires __builtin_round and __builtin_lround support as single insn) - error handling is split up across several translation units. - data layout declarations are split into several _data.h headers todo: - fp_barrier implementation for various targets - musl does not enable fma contraction, new code would be better with it - musl disables fabs etc inlining, using builtins would help - FENV_ACCESS pragma should be set in some top level header in principle (like features.h) - use the new helper functions/macros in existing code. overall libc.so code size increase on x86_64: +8540 bytes (i'll send the patches as attachments in two parts, because they are too big for one mail) Szabolcs Nagy (18): define FP_FAST_FMA* when fma* can be inlined math: move complex math out of libm.h math: add asuint, asuint64, asfloat and asdouble math: remove sun copyright from libm.h math: add fp_arch.h with fp_barrier and fp_force_eval math: add eval_as_float and eval_as_double math: add single precision error handling functions math: add double precision error handling functions math: add macros for static branch prediction hints math: add configuration macros math: new logf math: new log2f math: new exp2f and expf math: new powf math: new log math: new log2 math: new exp and exp2 math: new pow arch/aarch64/fp_arch.h | 25 ++ arch/generic/fp_arch.h | 0 include/math.h | 12 + src/complex/__cexp.c | 2 +- src/complex/__cexpf.c | 2 +- src/complex/cabs.c | 2 +- src/complex/cabsf.c | 2 +- src/complex/cabsl.c | 2 +- src/complex/cacos.c | 2 +- src/complex/cacosf.c | 2 +- src/complex/cacosh.c | 2 +- src/complex/cacoshf.c | 2 +- src/complex/cacoshl.c | 2 +- src/complex/cacosl.c | 2 +- src/complex/carg.c | 2 +- src/complex/cargf.c | 2 +- src/complex/cargl.c | 2 +- src/complex/casin.c | 2 +- src/complex/casinf.c | 2 +- src/complex/casinh.c | 2 +- src/complex/casinhf.c | 2 +- src/complex/casinhl.c | 2 +- src/complex/casinl.c | 2 +- src/complex/catan.c | 2 +- src/complex/catanf.c | 2 +- src/complex/catanh.c | 2 +- src/complex/catanhf.c | 2 +- src/complex/catanhl.c | 2 +- src/complex/catanl.c | 2 +- src/complex/ccos.c | 2 +- src/complex/ccosf.c | 2 +- src/complex/ccosh.c | 2 +- src/complex/ccoshf.c | 2 +- src/complex/ccoshl.c | 2 +- src/complex/ccosl.c | 2 +- src/complex/cexp.c | 2 +- src/complex/cexpf.c | 2 +- src/complex/cexpl.c | 2 +- src/complex/cimag.c | 2 +- src/complex/cimagf.c | 2 +- src/complex/cimagl.c | 2 +- src/complex/clog.c | 2 +- src/complex/clogf.c | 2 +- src/complex/clogl.c | 2 +- src/complex/conj.c | 2 +- src/complex/conjf.c | 2 +- src/complex/conjl.c | 2 +- src/complex/cpow.c | 2 +- src/complex/cpowf.c | 2 +- src/complex/cpowl.c | 2 +- src/complex/cproj.c | 2 +- src/complex/cprojf.c | 2 +- src/complex/cprojl.c | 2 +- src/complex/csin.c | 2 +- src/complex/csinf.c | 2 +- src/complex/csinh.c | 2 +- src/complex/csinhf.c | 2 +- src/complex/csinhl.c | 2 +- src/complex/csinl.c | 2 +- src/complex/csqrt.c | 2 +- src/complex/csqrtf.c | 2 +- src/complex/csqrtl.c | 2 +- src/complex/ctan.c | 2 +- src/complex/ctanf.c | 2 +- src/complex/ctanh.c | 2 +- src/complex/ctanhf.c | 2 +- src/complex/ctanhl.c | 2 +- src/complex/ctanl.c | 2 +- src/internal/complex_impl.h | 22 ++ src/internal/libm.h | 223 ++++++++----- src/math/__math_divzero.c | 6 + src/math/__math_divzerof.c | 6 + src/math/__math_invalid.c | 6 + src/math/__math_invalidf.c | 6 + src/math/__math_oflow.c | 6 + src/math/__math_oflowf.c | 6 + src/math/__math_uflow.c | 6 + src/math/__math_uflowf.c | 6 + src/math/__math_xflow.c | 6 + src/math/__math_xflowf.c | 6 + src/math/exp.c | 240 +++++++------- src/math/exp2.c | 466 ++++++--------------------- src/math/exp2f.c | 165 ++++------ src/math/exp2f_data.c | 35 ++ src/math/exp2f_data.h | 23 ++ src/math/exp_data.c | 182 +++++++++++ src/math/exp_data.h | 26 ++ src/math/expf.c | 133 ++++---- src/math/log.c | 202 ++++++------ src/math/log2.c | 212 ++++++------ src/math/log2_data.c | 201 ++++++++++++ src/math/log2_data.h | 28 ++ src/math/log2f.c | 114 ++++--- src/math/log2f_data.c | 33 ++ src/math/log2f_data.h | 19 ++ src/math/log_data.c | 328 +++++++++++++++++++ src/math/log_data.h | 28 ++ src/math/logf.c | 110 +++---- src/math/logf_data.c | 33 ++ src/math/logf_data.h | 20 ++ src/math/pow.c | 621 ++++++++++++++++++------------------ src/math/pow_data.c | 180 +++++++++++ src/math/pow_data.h | 22 ++ src/math/powf.c | 406 ++++++++++------------- src/math/powf_data.c | 34 ++ src/math/powf_data.h | 26 ++ 106 files changed, 2693 insertions(+), 1666 deletions(-) create mode 100644 arch/aarch64/fp_arch.h create mode 100644 arch/generic/fp_arch.h create mode 100644 src/internal/complex_impl.h create mode 100644 src/math/__math_divzero.c create mode 100644 src/math/__math_divzerof.c create mode 100644 src/math/__math_invalid.c create mode 100644 src/math/__math_invalidf.c create mode 100644 src/math/__math_oflow.c create mode 100644 src/math/__math_oflowf.c create mode 100644 src/math/__math_uflow.c create mode 100644 src/math/__math_uflowf.c create mode 100644 src/math/__math_xflow.c create mode 100644 src/math/__math_xflowf.c create mode 100644 src/math/exp2f_data.c create mode 100644 src/math/exp2f_data.h create mode 100644 src/math/exp_data.c create mode 100644 src/math/exp_data.h create mode 100644 src/math/log2_data.c create mode 100644 src/math/log2_data.h create mode 100644 src/math/log2f_data.c create mode 100644 src/math/log2f_data.h create mode 100644 src/math/log_data.c create mode 100644 src/math/log_data.h create mode 100644 src/math/logf_data.c create mode 100644 src/math/logf_data.h create mode 100644 src/math/pow_data.c create mode 100644 src/math/pow_data.h create mode 100644 src/math/powf_data.c create mode 100644 src/math/powf_data.h -- 2.19.1