From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <musl-return-15199-ml=inbox.vuxu.org@lists.openwall.com>
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org
X-Spam-Level: 
X-Spam-Status: No, score=-3.0 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS,
	MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,
	RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.2
Received: from mother.openwall.net (mother.openwall.net [195.42.179.200])
	by inbox.vuxu.org (OpenSMTPD) with SMTP id e6631c6c
	for <ml@inbox.vuxu.org>;
	Sat, 18 Jan 2020 20:14:50 +0000 (UTC)
Received: (qmail 30341 invoked by uid 550); 18 Jan 2020 20:14:48 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Reply-To: musl@lists.openwall.com
Received: (qmail 30318 invoked from network); 18 Jan 2020 20:14:48 -0000
Date: Sat, 18 Jan 2020 21:14:35 +0100
From: Szabolcs Nagy <nsz@port70.net>
To: paul zimmermann <Paul.Zimmermann@inria.fr>
Cc: Szabolcs.Nagy@arm.com, nd@arm.com, jens.gustedt@inria.fr,
	Vincent.Lefevre@ens-lyon.fr, musl@lists.openwall.com
Message-ID: <20200118201435.GH23985@port70.net>
Mail-Followup-To: paul zimmermann <Paul.Zimmermann@inria.fr>,
	Szabolcs.Nagy@arm.com, nd@arm.com, jens.gustedt@inria.fr,
	Vincent.Lefevre@ens-lyon.fr, musl@lists.openwall.com
References: <mwlfqit6tx.fsf@tomate.loria.fr>
 <5cf6c0cf-6988-daac-8b74-43bb0e2c625f@arm.com>
 <mwlfqfqp08.fsf@tomate.loria.fr>
 <20200110173023.GS23985@port70.net>
 <mwv9pj2m8z.fsf@tomate.loria.fr>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <mwv9pj2m8z.fsf@tomate.loria.fr>
User-Agent: Mutt/1.10.1 (2018-07-13)
Subject: Re: [musl] Re: musl mathematical functions

* paul zimmermann <Paul.Zimmermann@inria.fr> [2020-01-10 19:35:08 +0100]:
> > i think libm functions are extremely rarely used with
> > non-nearest rounding mode so i think
> > 
> >   NR accuracy >> DR accuracy >> NR symmetry >> NR speed
> >   >> DR symmetry >> DR speed
> > 
> > where NR is nearest rounding and DR is directed rounding.
> 
> yes this makes sense.
> 
> > and by accuracy i just mean correct behavirour with respect
> > to exceptions and results (i.e. small ulp errors).
> 
> note that if directed rounding is used to implement interval
> arithmetic, it is very important to have the return value on
> the right side with respect to the exact value (at the cost
> of a few ulps of accuracy).

getting on the right side would regress the performance
for all users for something theoretical (existing math
libraries don't support it). at least i think it would
require accessing fenv (changing the rounding mode) or
other expensive operation in the hot code path.
(expensive rounding mode change is one of the reasons
interval arithmetics is not practical, lack of compiler
support for fenv access is another, the instruction set
and language semantics should be designed differently
for it to be practical.)

i'd recommend using an arbitrary precision library or a
correctly rounded math library (e.g. tr 18661-4 reserves
separate cr prefixed symbols for that), not libc functions
for interval arithmetic.

large ulp errors can usually be fixed without significant
penalty for nearest rounding. e.g. i'm considering a fix
for trig arg reduction with two additional branches (i
think this can be simplified with more code changes and
the cost can be eliminated on targets with rounding mode
independent rounding instructions)

diff --git a/src/math/__rem_pio2.c b/src/math/__rem_pio2.c
index d403f81c..80fd72c8 100644
--- a/src/math/__rem_pio2.c
+++ b/src/math/__rem_pio2.c
@@ -36,6 +36,7 @@
  */
 static const double
 toint   = 1.5/EPS,
+pio4    = 0x1.921fb54442d18p-1,
 invpio2 = 6.36619772367581382433e-01, /* 0x3FE45F30, 0x6DC9C883 */
 pio2_1  = 1.57079632673412561417e+00, /* 0x3FF921FB, 0x54400000 */
 pio2_1t = 6.07710050650619224932e-11, /* 0x3DD0B461, 0x1A626331 */
@@ -122,6 +123,17 @@ medium:
 		n = (int32_t)fn;
 		r = x - fn*pio2_1;
 		w = fn*pio2_1t;  /* 1st round, good to 85 bits */
+		if (predict_false(r - w < -pio4)) {
+			n--;
+			fn--;
+			r = x - fn*pio2_1;
+			w = fn*pio2_1t;
+		} else if (predict_false(r - w > pio4)) {
+			n++;
+			fn++;
+			r = x - fn*pio2_1;
+			w = fn*pio2_1t;
+		}
 		y[0] = r - w;
 		u.f = y[0];
 		ey = u.i>>52 & 0x7ff;