From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-0.7 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: from second.openwall.net (second.openwall.net [193.110.157.125]) by inbox.vuxu.org (Postfix) with SMTP id 4E278216C7 for ; Tue, 23 Apr 2024 04:27:26 +0200 (CEST) Received: (qmail 26472 invoked by uid 550); 23 Apr 2024 02:27:20 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 26407 invoked from network); 23 Apr 2024 02:27:18 -0000 From: ticat_fp To: musl@lists.openwall.com Cc: lixing@loongson.cn, huajingyun@loongson.cn, wanghongliang@loongson.cn Date: Tue, 23 Apr 2024 10:26:19 +0800 Message-Id: <20240423022619.1253464-1-fanpeng@loongson.cn> X-Mailer: git-send-email 2.33.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID:AQAAf8Bxut1NHCdmGuYBAA--.7916S2 X-CM-SenderInfo: xidq1vtqj6z05rqj20fqof0/1tbiAQAQEmYl0BIL4wABsr X-Coremail-Antispam: 1Uk129KBj9fXoW3ZF4fZw4UJr1DuFWkGFyUJwc_yoW8JrykKo Z5ZayUuw48Gr45Zr10vry5X3y7Zr1IkFWfZ3y7Z3yUGF95Aw45Gry7u3W5G3WfurnrW3y5 urWIqrsxAwsFg3s7l-sFpf9Il3svdjkaLaAFLSUrUUUUUb8apTn2vfkv8UJUUUU8wcxFpf 9Il3svdxBIdaVrn0xqx4xG64xvF2IEw4CE5I8CrVC2j2Jv73VFW2AGmfu7bjvjm3AaLaJ3 UjIYCTnIWjp_UUUYb7kC6x804xWl14x267AKxVWUJVW8JwAFc2x0x2IEx4CE42xK8VAvwI 8IcIk0rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xG Y2AK021l84ACjcxK6xIIjxv20xvE14v26r1j6r1xM28EF7xvwVC0I7IYx2IY6xkF7I0E14 v26r1j6r4UM28EF7xvwVC2z280aVAFwI0_Cr0_Gr1UM28EF7xvwVC2z280aVCY1x0267AK xVW8Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx 1l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv 67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41l42xK82IYc2 Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s02 6x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48JMIIF0x vE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE 42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6x kF7I0E14v26r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x07UE-erUUUUU= Subject: [musl] [PATCH] math: add LoongArch support for common APIs with inline assembly. Including: ceil, copysign, fabs, floor, fma, fmax, fmin, llrint, lrint, rint, sqrt and their f versions. --- src/math/loongarch64/ceil.c | 25 +++++++++++++++++++++++++ src/math/loongarch64/ceilf.c | 25 +++++++++++++++++++++++++ src/math/loongarch64/copysign.c | 7 +++++++ src/math/loongarch64/copysignf.c | 7 +++++++ src/math/loongarch64/fabs.c | 7 +++++++ src/math/loongarch64/fabsf.c | 7 +++++++ src/math/loongarch64/floor.c | 22 ++++++++++++++++++++++ src/math/loongarch64/floorf.c | 22 ++++++++++++++++++++++ src/math/loongarch64/fma.c | 7 +++++++ src/math/loongarch64/fmaf.c | 7 +++++++ src/math/loongarch64/fmax.c | 7 +++++++ src/math/loongarch64/fmaxf.c | 7 +++++++ src/math/loongarch64/fmin.c | 7 +++++++ src/math/loongarch64/fminf.c | 7 +++++++ src/math/loongarch64/llrint.c | 17 +++++++++++++++++ src/math/loongarch64/llrintf.c | 17 +++++++++++++++++ src/math/loongarch64/lrint.c | 17 +++++++++++++++++ src/math/loongarch64/lrintf.c | 17 +++++++++++++++++ src/math/loongarch64/rint.c | 7 +++++++ src/math/loongarch64/rintf.c | 7 +++++++ src/math/loongarch64/sqrt.c | 7 +++++++ src/math/loongarch64/sqrtf.c | 7 +++++++ 22 files changed, 260 insertions(+) create mode 100644 src/math/loongarch64/ceil.c create mode 100644 src/math/loongarch64/ceilf.c create mode 100644 src/math/loongarch64/copysign.c create mode 100644 src/math/loongarch64/copysignf.c create mode 100644 src/math/loongarch64/fabs.c create mode 100644 src/math/loongarch64/fabsf.c create mode 100644 src/math/loongarch64/floor.c create mode 100644 src/math/loongarch64/floorf.c create mode 100644 src/math/loongarch64/fma.c create mode 100644 src/math/loongarch64/fmaf.c create mode 100644 src/math/loongarch64/fmax.c create mode 100644 src/math/loongarch64/fmaxf.c create mode 100644 src/math/loongarch64/fmin.c create mode 100644 src/math/loongarch64/fminf.c create mode 100644 src/math/loongarch64/llrint.c create mode 100644 src/math/loongarch64/llrintf.c create mode 100644 src/math/loongarch64/lrint.c create mode 100644 src/math/loongarch64/lrintf.c create mode 100644 src/math/loongarch64/rint.c create mode 100644 src/math/loongarch64/rintf.c create mode 100644 src/math/loongarch64/sqrt.c create mode 100644 src/math/loongarch64/sqrtf.c diff --git a/src/math/loongarch64/ceil.c b/src/math/loongarch64/ceil.c new file mode 100644 index 00000000..95781f4b --- /dev/null +++ b/src/math/loongarch64/ceil.c @@ -0,0 +1,25 @@ +#include +#include + +double ceil(double x) +{ + int32_t old; + int32_t new; + int32_t tmp1; + int32_t tmp2; + + __asm__ __volatile__( + "movfcsr2gr %[orig_old], $r0 \n\t" + "li.d %[tmp1], 0x200 \n\t" + "or %[new], %[orig_old], %[tmp1] \n\t" + "li.d %[tmp2], 0xfffffeff \n\t" + "and %[new], %[new], %[tmp2] \n\t" + "movgr2fcsr $r0, %[new] \n\t" + "frint.d %[result], %[orig_x] \n\t" + "movgr2fcsr $r0, %[orig_old] \n\t" + : [result] "+f"(x), [old]"+r"(old), [new]"+r"(new), [tmp1] "+r"(tmp1), [tmp2] "+r"(tmp2) + : [orig_x] "f"(x), [orig_old]"r"(old), [orig_new]"r"(new), [orig_tmp1] "r"(tmp1), [orig_tmp2] "r"(tmp2) + :); + + return x; +} diff --git a/src/math/loongarch64/ceilf.c b/src/math/loongarch64/ceilf.c new file mode 100644 index 00000000..03a2d933 --- /dev/null +++ b/src/math/loongarch64/ceilf.c @@ -0,0 +1,25 @@ +#include +#include + +float ceilf(float x) +{ + int32_t old; + int32_t new; + int32_t tmp1; + int32_t tmp2; + + __asm__ __volatile__( + "movfcsr2gr %[orig_old], $r0 \n\t" + "li.d %[tmp1], 0x200 \n\t" + "or %[new], %[orig_old], %[tmp1] \n\t" + "li.d %[tmp2], 0xfffffeff \n\t" + "and %[new], %[new], %[tmp2] \n\t" + "movgr2fcsr $r0, %[new] \n\t" + "frint.s %[result], %[orig_x] \n\t" + "movgr2fcsr $r0, %[orig_old] \n\t" + : [result] "+f"(x), [old]"+r"(old), [new]"+r"(new), [tmp1] "+r"(tmp1), [tmp2] "+r"(tmp2) + : [orig_x] "f"(x), [orig_old]"r"(old), [orig_new]"r"(new), [orig_tmp1] "r"(tmp1), [orig_tmp2] "r"(tmp2) + :); + + return x; +} diff --git a/src/math/loongarch64/copysign.c b/src/math/loongarch64/copysign.c new file mode 100644 index 00000000..9e3b8de3 --- /dev/null +++ b/src/math/loongarch64/copysign.c @@ -0,0 +1,7 @@ +#include + +double copysign(double x, double y) +{ + __asm__ __volatile__("fcopysign.d %0, %1, %2" : "=f"(x) : "f"(x), "f"(y)); + return x; +} diff --git a/src/math/loongarch64/copysignf.c b/src/math/loongarch64/copysignf.c new file mode 100644 index 00000000..98df4254 --- /dev/null +++ b/src/math/loongarch64/copysignf.c @@ -0,0 +1,7 @@ +#include + +float copysignf(float x, float y) +{ + __asm__ __volatile__("fcopysign.s %0, %1, %2" : "=f"(x) : "f"(x), "f"(y)); + return x; +} diff --git a/src/math/loongarch64/fabs.c b/src/math/loongarch64/fabs.c new file mode 100644 index 00000000..3db57fb5 --- /dev/null +++ b/src/math/loongarch64/fabs.c @@ -0,0 +1,7 @@ +#include + +double fabs(double x) +{ + __asm__ __volatile__("fabs.d %0, %1" : "=f"(x) : "f"(x)); + return x; +} diff --git a/src/math/loongarch64/fabsf.c b/src/math/loongarch64/fabsf.c new file mode 100644 index 00000000..e24201c5 --- /dev/null +++ b/src/math/loongarch64/fabsf.c @@ -0,0 +1,7 @@ +#include + +float fabsf(float x) +{ + __asm__ __volatile__("fabs.s %0, %1" : "=f"(x) : "f"(x)); + return x; +} diff --git a/src/math/loongarch64/floor.c b/src/math/loongarch64/floor.c new file mode 100644 index 00000000..7aead2a3 --- /dev/null +++ b/src/math/loongarch64/floor.c @@ -0,0 +1,22 @@ +#include +#include + +double floor(double x) +{ + int32_t old; + int32_t new; + int32_t tmp1; + + __asm__ __volatile__( + "movfcsr2gr %[old], $r0 \n\t" + "li.d %[tmp1], 0x300 \n\t" + "or %[new], %[old], %[tmp1] \n\t" + "movgr2fcsr $r0, %[new] \n\t" + "frint.d %[result], %[orig_x] \n\t" + "movgr2fcsr $r0, %[old] \n\t" + : [result] "+f"(x), [old]"+r"(old), [tmp1] "+r"(tmp1), [new]"+r"(new) + : [orig_x] "f"(x), [origin_old] "r"(old), [orig_new] "r"(new), [orig_tmp1] "r"(tmp1) + :); + + return x; +} diff --git a/src/math/loongarch64/floorf.c b/src/math/loongarch64/floorf.c new file mode 100644 index 00000000..772d15eb --- /dev/null +++ b/src/math/loongarch64/floorf.c @@ -0,0 +1,22 @@ +#include +#include + +float floorf(float x) +{ + int32_t old; + int32_t new; + int32_t tmp1; + + __asm__ __volatile__( + "movfcsr2gr %[old], $r0 \n\t" + "li.d %[tmp1], 0x300 \n\t" + "or %[new], %[old], %[tmp1] \n\t" + "movgr2fcsr $r0, %[new] \n\t" + "frint.s %[result], %[orig_x] \n\t" + "movgr2fcsr $r0, %[old] \n\t" + : [result] "+f"(x), [old]"+r"(old), [tmp1] "+r"(tmp1), [new]"+r"(new) + : [orig_x] "f"(x), [origin_old] "r"(old), [orig_new] "r"(new), [orig_tmp1] "r"(tmp1) + :); + + return x; +} diff --git a/src/math/loongarch64/fma.c b/src/math/loongarch64/fma.c new file mode 100644 index 00000000..0b6a3f23 --- /dev/null +++ b/src/math/loongarch64/fma.c @@ -0,0 +1,7 @@ +#include + +double fma(double x, double y, double z) +{ + __asm__ __volatile__("fmadd.d %0, %1, %2, %3" : "=f" (x) : "f"(x) , "f" (y), "f" (z)); + return x; +} diff --git a/src/math/loongarch64/fmaf.c b/src/math/loongarch64/fmaf.c new file mode 100644 index 00000000..77a8363b --- /dev/null +++ b/src/math/loongarch64/fmaf.c @@ -0,0 +1,7 @@ +#include + +float fmaf(float x, float y, float z) +{ + __asm__ __volatile__("fmadd.s %0, %1, %2, %3" : "=f" (x) : "f"(x) , "f" (y), "f" (z)); + return x; +} diff --git a/src/math/loongarch64/fmax.c b/src/math/loongarch64/fmax.c new file mode 100644 index 00000000..2d091877 --- /dev/null +++ b/src/math/loongarch64/fmax.c @@ -0,0 +1,7 @@ +#include + +double fmax(double x, double y) +{ + __asm__ __volatile__("fmax.d %0, %1, %2" : "=f"(x) : "f"(x), "f"(y)); + return x; +} diff --git a/src/math/loongarch64/fmaxf.c b/src/math/loongarch64/fmaxf.c new file mode 100644 index 00000000..1106d47c --- /dev/null +++ b/src/math/loongarch64/fmaxf.c @@ -0,0 +1,7 @@ +#include + +float fmaxf(float x, float y) +{ + __asm__ __volatile__("fmax.s %0, %1, %2" : "=f"(x) : "f"(x), "f"(y)); + return x; +} diff --git a/src/math/loongarch64/fmin.c b/src/math/loongarch64/fmin.c new file mode 100644 index 00000000..9c44ce87 --- /dev/null +++ b/src/math/loongarch64/fmin.c @@ -0,0 +1,7 @@ +#include + +double fmin(double x, double y) +{ + __asm__ __volatile__("fmin.d %0, %1, %2" : "=f"(x) : "f"(x), "f"(y)); + return x; +} diff --git a/src/math/loongarch64/fminf.c b/src/math/loongarch64/fminf.c new file mode 100644 index 00000000..94a0fa45 --- /dev/null +++ b/src/math/loongarch64/fminf.c @@ -0,0 +1,7 @@ +#include + +float fminf(float x, float y) +{ + __asm__ __volatile__("fmin.s %0, %1, %2" : "=f"(x) : "f"(x), "f"(y)); + return x; +} diff --git a/src/math/loongarch64/llrint.c b/src/math/loongarch64/llrint.c new file mode 100644 index 00000000..766222d3 --- /dev/null +++ b/src/math/loongarch64/llrint.c @@ -0,0 +1,17 @@ +#include +#include + +long long llrint(double x) +{ + long long r; + + __asm__ __volatile__( + "frint.d %[x], %[orig_x] \n\t" + "ftintrz.l.d %[x], %[x] \n\t" + "movfr2gr.d %[result], %[x] \n\t" + : [result]"+r"(r), [x]"+f"(x) + : [orig_x]"f"(x) + :); + + return r; +} diff --git a/src/math/loongarch64/llrintf.c b/src/math/loongarch64/llrintf.c new file mode 100644 index 00000000..f5b9dd9f --- /dev/null +++ b/src/math/loongarch64/llrintf.c @@ -0,0 +1,17 @@ +#include +#include + +long long llrintf(float x) +{ + long long r; + + __asm__ __volatile__( + "frint.s %[x], %[orig_x] \n\t" + "ftintrz.w.s %[x], %[x] \n\t" + "movfr2gr.s %[result], %[x] \n\t" + : [result]"+r"(r), [x]"+f"(x) + : [orig_x]"f"(x) + :); + + return r; +} diff --git a/src/math/loongarch64/lrint.c b/src/math/loongarch64/lrint.c new file mode 100644 index 00000000..d82239d1 --- /dev/null +++ b/src/math/loongarch64/lrint.c @@ -0,0 +1,17 @@ +#include +#include + +long lrint(double x) +{ + long r; + + __asm__ __volatile__( + "frint.d %[x], %[orig_x] \n\t" + "ftintrz.l.d %[x], %[x] \n\t" + "movfr2gr.d %[result], %[x] \n\t" + : [result]"+r"(r), [x]"+f"(x) + : [orig_x]"f"(x) + :); + + return r; +} diff --git a/src/math/loongarch64/lrintf.c b/src/math/loongarch64/lrintf.c new file mode 100644 index 00000000..b30872e9 --- /dev/null +++ b/src/math/loongarch64/lrintf.c @@ -0,0 +1,17 @@ +#include +#include + +long lrintf(float x) +{ + long r; + + __asm__ __volatile__( + "frint.s %[x], %[orig_x] \n\t" + "ftintrz.l.s %[x], %[x] \n\t" + "movfr2gr.s %[result], %[x] \n\t" + : [result]"+r"(r), [x]"+f"(x) + : [orig_x]"f"(x) + :); + + return r; +} diff --git a/src/math/loongarch64/rint.c b/src/math/loongarch64/rint.c new file mode 100644 index 00000000..862cea8c --- /dev/null +++ b/src/math/loongarch64/rint.c @@ -0,0 +1,7 @@ +#include + +double rint(double x) +{ + __asm__ __volatile__("frint.d %0, %1" : "=f"(x) : "f"(x)); + return x; +} diff --git a/src/math/loongarch64/rintf.c b/src/math/loongarch64/rintf.c new file mode 100644 index 00000000..79ac216b --- /dev/null +++ b/src/math/loongarch64/rintf.c @@ -0,0 +1,7 @@ +#include + +float rintf(float x) +{ + __asm__ __volatile__("frint.s %0, %1" : "=f"(x) : "f"(x)); + return x; +} diff --git a/src/math/loongarch64/sqrt.c b/src/math/loongarch64/sqrt.c new file mode 100644 index 00000000..a70e20e9 --- /dev/null +++ b/src/math/loongarch64/sqrt.c @@ -0,0 +1,7 @@ +#include + +double sqrt(double x) +{ + __asm__ __volatile__("fsqrt.d %1, %0" : "=f"(x) : "f"(x)); + return x; +} diff --git a/src/math/loongarch64/sqrtf.c b/src/math/loongarch64/sqrtf.c new file mode 100644 index 00000000..796609b0 --- /dev/null +++ b/src/math/loongarch64/sqrtf.c @@ -0,0 +1,7 @@ +#include + +float sqrtf(float x) +{ + __asm__ __volatile__("fsqrt.s %1, %0" : "=f"(x) : "f"(x)); + return x; +} -- 2.33.0