From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/15088 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Alexander Monakov Newsgroups: gmane.linux.lib.musl.general Subject: [PATCH] math: move x86_64 fabs, fabsf to C with inline asm Date: Sun, 5 Jan 2020 19:36:39 +0300 Message-ID: <20200105163639.25963-1-amonakov@ispras.ru> References: Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="------------2.11.0" Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="84561"; mail-complaints-to="usenet@blaine.gmane.org" Cc: Alexander Monakov To: musl@lists.openwall.com Original-X-From: musl-return-15104-gllmg-musl=m.gmane.org@lists.openwall.com Sun Jan 05 17:37:12 2020 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1io8ts-000Lu8-2V for gllmg-musl@m.gmane.org; Sun, 05 Jan 2020 17:37:12 +0100 Original-Received: (qmail 14306 invoked by uid 550); 5 Jan 2020 16:37:10 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 14271 invoked from network); 5 Jan 2020 16:37:09 -0000 X-Mailer: git-send-email 2.11.0 In-Reply-To: Xref: news.gmane.org gmane.linux.lib.musl.general:15088 Archived-At: This is a multi-part message in MIME format. --------------2.11.0 Content-Type: text/plain; charset=UTF-8; format=fixed Content-Transfer-Encoding: 8bit --- Questions: Why are there amd64-specific fabs implementations in the first place? (Only) because GCC generated poor code for the generic C version? Do annotations for mask manipulation in the patch help? Any way to make them less ambiguous? src/math/x86_64/fabs.c | 10 ++++++++++ src/math/x86_64/fabs.s | 9 --------- src/math/x86_64/fabsf.c | 10 ++++++++++ src/math/x86_64/fabsf.s | 7 ------- 4 files changed, 20 insertions(+), 16 deletions(-) create mode 100644 src/math/x86_64/fabs.c delete mode 100644 src/math/x86_64/fabs.s create mode 100644 src/math/x86_64/fabsf.c delete mode 100644 src/math/x86_64/fabsf.s --------------2.11.0 Content-Type: text/x-patch; name="0001-math-move-x86_64-fabs-fabsf-to-C-with-inline-asm.patch" Content-Transfer-Encoding: 8bit Content-Disposition: inline; filename="0001-math-move-x86_64-fabs-fabsf-to-C-with-inline-asm.patch" diff --git a/src/math/x86_64/fabs.c b/src/math/x86_64/fabs.c new file mode 100644 index 00000000..16562477 --- /dev/null +++ b/src/math/x86_64/fabs.c @@ -0,0 +1,10 @@ +#include + +double fabs(double x) +{ + double t; + __asm__ ("pcmpeqd %0, %0" : "=x"(t)); // t = ~0 + __asm__ ("psrlq $1, %0" : "+x"(t)); // t >>= 1 + __asm__ ("andps %1, %0" : "+x"(x) : "x"(t)); // x &= t + return x; +} diff --git a/src/math/x86_64/fabs.s b/src/math/x86_64/fabs.s deleted file mode 100644 index 5715005e..00000000 --- a/src/math/x86_64/fabs.s +++ /dev/null @@ -1,9 +0,0 @@ -.global fabs -.type fabs,@function -fabs: - xor %eax,%eax - dec %rax - shr %rax - movq %rax,%xmm1 - andpd %xmm1,%xmm0 - ret diff --git a/src/math/x86_64/fabsf.c b/src/math/x86_64/fabsf.c new file mode 100644 index 00000000..36ea7481 --- /dev/null +++ b/src/math/x86_64/fabsf.c @@ -0,0 +1,10 @@ +#include + +float fabsf(float x) +{ + float t; + __asm__ ("pcmpeqd %0, %0" : "=x"(t)); // t = ~0 + __asm__ ("psrld $1, %0" : "+x"(t)); // t >>= 1 + __asm__ ("andps %1, %0" : "+x"(x) : "x"(t)); // x &= t + return x; +} diff --git a/src/math/x86_64/fabsf.s b/src/math/x86_64/fabsf.s deleted file mode 100644 index 501a1f17..00000000 --- a/src/math/x86_64/fabsf.s +++ /dev/null @@ -1,7 +0,0 @@ -.global fabsf -.type fabsf,@function -fabsf: - mov $0x7fffffff,%eax - movq %rax,%xmm1 - andps %xmm1,%xmm0 - ret --------------2.11.0--