From: Denys Vlasenko <vda.linux@googlemail.com>
To: Rich Felker <dalias@libc.org>
Cc: Denys Vlasenko <dvlasenk@redhat.com>, musl@lists.openwall.com
Subject: [musl] [PATCH] x86/memset: avoid performing final store twice
Date: Sun, 4 Oct 2020 00:32:09 +0200 [thread overview]
Message-ID: <20201003223209.10307-1-vda.linux@googlemail.com> (raw)
From: Denys Vlasenko <dvlasenk@redhat.com>
For not very short NBYTES case:
To handle the tail alignment, the code performs a potentially
misaligned word store to fill the final 8 bytes of the buffer.
This is done even if the buffer's end is aligned.
Eventually code fills the rest of the buffer, which is a multiple
of 8 bytes now, with NBYTES / 8 aligned word stores.
However, this means that if NBYTES *was* divisible by 8,
we store last word too, again.
This patch decrements byte count before dividing it by 8,
making one less store in "NBYTES is divisible by 8" case,
and not changing anything in all other cases.
CC: Rich Felker <dalias@libc.org>
CC: musl@lists.openwall.com
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
---
src/string/i386/memset.s | 7 ++++---
src/string/x86_64/memset.s | 2 +-
2 files changed, 5 insertions(+), 4 deletions(-)
diff --git a/src/string/i386/memset.s b/src/string/i386/memset.s
index d00422c4..b1c5c2f8 100644
--- a/src/string/i386/memset.s
+++ b/src/string/i386/memset.s
@@ -47,7 +47,7 @@ memset:
mov %edx,(-1-2-4-8-8)(%eax,%ecx)
mov %edx,(-1-2-4-8-4)(%eax,%ecx)
-1: ret
+1: ret
2: movzbl 8(%esp),%eax
mov %edi,12(%esp)
@@ -57,13 +57,14 @@ memset:
mov %eax,-4(%edi,%ecx)
jnz 2f
-1: shr $2, %ecx
+1: dec %ecx
+ shr $2, %ecx
rep
stosl
mov 4(%esp),%eax
mov 12(%esp),%edi
ret
-
+
2: xor %edx,%edx
sub %edi,%edx
and $15,%edx
diff --git a/src/string/x86_64/memset.s b/src/string/x86_64/memset.s
index 2d3f5e52..85bb686c 100644
--- a/src/string/x86_64/memset.s
+++ b/src/string/x86_64/memset.s
@@ -53,7 +53,7 @@ memset:
2: test $15,%edi
mov %rdi,%r8
mov %rax,-8(%rdi,%rdx)
- mov %rdx,%rcx
+ lea -1(%rdx),%rcx
jnz 2f
1: shr $3,%rcx
--
2.25.0
next reply other threads:[~2020-10-03 22:32 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-03 22:32 Denys Vlasenko [this message]
2020-10-11 0:25 ` Rich Felker
2020-10-12 12:18 ` Denys Vlasenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201003223209.10307-1-vda.linux@googlemail.com \
--to=vda.linux@googlemail.com \
--cc=dalias@libc.org \
--cc=dvlasenk@redhat.com \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).