mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Denys Vlasenko <vda.linux@googlemail.com>
To: Rich Felker <dalias@libc.org>
Cc: Denys Vlasenko <vda.linux@googlemail.com>, musl@lists.openwall.com
Subject: [PATCH 2/3] i386/memset: do not fetch fill char from memory again
Date: Mon, 12 Oct 2015 20:30:33 +0200	[thread overview]
Message-ID: <1444674635-25421-2-git-send-email-vda.linux@googlemail.com> (raw)
In-Reply-To: <1444674635-25421-1-git-send-email-vda.linux@googlemail.com>

 shl $16,%edx
 mov 8(%esp),%dl
 mov 8(%esp),%dh

The above code has two register merge stalls, and it goes to load unit
to fetch the data. I don't know what's worse. Both are not pleasant.

Replace them with IMUL. It has ~3 cycle latency, but no stalls.
Move it a bit up to hide its latency.

   text	   data	    bss	    dec	    hex	filename
    182	      0	      0	    182	     b6	memset1.o
    177	      0	      0	    177	     b1	memset2.o

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
CC: Rich Felker <dalias@libc.org>
CC: musl@lists.openwall.com
---
 src/string/i386/memset.s | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/src/string/i386/memset.s b/src/string/i386/memset.s
index d6118c7..cd13f41 100644
--- a/src/string/i386/memset.s
+++ b/src/string/i386/memset.s
@@ -19,13 +19,10 @@ memset:
 
 	mov %dx,1(%eax)
 	mov %dx,(-1-2)(%eax,%ecx)
+	imul $0x10001,%edx
 	cmp $6,%ecx
 	jbe 1f
 
-	shl $16,%edx
-	mov 8(%esp),%dl
-	mov 8(%esp),%dh
-
 	mov %edx,(1+2)(%eax)
 	mov %edx,(-1-2-4)(%eax,%ecx)
 	cmp $14,%ecx
-- 
1.8.1.4



  reply	other threads:[~2015-10-12 18:30 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-12 18:30 [PATCH 1/3] i386/memset: argument load code need not be separate Denys Vlasenko
2015-10-12 18:30 ` Denys Vlasenko [this message]
2015-11-05  2:54   ` [PATCH 2/3] i386/memset: do not fetch fill char from memory again Rich Felker
2015-10-12 18:30 ` [PATCH 3/3] i386/memset: move byte-extending IMUL up, drop one insn Denys Vlasenko
2015-10-14 23:20 ` [PATCH 1/3] i386/memset: argument load code need not be separate Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1444674635-25421-2-git-send-email-vda.linux@googlemail.com \
    --to=vda.linux@googlemail.com \
    --cc=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).