From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/8652 Path: news.gmane.org!not-for-mail From: Denys Vlasenko Newsgroups: gmane.linux.lib.musl.general Subject: [PATCH 2/3] i386/memset: do not fetch fill char from memory again Date: Mon, 12 Oct 2015 20:30:33 +0200 Message-ID: <1444674635-25421-2-git-send-email-vda.linux@googlemail.com> References: <1444674635-25421-1-git-send-email-vda.linux@googlemail.com> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org X-Trace: ger.gmane.org 1444674671 22689 80.91.229.3 (12 Oct 2015 18:31:11 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 12 Oct 2015 18:31:11 +0000 (UTC) Cc: Denys Vlasenko , musl@lists.openwall.com To: Rich Felker Original-X-From: musl-return-8664-gllmg-musl=m.gmane.org@lists.openwall.com Mon Oct 12 20:31:07 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1ZlhsD-00072i-OS for gllmg-musl@m.gmane.org; Mon, 12 Oct 2015 20:31:01 +0200 Original-Received: (qmail 9218 invoked by uid 550); 12 Oct 2015 18:31:00 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 8172 invoked from network); 12 Oct 2015 18:30:59 -0000 In-Reply-To: <1444674635-25421-1-git-send-email-vda.linux@googlemail.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 Xref: news.gmane.org gmane.linux.lib.musl.general:8652 Archived-At: shl $16,%edx mov 8(%esp),%dl mov 8(%esp),%dh The above code has two register merge stalls, and it goes to load unit to fetch the data. I don't know what's worse. Both are not pleasant. Replace them with IMUL. It has ~3 cycle latency, but no stalls. Move it a bit up to hide its latency. text data bss dec hex filename 182 0 0 182 b6 memset1.o 177 0 0 177 b1 memset2.o Signed-off-by: Denys Vlasenko CC: Rich Felker CC: musl@lists.openwall.com --- src/string/i386/memset.s | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/src/string/i386/memset.s b/src/string/i386/memset.s index d6118c7..cd13f41 100644 --- a/src/string/i386/memset.s +++ b/src/string/i386/memset.s @@ -19,13 +19,10 @@ memset: mov %dx,1(%eax) mov %dx,(-1-2)(%eax,%ecx) + imul $0x10001,%edx cmp $6,%ecx jbe 1f - shl $16,%edx - mov 8(%esp),%dl - mov 8(%esp),%dh - mov %edx,(1+2)(%eax) mov %edx,(-1-2-4)(%eax,%ecx) cmp $14,%ecx -- 1.8.1.4