mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: [PATCH 2/3] i386/memset: do not fetch fill char from memory again
Date: Wed, 4 Nov 2015 21:54:33 -0500	[thread overview]
Message-ID: <20151105025433.GW8645@brightrain.aerifal.cx> (raw)
In-Reply-To: <1444674635-25421-2-git-send-email-vda.linux@googlemail.com>

On Mon, Oct 12, 2015 at 08:30:33PM +0200, Denys Vlasenko wrote:
>  shl $16,%edx
>  mov 8(%esp),%dl
>  mov 8(%esp),%dh
> 
> The above code has two register merge stalls, and it goes to load unit
> to fetch the data. I don't know what's worse. Both are not pleasant.

Do you have measurements to back this?

> Replace them with IMUL. It has ~3 cycle latency, but no stalls.

While we probably don't need to care about ancient chips like 486 or
original Pentium for performance purposes (altho maybe Quark?), I'd
rather not do anything that would make performance catastrophically
worse on them unless it actually has significant (measurable) benefit
for modern systems. The code as is was written to be non-hostile to
systems where imul has some nontrivial cost.

> Move it a bit up to hide its latency.

The movement puts it before the branch which exits early, which is
probably a huge performance loss on old cpus.

Of course even better than evidence that your code helps a lot on
modern cpus would be evidence that it doesn't hurt at all on old ones.
Anyone have a 486 or 586 lying around to run timings on? I suppose I
could see if my old K6 still boots...

Rich


  reply	other threads:[~2015-11-05  2:54 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-12 18:30 [PATCH 1/3] i386/memset: argument load code need not be separate Denys Vlasenko
2015-10-12 18:30 ` [PATCH 2/3] i386/memset: do not fetch fill char from memory again Denys Vlasenko
2015-11-05  2:54   ` Rich Felker [this message]
2015-10-12 18:30 ` [PATCH 3/3] i386/memset: move byte-extending IMUL up, drop one insn Denys Vlasenko
2015-10-14 23:20 ` [PATCH 1/3] i386/memset: argument load code need not be separate Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151105025433.GW8645@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).