mailing list of musl libc
 help / color / mirror / code / Atom feed
From: John Spencer <maillist-musl@barfooze.de>
To: musl@lists.openwall.com
Subject: Re: Re: musl libc, memcpy
Date: Sat, 04 Aug 2012 01:22:10 +0200	[thread overview]
Message-ID: <501C5D22.1000405@barfooze.de> (raw)
In-Reply-To: <20120801061904.GD544@brightrain.aerifal.cx>

i've setup a perfomance test ( https://github.com/rofl0r/memcpy-test )

these are the average results for i386 (100 runs on big sizes, 10000 on 
smaller ones)

                 asm version    current c-version
size: 3     172 ticks       199 ticks
size: 4     167 ticks       167 ticks
size: 5     197 ticks       186 ticks
size: 8     187 ticks       186 ticks
size: 15        195 ticks       196 ticks
size: 16        186 ticks       185 ticks
size: 23        202 ticks       199 ticks
size: 24        193 ticks       188 ticks
size: 25        205 ticks       212 ticks
size: 31        199 ticks       198 ticks
size: 32        195 ticks       192 ticks
size: 33        204 ticks       192 ticks
size: 63        213 ticks       255 ticks
size: 64        219 ticks       226 ticks
size: 65        208 ticks       238 ticks
size: 95        220 ticks       247 ticks
size: 96        214 ticks       239 ticks
size: 97        217 ticks       243 ticks
size: 127       233 ticks       261 ticks
size: 128       225 ticks       254 ticks
size: 129       229 ticks       266 ticks
size: 159       242 ticks       279 ticks
size: 160       235 ticks       268 ticks
size: 161       238 ticks       273 ticks
size: 191       255 ticks       288 ticks
size: 192       264 ticks       288 ticks
size: 193       248 ticks       287 ticks
size: 255       279 ticks       323 ticks
size: 256       266 ticks       313 ticks
size: 257       269 ticks       319 ticks
size: 383       332 ticks       391 ticks
size: 384       308 ticks       370 ticks
size: 385       307 ticks       384 ticks
size: 511       345 ticks       439 ticks
size: 512       315 ticks       434 ticks
size: 513       318 ticks       439 ticks
size: 767       370 ticks       571 ticks
size: 768       330 ticks       555 ticks
size: 769       334 ticks       566 ticks
size: 1023      382 ticks       740 ticks
size: 1024      349 ticks       727 ticks
size: 1025      358 ticks       694 ticks
size: 1535      423 ticks       936 ticks
size: 1536      393 ticks       930 ticks
size: 1537      400 ticks       929 ticks
size: 2048      448 ticks       1176 ticks
size: 4096      822 ticks       2404 ticks
size: 8192      3136 ticks      8310 ticks
size: 16384     6481 ticks      9780 ticks
size: 32768     11645 ticks     19060 ticks
size: 65536     29700 ticks     52051 ticks
size: 131072    307029 ticks    310875 ticks
size: 262144    608502 ticks    617698 ticks
size: 524288    1222116 ticks   1244987 ticks
size: 1048576   2500207 ticks   2712991 ticks
size: 2097152   5279016 ticks   5566665 ticks
size: 4194304   10586333 ticks  10849110 ticks
size: 8388608   21961730 ticks  22473953 ticks
size: 16777216  45966254 ticks  47159258 ticks
size: 33554432  92434464 ticks  95873868 ticks
size: 67108864  189858530 ticks 190456107 ticks

it looks as if the asm version is up to twice as fast, depending on the 
size of data copied.
now waiting for the x86_64 version (if you could provide a working 64bit 
rdtsc inline asm function, i'll gladly take that as well)

someone on ##asm suggested that movaps with xmm regs was fastest in his 
tests.
would be interesting to test such a version as well.

On 08/01/2012 08:19 AM, Rich Felker wrote:
> On Wed, Aug 01, 2012 at 01:40:11AM -0400, Rich Felker wrote:
>> On Wed, Aug 01, 2012 at 12:27:22AM -0400, Rich Felker wrote:
>>> I'm attaching a (possibly buggy; not heavily tested) rep-movsd-based
>>> version. I'd be interested in hearing how it performs.
>> And here is the attachment...
> And here's a version that might be faster; reportedly, rep movsd works
> better when the destination address is aligned.
>
> Rich



      reply	other threads:[~2012-08-03 23:22 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAE_DJ110cVdicu-wPe_Ndeg9ih+g3AXZ_hNoGgX+DftJm6q=mA@mail.gmail.com>
2012-07-30 20:41 ` Rich Felker
2012-07-31  1:25   ` Luca Barbato
     [not found]   ` <CAE_DJ1328EfKQp6t33bq0k+9Hbo0Fvu6dhvO2OOePcx2xa3QeQ@mail.gmail.com>
2012-08-01  4:27     ` Rich Felker
2012-08-01  5:40       ` Rich Felker
2012-08-01  6:19         ` Rich Felker
2012-08-03 23:22           ` John Spencer [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=501C5D22.1000405@barfooze.de \
    --to=maillist-musl@barfooze.de \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).