From: Rich Felker <dalias@aerifal.cx>
To: Kim Walisch <kim.walisch@gmail.com>
Cc: musl@lists.openwall.com
Subject: Re: musl libc, memcpy
Date: Wed, 1 Aug 2012 00:27:22 -0400 [thread overview]
Message-ID: <20120801042722.GB544@brightrain.aerifal.cx> (raw)
In-Reply-To: <CAE_DJ1328EfKQp6t33bq0k+9Hbo0Fvu6dhvO2OOePcx2xa3QeQ@mail.gmail.com>
On Tue, Jul 31, 2012 at 12:19:13AM +0200, Kim Walisch wrote:
> > I'd like to know what block sizes you were looking at, because for
> > memcpy that makes all the difference in the world:
>
> I copied blocks of 16 kilobytes.
OK, that sounds (off-hand) like a good size for testing.
> > I don't think this is necessary or useful. If we want better
> > performance on these archs, a tiny asm file that does almost nothing
> > but "rep movsd" is known to be the fastest solution on 32-bit x86, and
> > is at least the second-fastest on 64-bit, with the faster solutions
> > not being available on all cpus. On pretty much all other archs,
> > unaligned access is illegal.
>
> My point is that your code uses byte (char) copying for unaligned data
> but on x86 this is not necessary. Using a simple macro in your memcpy
> implementation that always uses the size_t copying path for x86 speeds
> up your memcpy implementation by about 500% for unaligned data on my
> PC (Intel i5-670 3.46GHz, gcc-4.7, SL Linux 6.2 x86_64). You can also
> use a separate asm file with "rep movsd" for x86, I guess it will run
> at the same speed as my macro solution.
I'm attaching a (possibly buggy; not heavily tested) rep-movsd-based
version. I'd be interested in hearing how it performs.
> Another interesting thing to mention is that gcc-4.5 vectorizes the 3
> copying loops of your memcpy implementation if it is compiled with the
> -ftree-vectorize flag (add -ftree-vectorizer-verbose=1 for
> vectorization report) but not if simply compiled with -O2 or -O3. With
Odd, the gcc manual claims -ftree-vectorize is included in -O3:
http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
> $ gcc -O2 -ftree-vectorize -ftree-vectorizer-verbose=1 memcpy.c main.c -o memcpy
>
> memcpy.c:25: note: created 1 versioning for alias checks.
> memcpy.c:25: note: LOOP VECTORIZED.
> memcpy.c:21: note: created 1 versioning for alias checks.
> memcpy.c:21: note: LOOP VECTORIZED.
> memcpy.c:9: note: vectorized 2 loops in function.
From the sound of those notes, I suspect duplicate code (and wasteful
conditional branches) are getting generated to handle the possibility
that the source and destination pointers might alias. I think this
means it would be a good idea to add proper use of "restrict" pointers
(per C99 requirements) in musl sooner rather than later; it might both
reduce code size and improve performance.
Rich
next prev parent reply other threads:[~2012-08-01 4:27 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAE_DJ110cVdicu-wPe_Ndeg9ih+g3AXZ_hNoGgX+DftJm6q=mA@mail.gmail.com>
2012-07-30 20:41 ` Rich Felker
2012-07-31 1:25 ` Luca Barbato
[not found] ` <CAE_DJ1328EfKQp6t33bq0k+9Hbo0Fvu6dhvO2OOePcx2xa3QeQ@mail.gmail.com>
2012-08-01 4:27 ` Rich Felker [this message]
2012-08-01 5:40 ` Rich Felker
2012-08-01 6:19 ` Rich Felker
2012-08-03 23:22 ` John Spencer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120801042722.GB544@brightrain.aerifal.cx \
--to=dalias@aerifal.cx \
--cc=kim.walisch@gmail.com \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).