From: Rich Felker <dalias@aerifal.cx>
To: Harald Becker <ralda@gmx.de>
Cc: musl@lists.openwall.com
Subject: Re: ARM memcpy post-0.9.12-release thread
Date: Tue, 30 Jul 2013 23:23:15 -0400 [thread overview]
Message-ID: <20130731032315.GA221@brightrain.aerifal.cx> (raw)
In-Reply-To: <20130731051347.7d8340ac@ralda.gmx.de>
On Wed, Jul 31, 2013 at 05:13:47AM +0200, Harald Becker wrote:
> Hi Rich !
>
> 30-07-2013 22:26 Rich Felker <dalias@aerifal.cx>:
>
> > Some rough times (128k copy repeated 10000 times):
> >
> > Aligned case:
> > Current C code: 1.2s
> > My best-attempt C code: 0.75s
> > My best-attempt inline asm: 0.57s
> > Bionic asm: 0.63s
> > Bionic asm without prefetch: 0.57s
> >
> > Misaligned case:
> > Current C code: 4.7s
> > My best-attempt inline asm: 2.9s
> > Bionic asm: 1.1s
>
> I like to throw in a question, as my cent to this topic:
>
> Does modern C Compiler not try to align all data types? So
> following this path in most cases aligned data structures are
> used and copying them around usually hit the aligned case. The
Yes but these are small anyway and the compiler will be generating
inline code to copy them with ldmia/stmia.
> misaligned case happens mostly due to working with strings, and
> those are usually short. Can't we consider other misaligned cases
> violation of the programmer or code generator? If so, I would
> prefer the best-attempt inline asm versions of code or even
> best attempt C code over arch specific asm versions ... and add
Part of the problem discussed on #musl was that I was having to be
really careful with "best attempt C" since GCC will _generate_ calls
to memcpy for some code, even when -ffreestanding is used. The folks
on #gcc claim this is not a bug. So, if compilers deem themselves at
liberty to make this kind of transformation, any C implementation of
memcpy that's not intentionally crippled (e.g. using volatile temps
and 20x slower than it should be) is a time-bomb that might blow up on
us with the next GCC version...
This makes asm (either inline or standalone) a lot more appealing for
memcpy than it otherwise would be.
> a warning for performance lose on misaligned data in
> documentation, with giving a rough percentage of this lose.
You'd prefer video processing being 4 to 5 times slower? Video
typically consists of single-byte samples (planar YUV) and operations
like cropping to a non-multiple-of-4 size, motion compensation, etc.
all involve misaligned memcpy. Same goes for image transformations in
gimp, image blitting in web browsers (not necessarily aligned to
multiple-of-4 boundaries unless you're using 32bpp), etc...
Rich
next prev parent reply other threads:[~2013-07-31 3:23 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-31 2:26 Rich Felker
2013-07-31 3:13 ` Harald Becker
2013-07-31 3:23 ` Rich Felker [this message]
2013-07-31 4:18 ` Harald Becker
2013-07-31 6:13 ` Rich Felker
2013-08-02 20:41 ` Rich Felker
2013-08-02 22:03 ` Andre Renaud
2013-08-03 0:01 ` Rich Felker
2013-08-05 21:24 ` Rich Felker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130731032315.GA221@brightrain.aerifal.cx \
--to=dalias@aerifal.cx \
--cc=musl@lists.openwall.com \
--cc=ralda@gmx.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).