mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@aerifal.cx>
To: musl@lists.openwall.com
Cc: Andre Renaud <andre@bluewatersys.com>
Subject: Re: ARM memcpy post-0.9.12-release thread
Date: Fri, 2 Aug 2013 16:41:47 -0400	[thread overview]
Message-ID: <20130802204146.GO221@brightrain.aerifal.cx> (raw)
In-Reply-To: <20130731022631.GA6655@brightrain.aerifal.cx>

Andre, do you have any input on this? (Cc'ing)

Rich


On Tue, Jul 30, 2013 at 10:26:31PM -0400, Rich Felker wrote:
> Hi all (especially Andre),
> 
> I've been doing some experimenting with ARM memcpy, and I have not
> found any way to beat the Bionic asm file for misaligned copies. The
> best I could do with simple inline asm (reading multi-words and
> writing byte-at-a-time or vice versa) improved the performance nearly
> 40% compared to musl's current code, but it was still worse than half
> the speed of the Bionic asm.
> 
> For the aligned case, however, as I've said before, the Bionic code
> runs 10% slower for me than the C-with-inline-asm I posted to the
> list. Commenting out the prefetch code in the Bionic version brings
> the performance up to the same as my version.
> 
> I also found that the Bionic code was mysteriously crashing on the
> real system I test on (it worked on my toolchain with qemu). On
> further investigation, the test system's toolchain had -mthumb (with
> thumb2) as the default; adding -marm made it work. Both ways the asm
> was being interpreted as arm; the problem was that the *calling* code
> being thumb broke it. The solution was adding .type memcpy,%function
> to the asm file. Without that, the linker cannot know that the symbol
> it's resolving is a function name and thus that it has to adjust the
> low bit of the relocated address as a flag for whether the code is arm
> or thumb. I've now got the code working reliably it seems.
> 
> Sizes so far:
> Current C code: 260 bytes
> My best-attempt inline asm: 352 bytes
> Bionic (with prefetch removed): 764 bytes
> 
> Obviously the Bionic code is a bit larger than the others and than I'd
> like it to be, but it looks really hard to trim it down without
> ruining performance for misaligned copies; roughly half of the asm
> covers the misaligned case, which is expensive because you have three
> different code paths for different ways it can be off mod 4.
> 
> One other issue we have to consider if we go with the Bionic code is
> that we'd need to add sub-arch asm dirs to use it. As-is, the code is
> hard-coded for little endian. It will shuffle the byte order badly
> when copying on a big endian machine.
> 
> Some rough times (128k copy repeated 10000 times):
> 
> Aligned case:
> Current C code: 1.2s
> My best-attempt C code: 0.75s
> My best-attempt inline asm: 0.57s
> Bionic asm: 0.63s
> Bionic asm without prefetch: 0.57s
> 
> Misaligned case:
> Current C code: 4.7s
> My best-attempt inline asm: 2.9s
> Bionic asm: 1.1s
> 
> Rich


  parent reply	other threads:[~2013-08-02 20:41 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-31  2:26 Rich Felker
2013-07-31  3:13 ` Harald Becker
2013-07-31  3:23   ` Rich Felker
2013-07-31  4:18     ` Harald Becker
2013-07-31  6:13       ` Rich Felker
2013-08-02 20:41 ` Rich Felker [this message]
2013-08-02 22:03   ` Andre Renaud
2013-08-03  0:01     ` Rich Felker
2013-08-05 21:24     ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130802204146.GO221@brightrain.aerifal.cx \
    --to=dalias@aerifal.cx \
    --cc=andre@bluewatersys.com \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).