From: Rich Felker <dalias@aerifal.cx>
To: musl@lists.openwall.com
Subject: Re: crypt_blowfish integration, optimization
Date: Thu, 9 Aug 2012 18:32:59 -0400 [thread overview]
Message-ID: <20120809223258.GW27715@brightrain.aerifal.cx> (raw)
In-Reply-To: <20120809222103.GA29365@openwall.com>
On Fri, Aug 10, 2012 at 02:21:03AM +0400, Solar Designer wrote:
> On Thu, Aug 09, 2012 at 05:46:54PM -0400, Rich Felker wrote:
> > I've taken this version and made some minimum changes based on my
> > version, mainly for integration with musl where I'm testing it. I also
> > think we've reached the final word on loop unrolling:
> >
> > Just For Fun, I tried replacing your unrolled BF_ROUND loop with a for
> > loop and compiling with -O3 on gcc 4.6.3. After noticing the
> > performance numbers were coming out near-identical, and that the .o
> > sizes were mysteriously identical, I decided, Just For Fun, to
> > disassemble both versions with objdump and diff them. They are
> > identical. That is, modern gcc generates byte-for-byte identical code
> > with -O3 for the manually unrolled loop and the for loop.
>
> What about -O2?
>
> -O3 is probably not what will be used for most musl builds, is it?
>
> Hmm, for me "gcc -Q -O2 --help=optimizers" and ditto for -O3 both show
> "disabled" for -funroll-loops. Why was the loop unrolled for you?
Not sure. I've found -Q --help=optimizers completely unreliable in the
past though. It only reports minimal differences between -Os, -O2, and
-O3, and trying to start with -O3 and reproduce -Os by just changing
the options that are different does not give effects even remotely
similar to -Os.
> Did you also have -funroll-loops specified explicitly? If so, does this
> happen for normal musl builds? I guess not?
No, I did not explicitly specify it. At present, -Os is default for
static libc and -O3 is default for shared libc. The reason for this
discrepency is that -fPIC generates a lot of size and speed bloat at
each function call, so the inlining from -O3 comes at reduced cost (it
eliminates wasteful prologue, compensating for some of the size
increase) and much greater performance benefits (again, from killing
prologue).
I've been thinking of making -O3 default across the board rather than
having different defaults for the two, which are ugly from a
build-system perspective, but some people are still against it even
though it's easy to override.
> As discussed, the problem with avoiding such hand-unrolls is that the
> compiler doesn't know just which loops are most important to unroll.
My experience has been that it tends to make good decisions overall,
and that if somebody is using -Os, they really want smallest size, not
performance.
> BTW, what speeds are you getting on your Atom?
I was clocking 0.573 seconds for one run with the 2^12 iterations on
one test, and about 4 million cycles per run with 2^4 iterations. This
is with my version of the code (essentially the same as yours;
compiled at -O3).
> How does this compare to
> the original crypt_blowfish-1.2 with asm code (both on 32-bit)?
I'll have to get the code and try it... The asm doesn't seem to have
ever been present in the code sent to the list.
Rich
next prev parent reply other threads:[~2012-08-09 22:32 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-07-21 15:23 crypt* files in crypt directory Łukasz Sowa
2012-07-21 17:11 ` Solar Designer
2012-07-21 20:17 ` Rich Felker
2012-07-22 16:23 ` Łukasz Sowa
2012-07-25 7:57 ` Rich Felker
2012-08-08 2:24 ` Rich Felker
2012-08-08 4:42 ` Solar Designer
2012-08-08 5:28 ` Rich Felker
2012-08-08 6:27 ` Solar Designer
2012-08-08 7:03 ` Daniel Cegiełka
2012-08-08 7:24 ` Solar Designer
2012-08-08 7:42 ` Daniel Cegiełka
2012-08-08 21:48 ` Rich Felker
2012-08-08 23:08 ` Isaac Dunham
2012-08-08 23:24 ` John Spencer
2012-08-09 1:03 ` Isaac Dunham
2012-08-09 3:16 ` Rich Felker
2012-08-09 3:36 ` Solar Designer
2012-08-09 7:13 ` orc
2012-08-09 7:28 ` Rich Felker
2012-08-09 7:29 ` Solar Designer
2012-08-09 10:53 ` Solar Designer
2012-08-09 11:58 ` Szabolcs Nagy
2012-08-09 16:43 ` Solar Designer
2012-08-09 17:30 ` Szabolcs Nagy
2012-08-09 18:22 ` Rich Felker
2012-08-09 23:21 ` Rich Felker
2012-08-10 17:04 ` Solar Designer
2012-08-10 18:06 ` Rich Felker
2012-08-09 21:46 ` crypt_blowfish integration, optimization Rich Felker
2012-08-09 22:21 ` Solar Designer
2012-08-09 22:32 ` Rich Felker [this message]
2012-08-10 17:18 ` Solar Designer
2012-08-10 18:08 ` Rich Felker
2012-08-10 22:52 ` Solar Designer
2012-08-08 7:52 ` crypt* files in crypt directory Szabolcs Nagy
2012-08-08 13:06 ` Rich Felker
2012-08-08 14:30 ` orc
2012-08-08 14:53 ` Szabolcs Nagy
2012-08-08 15:05 ` orc
2012-08-08 18:10 ` Rich Felker
2012-08-09 1:51 ` Solar Designer
2012-08-09 3:25 ` Rich Felker
2012-08-09 4:04 ` Solar Designer
2012-08-09 5:48 ` Rich Felker
2012-08-09 15:52 ` Solar Designer
2012-08-09 17:59 ` Rich Felker
2012-08-09 21:17 ` Rich Felker
2012-08-09 21:44 ` Solar Designer
2012-08-09 22:08 ` Rich Felker
2012-08-09 23:33 ` Rich Felker
2012-08-09 6:03 ` Rich Felker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120809223258.GW27715@brightrain.aerifal.cx \
--to=dalias@aerifal.cx \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).