mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Andre McCurdy <armccurdy@gmail.com>
To: musl@lists.openwall.com
Subject: Re: building musl libc.so with gcc -flto
Date: Wed, 22 Apr 2015 22:34:40 -0700	[thread overview]
Message-ID: <CAJ86T=WdsdSVsBjLqge9o3C6u+KK_XP0xrd0q1VwFBqfv_MarA@mail.gmail.com> (raw)
In-Reply-To: <20150423022309.GH6817@brightrain.aerifal.cx>

On Wed, Apr 22, 2015 at 7:23 PM, Rich Felker <dalias@libc.org> wrote:
> On Wed, Apr 22, 2015 at 03:48:52PM -0700, Andre McCurdy wrote:
>> Hi all,
>>
>> Below are some observations from building musl libc.so with gcc's -flto
>> (link time optimization) option.
>
> Interesting!
>
>> 1) With today's master (afbcac68), adding -flto to CFLAGS causes the
>> build to fail:
>>
>>  | `_dlstart_c' referenced in section `.text' of /tmp/cc8ceNIy.ltrans0.ltrans.o: defined in discarded section `.text' of src/ldso/dlstart.lo (symbol from plugin)
>>  | collect2: error: ld returned 1 exit status
>>  | make: *** [lib/libc.so] Error 1
>>
>> Reverting f1faa0e1 (make _dlstart_c function use hidden visibility)
>> seems to be a workaround.
>
> I think the problem is that LTO is garbage collecting "unused" symbols
> before it gets to the step of linking with asm for which there is no
> IR code, thereby losing anything that's only referenced from asm. A
> better workaround might be to define _dlstart_c with a different name
> as a non-hidden function (e.g. call it __dls1) and then make
> _dlstart_c a hidden alias for it via:
>
> __attribute__((__visibility__("hidden")))
> void _dlstart_c(size_t *, size_t *);
>
> weak_alias(__dls1, _dlstart_c);
>
> If you get a chance to try that, let me know if it works.

That change does fix the build, but the resulting binary fails to run:

$ gdb ./lib/libc.so
...
(gdb) run
Starting program: /home/andre/.../lib/libc.so

Program received signal SIGILL, Illegal instruction.
0x56572ab8 in _dlstart ()
(gdb) disassemble
Dump of assembler code for function _dlstart:
   0x56572aa0 <+0>:    xor    %ebp,%ebp
   0x56572aa2 <+2>:    mov    %esp,%eax
   0x56572aa4 <+4>:    and    $0xfffffff0,%esp
   0x56572aa7 <+7>:    push   %eax
   0x56572aa8 <+8>:    push   %eax
   0x56572aa9 <+9>:    call   0x56572aae <_dlstart+14>
   0x56572aae <+14>:    addl   $0x7864a,(%esp)
   0x56572ab5 <+21>:    push   %eax
   0x56572ab6 <+22>:    call   0x56572ab7 <_dlstart+23>
   0x56572abb <+27>:    nop
   0x56572abc <+28>:    lea    0x0(%esi,%eiz,1),%esi
End of assembler dump.
(gdb)

> Another
> option might be adding -Wl,-u,_dlstart_c to LDFLAGS.

That change alone doesn't fix the build.

>> 2) With f1faa0e1 reverted, the build succeeds, but with a warning about
>> differing declarations for dummy_tsd and __pthread_tsd_main:
>>
>>  | src/thread/pthread_create.c:169:1: warning: type of '__pthread_tsd_main' does not match original declaration
>>  |  weak_alias(dummy_tsd, __pthread_tsd_main);
>>  |  ^
>>  | src/thread/pthread_key_create.c:4:7: note: previously declared here
>>  |  void *__pthread_tsd_main[PTHREAD_KEYS_MAX] = { 0 };
>>  |        ^
>
> This should be harmless but perhaps there's a better way it could be
> done.
>
>> 3) Overall build times are similar, but archiving the best results
>> with -flto relies on manually duplicating any 'make -j' options for
>> the linker. Times below are from a quad core + hyperthreading system
>> running 'make -j8 lib/libc.so':
>>
>>   original : real 0m8.501s
>>   -flto    : real 0m18.034s
>>   -flto=4  : real 0m9.885s
>>   -flto=8  : real 0m8.876s
>
> Yeah that would be expected.
>
>> 4) Changes in code size seem to be minor, except when compiling with
>> -O3, where the code gets noticeably larger (presumably due to -flto
>> giving a lot more scope for inlining?). Results below are from building
>> with gcc 4.9.2 for 32bit x86:
>>
>>     text    data     bss     dec     hex filename
>>
>>   536405    1416    8800  546621   8573d lib/libc.so      ( -Os )
>>   536324    1324    8780  546428   8567c lib/libc.so.lto  ( -Os )
>>
>>   612028    1416    8928  622372   97f24 lib/libc.so      ( -O2 )
>>   611701    1304    9132  622137   97e39 lib/libc.so.lto  ( -O2 )
>>
>>   687708    1416    8992  698116   aa704 lib/libc.so      ( -O3 )
>>   713704    1312    9208  724224   b0d00 lib/libc.so.lto  ( -O3 )
>
> Also seems rather like what I would expect. Any idea if performance is
> significantly better? It's not very comprehensive but you could try
> libc-bench.

I modified libc-bench so that it loops though everything in main() ten
times and then ran the same libc-bench binary with each version of
libc.so, sending output to /dev/null.

The -O3 -flto build seems to be consistently very slightly *slower*
than the non -flto version...

----
./lib/libc.so.Os
----
19.92user 9.88system 0:25.38elapsed 117%CPU (0avgtext+0avgdata
39344maxresident)k
0inputs+195360outputs (0major+416745minor)pagefaults 0swaps
----
./lib/libc.so.O2
----
18.72user 9.83system 0:24.20elapsed 117%CPU (0avgtext+0avgdata
39348maxresident)k
0inputs+195360outputs (0major+417364minor)pagefaults 0swaps
----
./lib/libc.so.O3
----
17.97user 9.77system 0:23.48elapsed 118%CPU (0avgtext+0avgdata
39344maxresident)k
0inputs+195360outputs (0major+418251minor)pagefaults 0swaps

----
./lib/libc.so.lto.Os
----
20.52user 9.79system 0:26.05elapsed 116%CPU (0avgtext+0avgdata
39344maxresident)k
0inputs+195360outputs (0major+418684minor)pagefaults 0swaps
----
./lib/libc.so.lto.O2
----
18.58user 9.85system 0:24.13elapsed 117%CPU (0avgtext+0avgdata
39348maxresident)k
0inputs+195360outputs (0major+419825minor)pagefaults 0swaps
----
./lib/libc.so.lto.O3
----
18.85user 9.77system 0:24.38elapsed 117%CPU (0avgtext+0avgdata
39344maxresident)k
0inputs+195360outputs (0major+419888minor)pagefaults 0swaps


>
> Rich


  reply	other threads:[~2015-04-23  5:34 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-22 22:48 Andre McCurdy
2015-04-23  2:23 ` Rich Felker
2015-04-23  5:34   ` Andre McCurdy [this message]
2015-04-23  9:45     ` Rich Felker
2015-04-28  0:16       ` Andre McCurdy
2015-04-28  0:24         ` Rich Felker
2015-04-28  6:23           ` Andre McCurdy
2015-04-28 13:44             ` Rich Felker
2015-04-29  1:42               ` Andre McCurdy
2015-04-29  3:27                 ` Rich Felker
2015-05-01  5:48                   ` Andre McCurdy
2015-05-01 10:10                     ` Szabolcs Nagy
2015-05-01 15:49                       ` Rich Felker
2015-04-30 20:46   ` Andy Lutomirski
2015-04-30 23:44     ` Rich Felker
2015-05-01  6:57       ` Alexander Monakov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJ86T=WdsdSVsBjLqge9o3C6u+KK_XP0xrd0q1VwFBqfv_MarA@mail.gmail.com' \
    --to=armccurdy@gmail.com \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).