* building musl libc.so with gcc -flto @ 2015-04-22 22:48 Andre McCurdy 2015-04-23 2:23 ` Rich Felker 0 siblings, 1 reply; 16+ messages in thread From: Andre McCurdy @ 2015-04-22 22:48 UTC (permalink / raw) To: musl; +Cc: Andre McCurdy Hi all, Below are some observations from building musl libc.so with gcc's -flto (link time optimization) option. 1) With today's master (afbcac68), adding -flto to CFLAGS causes the build to fail: | `_dlstart_c' referenced in section `.text' of /tmp/cc8ceNIy.ltrans0.ltrans.o: defined in discarded section `.text' of src/ldso/dlstart.lo (symbol from plugin) | collect2: error: ld returned 1 exit status | make: *** [lib/libc.so] Error 1 Reverting f1faa0e1 (make _dlstart_c function use hidden visibility) seems to be a workaround. 2) With f1faa0e1 reverted, the build succeeds, but with a warning about differing declarations for dummy_tsd and __pthread_tsd_main: | src/thread/pthread_create.c:169:1: warning: type of '__pthread_tsd_main' does not match original declaration | weak_alias(dummy_tsd, __pthread_tsd_main); | ^ | src/thread/pthread_key_create.c:4:7: note: previously declared here | void *__pthread_tsd_main[PTHREAD_KEYS_MAX] = { 0 }; | ^ 3) Overall build times are similar, but archieving the best results with -flto relies on manually duplicating any 'make -j' options for the linker. Times below are from a quad core + hyperthreading system running 'make -j8 lib/libc.so': original : real 0m8.501s -flto : real 0m18.034s -flto=4 : real 0m9.885s -flto=8 : real 0m8.876s 4) Changes in code size seem to be minor, except when compiling with -O3, where the code gets noticably larger (presumably due to -flto giving a lot more scope for inlining?). Results below are from building with gcc 4.9.2 for 32bit x86: text data bss dec hex filename 536405 1416 8800 546621 8573d lib/libc.so ( -Os ) 536324 1324 8780 546428 8567c lib/libc.so.lto ( -Os ) 612028 1416 8928 622372 97f24 lib/libc.so ( -O2 ) 611701 1304 9132 622137 97e39 lib/libc.so.lto ( -O2 ) 687708 1416 8992 698116 aa704 lib/libc.so ( -O3 ) 713704 1312 9208 724224 b0d00 lib/libc.so.lto ( -O3 ) Andre -- ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: building musl libc.so with gcc -flto 2015-04-22 22:48 building musl libc.so with gcc -flto Andre McCurdy @ 2015-04-23 2:23 ` Rich Felker 2015-04-23 5:34 ` Andre McCurdy 2015-04-30 20:46 ` Andy Lutomirski 0 siblings, 2 replies; 16+ messages in thread From: Rich Felker @ 2015-04-23 2:23 UTC (permalink / raw) To: musl On Wed, Apr 22, 2015 at 03:48:52PM -0700, Andre McCurdy wrote: > Hi all, > > Below are some observations from building musl libc.so with gcc's -flto > (link time optimization) option. Interesting! > 1) With today's master (afbcac68), adding -flto to CFLAGS causes the > build to fail: > > | `_dlstart_c' referenced in section `.text' of /tmp/cc8ceNIy.ltrans0.ltrans.o: defined in discarded section `.text' of src/ldso/dlstart.lo (symbol from plugin) > | collect2: error: ld returned 1 exit status > | make: *** [lib/libc.so] Error 1 > > Reverting f1faa0e1 (make _dlstart_c function use hidden visibility) > seems to be a workaround. I think the problem is that LTO is garbage collecting "unused" symbols before it gets to the step of linking with asm for which there is no IR code, thereby losing anything that's only referenced from asm. A better workaround might be to define _dlstart_c with a different name as a non-hidden function (e.g. call it __dls1) and then make _dlstart_c a hidden alias for it via: __attribute__((__visibility__("hidden"))) void _dlstart_c(size_t *, size_t *); weak_alias(__dls1, _dlstart_c); If you get a chance to try that, let me know if it works. Another option might be adding -Wl,-u,_dlstart_c to LDFLAGS. > 2) With f1faa0e1 reverted, the build succeeds, but with a warning about > differing declarations for dummy_tsd and __pthread_tsd_main: > > | src/thread/pthread_create.c:169:1: warning: type of '__pthread_tsd_main' does not match original declaration > | weak_alias(dummy_tsd, __pthread_tsd_main); > | ^ > | src/thread/pthread_key_create.c:4:7: note: previously declared here > | void *__pthread_tsd_main[PTHREAD_KEYS_MAX] = { 0 }; > | ^ This should be harmless but perhaps there's a better way it could be done. > 3) Overall build times are similar, but archieving the best results > with -flto relies on manually duplicating any 'make -j' options for > the linker. Times below are from a quad core + hyperthreading system > running 'make -j8 lib/libc.so': > > original : real 0m8.501s > -flto : real 0m18.034s > -flto=4 : real 0m9.885s > -flto=8 : real 0m8.876s Yeah that would be expected. > 4) Changes in code size seem to be minor, except when compiling with > -O3, where the code gets noticably larger (presumably due to -flto > giving a lot more scope for inlining?). Results below are from building > with gcc 4.9.2 for 32bit x86: > > text data bss dec hex filename > > 536405 1416 8800 546621 8573d lib/libc.so ( -Os ) > 536324 1324 8780 546428 8567c lib/libc.so.lto ( -Os ) > > 612028 1416 8928 622372 97f24 lib/libc.so ( -O2 ) > 611701 1304 9132 622137 97e39 lib/libc.so.lto ( -O2 ) > > 687708 1416 8992 698116 aa704 lib/libc.so ( -O3 ) > 713704 1312 9208 724224 b0d00 lib/libc.so.lto ( -O3 ) Also seems rather like what I would expect. Any idea if performance is significantly better? It's not very comprehensive but you could try libc-bench. Rich ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: building musl libc.so with gcc -flto 2015-04-23 2:23 ` Rich Felker @ 2015-04-23 5:34 ` Andre McCurdy 2015-04-23 9:45 ` Rich Felker 2015-04-30 20:46 ` Andy Lutomirski 1 sibling, 1 reply; 16+ messages in thread From: Andre McCurdy @ 2015-04-23 5:34 UTC (permalink / raw) To: musl On Wed, Apr 22, 2015 at 7:23 PM, Rich Felker <dalias@libc.org> wrote: > On Wed, Apr 22, 2015 at 03:48:52PM -0700, Andre McCurdy wrote: >> Hi all, >> >> Below are some observations from building musl libc.so with gcc's -flto >> (link time optimization) option. > > Interesting! > >> 1) With today's master (afbcac68), adding -flto to CFLAGS causes the >> build to fail: >> >> | `_dlstart_c' referenced in section `.text' of /tmp/cc8ceNIy.ltrans0.ltrans.o: defined in discarded section `.text' of src/ldso/dlstart.lo (symbol from plugin) >> | collect2: error: ld returned 1 exit status >> | make: *** [lib/libc.so] Error 1 >> >> Reverting f1faa0e1 (make _dlstart_c function use hidden visibility) >> seems to be a workaround. > > I think the problem is that LTO is garbage collecting "unused" symbols > before it gets to the step of linking with asm for which there is no > IR code, thereby losing anything that's only referenced from asm. A > better workaround might be to define _dlstart_c with a different name > as a non-hidden function (e.g. call it __dls1) and then make > _dlstart_c a hidden alias for it via: > > __attribute__((__visibility__("hidden"))) > void _dlstart_c(size_t *, size_t *); > > weak_alias(__dls1, _dlstart_c); > > If you get a chance to try that, let me know if it works. That change does fix the build, but the resulting binary fails to run: $ gdb ./lib/libc.so ... (gdb) run Starting program: /home/andre/.../lib/libc.so Program received signal SIGILL, Illegal instruction. 0x56572ab8 in _dlstart () (gdb) disassemble Dump of assembler code for function _dlstart: 0x56572aa0 <+0>: xor %ebp,%ebp 0x56572aa2 <+2>: mov %esp,%eax 0x56572aa4 <+4>: and $0xfffffff0,%esp 0x56572aa7 <+7>: push %eax 0x56572aa8 <+8>: push %eax 0x56572aa9 <+9>: call 0x56572aae <_dlstart+14> 0x56572aae <+14>: addl $0x7864a,(%esp) 0x56572ab5 <+21>: push %eax 0x56572ab6 <+22>: call 0x56572ab7 <_dlstart+23> 0x56572abb <+27>: nop 0x56572abc <+28>: lea 0x0(%esi,%eiz,1),%esi End of assembler dump. (gdb) > Another > option might be adding -Wl,-u,_dlstart_c to LDFLAGS. That change alone doesn't fix the build. >> 2) With f1faa0e1 reverted, the build succeeds, but with a warning about >> differing declarations for dummy_tsd and __pthread_tsd_main: >> >> | src/thread/pthread_create.c:169:1: warning: type of '__pthread_tsd_main' does not match original declaration >> | weak_alias(dummy_tsd, __pthread_tsd_main); >> | ^ >> | src/thread/pthread_key_create.c:4:7: note: previously declared here >> | void *__pthread_tsd_main[PTHREAD_KEYS_MAX] = { 0 }; >> | ^ > > This should be harmless but perhaps there's a better way it could be > done. > >> 3) Overall build times are similar, but archiving the best results >> with -flto relies on manually duplicating any 'make -j' options for >> the linker. Times below are from a quad core + hyperthreading system >> running 'make -j8 lib/libc.so': >> >> original : real 0m8.501s >> -flto : real 0m18.034s >> -flto=4 : real 0m9.885s >> -flto=8 : real 0m8.876s > > Yeah that would be expected. > >> 4) Changes in code size seem to be minor, except when compiling with >> -O3, where the code gets noticeably larger (presumably due to -flto >> giving a lot more scope for inlining?). Results below are from building >> with gcc 4.9.2 for 32bit x86: >> >> text data bss dec hex filename >> >> 536405 1416 8800 546621 8573d lib/libc.so ( -Os ) >> 536324 1324 8780 546428 8567c lib/libc.so.lto ( -Os ) >> >> 612028 1416 8928 622372 97f24 lib/libc.so ( -O2 ) >> 611701 1304 9132 622137 97e39 lib/libc.so.lto ( -O2 ) >> >> 687708 1416 8992 698116 aa704 lib/libc.so ( -O3 ) >> 713704 1312 9208 724224 b0d00 lib/libc.so.lto ( -O3 ) > > Also seems rather like what I would expect. Any idea if performance is > significantly better? It's not very comprehensive but you could try > libc-bench. I modified libc-bench so that it loops though everything in main() ten times and then ran the same libc-bench binary with each version of libc.so, sending output to /dev/null. The -O3 -flto build seems to be consistently very slightly *slower* than the non -flto version... ---- ./lib/libc.so.Os ---- 19.92user 9.88system 0:25.38elapsed 117%CPU (0avgtext+0avgdata 39344maxresident)k 0inputs+195360outputs (0major+416745minor)pagefaults 0swaps ---- ./lib/libc.so.O2 ---- 18.72user 9.83system 0:24.20elapsed 117%CPU (0avgtext+0avgdata 39348maxresident)k 0inputs+195360outputs (0major+417364minor)pagefaults 0swaps ---- ./lib/libc.so.O3 ---- 17.97user 9.77system 0:23.48elapsed 118%CPU (0avgtext+0avgdata 39344maxresident)k 0inputs+195360outputs (0major+418251minor)pagefaults 0swaps ---- ./lib/libc.so.lto.Os ---- 20.52user 9.79system 0:26.05elapsed 116%CPU (0avgtext+0avgdata 39344maxresident)k 0inputs+195360outputs (0major+418684minor)pagefaults 0swaps ---- ./lib/libc.so.lto.O2 ---- 18.58user 9.85system 0:24.13elapsed 117%CPU (0avgtext+0avgdata 39348maxresident)k 0inputs+195360outputs (0major+419825minor)pagefaults 0swaps ---- ./lib/libc.so.lto.O3 ---- 18.85user 9.77system 0:24.38elapsed 117%CPU (0avgtext+0avgdata 39344maxresident)k 0inputs+195360outputs (0major+419888minor)pagefaults 0swaps > > Rich ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: building musl libc.so with gcc -flto 2015-04-23 5:34 ` Andre McCurdy @ 2015-04-23 9:45 ` Rich Felker 2015-04-28 0:16 ` Andre McCurdy 0 siblings, 1 reply; 16+ messages in thread From: Rich Felker @ 2015-04-23 9:45 UTC (permalink / raw) To: musl On Wed, Apr 22, 2015 at 10:34:40PM -0700, Andre McCurdy wrote: > On Wed, Apr 22, 2015 at 7:23 PM, Rich Felker <dalias@libc.org> wrote: > > On Wed, Apr 22, 2015 at 03:48:52PM -0700, Andre McCurdy wrote: > >> Hi all, > >> > >> Below are some observations from building musl libc.so with gcc's -flto > >> (link time optimization) option. > > > > Interesting! > > > >> 1) With today's master (afbcac68), adding -flto to CFLAGS causes the > >> build to fail: > >> > >> | `_dlstart_c' referenced in section `.text' of /tmp/cc8ceNIy.ltrans0.ltrans.o: defined in discarded section `.text' of src/ldso/dlstart.lo (symbol from plugin) > >> | collect2: error: ld returned 1 exit status > >> | make: *** [lib/libc.so] Error 1 > >> > >> Reverting f1faa0e1 (make _dlstart_c function use hidden visibility) > >> seems to be a workaround. > > > > I think the problem is that LTO is garbage collecting "unused" symbols > > before it gets to the step of linking with asm for which there is no > > IR code, thereby losing anything that's only referenced from asm. A > > better workaround might be to define _dlstart_c with a different name > > as a non-hidden function (e.g. call it __dls1) and then make > > _dlstart_c a hidden alias for it via: > > > > __attribute__((__visibility__("hidden"))) > > void _dlstart_c(size_t *, size_t *); > > > > weak_alias(__dls1, _dlstart_c); > > > > If you get a chance to try that, let me know if it works. > > That change does fix the build, but the resulting binary fails to run: > > $ gdb ./lib/libc.so > .... > (gdb) run > Starting program: /home/andre/.../lib/libc.so > > Program received signal SIGILL, Illegal instruction. > 0x56572ab8 in _dlstart () > (gdb) disassemble > Dump of assembler code for function _dlstart: > 0x56572aa0 <+0>: xor %ebp,%ebp > 0x56572aa2 <+2>: mov %esp,%eax > 0x56572aa4 <+4>: and $0xfffffff0,%esp > 0x56572aa7 <+7>: push %eax > 0x56572aa8 <+8>: push %eax > 0x56572aa9 <+9>: call 0x56572aae <_dlstart+14> > 0x56572aae <+14>: addl $0x7864a,(%esp) > 0x56572ab5 <+21>: push %eax > 0x56572ab6 <+22>: call 0x56572ab7 <_dlstart+23> > 0x56572abb <+27>: nop > 0x56572abc <+28>: lea 0x0(%esi,%eiz,1),%esi > End of assembler dump. > (gdb) OK, it looks like the _dlstart_c symbol got removed before linking the asm. What about selectively compiling this file with -fno-lto via something like this in config.mak: src/ldso/dlstart.lo: CFLAGS += -fno-lto > > Also seems rather like what I would expect. Any idea if performance is > > significantly better? It's not very comprehensive but you could try > > libc-bench. > > I modified libc-bench so that it loops though everything in main() ten > times and then ran the same libc-bench binary with each version of > libc.so, sending output to /dev/null. > > The -O3 -flto build seems to be consistently very slightly *slower* > than the non -flto version... That makes the whole thing somewhat less interesting. LTO is probably more interesting for static libc. Rich ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: building musl libc.so with gcc -flto 2015-04-23 9:45 ` Rich Felker @ 2015-04-28 0:16 ` Andre McCurdy 2015-04-28 0:24 ` Rich Felker 0 siblings, 1 reply; 16+ messages in thread From: Andre McCurdy @ 2015-04-28 0:16 UTC (permalink / raw) To: musl On Thu, Apr 23, 2015 at 2:45 AM, Rich Felker <dalias@libc.org> wrote: > On Wed, Apr 22, 2015 at 10:34:40PM -0700, Andre McCurdy wrote: >> On Wed, Apr 22, 2015 at 7:23 PM, Rich Felker <dalias@libc.org> wrote: >> > On Wed, Apr 22, 2015 at 03:48:52PM -0700, Andre McCurdy wrote: >> >> Hi all, >> >> >> >> Below are some observations from building musl libc.so with gcc's -flto >> >> (link time optimization) option. >> > >> > Interesting! >> > >> >> 1) With today's master (afbcac68), adding -flto to CFLAGS causes the >> >> build to fail: >> >> >> >> | `_dlstart_c' referenced in section `.text' of /tmp/cc8ceNIy.ltrans0.ltrans.o: defined in discarded section `.text' of src/ldso/dlstart.lo (symbol from plugin) >> >> | collect2: error: ld returned 1 exit status >> >> | make: *** [lib/libc.so] Error 1 >> >> >> >> Reverting f1faa0e1 (make _dlstart_c function use hidden visibility) >> >> seems to be a workaround. >> > >> > I think the problem is that LTO is garbage collecting "unused" symbols >> > before it gets to the step of linking with asm for which there is no >> > IR code, thereby losing anything that's only referenced from asm. A >> > better workaround might be to define _dlstart_c with a different name >> > as a non-hidden function (e.g. call it __dls1) and then make >> > _dlstart_c a hidden alias for it via: >> > >> > __attribute__((__visibility__("hidden"))) >> > void _dlstart_c(size_t *, size_t *); >> > >> > weak_alias(__dls1, _dlstart_c); >> > >> > If you get a chance to try that, let me know if it works. >> >> That change does fix the build, but the resulting binary fails to run: >> >> $ gdb ./lib/libc.so >> .... >> (gdb) run >> Starting program: /home/andre/.../lib/libc.so >> >> Program received signal SIGILL, Illegal instruction. >> 0x56572ab8 in _dlstart () >> (gdb) disassemble >> Dump of assembler code for function _dlstart: >> 0x56572aa0 <+0>: xor %ebp,%ebp >> 0x56572aa2 <+2>: mov %esp,%eax >> 0x56572aa4 <+4>: and $0xfffffff0,%esp >> 0x56572aa7 <+7>: push %eax >> 0x56572aa8 <+8>: push %eax >> 0x56572aa9 <+9>: call 0x56572aae <_dlstart+14> >> 0x56572aae <+14>: addl $0x7864a,(%esp) >> 0x56572ab5 <+21>: push %eax >> 0x56572ab6 <+22>: call 0x56572ab7 <_dlstart+23> >> 0x56572abb <+27>: nop >> 0x56572abc <+28>: lea 0x0(%esi,%eiz,1),%esi >> End of assembler dump. >> (gdb) > > OK, it looks like the _dlstart_c symbol got removed before linking the > asm. What about selectively compiling this file with -fno-lto via > something like this in config.mak: > > src/ldso/dlstart.lo: CFLAGS += -fno-lto That works. Should I send a patch? >> > Also seems rather like what I would expect. Any idea if performance is >> > significantly better? It's not very comprehensive but you could try >> > libc-bench. >> >> I modified libc-bench so that it loops though everything in main() ten >> times and then ran the same libc-bench binary with each version of >> libc.so, sending output to /dev/null. >> >> The -O3 -flto build seems to be consistently very slightly *slower* >> than the non -flto version... > > That makes the whole thing somewhat less interesting. LTO is probably > more interesting for static libc. Yes, quite disappointing... I'll try to experiment a little with static linking. > > Rich ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: building musl libc.so with gcc -flto 2015-04-28 0:16 ` Andre McCurdy @ 2015-04-28 0:24 ` Rich Felker 2015-04-28 6:23 ` Andre McCurdy 0 siblings, 1 reply; 16+ messages in thread From: Rich Felker @ 2015-04-28 0:24 UTC (permalink / raw) To: musl On Mon, Apr 27, 2015 at 05:16:12PM -0700, Andre McCurdy wrote: > > OK, it looks like the _dlstart_c symbol got removed before linking the > > asm. What about selectively compiling this file with -fno-lto via > > something like this in config.mak: > > > > src/ldso/dlstart.lo: CFLAGS += -fno-lto > > That works. Should I send a patch? Yes, but configure would need to detect support for -fno-lto and add it appropriately. See what's done for CFLAGS_NOSSP. I suspect the crt files also need -fno-lto in principle even if they're not currently breaking for lack of it. > >> > Also seems rather like what I would expect. Any idea if performance is > >> > significantly better? It's not very comprehensive but you could try > >> > libc-bench. > >> > >> I modified libc-bench so that it loops though everything in main() ten > >> times and then ran the same libc-bench binary with each version of > >> libc.so, sending output to /dev/null. > >> > >> The -O3 -flto build seems to be consistently very slightly *slower* > >> than the non -flto version... > > > > That makes the whole thing somewhat less interesting. LTO is probably > > more interesting for static libc. > > Yes, quite disappointing... > > I'll try to experiment a little with static linking. Great. Let us know how it goes. Rich ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: building musl libc.so with gcc -flto 2015-04-28 0:24 ` Rich Felker @ 2015-04-28 6:23 ` Andre McCurdy 2015-04-28 13:44 ` Rich Felker 0 siblings, 1 reply; 16+ messages in thread From: Andre McCurdy @ 2015-04-28 6:23 UTC (permalink / raw) To: musl On Mon, Apr 27, 2015 at 5:24 PM, Rich Felker <dalias@libc.org> wrote: > On Mon, Apr 27, 2015 at 05:16:12PM -0700, Andre McCurdy wrote: >> > OK, it looks like the _dlstart_c symbol got removed before linking the >> > asm. What about selectively compiling this file with -fno-lto via >> > something like this in config.mak: >> > >> > src/ldso/dlstart.lo: CFLAGS += -fno-lto >> >> That works. Should I send a patch? > > Yes, but configure would need to detect support for -fno-lto and add > it appropriately. See what's done for CFLAGS_NOSSP. I suspect the crt > files also need -fno-lto in principle even if they're not currently > breaking for lack of it. Patch sent. I think the crt files might be OK as they are, since the _start_c symbol isn't being hidden? >> >> > Also seems rather like what I would expect. Any idea if performance is >> >> > significantly better? It's not very comprehensive but you could try >> >> > libc-bench. >> >> >> >> I modified libc-bench so that it loops though everything in main() ten >> >> times and then ran the same libc-bench binary with each version of >> >> libc.so, sending output to /dev/null. >> >> >> >> The -O3 -flto build seems to be consistently very slightly *slower* >> >> than the non -flto version... >> > >> > That makes the whole thing somewhat less interesting. LTO is probably >> > more interesting for static libc. >> >> Yes, quite disappointing... >> >> I'll try to experiment a little with static linking. > > Great. Let us know how it goes. > > Rich ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: building musl libc.so with gcc -flto 2015-04-28 6:23 ` Andre McCurdy @ 2015-04-28 13:44 ` Rich Felker 2015-04-29 1:42 ` Andre McCurdy 0 siblings, 1 reply; 16+ messages in thread From: Rich Felker @ 2015-04-28 13:44 UTC (permalink / raw) To: musl On Mon, Apr 27, 2015 at 11:23:40PM -0700, Andre McCurdy wrote: > On Mon, Apr 27, 2015 at 5:24 PM, Rich Felker <dalias@libc.org> wrote: > > On Mon, Apr 27, 2015 at 05:16:12PM -0700, Andre McCurdy wrote: > >> > OK, it looks like the _dlstart_c symbol got removed before linking the > >> > asm. What about selectively compiling this file with -fno-lto via > >> > something like this in config.mak: > >> > > >> > src/ldso/dlstart.lo: CFLAGS += -fno-lto > >> > >> That works. Should I send a patch? > > > > Yes, but configure would need to detect support for -fno-lto and add > > it appropriately. See what's done for CFLAGS_NOSSP. I suspect the crt > > files also need -fno-lto in principle even if they're not currently > > breaking for lack of it. > > Patch sent. > > I think the crt files might be OK as they are, since the _start_c > symbol isn't being hidden? I think you'll find the exact same thing happens if you use a crt1.o produced from crt1.c for static linking with LTO. Note that on i386 (and x86_64) we still have a crt1.s which overrides crt1.c; I want to remove it at some point. Temporarily removing/renaming it yourself will allow you to test what happens with LTO on this file. Rich ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: building musl libc.so with gcc -flto 2015-04-28 13:44 ` Rich Felker @ 2015-04-29 1:42 ` Andre McCurdy 2015-04-29 3:27 ` Rich Felker 0 siblings, 1 reply; 16+ messages in thread From: Andre McCurdy @ 2015-04-29 1:42 UTC (permalink / raw) To: musl On Tue, Apr 28, 2015 at 6:44 AM, Rich Felker <dalias@libc.org> wrote: > On Mon, Apr 27, 2015 at 11:23:40PM -0700, Andre McCurdy wrote: >> On Mon, Apr 27, 2015 at 5:24 PM, Rich Felker <dalias@libc.org> wrote: >> > On Mon, Apr 27, 2015 at 05:16:12PM -0700, Andre McCurdy wrote: >> >> > OK, it looks like the _dlstart_c symbol got removed before linking the >> >> > asm. What about selectively compiling this file with -fno-lto via >> >> > something like this in config.mak: >> >> > >> >> > src/ldso/dlstart.lo: CFLAGS += -fno-lto >> >> >> >> That works. Should I send a patch? >> > >> > Yes, but configure would need to detect support for -fno-lto and add >> > it appropriately. See what's done for CFLAGS_NOSSP. I suspect the crt >> > files also need -fno-lto in principle even if they're not currently >> > breaking for lack of it. >> >> Patch sent. >> >> I think the crt files might be OK as they are, since the _start_c >> symbol isn't being hidden? > > I think you'll find the exact same thing happens if you use a crt1.o > produced from crt1.c for static linking with LTO. Note that on i386 > (and x86_64) we still have a crt1.s which overrides crt1.c; I want to > remove it at some point. Temporarily removing/renaming it yourself > will allow you to test what happens with LTO on this file. Yes, you're right. The same workaround is needed for crt1.c, so my original patch is incomplete. The next issue, as Khem mentioned, is that AR and RANLIB need to be changed to: AR = $(CROSS_COMPILE)gcc-ar RANLIB = $(CROSS_COMPILE)gcc-ranlib Is it safe to use these gcc-xx wrappers in all cases (after having configure test that the toolchain provides them)? Or should they only be used with LTO? Beyond that I'm not able to statically link a hello.c test app using LTO yet, since LTO linking with .a archives requires the gold linker and I specifically have that disabled (to avoid issues I had seen previously with musl+gold+dynamic linking). It looks like I need to enable gold again before continuing. Are there any known issues, or known success stories, when using musl with gold? > > Rich ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: building musl libc.so with gcc -flto 2015-04-29 1:42 ` Andre McCurdy @ 2015-04-29 3:27 ` Rich Felker 2015-05-01 5:48 ` Andre McCurdy 0 siblings, 1 reply; 16+ messages in thread From: Rich Felker @ 2015-04-29 3:27 UTC (permalink / raw) To: musl On Tue, Apr 28, 2015 at 06:42:13PM -0700, Andre McCurdy wrote: > On Tue, Apr 28, 2015 at 6:44 AM, Rich Felker <dalias@libc.org> wrote: > > On Mon, Apr 27, 2015 at 11:23:40PM -0700, Andre McCurdy wrote: > >> On Mon, Apr 27, 2015 at 5:24 PM, Rich Felker <dalias@libc.org> wrote: > >> > On Mon, Apr 27, 2015 at 05:16:12PM -0700, Andre McCurdy wrote: > >> >> > OK, it looks like the _dlstart_c symbol got removed before linking the > >> >> > asm. What about selectively compiling this file with -fno-lto via > >> >> > something like this in config.mak: > >> >> > > >> >> > src/ldso/dlstart.lo: CFLAGS += -fno-lto > >> >> > >> >> That works. Should I send a patch? > >> > > >> > Yes, but configure would need to detect support for -fno-lto and add > >> > it appropriately. See what's done for CFLAGS_NOSSP. I suspect the crt > >> > files also need -fno-lto in principle even if they're not currently > >> > breaking for lack of it. > >> > >> Patch sent. > >> > >> I think the crt files might be OK as they are, since the _start_c > >> symbol isn't being hidden? > > > > I think you'll find the exact same thing happens if you use a crt1.o > > produced from crt1.c for static linking with LTO. Note that on i386 > > (and x86_64) we still have a crt1.s which overrides crt1.c; I want to > > remove it at some point. Temporarily removing/renaming it yourself > > will allow you to test what happens with LTO on this file. > > Yes, you're right. The same workaround is needed for crt1.c, so my > original patch is incomplete. > > The next issue, as Khem mentioned, is that AR and RANLIB need to be changed to: > > AR = $(CROSS_COMPILE)gcc-ar > RANLIB = $(CROSS_COMPILE)gcc-ranlib > > Is it safe to use these gcc-xx wrappers in all cases (after having > configure test that the toolchain provides them)? Or should they only > be used with LTO? I have no idea. But why can't a normal archive produced with ar be used for LTO? Inability to use standard toolchain components/build scripts sounds like a fairly big problem for LTO deployability. > Beyond that I'm not able to statically link a hello.c test app using > LTO yet, since LTO linking with .a archives requires the gold linker > and I specifically have that disabled (to avoid issues I had seen > previously with musl+gold+dynamic linking). It looks like I need to > enable gold again before continuing. Are there any known issues, or > known success stories, when using musl with gold? I would actually love to see new reports on gold with musl dynamic linking. The new dynamic linker bootstrap design should be a lot more robust against poor choices by the linker, but there could be stupid details that are breaking still which might call for fixes on musl's side. If you have citations for the problems that previously existed with musl+gold, it would be nice to see those again too just to make sure the issues were properly taken care of and not just swept under a rug. Rich ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: building musl libc.so with gcc -flto 2015-04-29 3:27 ` Rich Felker @ 2015-05-01 5:48 ` Andre McCurdy 2015-05-01 10:10 ` Szabolcs Nagy 0 siblings, 1 reply; 16+ messages in thread From: Andre McCurdy @ 2015-05-01 5:48 UTC (permalink / raw) To: musl On Tue, Apr 28, 2015 at 8:27 PM, Rich Felker <dalias@libc.org> wrote: > > I would actually love to see new reports on gold with musl dynamic > linking. The new dynamic linker bootstrap design should be a lot more > robust against poor choices by the linker, but there could be stupid > details that are breaking still which might call for fixes on musl's > side. > > If you have citations for the problems that previously existed with > musl+gold, it would be nice to see those again too just to make sure > the issues were properly taken care of and not just swept under a rug. > I'm able to reproduce the problem I saw previously with the latest musl git version. Behaviour is that some binaries dynamically linked with gold (notably busybox) seem to run well but most binaries segfault at startup. I'm using gcc 4.9.2 and binutils 2.25, but I should also mention that I'm using OpenEmbedded to build the toolchain and musl support in OE is still quite experimental... Below is a link to a base64 encoded tar file containing two dynamically linked "hello world" x86 binaries. Both were created using the same OE toolchain (the only difference was the -fuse-ld=XXX option used). "hello.bfd" runs well, "hello.gold" segfaults. Hopefully they can give some clues about what's happening. http://pastebin.com/raw.php?i=RKJBqAg1 Andre -- ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: building musl libc.so with gcc -flto 2015-05-01 5:48 ` Andre McCurdy @ 2015-05-01 10:10 ` Szabolcs Nagy 2015-05-01 15:49 ` Rich Felker 0 siblings, 1 reply; 16+ messages in thread From: Szabolcs Nagy @ 2015-05-01 10:10 UTC (permalink / raw) To: musl * Andre McCurdy <armccurdy@gmail.com> [2015-04-30 22:48:09 -0700]: > I'm able to reproduce the problem I saw previously with the latest > musl git version. Behaviour is that some binaries dynamically linked > with gold (notably busybox) seem to run well but most binaries > segfault at startup. > > I'm using gcc 4.9.2 and binutils 2.25, but I should also mention that > I'm using OpenEmbedded to build the toolchain and musl support in OE > is still quite experimental... > > Below is a link to a base64 encoded tar file containing two > dynamically linked "hello world" x86 binaries. Both were created using > the same OE toolchain (the only difference was the -fuse-ld=XXX option > used). "hello.bfd" runs well, "hello.gold" segfaults. Hopefully they > can give some clues about what's happening. > > http://pastebin.com/raw.php?i=RKJBqAg1 $ nm -D hello.gold w _ITM_deregisterTMCloneTable w _ITM_registerTMCloneTable w _Jv_RegisterClasses 08049670 A __bss_start w __deregister_frame_info U __libc_start_main w __register_frame_info 08049670 A _edata 08049690 A _end U printf $ nm -D hello.bfd 08049564 B __bss_start U __libc_start_main 08049564 D _edata 08049584 B _end U printf i'm not sure where gold expects __register_frame_info to be defined.. the call chain is __dls3 -> do_init_fini -> _init -> frame_dummy -> __register_frame_info@plt -> 0 objdump: 0804846e <frame_dummy>: 804846e: 55 push %ebp 804846f: b8 70 83 04 08 mov $0x8048370,%eax 8048474: 89 e5 mov %esp,%ebp 8048476: 83 ec 08 sub $0x8,%esp 8048479: 85 c0 test %eax,%eax 804847b: 74 14 je 8048491 <frame_dummy+0x23> 804847d: 50 push %eax 804847e: 50 push %eax 804847f: 68 78 96 04 08 push $0x8049678 8048484: 68 18 85 04 08 push $0x8048518 8048489: e8 e2 fe ff ff call 8048370 <__register_frame_info@plt> ... 08048370 <__register_frame_info@plt>: 8048370: ff 25 50 96 04 08 jmp *0x8049650 ... GOT at this point: 0x08049648: 0xf7f92263 (__libc_start_main) 0x0804964c: 0x00000000 (should be __deregister_frame_info?) 0x08049650: 0x00000000 (should be __register_frame_info?) 0x08049654: 0xf7fbedeb (printf) readelf says: Relocation section '.rel.plt' at offset 0x308 contains 4 entries: Offset Info Type Sym.Value Sym. Name 08049648 00000107 R_386_JUMP_SLOT 00000000 __libc_start_main 0804964c 00000907 R_386_JUMP_SLOT 08048360 __deregister_frame_inf 08049650 00000607 R_386_JUMP_SLOT 08048370 __register_frame_info 08049654 00000507 R_386_JUMP_SLOT 00000000 printf Symbol table '.dynsym' contains 11 entries: Num: Value Size Type Bind Vis Ndx Name 0: 00000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000 0 FUNC GLOBAL DEFAULT UND __libc_start_main 2: 00000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterTMCloneTab 3: 00000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMCloneTable 4: 00000000 0 NOTYPE WEAK DEFAULT UND _Jv_RegisterClasses 5: 00000000 0 FUNC GLOBAL DEFAULT UND printf 6: 08048370 0 FUNC WEAK DEFAULT UND __register_frame_info 7: 08049690 0 NOTYPE GLOBAL DEFAULT ABS _end 8: 08049670 0 NOTYPE GLOBAL DEFAULT ABS _edata 9: 08048360 0 FUNC WEAK DEFAULT UND __deregister_frame_info 10: 08049670 0 NOTYPE GLOBAL DEFAULT ABS __bss_start Symbol table '.symtab' contains 39 entries: Num: Value Size Type Bind Vis Ndx Name ... 32: 00000000 0 FUNC WEAK DEFAULT UND __deregister_frame_info 33: 00000000 0 FUNC WEAK DEFAULT UND __register_frame_info ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: building musl libc.so with gcc -flto 2015-05-01 10:10 ` Szabolcs Nagy @ 2015-05-01 15:49 ` Rich Felker 0 siblings, 0 replies; 16+ messages in thread From: Rich Felker @ 2015-05-01 15:49 UTC (permalink / raw) To: musl On Fri, May 01, 2015 at 12:10:16PM +0200, Szabolcs Nagy wrote: > * Andre McCurdy <armccurdy@gmail.com> [2015-04-30 22:48:09 -0700]: > > I'm able to reproduce the problem I saw previously with the latest > > musl git version. Behaviour is that some binaries dynamically linked > > with gold (notably busybox) seem to run well but most binaries > > segfault at startup. > > > > I'm using gcc 4.9.2 and binutils 2.25, but I should also mention that > > I'm using OpenEmbedded to build the toolchain and musl support in OE > > is still quite experimental... > > > > Below is a link to a base64 encoded tar file containing two > > dynamically linked "hello world" x86 binaries. Both were created using > > the same OE toolchain (the only difference was the -fuse-ld=XXX option > > used). "hello.bfd" runs well, "hello.gold" segfaults. Hopefully they > > can give some clues about what's happening. > > > > http://pastebin.com/raw.php?i=RKJBqAg1 > > $ nm -D hello.gold > w _ITM_deregisterTMCloneTable > w _ITM_registerTMCloneTable > w _Jv_RegisterClasses > 08049670 A __bss_start > w __deregister_frame_info > U __libc_start_main > w __register_frame_info > 08049670 A _edata > 08049690 A _end > U printf > > $ nm -D hello.bfd > 08049564 B __bss_start > U __libc_start_main > 08049564 D _edata > 08049584 B _end > U printf > > i'm not sure where gold expects __register_frame_info to be defined.. > > the call chain is > > __dls3 -> do_init_fini -> _init -> frame_dummy -> __register_frame_info@plt -> 0 > > objdump: > > 0804846e <frame_dummy>: > 804846e: 55 push %ebp > 804846f: b8 70 83 04 08 mov $0x8048370,%eax > 8048474: 89 e5 mov %esp,%ebp > 8048476: 83 ec 08 sub $0x8,%esp > 8048479: 85 c0 test %eax,%eax > 804847b: 74 14 je 8048491 <frame_dummy+0x23> > 804847d: 50 push %eax > 804847e: 50 push %eax > 804847f: 68 78 96 04 08 push $0x8049678 > 8048484: 68 18 85 04 08 push $0x8048518 > 8048489: e8 e2 fe ff ff call 8048370 <__register_frame_info@plt> > .... > 08048370 <__register_frame_info@plt>: > 8048370: ff 25 50 96 04 08 jmp *0x8049650 > .... The problem is that gold does not know how to process relocations for undefined weak references correctly. When the code in question is PIC/PIE, the weak reference can be kept for resolving at runtime. Instead of: 804846f: b8 70 83 04 08 mov $0x8048370,%eax where the linker filled in a fixed address (the PLT slot) which the code happily sees is non-zero and then calls it, PIC code would read the address from the GOT. In non-PIC code, the linker (ld) *MUST* resolve undefined weak references to the address zero; they are not overridable at runtime because non-PIC doesn't support that. This is a bug in gold, but I have no idea how it works at all, even with glibc. The same issue should arise in gcc's crt files. You can probably work around it for now by building the app as PIE. Rich ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: building musl libc.so with gcc -flto 2015-04-23 2:23 ` Rich Felker 2015-04-23 5:34 ` Andre McCurdy @ 2015-04-30 20:46 ` Andy Lutomirski 2015-04-30 23:44 ` Rich Felker 1 sibling, 1 reply; 16+ messages in thread From: Andy Lutomirski @ 2015-04-30 20:46 UTC (permalink / raw) To: musl On 04/22/2015 07:23 PM, Rich Felker wrote: > On Wed, Apr 22, 2015 at 03:48:52PM -0700, Andre McCurdy wrote: >> Hi all, >> >> Below are some observations from building musl libc.so with gcc's -flto >> (link time optimization) option. > > Interesting! > >> 1) With today's master (afbcac68), adding -flto to CFLAGS causes the >> build to fail: >> >> | `_dlstart_c' referenced in section `.text' of /tmp/cc8ceNIy.ltrans0.ltrans.o: defined in discarded section `.text' of src/ldso/dlstart.lo (symbol from plugin) >> | collect2: error: ld returned 1 exit status >> | make: *** [lib/libc.so] Error 1 >> >> Reverting f1faa0e1 (make _dlstart_c function use hidden visibility) >> seems to be a workaround. > > I think the problem is that LTO is garbage collecting "unused" symbols > before it gets to the step of linking with asm for which there is no > IR code, thereby losing anything that's only referenced from asm. A > better workaround might be to define _dlstart_c with a different name > as a non-hidden function (e.g. call it __dls1) and then make > _dlstart_c a hidden alias for it via: > > __attribute__((__visibility__("hidden"))) > void _dlstart_c(size_t *, size_t *); > > weak_alias(__dls1, _dlstart_c); > > If you get a chance to try that, let me know if it works. Another > option might be adding -Wl,-u,_dlstart_c to LDFLAGS. Wouldn't adding __attribute__((externally_visible)) to the relevant symbols be more appropriate? It's intended to solve exactly this problem. --Andy ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Re: building musl libc.so with gcc -flto 2015-04-30 20:46 ` Andy Lutomirski @ 2015-04-30 23:44 ` Rich Felker 2015-05-01 6:57 ` Alexander Monakov 0 siblings, 1 reply; 16+ messages in thread From: Rich Felker @ 2015-04-30 23:44 UTC (permalink / raw) To: musl On Thu, Apr 30, 2015 at 01:46:21PM -0700, Andy Lutomirski wrote: > On 04/22/2015 07:23 PM, Rich Felker wrote: > >On Wed, Apr 22, 2015 at 03:48:52PM -0700, Andre McCurdy wrote: > >>Hi all, > >> > >>Below are some observations from building musl libc.so with gcc's -flto > >>(link time optimization) option. > > > >Interesting! > > > >>1) With today's master (afbcac68), adding -flto to CFLAGS causes the > >>build to fail: > >> > >> | `_dlstart_c' referenced in section `.text' of /tmp/cc8ceNIy.ltrans0.ltrans.o: defined in discarded section `.text' of src/ldso/dlstart.lo (symbol from plugin) > >> | collect2: error: ld returned 1 exit status > >> | make: *** [lib/libc.so] Error 1 > >> > >>Reverting f1faa0e1 (make _dlstart_c function use hidden visibility) > >>seems to be a workaround. > > > >I think the problem is that LTO is garbage collecting "unused" symbols > >before it gets to the step of linking with asm for which there is no > >IR code, thereby losing anything that's only referenced from asm. A > >better workaround might be to define _dlstart_c with a different name > >as a non-hidden function (e.g. call it __dls1) and then make > >_dlstart_c a hidden alias for it via: > > > >__attribute__((__visibility__("hidden"))) > >void _dlstart_c(size_t *, size_t *); > > > >weak_alias(__dls1, _dlstart_c); > > > >If you get a chance to try that, let me know if it works. Another > >option might be adding -Wl,-u,_dlstart_c to LDFLAGS. > > Wouldn't adding __attribute__((externally_visible)) to the relevant > symbols be more appropriate? It's intended to solve exactly this > problem. I'm not clear whether it would be reliable to use this or not. Semantically externally_visible and visibility=hidden are contradictory. Even if we weren't trying to avoid relying on additional GNU C features, I think it would be a bad idea to rely on this working since the behavior under such contradictory annotations could potentially vary widely between compilers. Rich ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Re: building musl libc.so with gcc -flto 2015-04-30 23:44 ` Rich Felker @ 2015-05-01 6:57 ` Alexander Monakov 0 siblings, 0 replies; 16+ messages in thread From: Alexander Monakov @ 2015-05-01 6:57 UTC (permalink / raw) To: musl > > Wouldn't adding __attribute__((externally_visible)) to the relevant > > symbols be more appropriate? It's intended to solve exactly this > > problem. > > I'm not clear whether it would be reliable to use this or not. > Semantically externally_visible and visibility=hidden are > contradictory. Even if we weren't trying to avoid relying on > additional GNU C features, I think it would be a bad idea to rely on > this working since the behavior under such contradictory annotations > could potentially vary widely between compilers. The attribute that's suitable for this case is "used", not "externally_visible" (um, I already mentioned that in the other thread where I explained why the reference from the asm is 'ignored'). Andy is not correct in saying that "it's intended to solve exactly this problem". To quote GCC manual, externally_visible This attribute, attached to a global variable or function, nullifies the effect of the -fwhole-program command-line option, so the object remains visible outside the current compilation unit. used This attribute, attached to a function, means that code must be emitted for the function even if it appears that the function is not referenced. This is useful, for example, when the function is referenced only in inline assembly. ( https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html ) Alexander ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2015-05-01 15:49 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-04-22 22:48 building musl libc.so with gcc -flto Andre McCurdy 2015-04-23 2:23 ` Rich Felker 2015-04-23 5:34 ` Andre McCurdy 2015-04-23 9:45 ` Rich Felker 2015-04-28 0:16 ` Andre McCurdy 2015-04-28 0:24 ` Rich Felker 2015-04-28 6:23 ` Andre McCurdy 2015-04-28 13:44 ` Rich Felker 2015-04-29 1:42 ` Andre McCurdy 2015-04-29 3:27 ` Rich Felker 2015-05-01 5:48 ` Andre McCurdy 2015-05-01 10:10 ` Szabolcs Nagy 2015-05-01 15:49 ` Rich Felker 2015-04-30 20:46 ` Andy Lutomirski 2015-04-30 23:44 ` Rich Felker 2015-05-01 6:57 ` Alexander Monakov
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).