From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7475 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: building musl libc.so with gcc -flto Date: Wed, 22 Apr 2015 22:23:09 -0400 Message-ID: <20150423022309.GH6817@brightrain.aerifal.cx> References: <1429742932-6026-1-git-send-email-armccurdy@gmail.com> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1429755805 5752 80.91.229.3 (23 Apr 2015 02:23:25 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 23 Apr 2015 02:23:25 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-7488-gllmg-musl=m.gmane.org@lists.openwall.com Thu Apr 23 04:23:24 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1Yl6nT-0006eW-Tt for gllmg-musl@m.gmane.org; Thu, 23 Apr 2015 04:23:24 +0200 Original-Received: (qmail 31929 invoked by uid 550); 23 Apr 2015 02:23:22 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 31904 invoked from network); 23 Apr 2015 02:23:22 -0000 Content-Disposition: inline In-Reply-To: <1429742932-6026-1-git-send-email-armccurdy@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:7475 Archived-At: On Wed, Apr 22, 2015 at 03:48:52PM -0700, Andre McCurdy wrote: > Hi all, > > Below are some observations from building musl libc.so with gcc's -flto > (link time optimization) option. Interesting! > 1) With today's master (afbcac68), adding -flto to CFLAGS causes the > build to fail: > > | `_dlstart_c' referenced in section `.text' of /tmp/cc8ceNIy.ltrans0.ltrans.o: defined in discarded section `.text' of src/ldso/dlstart.lo (symbol from plugin) > | collect2: error: ld returned 1 exit status > | make: *** [lib/libc.so] Error 1 > > Reverting f1faa0e1 (make _dlstart_c function use hidden visibility) > seems to be a workaround. I think the problem is that LTO is garbage collecting "unused" symbols before it gets to the step of linking with asm for which there is no IR code, thereby losing anything that's only referenced from asm. A better workaround might be to define _dlstart_c with a different name as a non-hidden function (e.g. call it __dls1) and then make _dlstart_c a hidden alias for it via: __attribute__((__visibility__("hidden"))) void _dlstart_c(size_t *, size_t *); weak_alias(__dls1, _dlstart_c); If you get a chance to try that, let me know if it works. Another option might be adding -Wl,-u,_dlstart_c to LDFLAGS. > 2) With f1faa0e1 reverted, the build succeeds, but with a warning about > differing declarations for dummy_tsd and __pthread_tsd_main: > > | src/thread/pthread_create.c:169:1: warning: type of '__pthread_tsd_main' does not match original declaration > | weak_alias(dummy_tsd, __pthread_tsd_main); > | ^ > | src/thread/pthread_key_create.c:4:7: note: previously declared here > | void *__pthread_tsd_main[PTHREAD_KEYS_MAX] = { 0 }; > | ^ This should be harmless but perhaps there's a better way it could be done. > 3) Overall build times are similar, but archieving the best results > with -flto relies on manually duplicating any 'make -j' options for > the linker. Times below are from a quad core + hyperthreading system > running 'make -j8 lib/libc.so': > > original : real 0m8.501s > -flto : real 0m18.034s > -flto=4 : real 0m9.885s > -flto=8 : real 0m8.876s Yeah that would be expected. > 4) Changes in code size seem to be minor, except when compiling with > -O3, where the code gets noticably larger (presumably due to -flto > giving a lot more scope for inlining?). Results below are from building > with gcc 4.9.2 for 32bit x86: > > text data bss dec hex filename > > 536405 1416 8800 546621 8573d lib/libc.so ( -Os ) > 536324 1324 8780 546428 8567c lib/libc.so.lto ( -Os ) > > 612028 1416 8928 622372 97f24 lib/libc.so ( -O2 ) > 611701 1304 9132 622137 97e39 lib/libc.so.lto ( -O2 ) > > 687708 1416 8992 698116 aa704 lib/libc.so ( -O3 ) > 713704 1312 9208 724224 b0d00 lib/libc.so.lto ( -O3 ) Also seems rather like what I would expect. Any idea if performance is significantly better? It's not very comprehensive but you could try libc-bench. Rich