From: Denys Vlasenko <vda.linux@googlemail.com>
To: Rob Landley <rob@landley.net>, Rich Felker <dalias@libc.org>,
musl <musl@lists.openwall.com>
Subject: Results of Aboriginal/musl CFLAGS experiment
Date: Fri, 23 Oct 2015 09:35:39 +0200 [thread overview]
Message-ID: <CAK1hOcNXZc+9dpZ9W+bYbXyReOaQDP48PBu82rFS86n4+hb3NA@mail.gmail.com> (raw)
Hi Rob, Rich,
I decided to take a look at how well building busybox against musl
would fare compared to building it against a custom-configured
uclibc I was using for quite some time.
Instead of reinventing the wheel, I decided to use Rob's excellent
Aboriginal Linux build scripts. Here's what I did.
I took Aboriginal's tip.tar.bz2, which was aboriginal-0b3b780ea942.
I built "./build.sh x86_64" without any tweaking.
Then I started adding gcc options I was using in my old custom uclibc build
to sources/sections/musl.build, and not changing anything else:
--- a.0/sources/sections/musl.build 2015-10-11 10:10:26.000000000 +0200
+++ a.1/sources/sections/musl.build 2015-10-23 02:37:45.803972995 +0200
@@ -1,7 +1,10 @@
# Build and install musl
+(
+export CFLAGS="-Wl,--sort-section,alignment -Wl,--sort-common"
+
CC= CROSS_COMPILE=${ARCH}- ./configure --prefix=/ &&
DESTDIR="$STAGE_DIR" make -j $CPUS CROSS_COMPILE=${ARCH}- all install &&
echo '#define __MUSL__' >> "$STAGE_DIR"/include/features.h &&
ln -s libc.so "$STAGE_DIR/lib/ld-musl.so.0"
-
+)
I made four steps:
step 1 - CFLAGS+="-Wl,--sort-section,alignment -Wl,--sort-common"
step 2 - CFLAGS+="-ffunction-sections -fdata-sections"
step 3 - CFLAGS+="-falign-jumps=1 -falign-labels=1"
step 4 - CFLAGS+="-falign-functions=1 -falign-loops=1"
and collected size information from several executables after each step:
ls -l */build/native-compiler-x86_64/usr/lib/libc.a
size */build/native-compiler-x86_64/usr/lib/libc.so
size */build/root-filesystem-x86_64/usr/bin/toybox
size */build/root-filesystem-x86_64/usr/bin/busybox
size */build/native-compiler-x86_64/usr/bin/as
size */build/native-compiler-x86_64/usr/bin/ld
size */build/native-compiler-x86_64/usr/bin/bash
size */build/native-compiler-x86_64/usr/x86_64-unknown-linux/bin/collect2
Here is what I discovered.
Step 1, which added "-Wl,--sort-section,alignment -Wl,--sort-common"
affects only the size of libc.so:
text data bss dec filename
572242 1920 11640 585802 a.0/native-compiler/lib/libc.so
572068 1916 11576 585560 a.1/native-compiler/lib/libc.so
What it does is it reduces the chances when during linking,
when sections are merged, a small section (such as one
resulting from "static char flag_var") with no alignment restrictions
gets logded between two bigger ones (say, "static int global_cnt")
which want e.g. 32-bit alignment.
Without section sorting, byte-sized "flag_var" gets 3 bytes of padding.
With section sorting by alignment, one-byte flag variables have
higher chances of being grouped together and not requiring padding.
(It can be made even better. Linker is too dumb).
Step 2: adding "-ffunction-sections -fdata-sections"
Previous optimization isn't working too well because data objects
aren't living in separate sections, they are all grouped in one .data
and one .bss section per *.o file.
"-ffunction-sections -fdata-sections" fix this by putting every function
and data object into its own section. Then section sorting eliminates
many more padding gaps:
text data bss dec filename
572068 1916 11576 585560 a.1/native-compiler/lib/libc.so
570356 1900 11480 583736 a.2/native-compiler/lib/libc.so
More to it. Object files in static libc.a also have their functions
and objects each in its own section. This means that programs
linked with -Wl,--gc-sections (toybox and busybox do this)
will be able to drop unused code and data not on per-.o-file basis,
but on per-function and per-object basis, resulting in ~1% size decrease!
text data bss dec filename
338047 6608 22384 367039 a.1/root-filesystem/usr/bin/toybox
336143 6560 22352 365055 a.2/root-filesystem/usr/bin/toybox
text data bss dec filename
324711 862 7648 333221 a.1/root-filesystem/bin/busybox
321913 826 7520 330259 a.2/root-filesystem/bin/busybox
Most programs, alas, don't use -Wl,--gc-sections, but they still get
a tiny bit smaller:
text data bss dec filename
1029977 8752 60192 1098921 a.1/native-compiler/bin/as
1029945 8720 60192 1098857 a.2/native-compiler/bin/as
text data bss dec filename
1122513 9328 25120 1156961 a.1/native-compiler/bin/ld
1122513 9296 25120 1156929 a.2/native-compiler/bin/ld
text data bss dec filename
425757 50652 16448 492857 a.1/native-compiler/bin/bash
425725 50604 16416 492745 a.2/native-compiler/bin/bash
text data bss dec filename
140624 880 9472 150976
a.1/native-compiler/x86_64-unknown-linux/bin/collect2
140624 848 9440 150912
a.2/native-compiler/x86_64-unknown-linux/bin/collect2
I would say there is no reason to not do steps 1 and 2 always.
They don't pessimize execution speed. They simply get rid of some
data padding, and drop dead, unreachable code.
Step 3: add "-falign-jumps=1 -falign-labels=1"
Step 4: add "-falign-functions=1 -falign-loops=1"
Not particularly interesting - they do reduce size of every program I measured,
but some (many?) people would prefer to leave it to gcc to decide when
and how align code, for speed reasons. Anyway, here are stats:
-rw-r--r-- 1 root root 2514966 a.2/native-compiler/lib/libc.a
-rw-r--r-- 1 root root 2514726 a.3/native-compiler/lib/libc.a
-rw-r--r-- 1 root root 2514646 a.4/native-compiler/lib/libc.a
text data bss dec filename
570356 1900 11480 583736 a.2/native-compiler/lib/libc.so
570148 1900 11480 583528 a.3/native-compiler/lib/libc.so
569637 1900 11480 583017 a.4/native-compiler/lib/libc.so
text data bss dec filename
336143 6560 22352 365055 a.2/root-filesystem/usr/bin/toybox
335999 6560 22352 364911 a.3/root-filesystem/usr/bin/toybox
335743 6560 22352 364655 a.4/root-filesystem/usr/bin/toybox
text data bss dec filename
321913 826 7520 330259 a.2/root-filesystem/bin/busybox
321801 826 7520 330147 a.3/root-filesystem/bin/busybox
321541 826 7520 329887 a.4/root-filesystem/bin/busybox
text data bss dec filename
1029945 8720 60192 1098857 a.2/native-compiler/bin/as
1029817 8720 60192 1098729 a.3/native-compiler/bin/as
1029609 8720 60192 1098521 a.4/native-compiler/bin/as
text data bss dec filename
1122513 9296 25120 1156929 a.2/native-compiler/bin/ld
1122369 9296 25120 1156785 a.3/native-compiler/bin/ld
1122161 9296 25120 1156577 a.4/native-compiler/bin/ld
text data bss dec filename
425725 50604 16416 492745 a.2/native-compiler/bin/bash
425629 50604 16416 492649 a.3/native-compiler/bin/bash
425437 50604 16416 492457 a.4/native-compiler/bin/bash
text data bss dec filename
140624 848 9440 150912
a.2/native-compiler/x86_64-unknown-linux/bin/collect2
140560 848 9440 150848
a.3/native-compiler/x86_64-unknown-linux/bin/collect2
140336 848 9440 150624
a.4/native-compiler/x86_64-unknown-linux/bin/collect2
next reply other threads:[~2015-10-23 7:35 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-23 7:35 Denys Vlasenko [this message]
2015-10-27 1:29 ` Rich Felker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAK1hOcNXZc+9dpZ9W+bYbXyReOaQDP48PBu82rFS86n4+hb3NA@mail.gmail.com \
--to=vda.linux@googlemail.com \
--cc=dalias@libc.org \
--cc=musl@lists.openwall.com \
--cc=rob@landley.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).