mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] Dropping -Os
@ 2023-05-22 16:48 Rich Felker
  2023-05-23 22:20 ` Szabolcs Nagy
  0 siblings, 1 reply; 2+ messages in thread
From: Rich Felker @ 2023-05-22 16:48 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1015 bytes --]

It's been known for a long time now that -Os is bad, mainly because it
imposes a few really ugly pessimizations without their own switches,
like forcing use of div instructions for div-by-constant instead of
allowing strength reduction to a mul (because the mul takes a couple
more bytes of .text O_o).

The attached proposed change switches over to starting with -O2 and
patching it up with the actually-desirable parts of -Os.

AIUI, at least with GCC this has other side effects, because the -O3
used with OPTIMIZE_GLOBS (--enable-optimize for particular components)
will not override explicit -f options. So there might be more work
that should be done splitting out the size/speed CFLAGS into separate
variables and only applying one to each file, rather than putting -O3
on top like we do now. Or it might not matter.

It's also perhaps worth considering whether this breakdown still makes
sense, or if there are unified options that would have low size cost
but achieve the bulk of the benefit of -O3.

Rich

[-- Attachment #2: 0001-configure-replace-Os-with-equivalent-based-on-O2.patch --]
[-- Type: text/plain, Size: 1930 bytes --]

From b90841e2583237a4132bbbd74752e0e9563660cd Mon Sep 17 00:00:00 2001
From: Rich Felker <dalias@aerifal.cx>
Date: Sun, 21 May 2023 12:16:11 -0400
Subject: [PATCH] configure: replace -Os with equivalent based on -O2

aside from the documented differences, which are the contents of this
patch, GCC's -Os also has hard-coded unwanted behaviors which are
impossible to override, like refusing to strength-reduce division by a
constant to multiplication, presumably because the div saves a couple
bytes of code. for this reason, getting rid of -Os and switching to an
equivalent default optimization profile based on -O2 has been a
long-term goal.

as follow-ups, it may make sense to evaluate which of these variations
from -O2 actually do anything useful, and eliminate the ones which are
not helpful or which throw away performance for insignificant size
savings. but for now, I've replicated -Os as closely as possible to
provide a baseline for such evaluation.
---
 configure | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index 853bf05e..0b966ede 100755
--- a/configure
+++ b/configure
@@ -444,7 +444,20 @@ xno|x) printf "disabled\n" ; optimize=no ;;
 *) printf "custom\n" ;;
 esac
 
-test "$optimize" = no || tryflag CFLAGS_AUTO -Os || tryflag CFLAGS_AUTO -O2
+if test "$optimize" = no ; then :
+else
+tryflag CFLAGS_AUTO -O2
+tryflag CFLAGS_AUTO -fno-align-jumps
+tryflag CFLAGS_AUTO -fno-align-functions
+tryflag CFLAGS_AUTO -fno-align-loops
+tryflag CFLAGS_AUTO -fno-align-labels
+tryflag CFLAGS_AUTO -fira-region=one
+tryflag CFLAGS_AUTO -fira-hoist-pressure
+tryflag CFLAGS_AUTO -freorder-blocks-algorithm=simple \
+|| tryflag CFLAGS_AUTO -fno-reorder-blocks
+tryflag CFLAGS_AUTO -fno-prefetch-loop-arrays
+tryflag CFLAGS_AUTO -fno-tree-ch
+fi
 test "$optimize" = yes && optimize="internal,malloc,string"
 
 if fnmatch 'no|size' "$optimize" ; then :
-- 
2.21.0


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [musl] Dropping -Os
  2023-05-22 16:48 [musl] Dropping -Os Rich Felker
@ 2023-05-23 22:20 ` Szabolcs Nagy
  0 siblings, 0 replies; 2+ messages in thread
From: Szabolcs Nagy @ 2023-05-23 22:20 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

* Rich Felker <dalias@libc.org> [2023-05-22 12:48:42 -0400]:
> It's been known for a long time now that -Os is bad, mainly because it
> imposes a few really ugly pessimizations without their own switches,
> like forcing use of div instructions for div-by-constant instead of
> allowing strength reduction to a mul (because the mul takes a couple
> more bytes of .text O_o).
> 
> The attached proposed change switches over to starting with -O2 and
> patching it up with the actually-desirable parts of -Os.
> 
> AIUI, at least with GCC this has other side effects, because the -O3
> used with OPTIMIZE_GLOBS (--enable-optimize for particular components)
> will not override explicit -f options. So there might be more work
> that should be done splitting out the size/speed CFLAGS into separate
> variables and only applying one to each file, rather than putting -O3
> on top like we do now. Or it might not matter.
> 
> It's also perhaps worth considering whether this breakdown still makes
> sense, or if there are unified options that would have low size cost
> but achieve the bulk of the benefit of -O3.

sounds good.

on aarch64 with gcc12 -Os vs -O2+-f* is 4% size increase and -Os
vs -O2 is 10% size increase (libc.so).

the 4% seems to be mostly inlining decisions (including inlining
small memset/memcpy).

the -f does change -O3 code gen, but i cant tell by just looking
at the asm how much that matters.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2023-05-23 22:21 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-22 16:48 [musl] Dropping -Os Rich Felker
2023-05-23 22:20 ` Szabolcs Nagy

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).