From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_MSPIKE_H2,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 20301 invoked from network); 22 May 2023 16:49:00 -0000 Received: from second.openwall.net (193.110.157.125) by inbox.vuxu.org with ESMTPUTF8; 22 May 2023 16:49:00 -0000 Received: (qmail 5717 invoked by uid 550); 22 May 2023 16:48:55 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 5683 invoked from network); 22 May 2023 16:48:54 -0000 Date: Mon, 22 May 2023 12:48:42 -0400 From: Rich Felker To: musl@lists.openwall.com Message-ID: <20230522164841.GS4163@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="nqkreNcslJAfgyzk" Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Subject: [musl] Dropping -Os --nqkreNcslJAfgyzk Content-Type: text/plain; charset=us-ascii Content-Disposition: inline It's been known for a long time now that -Os is bad, mainly because it imposes a few really ugly pessimizations without their own switches, like forcing use of div instructions for div-by-constant instead of allowing strength reduction to a mul (because the mul takes a couple more bytes of .text O_o). The attached proposed change switches over to starting with -O2 and patching it up with the actually-desirable parts of -Os. AIUI, at least with GCC this has other side effects, because the -O3 used with OPTIMIZE_GLOBS (--enable-optimize for particular components) will not override explicit -f options. So there might be more work that should be done splitting out the size/speed CFLAGS into separate variables and only applying one to each file, rather than putting -O3 on top like we do now. Or it might not matter. It's also perhaps worth considering whether this breakdown still makes sense, or if there are unified options that would have low size cost but achieve the bulk of the benefit of -O3. Rich --nqkreNcslJAfgyzk Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="0001-configure-replace-Os-with-equivalent-based-on-O2.patch" >From b90841e2583237a4132bbbd74752e0e9563660cd Mon Sep 17 00:00:00 2001 From: Rich Felker Date: Sun, 21 May 2023 12:16:11 -0400 Subject: [PATCH] configure: replace -Os with equivalent based on -O2 aside from the documented differences, which are the contents of this patch, GCC's -Os also has hard-coded unwanted behaviors which are impossible to override, like refusing to strength-reduce division by a constant to multiplication, presumably because the div saves a couple bytes of code. for this reason, getting rid of -Os and switching to an equivalent default optimization profile based on -O2 has been a long-term goal. as follow-ups, it may make sense to evaluate which of these variations from -O2 actually do anything useful, and eliminate the ones which are not helpful or which throw away performance for insignificant size savings. but for now, I've replicated -Os as closely as possible to provide a baseline for such evaluation. --- configure | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/configure b/configure index 853bf05e..0b966ede 100755 --- a/configure +++ b/configure @@ -444,7 +444,20 @@ xno|x) printf "disabled\n" ; optimize=no ;; *) printf "custom\n" ;; esac -test "$optimize" = no || tryflag CFLAGS_AUTO -Os || tryflag CFLAGS_AUTO -O2 +if test "$optimize" = no ; then : +else +tryflag CFLAGS_AUTO -O2 +tryflag CFLAGS_AUTO -fno-align-jumps +tryflag CFLAGS_AUTO -fno-align-functions +tryflag CFLAGS_AUTO -fno-align-loops +tryflag CFLAGS_AUTO -fno-align-labels +tryflag CFLAGS_AUTO -fira-region=one +tryflag CFLAGS_AUTO -fira-hoist-pressure +tryflag CFLAGS_AUTO -freorder-blocks-algorithm=simple \ +|| tryflag CFLAGS_AUTO -fno-reorder-blocks +tryflag CFLAGS_AUTO -fno-prefetch-loop-arrays +tryflag CFLAGS_AUTO -fno-tree-ch +fi test "$optimize" = yes && optimize="internal,malloc,string" if fnmatch 'no|size' "$optimize" ; then : -- 2.21.0 --nqkreNcslJAfgyzk--