From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: from second.openwall.net (second.openwall.net [193.110.157.125]) by inbox.vuxu.org (Postfix) with SMTP id 859F123959 for ; Wed, 17 Apr 2024 03:56:21 +0200 (CEST) Received: (qmail 32698 invoked by uid 550); 17 Apr 2024 01:56:15 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 32660 invoked from network); 17 Apr 2024 01:56:14 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmx.net; s=s31663417; t=1713318967; x=1713923767; i=nullplan@gmx.net; bh=Qw1/zuaqXWsJLLZ1Tqrz7EkfihoPd1gkcTL4sX59TTU=; h=X-UI-Sender-Class:Date:From:To:Cc:Subject:Message-ID:References: MIME-Version:Content-Type:In-Reply-To:Content-Transfer-Encoding: cc:content-transfer-encoding:content-type:date:from:message-id: mime-version:reply-to:subject:to; b=lBISpgbF/We17zN/ujm6BX3yLiPX94iSdIOcnokVNgta6P8G0hbIngiDcFtOPwNJ Yv6pVhkWJUzvKygbJs7cJqvCjrEZQQh4cZMgDOyhU/AvgGwoqxVwMUyWdCLUK4JHB 2MkFnpkEfWNe0zP7eTZk/15HReZvUGXmtAW051sqqbYwRLBCqbNtwtBKEqOfCphbj e15SIwFevfU0qoAzWKrtpIe0b/VmtqewRsoNg2NL+L8dgaqT7MTvBJ8iUSWEtTkae gp1nWNeVTBm3yJ/nDXZ9zFHEg/afTEmCpG7hv/TDHSBI7Ps+JXf3AauAIFvAVa0BM 60M3nnHWNCOzu0TnFA== X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a Date: Wed, 17 Apr 2024 03:56:05 +0200 From: Markus Wichmann To: musl@lists.openwall.com Cc: Viktor Reznov Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Provags-ID: V03:K1:7h3EfN18t4xJjszB9u5/cAm5f2kmzgLJ8MWJENszBHgKSW2hqz2 K3OYG5dvUX3IyU4gYV89gyMnT2/Zqv039mxNJ7t6sogbNyxIAhghsXTdy3j1o0/YhDG2v+x /ypHqsoA5yZI8ehPgJ8+9waVtoYmD/dy4gpqJSxmN7ztxWRy0Y5LhDnyiVMeS1vk5rFBPW2 MBFkUXx2xsAiLUDSz7myw== UI-OutboundReport: notjunk:1;M01:P0:r4HNjXI+N5A=;xrbpnydx87sNZrVKktLMu2w6iL+ rExtLbCZUueDg6U/Tw0BwFTYhVf1X+MDYa1rl1KEm/XNp2bwL6WDEIR2D3TIo4k5Tj+0NBoFQ diYyn4G2yWRlnF20Z8CPs+s+SpjI8OZh+7Jd3sxyAzAN+dfCXFZLTbIo+zUJW2DJRKgBq43kS c2FXTG9XGs/JcCBSwfNe5CQzgKC2hLDQ40DDxUySLvsaLjTEYFxb34fk9aQRVegiqFOrEJt4x Ngj+PvvACks8DIHtNHxbnpDhBTa49iiE3BWYEQjOm+Id4ij6t5c30Dr7V1hc8n19Djj0uF4NG 9VZpbmprTE/4B8FnxcW6wWo17vmXbAftziUgH4f+GJFWgKsyfDBCIY8HLceOtRkDfYJPZEUe/ hgeaOJtgKrNl9Y2YI7ZHCmlDP6mGLysj5wkBT1o6kntv1rxK9Ne6lFruHbc9+d3uIlVUcXnox P6fRIDz15GfXDM6gklhQpNjKM49xyW9y02vYXHXben0LFu7NGWgumgeUyoz3szp1kKO7pAQ8C iGH4WcJKGTTZ+sW1isKThGWsmNVsWuu/MqpstxXYgqTEo/Z9FDY/r/oIFzubiQe5RzZlNfBDc +SdY8qaiJmLTmezlyphXazPEJqAbmodurgz4tFVDgUCp/qbc0e5TK0Kx1cn3eJDm0x5FUBEE1 shUa11gVKJuJvzKOOagHJKT7nlC3Cq4TtJwWTyps0JwrTZmjr7HLtGLESg9YeJFd+ilwaLkiB zDkR/G/LlHEO5XnJ0EGcVKjHXCA84x8bk8tYudvcZlbnnsOA93c3ZZjf63Vd9JbZDSyqi/zwT DCQ18CGAC+fY/HIe+z2/+TiW4wKlOTBH4+XVwuHRU7Na8= Content-Transfer-Encoding: quoted-printable Subject: Re: [musl] [PATCH] Decreasing the number of divisions Am Wed, Apr 17, 2024 at 01:25:18AM +0000 schrieb NRK: > > I played around with this change on godbolt: https://godbolt.org/z/9Po= GK9zae > > You're looking at clang -O3, if you use gcc -Os (usual for musl > users/distros) you'll notice that gcc actually ends up emitting a div > instruction, which are known to be slow. > > But I don't think trying to optimize around gcc's bad codegen is the > right move. It's better to just not use -Os with gcc. Which musl already > does since commit b90841e25832. > > - NRK Well, yeah, if you use -Os and expect fast code, you are doing it wrong. -Os explicitly asks for small code rather than fast. It is appropriate for code that ends up having to fit in a ROM or a bootsector or something, but otherwise I can't really see the point. I remember once reading the insane claim that -Os code ends up being faster than -O3 because the code fits in cache, but much as the OP in this thread, there was no benchmark to actually show this. I call it insane because -O3 is telling GCC explicitly to output the fastest code possible, and in my experience it generally does. Ciao, Markus