From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12124 Path: news.gmane.org!.POSTED!not-for-mail From: Michael Clark Newsgroups: gmane.linux.lib.musl.general Subject: Re: Do not use 64 bit division if possible Date: Sun, 26 Nov 2017 12:15:06 +1300 Message-ID: <9716E0B3-B86C-4CFF-8636-6DE4BAA0D716@mac.com> References: <424674f0-8460-7807-7366-a87d8588e8bc@davidgf.es> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 (Mac OS X Mail 11.1 \(3445.4.7\)) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: blaine.gmane.org 1511651731 18575 195.159.176.226 (25 Nov 2017 23:15:31 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sat, 25 Nov 2017 23:15:31 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-12140-gllmg-musl=m.gmane.org@lists.openwall.com Sun Nov 26 00:15:24 2017 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1eIjfM-00041P-0z for gllmg-musl@m.gmane.org; Sun, 26 Nov 2017 00:15:20 +0100 Original-Received: (qmail 23554 invoked by uid 550); 25 Nov 2017 23:15:23 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 22512 invoked from network); 25 Nov 2017 23:15:22 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mac.com; s=04042017; t=1511651710; bh=vbvZWBTRM6P03JiSPjAoVwF7GMepiWhycssOVUXzKFo=; h=From:Content-type:MIME-version:Subject:Date:To:Message-id; b=XePHCkE+/6zkPrSqTWd7BcbY51+CZACz10y/ntvOCtSwRfOXOc78x0R/ZovpPNvM4 g+ubkOJfCeU3zvg/C8+1bSBkqCpfWF6vwYt1K/6GRxfgr4v9ErYJPMozhbzw/0X6uh 8wNRGW9I0LwYw4WzEzyalXD1v+gH1QxXhDCryX/neEL3xcO9dtt8iDrnfeUNfnKl25 +N/xD3fkSjzZCVDJCWIzUorVDRdc7j8cxi2V523Te8rD08YUsqqHfdPEaq6sUB9Rsl 0lje7itXii5GFpFApcu4JgQpLIqKnS/5wLmRGrbXv3t3JTYe9ax6QdUB3dKsfSXYwx LHhZjHscVQ7Fw== X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-11-25_12:,, signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 clxscore=1015 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1711250325 In-reply-to: <424674f0-8460-7807-7366-a87d8588e8bc@davidgf.es> X-Mailer: Apple Mail (2.3445.4.7) Xref: news.gmane.org gmane.linux.lib.musl.general:12124 Archived-At: At -O0 and above, clang and gcc strength reduce division by a constant = power of two into a right shift (arithmetic or logical depending on = signedness of the types). - https://cx.rv8.io/g/kDrEkB a_ctz_l is not exactly inexpensive, given it has a multiply, and, = negate, shift, load (cache miss). We=E2=80=99d be better off defining PAGE_SHIFT if we want to be certain = the code uses shift when optimisation is disabled, however I trust the = compilers to turn the division into a shift. #ifndef a_ctz_l #define a_ctz_l a_ctz_l static inline int a_ctz_l(unsigned long x) { static const char debruijn32[32] =3D { 0, 1, 23, 2, 29, 24, 19, 3, 30, 27, 25, 11, 20, 8, 4, = 13, 31, 22, 28, 18, 26, 10, 7, 12, 21, 17, 9, 6, 16, 5, 15, = 14 }; if (sizeof(long) =3D=3D 8) return a_ctz_64(x); return debruijn32[(x&-x)*0x076be629 >> 27]; } #endif If you study the codegen then this might be a better change (including = to all other archs). $ git diff arch/x86_64/bits/limits.h diff --git a/arch/x86_64/bits/limits.h b/arch/x86_64/bits/limits.h index 792a30b..32f29bf 100644 --- a/arch/x86_64/bits/limits.h +++ b/arch/x86_64/bits/limits.h @@ -1,6 +1,6 @@ #if defined(_POSIX_SOURCE) || defined(_POSIX_C_SOURCE) \ || defined(_XOPEN_SOURCE) || defined(_GNU_SOURCE) || = defined(_BSD_SOURCE) -#define PAGE_SIZE 4096 +#define PAGE_SIZE 4096UL #define LONG_BIT 64 #endif =20 Try removing the UL suffix from the constant in the compiler explorer = example above and see the change in codegen. > On 26/11/2017, at 9:52 AM, David Guillen Fandos = wrote: >=20 > Hey there, >=20 > Just noticed that my binary was getting some gcc functions for integer = division in some places coming from musl. I checked and it seems that, = even though musl assumes PAGE_SIZE is always power of two, that we = divide by it instead of using shifts for that. This results in extra = overhead and slow division on platforms that do not have a 64 bit = divider (even the ones that do have 32 bit divider). >=20 > So I propose a patch here, let me know what you people think about. >=20 > David >=20 >=20 > diff --git a/src/conf/sysconf.c b/src/conf/sysconf.c > index b8b761d0..aa9fc9d1 100644 > --- a/src/conf/sysconf.c > +++ b/src/conf/sysconf.c > @@ -4,6 +4,7 @@ long sysconf(int name) > #include > #include "syscall.h" > #include "libc.h" > +#include "atomic.h" >=20 > #define JT(x) (-256|(x)) > #define VER JT(1) > @@ -206,7 +206,7 @@ long sysconf(int name) > if (name=3D=3D_SC_PHYS_PAGES) mem =3D si.totalram; > else mem =3D si.freeram + si.bufferram; > mem *=3D si.mem_unit; > - mem /=3D PAGE_SIZE; > + mem >>=3D (unsigned)(a_ctz_l(PAGE_SIZE)); > return (mem > LONG_MAX) ? LONG_MAX : mem; > case JT_ZERO & 255: > return 0;