From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12128 Path: news.gmane.org!.POSTED!not-for-mail From: David Guillen Fandos Newsgroups: gmane.linux.lib.musl.general Subject: Re: Do not use 64 bit division if possible Date: Sun, 26 Nov 2017 01:49:09 +0100 Message-ID: <796e366e-f321-25a3-78e7-8a3800e62eeb@davidgf.es> References: <424674f0-8460-7807-7366-a87d8588e8bc@davidgf.es> <9716E0B3-B86C-4CFF-8636-6DE4BAA0D716@mac.com> <5575a0c9-0f53-f8e7-e0dc-6c1ff2b594f7@davidgf.es> <20171125235333.GQ1627@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit X-Trace: blaine.gmane.org 1511657362 28024 195.159.176.226 (26 Nov 2017 00:49:22 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 26 Nov 2017 00:49:22 +0000 (UTC) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 To: musl@lists.openwall.com Original-X-From: musl-return-12144-gllmg-musl=m.gmane.org@lists.openwall.com Sun Nov 26 01:49:18 2017 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1eIl8H-0006yx-Tj for gllmg-musl@m.gmane.org; Sun, 26 Nov 2017 01:49:17 +0100 Original-Received: (qmail 23784 invoked by uid 550); 26 Nov 2017 00:49:23 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 23766 invoked from network); 26 Nov 2017 00:49:22 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=davidgf-es.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=AdCu7YyylryY8IUxsiTyeqgXfFpqZUbbmnw0vzhiWdY=; b=FSBYip3Q+8nk6cDLunmqqC8K8XhxyxZJW1LoX7gPcj+PtWhXY1G/d4j010lC8h+RUR QZ+b+IZK78P6UCtk0CsGi/RD6PZdFk6Bo/MaEc8F521SONm399duSrHA1E7kJQVr+ALQ 3RYzGtgdrTdmUQMv4Zfa6Rn16erftcyg2zb+w74p+S7CR6cYc4PFu1HdetP5YypmhVaq obAyvs0rm6H7hfOeGVBcT2K19Q8dBfldAPcNbffZEoKlhQL48k0ZAOZwds47XyRgcspF eBjAD1zpkY6Y4bRjZLWkxI0+yMxmOzMiDkQpU/r8gNHn38BpCiUzTZahkvDg+eAgmAdK 0FBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=AdCu7YyylryY8IUxsiTyeqgXfFpqZUbbmnw0vzhiWdY=; b=n/+LYdq4YsEKS2YwT/2oOvXFIwPZLDw06SSjUzf7sFi18SOpmX8NlSaUNQ1aTnePrF 3ddLBN93IvRh8lonxBTi3qlo499ucElEkdHmy6oAIpqj2ilpcvwANOIxMtPi0AOV8Osx SlGHDtkETsffpMdwWF4ERnPf49cjwVZ1TGFjIRibfxsmCUy6mH7x2yxbakMAO9QGMk7p 6+lNzi681dsRMqWPlwL6cd7Fcheh4+R2bVXom5sKzFHUGH80Kwz3bb1G609MAhWDenC2 x8ui4XUWSZGvPYtu7Qi+kJACNU+eAhRNWV+okEcOWpicvNL2ChF7K3PlP0BIcOwSvKgy +R6g== X-Gm-Message-State: AJaThX5HcM4Ho10P6T1MqT8BDilGx4ViFCoHa8hMHeiz4gVoKtd2/XYC fSizdygpLV0SRk1wfrYruJw08RCDhQ== X-Google-Smtp-Source: AGs4zMZ/3bN8Y4N0KsU4xjP5WV9f1Qc5DutBr+SfOenqjXUIAYiNOhO6bIYpHOCoskyCcV/RiQ3xQQ== X-Received: by 10.80.160.198 with SMTP id 64mr28450298edo.242.1511657351046; Sat, 25 Nov 2017 16:49:11 -0800 (PST) In-Reply-To: Content-Language: en-US Xref: news.gmane.org gmane.linux.lib.musl.general:12128 Archived-At: Hey, Wow that's an awesome optimization (the a&-a), didn't know gcc was smart enough to figure that out by itself :D I just realized that PAGE_SIZE seems indeed to be defined to a constant for some architectures, did not notice since I was running on MIPS which has a page size different for each uarch. I'd say the (a&-a) is a very simple optimization and we should use it, since it adds almost no complexity and sames some cycles and some .text bytes, which is sometimes a bit tight. Something like this? Doesn't hurt constants, improves some arches :) diff --git a/src/conf/sysconf.c b/src/conf/sysconf.c index b8b761d0..aa9fc9d1 100644 --- a/src/conf/sysconf.c +++ b/src/conf/sysconf.c @@ -206,7 +206,7 @@ long sysconf(int name) if (name==_SC_PHYS_PAGES) mem = si.totalram; else mem = si.freeram + si.bufferram; mem *= si.mem_unit; - mem /= PAGE_SIZE; + mem /= (unsigned)(PAGE_SIZE & -PAGE_SIZE); return (mem > LONG_MAX) ? LONG_MAX : mem; case JT_ZERO & 255: return 0; On 26/11/17 01:10, Michael Clark wrote: > > >> On 26/11/2017, at 12:53 PM, Rich Felker wrote: >> >> On Sun, Nov 26, 2017 at 12:46:56AM +0100, David Guillen Fandos wrote: >>> Thanks for your response. >>> Please note that PAGE_SIZE is not a constant but an alias to >>> libc.page_size which is a variable of type size_t (signed). >>> That's why at O1+ gcc doesn't generate a shift. >> >> Indeed; this varies by arch. > > Oh, I wasn’t aware of that. > >>> I also created a patch to include libc.page_shift, but as far as I >>> can see no other functions would benefit from it, since there's no >>> other divides there (only negations, additions and subtractions). >> >> Adding infrastructure complexity except in cases where it makes a >> significant improvement to size or performance is generally not >> desirable. mmap() is one other place where, in principle, division by >> PAGE_SIZE might take place, but in practice the size is constant 4096 >> or 8192 on all archs. >> >>> And yeah I agree, a_ctz_l is not exactly inexpensive but I guess it >>> is better than full 64 bit signed division (that's why I cast >>> unsigned otherwise the shift right is not trivial due to the sign). >> >> The cost here is more a matter of adding a reading complexity >> dependency on musl internals (a_*) where it's not needed. I wonder if >> GCC could optimize it if we instead of /PAGE_SIZE wrote >> /(PAGE_SIZE&-PAGE_SIZE). Or if we did something like define PAGE_SIZE >> as ((libc.page_size&-libc.page_size)==libc.page_size ? libc.page_size >> : 1/0) so that "PAGE_SIZE is not a power of 2" would become an >> unreachable case. > > Interesting. It seems GCC figures out the division by zero is unreachable but the (n&-n) expression leads to a power of two, not to a log2 n so the ctz is still required. > > - https://cx.rv8.io/g/eHf2Ah > > One could do so once at initialisation time and add PAGE_SHIFT and on architectures with variable page sizes do this: > > #define PAGE_SHIFT libc.page_shift > > diff --git a/src/env/__libc_start_main.c b/src/env/__libc_start_main.c > index 2d758af..f24d10a 100644 > --- a/src/env/__libc_start_main.c > +++ b/src/env/__libc_start_main.c > @@ -29,6 +29,7 @@ void __init_libc(char **envp, char *pn) > __hwcap = aux[AT_HWCAP]; > __sysinfo = aux[AT_SYSINFO]; > libc.page_size = aux[AT_PAGESZ]; > + libc.page_shift = a_ctz_l(libc.page_size); > > if (!pn) pn = (void*)aux[AT_EXECFN]; > if (!pn) pn = ""; > > That isolates the a_ctz_l to one place. >