mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: Do not use 64 bit division if possible
Date: Sat, 25 Nov 2017 18:53:33 -0500	[thread overview]
Message-ID: <20171125235333.GQ1627@brightrain.aerifal.cx> (raw)
In-Reply-To: <5575a0c9-0f53-f8e7-e0dc-6c1ff2b594f7@davidgf.es>

On Sun, Nov 26, 2017 at 12:46:56AM +0100, David Guillen Fandos wrote:
> Thanks for your response.
> Please note that PAGE_SIZE is not a constant but an alias to
> libc.page_size which is a variable of type size_t (signed).
> That's why at O1+ gcc doesn't generate a shift.

Indeed; this varies by arch.

> I also created a patch to include libc.page_shift, but as far as I
> can see no other functions would benefit from it, since there's no
> other divides there (only negations, additions and subtractions).

Adding infrastructure complexity except in cases where it makes a
significant improvement to size or performance is generally not
desirable. mmap() is one other place where, in principle, division by
PAGE_SIZE might take place, but in practice the size is constant 4096
or 8192 on all archs.

> And yeah I agree, a_ctz_l is not exactly inexpensive but I guess it
> is better than full 64 bit signed division (that's why I cast
> unsigned otherwise the shift right is not trivial due to the sign).

The cost here is more a matter of adding a reading complexity
dependency on musl internals (a_*) where it's not needed. I wonder if
GCC could optimize it if we instead of /PAGE_SIZE wrote
/(PAGE_SIZE&-PAGE_SIZE). Or if we did something like define PAGE_SIZE
as ((libc.page_size&-libc.page_size)==libc.page_size ? libc.page_size
: 1/0) so that "PAGE_SIZE is not a power of 2" would become an
unreachable case.

Rich



> On 26/11/17 00:15, Michael Clark wrote:
> >At -O0 and above, clang and gcc strength reduce division by a constant power of two into a right shift (arithmetic or logical depending on signedness of the types).
> >
> >- https://cx.rv8.io/g/kDrEkB
> >
> >a_ctz_l is not exactly inexpensive, given it has a multiply, and, negate, shift, load (cache miss).
> >
> >We’d be better off defining PAGE_SHIFT if we want to be certain the code uses shift when optimisation is disabled, however I trust the compilers to turn the division into a shift.
> >
> >#ifndef a_ctz_l
> >#define a_ctz_l a_ctz_l
> >static inline int a_ctz_l(unsigned long x)
> >{
> >         static const char debruijn32[32] = {
> >                 0, 1, 23, 2, 29, 24, 19, 3, 30, 27, 25, 11, 20, 8, 4, 13,
> >                 31, 22, 28, 18, 26, 10, 7, 12, 21, 17, 9, 6, 16, 5, 15, 14
> >         };
> >         if (sizeof(long) == 8) return a_ctz_64(x);
> >         return debruijn32[(x&-x)*0x076be629 >> 27];
> >}
> >#endif
> >
> >If you study the codegen then this might be a better change (including to all other archs).
> >
> >$ git diff arch/x86_64/bits/limits.h
> >diff --git a/arch/x86_64/bits/limits.h b/arch/x86_64/bits/limits.h
> >index 792a30b..32f29bf 100644
> >--- a/arch/x86_64/bits/limits.h
> >+++ b/arch/x86_64/bits/limits.h
> >@@ -1,6 +1,6 @@
> >  #if defined(_POSIX_SOURCE) || defined(_POSIX_C_SOURCE) \
> >   || defined(_XOPEN_SOURCE) || defined(_GNU_SOURCE) || defined(_BSD_SOURCE)
> >-#define PAGE_SIZE 4096
> >+#define PAGE_SIZE 4096UL
> >  #define LONG_BIT 64
> >  #endif
> >  Try removing the UL suffix from the constant in the compiler explorer example above and see the change in codegen.
> >
> >>On 26/11/2017, at 9:52 AM, David Guillen Fandos <david@davidgf.es> wrote:
> >>
> >>Hey there,
> >>
> >>Just noticed that my binary was getting some gcc functions for integer division in some places coming from musl. I checked and it seems that, even though musl assumes PAGE_SIZE is always power of two, that we divide by it instead of using shifts for that. This results in extra overhead and slow division on platforms that do not have a 64 bit divider (even the ones that do have 32 bit divider).
> >>
> >>So I propose a patch here, let me know what you people think about.
> >>
> >>David
> >>
> >>
> >>diff --git a/src/conf/sysconf.c b/src/conf/sysconf.c
> >>index b8b761d0..aa9fc9d1 100644
> >>--- a/src/conf/sysconf.c
> >>+++ b/src/conf/sysconf.c
> >>@@ -4,6 +4,7 @@ long sysconf(int name)
> >>#include <sys/sysinfo.h>
> >>#include "syscall.h"
> >>#include "libc.h"
> >>+#include "atomic.h"
> >>
> >>#define JT(x) (-256|(x))
> >>#define VER JT(1)
> >>@@ -206,7 +206,7 @@ long sysconf(int name)
> >>		if (name==_SC_PHYS_PAGES) mem = si.totalram;
> >>		else mem = si.freeram + si.bufferram;
> >>		mem *= si.mem_unit;
> >>-		mem /= PAGE_SIZE;
> >>+		mem >>= (unsigned)(a_ctz_l(PAGE_SIZE));
> >>		return (mem > LONG_MAX) ? LONG_MAX : mem;
> >>		case JT_ZERO & 255:
> >>		return 0;
> >


  reply	other threads:[~2017-11-25 23:53 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-25 20:52 David Guillen Fandos
2017-11-25 23:15 ` Michael Clark
2017-11-25 23:46   ` David Guillen Fandos
2017-11-25 23:53     ` Rich Felker [this message]
2017-11-26  0:10       ` Michael Clark
2017-11-26  0:49         ` David Guillen Fandos
2017-11-26  0:59           ` Rich Felker
2017-11-26  1:12             ` David Guillen Fandos
2017-11-26  1:23               ` Rich Felker
2017-11-26  1:40                 ` David Guillen Fandos
2017-11-26  0:49         ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171125235333.GQ1627@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).