* [PATCH] use lookup table for malloc bin index instead of float conversion
@ 2016-11-27 14:15 Szabolcs Nagy
2016-12-17 5:50 ` Rich Felker
0 siblings, 1 reply; 5+ messages in thread
From: Szabolcs Nagy @ 2016-11-27 14:15 UTC (permalink / raw)
To: musl
float conversion is slow and big on soft-float targets.
The lookup table increases code size a bit on most hard float targets
(and adds 64byte rodata), performance can be a bit slower because of
position independent data access and cpu internal state dependence
(cache, extra branches), but the overall effect should be minimal
(common, small size allocations should be unaffected).
---
src/malloc/malloc.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/src/malloc/malloc.c b/src/malloc/malloc.c
index b90636c..ce2e97d 100644
--- a/src/malloc/malloc.c
+++ b/src/malloc/malloc.c
@@ -111,19 +111,29 @@ static int first_set(uint64_t x)
#endif
}
+static const unsigned char bin_tab[64] = {
+ 0, 0, 0, 0,32,33,34,35,36,36,37,37,38,38,39,39,
+ 40,40,40,40,41,41,41,41,42,42,42,42,43,43,43,43,
+ 44,44,44,44,44,44,44,44,45,45,45,45,45,45,45,45,
+ 46,46,46,46,46,46,46,46,47,47,47,47,47,47,47,47,
+};
+
static int bin_index(size_t x)
{
x = x / SIZE_ALIGN - 1;
if (x <= 32) return x;
+ if (x < 512) return bin_tab[x/8];
if (x > 0x1c00) return 63;
- return ((union { float v; uint32_t r; }){(int)x}.r>>21) - 496;
+ return bin_tab[x/128] + 16;
}
static int bin_index_up(size_t x)
{
x = x / SIZE_ALIGN - 1;
if (x <= 32) return x;
- return ((union { float v; uint32_t r; }){(int)x}.r+0x1fffff>>21) - 496;
+ x--;
+ if (x < 512) return bin_tab[x/8] + 1;
+ return bin_tab[x/128] + 17;
}
#if 0
--
2.10.2
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] use lookup table for malloc bin index instead of float conversion
2016-11-27 14:15 [PATCH] use lookup table for malloc bin index instead of float conversion Szabolcs Nagy
@ 2016-12-17 5:50 ` Rich Felker
2016-12-17 7:36 ` u-uy74
0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2016-12-17 5:50 UTC (permalink / raw)
To: musl
On Sun, Nov 27, 2016 at 03:15:41PM +0100, Szabolcs Nagy wrote:
> float conversion is slow and big on soft-float targets.
>
> The lookup table increases code size a bit on most hard float targets
> (and adds 64byte rodata), performance can be a bit slower because of
> position independent data access and cpu internal state dependence
> (cache, extra branches), but the overall effect should be minimal
> (common, small size allocations should be unaffected).
> ---
> src/malloc/malloc.c | 14 ++++++++++++--
> 1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/src/malloc/malloc.c b/src/malloc/malloc.c
> index b90636c..ce2e97d 100644
> --- a/src/malloc/malloc.c
> +++ b/src/malloc/malloc.c
> @@ -111,19 +111,29 @@ static int first_set(uint64_t x)
> #endif
> }
>
> +static const unsigned char bin_tab[64] = {
> + 0, 0, 0, 0,32,33,34,35,36,36,37,37,38,38,39,39,
> + 40,40,40,40,41,41,41,41,42,42,42,42,43,43,43,43,
> + 44,44,44,44,44,44,44,44,45,45,45,45,45,45,45,45,
> + 46,46,46,46,46,46,46,46,47,47,47,47,47,47,47,47,
> +};
> +
> static int bin_index(size_t x)
> {
> x = x / SIZE_ALIGN - 1;
> if (x <= 32) return x;
> + if (x < 512) return bin_tab[x/8];
> if (x > 0x1c00) return 63;
> - return ((union { float v; uint32_t r; }){(int)x}.r>>21) - 496;
> + return bin_tab[x/128] + 16;
> }
>
> static int bin_index_up(size_t x)
> {
> x = x / SIZE_ALIGN - 1;
> if (x <= 32) return x;
> - return ((union { float v; uint32_t r; }){(int)x}.r+0x1fffff>>21) - 496;
> + x--;
> + if (x < 512) return bin_tab[x/8] + 1;
> + return bin_tab[x/128] + 17;
> }
>
> #if 0
> --
> 2.10.2
Looks good mostly, but wouldn't it be better to drop the 4 unused
entries from the table and add -4's to the indices?
Rich
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] use lookup table for malloc bin index instead of float conversion
2016-12-17 5:50 ` Rich Felker
@ 2016-12-17 7:36 ` u-uy74
2016-12-17 14:03 ` [PATCH v2] " Szabolcs Nagy
0 siblings, 1 reply; 5+ messages in thread
From: u-uy74 @ 2016-12-17 7:36 UTC (permalink / raw)
To: musl
On Sat, Dec 17, 2016 at 12:50:58AM -0500, Rich Felker wrote:
> On Sun, Nov 27, 2016 at 03:15:41PM +0100, Szabolcs Nagy wrote:
> > +static const unsigned char bin_tab[64] = {
> > + 0, 0, 0, 0,32,33,34,35,36,36,37,37,38,38,39,39,
> > + 40,40,40,40,41,41,41,41,42,42,42,42,43,43,43,43,
> > + 44,44,44,44,44,44,44,44,45,45,45,45,45,45,45,45,
> > + 46,46,46,46,46,46,46,46,47,47,47,47,47,47,47,47,
> > +};
> > +
> > static int bin_index(size_t x)
> > {
> > x = x / SIZE_ALIGN - 1;
> > if (x <= 32) return x;
> > + if (x < 512) return bin_tab[x/8];
> > if (x > 0x1c00) return 63;
> > - return ((union { float v; uint32_t r; }){(int)x}.r>>21) - 496;
> > + return bin_tab[x/128] + 16;
> > }
> >
> > static int bin_index_up(size_t x)
> > {
> > x = x / SIZE_ALIGN - 1;
> > if (x <= 32) return x;
> > - return ((union { float v; uint32_t r; }){(int)x}.r+0x1fffff>>21) - 496;
> > + x--;
> > + if (x < 512) return bin_tab[x/8] + 1;
> > + return bin_tab[x/128] + 17;
> > }
> >
> > #if 0
> > --
> > 2.10.2
> Looks good mostly, but wouldn't it be better to drop the 4 unused
> entries from the table and add -4's to the indices?
Wouldn't this enlarge the code more than reduce the data?
Rune
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v2] use lookup table for malloc bin index instead of float conversion
2016-12-17 7:36 ` u-uy74
@ 2016-12-17 14:03 ` Szabolcs Nagy
2016-12-17 14:30 ` u-uy74
0 siblings, 1 reply; 5+ messages in thread
From: Szabolcs Nagy @ 2016-12-17 14:03 UTC (permalink / raw)
To: musl
float conversion is slow and big on soft-float targets.
The lookup table increases code size a bit on most hard float targets
(and adds 60byte rodata), performance can be a bit slower because of
position independent data access and cpu internal state dependence
(cache, extra branches), but the overall effect should be minimal
(common, small size allocations should be unaffected).
---
* u-uy74@aetey.se <u-uy74@aetey.se> [2016-12-17 08:36:00 +0100]:
> On Sat, Dec 17, 2016 at 12:50:58AM -0500, Rich Felker wrote:
> > Looks good mostly, but wouldn't it be better to drop the 4 unused
> > entries from the table and add -4's to the indices?
>
> Wouldn't this enlarge the code more than reduce the data?
most targets have a load instruction with small offset
and on some targets the compiler emits relocation against
(tab-4) instead of tab so the code size is not affected.
src/malloc/malloc.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
diff --git a/src/malloc/malloc.c b/src/malloc/malloc.c
index b90636c..c38c46f 100644
--- a/src/malloc/malloc.c
+++ b/src/malloc/malloc.c
@@ -111,19 +111,29 @@ static int first_set(uint64_t x)
#endif
}
+static const unsigned char bin_tab[60] = {
+ 32,33,34,35,36,36,37,37,38,38,39,39,
+ 40,40,40,40,41,41,41,41,42,42,42,42,43,43,43,43,
+ 44,44,44,44,44,44,44,44,45,45,45,45,45,45,45,45,
+ 46,46,46,46,46,46,46,46,47,47,47,47,47,47,47,47,
+};
+
static int bin_index(size_t x)
{
x = x / SIZE_ALIGN - 1;
if (x <= 32) return x;
+ if (x < 512) return bin_tab[x/8-4];
if (x > 0x1c00) return 63;
- return ((union { float v; uint32_t r; }){(int)x}.r>>21) - 496;
+ return bin_tab[x/128-4] + 16;
}
static int bin_index_up(size_t x)
{
x = x / SIZE_ALIGN - 1;
if (x <= 32) return x;
- return ((union { float v; uint32_t r; }){(int)x}.r+0x1fffff>>21) - 496;
+ x--;
+ if (x < 512) return bin_tab[x/8-4] + 1;
+ return bin_tab[x/128-4] + 17;
}
#if 0
--
2.10.2
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] use lookup table for malloc bin index instead of float conversion
2016-12-17 14:03 ` [PATCH v2] " Szabolcs Nagy
@ 2016-12-17 14:30 ` u-uy74
0 siblings, 0 replies; 5+ messages in thread
From: u-uy74 @ 2016-12-17 14:30 UTC (permalink / raw)
To: musl
On Sat, Dec 17, 2016 at 03:03:24PM +0100, Szabolcs Nagy wrote:
> > > ... wouldn't it be better to drop the 4 unused
> > > entries from the table and add -4's to the indices?
>
> > Wouldn't this enlarge the code more than reduce the data?
>
> most targets have a load instruction with small offset
> and on some targets the compiler emits relocation against
> (tab-4) instead of tab so the code size is not affected.
Oh indeed, I overlooked this link-time evaluation. Thanks.
Rune
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2016-12-17 14:30 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-27 14:15 [PATCH] use lookup table for malloc bin index instead of float conversion Szabolcs Nagy
2016-12-17 5:50 ` Rich Felker
2016-12-17 7:36 ` u-uy74
2016-12-17 14:03 ` [PATCH v2] " Szabolcs Nagy
2016-12-17 14:30 ` u-uy74
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).