mailing list of musl libc
 help / color / mirror / code / Atom feed
* [PATCH] use lookup table for malloc bin index instead of float conversion
@ 2016-11-27 14:15 Szabolcs Nagy
  2016-12-17  5:50 ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: Szabolcs Nagy @ 2016-11-27 14:15 UTC (permalink / raw)
  To: musl

float conversion is slow and big on soft-float targets.

The lookup table increases code size a bit on most hard float targets
(and adds 64byte rodata), performance can be a bit slower because of
position independent data access and cpu internal state dependence
(cache, extra branches), but the overall effect should be minimal
(common, small size allocations should be unaffected).
---
 src/malloc/malloc.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/malloc/malloc.c b/src/malloc/malloc.c
index b90636c..ce2e97d 100644
--- a/src/malloc/malloc.c
+++ b/src/malloc/malloc.c
@@ -111,19 +111,29 @@ static int first_set(uint64_t x)
 #endif
 }
 
+static const unsigned char bin_tab[64] = {
+	 0, 0, 0, 0,32,33,34,35,36,36,37,37,38,38,39,39,
+	40,40,40,40,41,41,41,41,42,42,42,42,43,43,43,43,
+	44,44,44,44,44,44,44,44,45,45,45,45,45,45,45,45,
+	46,46,46,46,46,46,46,46,47,47,47,47,47,47,47,47,
+};
+
 static int bin_index(size_t x)
 {
 	x = x / SIZE_ALIGN - 1;
 	if (x <= 32) return x;
+	if (x < 512) return bin_tab[x/8];
 	if (x > 0x1c00) return 63;
-	return ((union { float v; uint32_t r; }){(int)x}.r>>21) - 496;
+	return bin_tab[x/128] + 16;
 }
 
 static int bin_index_up(size_t x)
 {
 	x = x / SIZE_ALIGN - 1;
 	if (x <= 32) return x;
-	return ((union { float v; uint32_t r; }){(int)x}.r+0x1fffff>>21) - 496;
+	x--;
+	if (x < 512) return bin_tab[x/8] + 1;
+	return bin_tab[x/128] + 17;
 }
 
 #if 0
-- 
2.10.2



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] use lookup table for malloc bin index instead of float conversion
  2016-11-27 14:15 [PATCH] use lookup table for malloc bin index instead of float conversion Szabolcs Nagy
@ 2016-12-17  5:50 ` Rich Felker
  2016-12-17  7:36   ` u-uy74
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2016-12-17  5:50 UTC (permalink / raw)
  To: musl

On Sun, Nov 27, 2016 at 03:15:41PM +0100, Szabolcs Nagy wrote:
> float conversion is slow and big on soft-float targets.
> 
> The lookup table increases code size a bit on most hard float targets
> (and adds 64byte rodata), performance can be a bit slower because of
> position independent data access and cpu internal state dependence
> (cache, extra branches), but the overall effect should be minimal
> (common, small size allocations should be unaffected).
> ---
>  src/malloc/malloc.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/src/malloc/malloc.c b/src/malloc/malloc.c
> index b90636c..ce2e97d 100644
> --- a/src/malloc/malloc.c
> +++ b/src/malloc/malloc.c
> @@ -111,19 +111,29 @@ static int first_set(uint64_t x)
>  #endif
>  }
>  
> +static const unsigned char bin_tab[64] = {
> +	 0, 0, 0, 0,32,33,34,35,36,36,37,37,38,38,39,39,
> +	40,40,40,40,41,41,41,41,42,42,42,42,43,43,43,43,
> +	44,44,44,44,44,44,44,44,45,45,45,45,45,45,45,45,
> +	46,46,46,46,46,46,46,46,47,47,47,47,47,47,47,47,
> +};
> +
>  static int bin_index(size_t x)
>  {
>  	x = x / SIZE_ALIGN - 1;
>  	if (x <= 32) return x;
> +	if (x < 512) return bin_tab[x/8];
>  	if (x > 0x1c00) return 63;
> -	return ((union { float v; uint32_t r; }){(int)x}.r>>21) - 496;
> +	return bin_tab[x/128] + 16;
>  }
>  
>  static int bin_index_up(size_t x)
>  {
>  	x = x / SIZE_ALIGN - 1;
>  	if (x <= 32) return x;
> -	return ((union { float v; uint32_t r; }){(int)x}.r+0x1fffff>>21) - 496;
> +	x--;
> +	if (x < 512) return bin_tab[x/8] + 1;
> +	return bin_tab[x/128] + 17;
>  }
>  
>  #if 0
> -- 
> 2.10.2

Looks good mostly, but wouldn't it be better to drop the 4 unused
entries from the table and add -4's to the indices?

Rich


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] use lookup table for malloc bin index instead of float conversion
  2016-12-17  5:50 ` Rich Felker
@ 2016-12-17  7:36   ` u-uy74
  2016-12-17 14:03     ` [PATCH v2] " Szabolcs Nagy
  0 siblings, 1 reply; 5+ messages in thread
From: u-uy74 @ 2016-12-17  7:36 UTC (permalink / raw)
  To: musl

On Sat, Dec 17, 2016 at 12:50:58AM -0500, Rich Felker wrote:
> On Sun, Nov 27, 2016 at 03:15:41PM +0100, Szabolcs Nagy wrote:
> > +static const unsigned char bin_tab[64] = {
> > +	 0, 0, 0, 0,32,33,34,35,36,36,37,37,38,38,39,39,
> > +	40,40,40,40,41,41,41,41,42,42,42,42,43,43,43,43,
> > +	44,44,44,44,44,44,44,44,45,45,45,45,45,45,45,45,
> > +	46,46,46,46,46,46,46,46,47,47,47,47,47,47,47,47,
> > +};
> > +
> >  static int bin_index(size_t x)
> >  {
> >  	x = x / SIZE_ALIGN - 1;
> >  	if (x <= 32) return x;
> > +	if (x < 512) return bin_tab[x/8];
> >  	if (x > 0x1c00) return 63;
> > -	return ((union { float v; uint32_t r; }){(int)x}.r>>21) - 496;
> > +	return bin_tab[x/128] + 16;
> >  }
> >  
> >  static int bin_index_up(size_t x)
> >  {
> >  	x = x / SIZE_ALIGN - 1;
> >  	if (x <= 32) return x;
> > -	return ((union { float v; uint32_t r; }){(int)x}.r+0x1fffff>>21) - 496;
> > +	x--;
> > +	if (x < 512) return bin_tab[x/8] + 1;
> > +	return bin_tab[x/128] + 17;
> >  }
> >  
> >  #if 0
> > -- 
> > 2.10.2

> Looks good mostly, but wouldn't it be better to drop the 4 unused
> entries from the table and add -4's to the indices?

Wouldn't this enlarge the code more than reduce the data?

Rune



^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2] use lookup table for malloc bin index instead of float conversion
  2016-12-17  7:36   ` u-uy74
@ 2016-12-17 14:03     ` Szabolcs Nagy
  2016-12-17 14:30       ` u-uy74
  0 siblings, 1 reply; 5+ messages in thread
From: Szabolcs Nagy @ 2016-12-17 14:03 UTC (permalink / raw)
  To: musl

float conversion is slow and big on soft-float targets.

The lookup table increases code size a bit on most hard float targets
(and adds 60byte rodata), performance can be a bit slower because of
position independent data access and cpu internal state dependence
(cache, extra branches), but the overall effect should be minimal
(common, small size allocations should be unaffected).
---

* u-uy74@aetey.se <u-uy74@aetey.se> [2016-12-17 08:36:00 +0100]:
> On Sat, Dec 17, 2016 at 12:50:58AM -0500, Rich Felker wrote:
> > Looks good mostly, but wouldn't it be better to drop the 4 unused
> > entries from the table and add -4's to the indices?
> 
> Wouldn't this enlarge the code more than reduce the data?

most targets have a load instruction with small offset
and on some targets the compiler emits relocation against
(tab-4) instead of tab so the code size is not affected.

 src/malloc/malloc.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/malloc/malloc.c b/src/malloc/malloc.c
index b90636c..c38c46f 100644
--- a/src/malloc/malloc.c
+++ b/src/malloc/malloc.c
@@ -111,19 +111,29 @@ static int first_set(uint64_t x)
 #endif
 }
 
+static const unsigned char bin_tab[60] = {
+	            32,33,34,35,36,36,37,37,38,38,39,39,
+	40,40,40,40,41,41,41,41,42,42,42,42,43,43,43,43,
+	44,44,44,44,44,44,44,44,45,45,45,45,45,45,45,45,
+	46,46,46,46,46,46,46,46,47,47,47,47,47,47,47,47,
+};
+
 static int bin_index(size_t x)
 {
 	x = x / SIZE_ALIGN - 1;
 	if (x <= 32) return x;
+	if (x < 512) return bin_tab[x/8-4];
 	if (x > 0x1c00) return 63;
-	return ((union { float v; uint32_t r; }){(int)x}.r>>21) - 496;
+	return bin_tab[x/128-4] + 16;
 }
 
 static int bin_index_up(size_t x)
 {
 	x = x / SIZE_ALIGN - 1;
 	if (x <= 32) return x;
-	return ((union { float v; uint32_t r; }){(int)x}.r+0x1fffff>>21) - 496;
+	x--;
+	if (x < 512) return bin_tab[x/8-4] + 1;
+	return bin_tab[x/128-4] + 17;
 }
 
 #if 0
-- 
2.10.2



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] use lookup table for malloc bin index instead of float conversion
  2016-12-17 14:03     ` [PATCH v2] " Szabolcs Nagy
@ 2016-12-17 14:30       ` u-uy74
  0 siblings, 0 replies; 5+ messages in thread
From: u-uy74 @ 2016-12-17 14:30 UTC (permalink / raw)
  To: musl

On Sat, Dec 17, 2016 at 03:03:24PM +0100, Szabolcs Nagy wrote:
> > > ... wouldn't it be better to drop the 4 unused
> > > entries from the table and add -4's to the indices?
>
> > Wouldn't this enlarge the code more than reduce the data?
> 
> most targets have a load instruction with small offset
> and on some targets the compiler emits relocation against
> (tab-4) instead of tab so the code size is not affected.

Oh indeed, I overlooked this link-time evaluation. Thanks.

Rune



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-12-17 14:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-27 14:15 [PATCH] use lookup table for malloc bin index instead of float conversion Szabolcs Nagy
2016-12-17  5:50 ` Rich Felker
2016-12-17  7:36   ` u-uy74
2016-12-17 14:03     ` [PATCH v2] " Szabolcs Nagy
2016-12-17 14:30       ` u-uy74

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).