> Is there a reason you're relying on an unreliable and nonstandard
> function (malloc_usable_size) to do this rather than your program
> keeping track of its own knowledge of the allocated size? This is what
> the C language expects you to do. For example if you have a structure
> that contains a pointer to a dynamically sized buffer, normally you
> store the size in a size_t member right next to that pointer, allowing
> you to make these kind of decisions without having to probe anything.

Yes, as I have been said, by comparing the number of bytes that realloc needs to copy in the worst case (the return value of malloc_usable_size), and the number of bytes we actually need to copy, we can optimize the performance of realloc in real scenarios and avoid unnecessary memory copies.

In fact, in scenarios including glibc, tcmalloc, windows crt, mac os x, uclibc and musl 1.1, we did achieve good optimization results.

On the other hand, of course we keep the number of bytes actually allocated, but it doesn't really reflect objectively the number of bytes to be copied by realloc when the memcpy actually occurs. And malloc_usable_size() more accurately reflects how many bytes realloc needs to copy when it degenerates back to malloc-memcpy-free mode.

So our expectation is as mentioned in the man page for linux, mac os or windows: "The value returned by malloc_usable_size() may be **greater than** the requested size of the allocation" or "The memory block size is always at least as large as the allocation it backs, **and may be larger**." - We expect to get its internal size to evaluate the cost of memory copying.

Thanks :-)

--

   Best Regards
  BaiYang
  baiyang@gmail.com
  http://i.baiy.cn
**** < END OF EMAIL > ****
 
 
 
From: Rich Felker
Date: 2022-09-20 02:15
To: baiyang
CC: musl
Subject: Re: Re: [musl] The heap memory performance (malloc/free/realloc) is significantly degraded in musl 1.2 (compared to 1.1)
On Tue, Sep 20, 2022 at 01:32:31AM +0800, baiyang wrote:
> Hi Rich,
>
> Thanks for your reply.
>
> > Unless you have an application that's explicitly using
> > malloc_usable_size all over the place, it's highly unlikely that this
> > is the cause of your real-world performance problems.
>
> 1. Yes, we have a real scenario where `malloc_usable_size` is called
> frequently: we need to optimize the realloc experience. We add an
> extra parameter to realloc - minimalCopyBytes: it represents the
> actual size of data that needs to be copied after fallback to
> malloc-copy-free mode. We will judge whether to call realloc or
> complete malloc-memcpy-free by ourself based on factors such as the
> size of the data that realloc needs to copy (obtained through
> `malloc_usable_size`), the size that we actually need to copy when
> we doing malloc-memcpy-free ourself (minimalCopyBytes) and the
> chance of merging chunks (small blocks) or mremap (large blocks) in
> the underlayer realloc. So, this is a real scenario, we need to call
> `malloc_usable_size` frequently.
 
Is there a reason you're relying on an unreliable and nonstandard
function (malloc_usable_size) to do this rather than your program
keeping track of its own knowledge of the allocated size? This is what
the C language expects you to do. For example if you have a structure
that contains a pointer to a dynamically sized buffer, normally you
store the size in a size_t member right next to that pointer, allowing
you to make these kind of decisions without having to probe anything.
 
> 2. As I mentioned before, this isn't just a problem with
> `malloc_usable_size`, since we actually include a full
> `malloc_usable_size` procedure in both `realloc` and `free`, it
> actually slows down The speed of other calls such as `free` and
> `realloc`. So this problem actually slows down not only the
> `malloc_usable_size` call itself, but also the realloc and free
> calls.
 
If this is affecting you too, that's a separate issue. But I can't
tell from what you've reported so far whether you're just claiming
this on a theoretical basis or whether you're actually experiencing
unacceptable performance.