> There is no hidden "size actually allocated internally". The size you > get is the size you requested. Everything else is allocator data > structures *outside of the object* that the caller has no entitlement > to peek or poke at, and malloc_usable_size's return value reflects > that. If I understand correctly, according to the definition of size_classes in the mallocng code: 1. When I call `void* p = malloc(6600)`, mallocng actually allocates more than 8100 bytes of usable space, right? 2. According to your previous explanation, calling malloc_usable_size(p) at this time returns 6600, right? My question is, if malloc_usable_size(p) can directly return 8191 (or similar actual allocated size, as other libc do) instead of 6600, is it possible to make mallocng achieve higher performance both in time and space? -- Best Regards BaiYang baiyang@gmail.com http://i.baiy.cn **** < END OF EMAIL > **** From: Rich Felker Date: 2022-09-20 09:00 To: baiyang CC: musl Subject: Re: Re: [musl] The heap memory performance (malloc/free/realloc) is significantly degraded in musl 1.2 (compared to 1.1) On Tue, Sep 20, 2022 at 08:47:07AM +0800, baiyang wrote: > > Would it be possible to limit use of the list to actually requesting > > help or making reports, rather than inciting debates about what is UB > > or what the consequences of UB might be? > > You are right. > > The real question is: if we only need malloc_usable_size to return > the size actually allocated internally (not the size requested by > the user, **just as musl version 1.1 and all other libc > implementations do**), is it possible to improve its time and space > efficiency? There is no hidden "size actually allocated internally". The size you get is the size you requested. Everything else is allocator data structures *outside of the object* that the caller has no entitlement to peek or poke at, and malloc_usable_size's return value reflects that. If you want to see what portion of the time is being spent on different parts of processing the metadata, you could sit down and actually run it under perf to get a profiling report/flame graph. I'm pretty sure you'll find that the final get_nominal_size step is a small portion of the time spent. get_meta is probably the majority of the time, some of it fundamental, and some of it hardening. But don't take my word for it. Measure. One thing I can tell you definitively though: if you did what the C language (which lacks malloc_usable_size) intended you to do, and kept track of the size of your own buffer, and just used that, you would spend 0% of the time you're spending on this. You would also save the entire "several hundred ms per 10 million calls" it's costing on other malloc implementations, by just *not doing something you don't need to do*. Rich