* [musl] realloci(): A realloc() variant that works in-place
@ 2025-10-30 23:15 Alejandro Colomar
2025-10-30 23:25 ` A. Wilcox
` (3 more replies)
0 siblings, 4 replies; 116+ messages in thread
From: Alejandro Colomar @ 2025-10-30 23:15 UTC (permalink / raw)
To: libc-alpha, musl; +Cc: Arthur O'Dwyer, Jonathan Wakely, Thiago Macieira
[-- Attachment #1: Type: text/plain, Size: 1974 bytes --]
Hi,
A discussion within the C++ std-proposals@ mailing list triggered the
discussion about the need for a realloc() variant that works in-place,
that is, that doesn't move the address of the memory, and thus that
doesn't invalidate existing pointers derived from it.
It's not the first time I've had people want this. While I worked on
fixing the situation of realloc(p,0) in the last year, several people
came to me saying they'd be interested in something like that.
How about adding a realloci() function to musl and glibc, with the
following specification:
void *realloci(void *p, size_t size);
- It returns the input pointer on success, or a null pointer on
error. Usual code using it would look like this:
if (realloci(p, size) == NULL)
goto fail;
without needing to store the return value anywhere, and it's
just like fgets(3) where it's mainly useful for the null
check.
- 'p' must be non-null. This is because it doesn't make sense
to keep in place a null pointer.
Forbidding null pointers here will also result in better
static analysis: this function will never end any lifetime,
and it will neither start any lifetime. It's just a regular
function, that happens to extend (or shrink) the storage of
a block of memory.
- We could perfectly return int (0 for success, -1 for error),
but returning the pointer makes it a drop-in replacement for
realloc(3), and also allows using it in chained code
foo(realloci(p, size));
About the name, I chose the 'i' because of sed(1) -i. 'i' seems to be
common for meaning in-place in several commands, so it would make sense
here, I think.
I'd like to hear opinions from implementers about feasibility of this
API, before writing a standards proposal. Please let me know any
feedback.
Have a lovely night!
Alex
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] realloci(): A realloc() variant that works in-place
2025-10-30 23:15 [musl] realloci(): A realloc() variant that works in-place Alejandro Colomar
@ 2025-10-30 23:25 ` A. Wilcox
2025-10-30 23:35 ` Collin Funk
2025-10-31 6:47 ` Lénárd Szolnoki
2025-10-31 12:16 ` Thorsten Glaser
` (2 subsequent siblings)
3 siblings, 2 replies; 116+ messages in thread
From: A. Wilcox @ 2025-10-30 23:25 UTC (permalink / raw)
To: musl; +Cc: libc-alpha, Arthur O'Dwyer, Jonathan Wakely, Thiago Macieira
On Oct 30, 2025, at 18:15, Alejandro Colomar <alx@kernel.org> wrote:
>
> Hi,
>
>
> A discussion within the C++ std-proposals@ mailing list triggered the
> discussion about the need for a realloc() variant that works in-place,
> that is, that doesn't move the address of the memory, and thus that
> doesn't invalidate existing pointers derived from it.
>
> It's not the first time I've had people want this. While I worked on
> fixing the situation of realloc(p,0) in the last year, several people
> came to me saying they'd be interested in something like that.
>
> How about adding a realloci() function to musl and glibc, with the
> following specification:
>
> void *realloci(void *p, size_t size);
>
> - It returns the input pointer on success, or a null pointer on
> error. Usual code using it would look like this:
>
> if (realloci(p, size) == NULL)
> goto fail;
>
> without needing to store the return value anywhere, and it's
> just like fgets(3) where it's mainly useful for the null
> check.
So, if it fails, does it free the pointer?
If it doesn’t, then you can’t use it in simple assignment.
>
> - 'p' must be non-null. This is because it doesn't make sense
> to keep in place a null pointer.
>
> Forbidding null pointers here will also result in better
> static analysis: this function will never end any lifetime,
> and it will neither start any lifetime. It's just a regular
> function, that happens to extend (or shrink) the storage of
> a block of memory.
>
> - We could perfectly return int (0 for success, -1 for error),
> but returning the pointer makes it a drop-in replacement for
> realloc(3), and also allows using it in chained code
>
> foo(realloci(p, size));
This is never safe if `realloci` can return `NULL`, IMO.
>
> About the name, I chose the 'i' because of sed(1) -i. 'i' seems to be
> common for meaning in-place in several commands, so it would make sense
> here, I think.
>
> I'd like to hear opinions from implementers about feasibility of this
> API, before writing a standards proposal. Please let me know any
> feedback.
Sigh. The only safe way I can think to make this work is:
* It leaves the existing allocation alone if it can’t satisfy the
new size.
* It always returns the existing pointer, allowing for the above
example chain.
* It is one of those dances where you need to zero out errno first,
and pay attention to *that* as the “side channel” return value, to
know whether the pointer actually points to memory of the desired
size or not.
But all of that is complicated, easy to get wrong, and just introduces
more foot guns to C. Aren’t there enough already?
I like what you’ve done with `realloc`, but I think something like
this belongs in a higher level language than C.
Best,
-Anna
> Have a lovely night!
> Alex
--
Anna Wilcox (she/her)
SW Engineering: C++/Rust, DevOps, POSIX, Py/Ruby
Wilcox Technologies Inc.
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] realloci(): A realloc() variant that works in-place
2025-10-30 23:25 ` A. Wilcox
@ 2025-10-30 23:35 ` Collin Funk
2025-10-31 6:47 ` Lénárd Szolnoki
1 sibling, 0 replies; 116+ messages in thread
From: Collin Funk @ 2025-10-30 23:35 UTC (permalink / raw)
To: A. Wilcox
Cc: musl, libc-alpha, Arthur O'Dwyer, Jonathan Wakely,
Thiago Macieira
"A. Wilcox" <AWilcox@Wilcox-Tech.com> writes:
>> - We could perfectly return int (0 for success, -1 for error),
>> but returning the pointer makes it a drop-in replacement for
>> realloc(3), and also allows using it in chained code
>>
>> foo(realloci(p, size));
>
>
> This is never safe if `realloci` can return `NULL`, IMO.
This was my concern as well. The code looks nicer than a normal realloc,
but I think it will lead to more people not checking for allocation
failures. I would rather have a nice error message than a segmentation
fault when a program fails to allocate memory.
Collin
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] realloci(): A realloc() variant that works in-place
2025-10-30 23:25 ` A. Wilcox
2025-10-30 23:35 ` Collin Funk
@ 2025-10-31 6:47 ` Lénárd Szolnoki
1 sibling, 0 replies; 116+ messages in thread
From: Lénárd Szolnoki @ 2025-10-31 6:47 UTC (permalink / raw)
To: musl, A. Wilcox
Cc: libc-alpha, Arthur O'Dwyer, Jonathan Wakely, Thiago Macieira
On 30/10/2025 23:25, A. Wilcox wrote:
> On Oct 30, 2025, at 18:15, Alejandro Colomar <alx@kernel.org> wrote:
>>
>> Hi,
>>
>>
>> A discussion within the C++ std-proposals@ mailing list triggered the
>> discussion about the need for a realloc() variant that works in-place,
>> that is, that doesn't move the address of the memory, and thus that
>> doesn't invalidate existing pointers derived from it.
>>
>> It's not the first time I've had people want this. While I worked on
>> fixing the situation of realloc(p,0) in the last year, several people
>> came to me saying they'd be interested in something like that.
>>
>> How about adding a realloci() function to musl and glibc, with the
>> following specification:
>>
>> void *realloci(void *p, size_t size);
>>
>> - It returns the input pointer on success, or a null pointer on
>> error. Usual code using it would look like this:
>>
>> if (realloci(p, size) == NULL)
>> goto fail;
>>
>> without needing to store the return value anywhere, and it's
>> just like fgets(3) where it's mainly useful for the null
>> check.
>
>
> So, if it fails, does it free the pointer?
>
> If it doesn’t, then you can’t use it in simple assignment.
>
>
>>
>> - 'p' must be non-null. This is because it doesn't make sense
>> to keep in place a null pointer.
>>
>> Forbidding null pointers here will also result in better
>> static analysis: this function will never end any lifetime,
>> and it will neither start any lifetime. It's just a regular
>> function, that happens to extend (or shrink) the storage of
>> a block of memory.
>>
>> - We could perfectly return int (0 for success, -1 for error),
>> but returning the pointer makes it a drop-in replacement for
>> realloc(3), and also allows using it in chained code
>>
>> foo(realloci(p, size));
>
>
> This is never safe if `realloci` can return `NULL`, IMO.
>
>
>>
>> About the name, I chose the 'i' because of sed(1) -i. 'i' seems to be
>> common for meaning in-place in several commands, so it would make sense
>> here, I think.
>>
>> I'd like to hear opinions from implementers about feasibility of this
>> API, before writing a standards proposal. Please let me know any
>> feedback.
>
>
> Sigh. The only safe way I can think to make this work is:
>
> * It leaves the existing allocation alone if it can’t satisfy the
> new size.
>
> * It always returns the existing pointer, allowing for the above
> example chain.
>
> * It is one of those dances where you need to zero out errno first,
> and pay attention to *that* as the “side channel” return value, to
> know whether the pointer actually points to memory of the desired
> size or not.
>
> But all of that is complicated, easy to get wrong, and just introduces
> more foot guns to C. Aren’t there enough already?
>
> I like what you’ve done with `realloc`, but I think something like
> this belongs in a higher level language than C.
Some higher level language implementations (like libstdc++ and libc++ for C++) build on to
of the C library's allocator, so they can't feasibly add a "try to extend allocation
inplace" operation without cooperation from the C library.
I agree that the proposed interface might have problems, but the capability is useful,
even for higher level languages to build up on.
Cheers,
Lénárd
> Best,
> -Anna
>
>
>> Have a lovely night!
>> Alex
>
> --
> Anna Wilcox (she/her)
> SW Engineering: C++/Rust, DevOps, POSIX, Py/Ruby
> Wilcox Technologies Inc.
>
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] realloci(): A realloc() variant that works in-place
2025-10-30 23:15 [musl] realloci(): A realloc() variant that works in-place Alejandro Colomar
2025-10-30 23:25 ` A. Wilcox
@ 2025-10-31 12:16 ` Thorsten Glaser
2025-11-01 1:03 ` Rich Felker
2025-10-31 13:43 ` [musl] " Alejandro Colomar
2025-11-01 13:05 ` Florian Weimer
3 siblings, 1 reply; 116+ messages in thread
From: Thorsten Glaser @ 2025-10-31 12:16 UTC (permalink / raw)
To: musl; +Cc: libc-alpha, Arthur O'Dwyer, Jonathan Wakely, Thiago Macieira
Alejandro Colomar dixit:
>A discussion within the C++ std-proposals@ mailing list triggered the
>discussion about the need for a realloc() variant that works in-place,
>that is, that doesn't move the address of the memory, and thus that
>doesn't invalidate existing pointers derived from it.
How is that supposed to work if you want to grow the
allocation?
This seems like increasing burden on the implementation
for everyone, just for niche corner use cases.
bye,
//mirabilos
--
15:41⎜<Lo-lan-do:#fusionforge> Somebody write a testsuite for helloworld :-)
^ permalink raw reply [flat|nested] 116+ messages in thread
* [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-30 23:15 [musl] realloci(): A realloc() variant that works in-place Alejandro Colomar
2025-10-30 23:25 ` A. Wilcox
2025-10-31 12:16 ` Thorsten Glaser
@ 2025-10-31 13:43 ` Alejandro Colomar
2025-10-31 14:13 ` Laurent Bercot
2025-11-01 13:05 ` Florian Weimer
3 siblings, 1 reply; 116+ messages in thread
From: Alejandro Colomar @ 2025-10-31 13:43 UTC (permalink / raw)
To: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Thorsten Glaser, Collin Funk
Cc: Arthur O'Dwyer, Jonathan Wakely, Thiago Macieira
[-- Attachment #1: Type: text/plain, Size: 6554 bytes --]
Hi,
I didn't receive replies that didn't CC me; I've learnt from them from
reading the archives. Please keep me in CC, as I'm not subscribed to
musl@ nor libc-alpha@. For those that don't have the address, here it
is:
Cc: alx@kernel.org
I'll try to reply to the feedback I saw in the archives.
Thosten wrote:
> How is that supposed to work if you want to grow the
> allocation?
>
> This seems like increasing burden on the implementation
> for everyone, just for niche corner use cases.
realloc(3) already does this sometimes. If the memory has some empty
storage after the end of the existing allocation, it can just change the
metadata to adjust the allocated size.
Here's how it could be implemented (this is my current draft for
implementing in musl, which just copies part of the existing
realloc(3) implementation):
+int realloci(void *p, size_t n)
+{
+ if (size_overflows(n)) return -1;
+
+ struct meta *g = get_meta(p);
+ int idx = get_slot_index(p);
+ size_t stride = get_stride(g);
+ unsigned char *start = g->mem->storage + stride*idx;
+ unsigned char *end = start + stride - IB;
+ size_t avail_size = end-(unsigned char *)p;
+
+ // only resize in-place if size class matches
+ if (n <= avail_size && n<MMAP_THRESHOLD
+ && size_to_class(n)+1 >= g->sizeclass) {
+ set_size(p, end, n);
+ return 0;
+ }
+
+ return -1;
+}
On 30/10/2025 23:25, A. Wilcox wrote:
>> void *realloci(void *p, size_t size);
>>
>> - It returns the input pointer on success, or a null pointer on
>> error. Usual code using it would look like this:
>>
>> if (realloci(p, size) == NULL)
>> goto fail;
>>
>> without needing to store the return value anywhere, and it's
>> just like fgets(3) where it's mainly useful for the null
>> check.
>
>
> So, if it fails, does it free the pointer?
No. In at least some cases (maybe most of them), the user will want to
fall back to realloc(3) if it failed, so we shouldn't free(3).
If one wants a reallocf(3) variant, as in the BSDs, it could be
reallocif(). However, I'd wait to see if anyone is actually interested
in that before even attempting to implement it; I suspect it could have
zero users.
BTW, I'm going to eventually propose reallocf(3) for standardization
(I want to first settle the realloc(p,0) discussion, as I'm worried that
the C committee might be confused by more than one realloc(3)-like
proposal at the same time). I'll write a separate proposal mail to
introduce it in musl and glibc soon-ish.
> If it doesn’t, then you can’t use it in simple assignment.
If you implement your own reallocif(), then you can.
>>
>> - 'p' must be non-null. This is because it doesn't make sense
>> to keep in place a null pointer.
>>
>> Forbidding null pointers here will also result in better
>> static analysis: this function will never end any lifetime,
>> and it will neither start any lifetime. It's just a regular
>> function, that happens to extend (or shrink) the storage of
>> a block of memory.
>>
>> - We could perfectly return int (0 for success, -1 for error),
>> but returning the pointer makes it a drop-in replacement for
>> realloc(3), and also allows using it in chained code
>>
>> foo(realloci(p, size));
>
>
> This is never safe if `realloci` can return `NULL`, IMO.
Unless foo() exit(3)s if it receives a null pointer.
See what we do in shadow utils:
$ grepc -h XREALLOC .
#define XREALLOC(p, n, T) exit_if_null(REALLOC(p, n, T))
$ grepc -h REALLOC .
#define REALLOC(p, n, T) \
( \
_Generic(p, T *: (T *) reallocarray(p, (n) ?: 1, sizeof(T))) \
)
$ grepc -h exit_if_null .
#define exit_if_null(p) \
({ \
__auto_type p_ = p; \
\
exit_if_null_(p_); \
p_; \
})
$ grepc -htfd exit_if_null_ .
inline void
exit_if_null_(void *p)
{
if (p == NULL) {
fprintf(log_get_logfd(), "%s: %s\n",
log_get_progname(), strerror(errno));
exit(13);
}
}
However, after thinking a bit more, I don't think this would be useful.
It would be rare to want to exit on a realloci() failure, I think.
Normally, one would then try with realloc(3).
And one can always write something like this:
void
xrealloci(void *p, size_t n)
{
if (realloci(p, n) == -1)
exit_if_null(NULL);
}
>> About the name, I chose the 'i' because of sed(1) -i. 'i' seems to be
>> common for meaning in-place in several commands, so it would make sense
>> here, I think.
>>
>> I'd like to hear opinions from implementers about feasibility of this
>> API, before writing a standards proposal. Please let me know any
>> feedback.
>
>
> Sigh. The only safe way I can think to make this work is:
>
> * It leaves the existing allocation alone if it can’t satisfy the
> new size.
Yes.
> * It always returns the existing pointer, allowing for the above
> example chain.
Let's forget about that. I don't think it would be very useful.
Let's return an int. That will also make it more obvious that it can't
possibly change the pointer at all, and that one can (has to) reuse old
pointers.
> * It is one of those dances where you need to zero out errno first,
> and pay attention to *that* as the “side channel” return value, to
> know whether the pointer actually points to memory of the desired
> size or not.
I prefer returning an int.
> But all of that is complicated, easy to get wrong, and just introduces
> more foot guns to C. Aren’t there enough already?
Let's try with
int realloci(void *p, size_t size);
This is much less of a footgun, and is also not complicated at all.
> I like what you’ve done with `realloc`,
Thanks!
> but I think something like
> this belongs in a higher level language than C.
As Lénárd said, those higher level languages depend on libc for this.
Have a lovely day!
Alex
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 13:43 ` [musl] " Alejandro Colomar
@ 2025-10-31 14:13 ` Laurent Bercot
2025-10-31 14:36 ` Thorsten Glaser
2025-10-31 15:35 ` Thiago Macieira
0 siblings, 2 replies; 116+ messages in thread
From: Laurent Bercot @ 2025-10-31 14:13 UTC (permalink / raw)
To: musl, libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Thorsten Glaser, Collin Funk
Cc: Arthur O'Dwyer, Jonathan Wakely, Thiago Macieira
>If one wants a reallocf(3) variant, as in the BSDs, it could be
>reallocif(). However, I'd wait to see if anyone is actually interested
>in that before even attempting to implement it; I suspect it could have
>zero users.
As usual, and this applies to realloci() too, the question is: what
exact problem are you trying to solve?
What problem is realloci() a solution to? In what circumstance would
a programmer want to use it rather than realloc?
In most cases, it will be very difficult for an implementation to
increase the size of an allocated block without relocating the block.
So you can expect realloci() to fail often. What should users do in that
case?
- Fail? the program will not be reliable at all. Who wants that?
- Fall back on realloc()? then the workarounds for relocation need to
be implemented anyway, so why not use realloc() in the first place?
When designing an API, you want it to be *usable*. As it is described,
realloci() is not really usable for a programmer. When I want to resize
a chunk of memory, I want to get the memory I asked for and get on with
my day. I don't want the library call to fail, except in a real ENOMEM
situation, which shouldn't happen more than once in two blue moons.
I already have my own structures and functions to deal with relocation
from realloc(), and every half-decent C programmer around the world,
does as well - that includes implementors of other languages. I see
no reason to ever use realloci(), if it's going to return a failure code
in circumstances I cannot predict while there is ample memory available.
It sounds to me that realloci() is a solution in search of a problem -
and we have more than enough of those already.
--
Laurent
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 14:13 ` Laurent Bercot
@ 2025-10-31 14:36 ` Thorsten Glaser
2025-10-31 15:14 ` Alejandro Colomar
2025-10-31 15:35 ` Thiago Macieira
1 sibling, 1 reply; 116+ messages in thread
From: Thorsten Glaser @ 2025-10-31 14:36 UTC (permalink / raw)
To: Alejandro Colomar
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely, Thiago Macieira
(no need to Cc me)
On Fri, 31 Oct 2025, Laurent Bercot wrote:
> As usual, and this applies to realloci() too, the question is: what
> exact problem are you trying to solve?
Yes. This.
> > > How is that supposed to work if you want to grow the
> > > allocation?
> > >
> > > This seems like increasing burden on the implementation
> > > for everyone, just for niche corner use cases.
> >
> > realloc(3) already does this sometimes. If the memory has some empty
> In most cases, it will be very difficult for an implementation to
> increase the size of an allocated block without relocating the block.
> So you can expect realloci() to fail often. What should users do in
> that case?
Exactly. This makes me doubt its usefulness as a generic function.
Even were you to limit its use to shrinking allocations only, that
would constrain implementations, e.g. those that spread the allocations
based on their size and thus could not free anything.
So, in short:
When called with a larger size, it’ll just fail most of the time.
When called with a smaller size, it’ll just be a no-op in many cases.
bye,
//mirabilos
--
22:20⎜<asarch> The crazy that persists in his craziness becomes a master
22:21⎜<asarch> And the distance between the craziness and geniality is
only measured by the success 18:35⎜<asarch> "Psychotics are consistently
inconsistent. The essence of sanity is to be inconsistently inconsistent
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 14:36 ` Thorsten Glaser
@ 2025-10-31 15:14 ` Alejandro Colomar
2025-10-31 15:45 ` Thorsten Glaser
0 siblings, 1 reply; 116+ messages in thread
From: Alejandro Colomar @ 2025-10-31 15:14 UTC (permalink / raw)
To: libc-alpha, musl
Cc: A. Wilcox, Lénárd Szolnoki, Collin Funk,
Arthur O'Dwyer, Jonathan Wakely, Thiago Macieira,
Alejandro Colomar
[-- Attachment #1: Type: text/plain, Size: 2059 bytes --]
Hi Thorsten,
On Fri, Oct 31, 2025 at 03:36:17PM +0100, Thorsten Glaser wrote:
> (no need to Cc me)
>
> On Fri, 31 Oct 2025, Laurent Bercot wrote:
>
> > As usual, and this applies to realloci() too, the question is: what
> > exact problem are you trying to solve?
>
> Yes. This.
I don't know much C++; I'll let C++ committee members speak about it.
In C, I've sometimes seen programmers trying to check if realloc(3)
moved or not, to skip some work. That's a micro-optimization that I've
never written myself, so I won't defend it. But for some reason, some
programmers keep wanting to do it.
Have a lovely day!
Alex
> > > > How is that supposed to work if you want to grow the
> > > > allocation?
> > > >
> > > > This seems like increasing burden on the implementation
> > > > for everyone, just for niche corner use cases.
> > >
> > > realloc(3) already does this sometimes. If the memory has some empty
>
> > In most cases, it will be very difficult for an implementation to
> > increase the size of an allocated block without relocating the block.
> > So you can expect realloci() to fail often. What should users do in
> > that case?
>
> Exactly. This makes me doubt its usefulness as a generic function.
>
> Even were you to limit its use to shrinking allocations only, that
> would constrain implementations, e.g. those that spread the allocations
> based on their size and thus could not free anything.
>
> So, in short:
>
> When called with a larger size, it’ll just fail most of the time.
>
> When called with a smaller size, it’ll just be a no-op in many cases.
>
> bye,
> //mirabilos
> --
> 22:20⎜<asarch> The crazy that persists in his craziness becomes a master
> 22:21⎜<asarch> And the distance between the craziness and geniality is
> only measured by the success 18:35⎜<asarch> "Psychotics are consistently
> inconsistent. The essence of sanity is to be inconsistently inconsistent
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 14:13 ` Laurent Bercot
2025-10-31 14:36 ` Thorsten Glaser
@ 2025-10-31 15:35 ` Thiago Macieira
1 sibling, 0 replies; 116+ messages in thread
From: Thiago Macieira @ 2025-10-31 15:35 UTC (permalink / raw)
To: musl, libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Thorsten Glaser, Collin Funk, Laurent Bercot
Cc: Arthur O'Dwyer, Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 2666 bytes --]
On Friday, 31 October 2025 07:13:24 Pacific Daylight Time Laurent Bercot wrote:
> What problem is realloci() a solution to? In what circumstance would
> a programmer want to use it rather than realloc?
This is very common in C++ but as Alex has pointed out, it happens in C too
when the caller code checks if the pointer changed and thus needs to perform
extra work.
In C++, most of the time in generic code, we simply cannot use realloc() in
the first place, because the container does not know if the object in question
can be memcpy()ed around. Some language has been added to the C++26 standard
to begin addressing this, which was the origin of this discussion. If the
container does not know this, then it must:
1) allocate a brand, new block
2) iterate over the elements in the container, asking the object to
move+destroy/relocate itself
3) free the old block
A resize-in-place operation that did not move would allow the container to be
much faster, especially if the move+destroy operation is in any way expensive.
It will increase the size of emitted code, but not by much and has a great
potential upside.
> In most cases, it will be very difficult for an implementation to
> increase the size of an allocated block without relocating the block.
> So you can expect realloci() to fail often.
Fail often is fine, though I dispute that a bit. The common case I can think of
is when a generic container is being populated without being told its final
target size. This implies the container is growing as elements are appended.
For an allocator implementation without slabs, which can allocate regions of
any size next to each other, the heap ahead is often free, so the block can be
grown in place by simple adjusting of the size.
Without the realloci() function, a generic container will keep performing the
three operations above. And since the new block is always bigger than the
previous usually by a factor of 2x, there's never enough free space in the
heap before it, so the heap keeps growing. If they don't, then adding ~8000
256-byte objects to a container will use approximately 4 MB of heap, not 2 MB,
unless they perform MADV_DONTNEED on free().
> What should users do in that case?
>
> - Fail? the program will not be reliable at all. Who wants that?
> - Fall back on realloc()? then the workarounds for relocation need to
> be implemented anyway, so why not use realloc() in the first place?
Neither.
The fallback is to malloc() + object-specific callback + free().
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 15:14 ` Alejandro Colomar
@ 2025-10-31 15:45 ` Thorsten Glaser
2025-10-31 16:02 ` Thiago Macieira
0 siblings, 1 reply; 116+ messages in thread
From: Thorsten Glaser @ 2025-10-31 15:45 UTC (permalink / raw)
To: Alejandro Colomar
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely, Thiago Macieira,
Alejandro Colomar
On Fri, 31 Oct 2025, Alejandro Colomar wrote:
>In C, I've sometimes seen programmers trying to check if realloc(3)
>moved or not, to skip some work. That's a micro-optimization that I've
>never written myself, so I won't defend it. But for some reason, some
>programmers keep wanting to do it.
Huh. That one if is probably more effort than just doing the
arithmetics always… if it’s not actually UB…
We don’t always need to follow what “some programmers keep
wanting to do” ☻
bye,
//mirabilos
--
<igli> exceptions: a truly awful implementation of quite a nice idea.
<igli> just about the worst way you could do something like that, afaic.
<igli> it's like anti-design. <mirabilos> that too… may I quote you on that?
<igli> sure, tho i doubt anyone will listen ;)
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 15:45 ` Thorsten Glaser
@ 2025-10-31 16:02 ` Thiago Macieira
2025-10-31 16:22 ` Alejandro Colomar
2025-10-31 23:46 ` Morten Welinder
0 siblings, 2 replies; 116+ messages in thread
From: Thiago Macieira @ 2025-10-31 16:02 UTC (permalink / raw)
To: Alejandro Colomar, Thorsten Glaser
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Alejandro Colomar
[-- Attachment #1: Type: text/plain, Size: 765 bytes --]
On Friday, 31 October 2025 08:45:09 Pacific Daylight Time Thorsten Glaser
wrote:
> >In C, I've sometimes seen programmers trying to check if realloc(3)
> >moved or not, to skip some work. That's a micro-optimization that I've
> >never written myself, so I won't defend it. But for some reason, some
> >programmers keep wanting to do it.
>
> Huh. That one if is probably more effort than just doing the
> arithmetics always… if it’s not actually UB…
The conclusion among C++ developers is that using the previous pointer in any
way is UB. Therefore, you simply cannot know if the area was moved or not.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 16:02 ` Thiago Macieira
@ 2025-10-31 16:22 ` Alejandro Colomar
2025-10-31 16:59 ` Paul Eggert
` (2 more replies)
2025-10-31 23:46 ` Morten Welinder
1 sibling, 3 replies; 116+ messages in thread
From: Alejandro Colomar @ 2025-10-31 16:22 UTC (permalink / raw)
To: Thiago Macieira
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney, Alejandro Colomar
[-- Attachment #1: Type: text/plain, Size: 1801 bytes --]
Hi Thiago, Thorsten, Paul,
On Fri, Oct 31, 2025 at 09:02:42AM -0700, Thiago Macieira wrote:
> On Friday, 31 October 2025 08:45:09 Pacific Daylight Time Thorsten Glaser
> wrote:
> > >In C, I've sometimes seen programmers trying to check if realloc(3)
> > >moved or not, to skip some work. That's a micro-optimization that I've
> > >never written myself, so I won't defend it. But for some reason, some
> > >programmers keep wanting to do it.
> >
> > Huh. That one if is probably more effort than just doing the
> > arithmetics always… if it’s not actually UB…
>
> The conclusion among C++ developers is that using the previous pointer in any
> way is UB. Therefore, you simply cannot know if the area was moved or not.
Yes, it is UB. realloc(3) zaps old pointers. Paul McKenney is
proposing to not zap the old pointer in some cases, but with current
ISO C, it is UB.
I don't remember well the details of what Paul told me in Paris, so I've
CCed him, in case he can clarify, or maybe if he has some reasons to
defend wanting to use the old pointer.
Paul, for context, this is a discussion for adding a function
int realloci(void *p, size_t n);
that changes the size of a memory block without moving it. (And thus,
fails rather often, for some implementations of allocators.)
Thiago, if you need this, it would also be useful to clarify what it
would be useful for, and numbers if the micro-optimizations are
important for you.
Here's the start of the thread, for anyone reading new:
<https://inbox.sourceware.org/libc-alpha/tzrznth5ng3qukc4dlym5woctbppcabjglsxgfnfvdrd45rr5d@573xvnl5twv6/T/#m625715f975b04cd7dd3d96276a7a83ace9f40d52>
Have a lovely day!
Alex
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 16:22 ` Alejandro Colomar
@ 2025-10-31 16:59 ` Paul Eggert
2025-10-31 17:25 ` Thiago Macieira
` (2 more replies)
2025-10-31 17:07 ` Thiago Macieira
2025-10-31 17:29 ` Paul E. McKenney
2 siblings, 3 replies; 116+ messages in thread
From: Paul Eggert @ 2025-10-31 16:59 UTC (permalink / raw)
To: Alejandro Colomar
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney, Thiago Macieira
On 10/31/25 10:22, Alejandro Colomar wrote:
> Paul, for context, this is a discussion for adding a function
>
> int realloci(void *p, size_t n);
>
> that changes the size of a memory block without moving it. (And thus,
> fails rather often, for some implementations of allocators.)
Reading the threads leading into this, the motivation for this seems to
be C++ and similar memory allocators that want a cheap way to grow an
object - if the object doesn't move they can skip some reinitialization
work, otherwise they have more work to do.
With that in mind, the proposed API is not the best way to go about the
problem. What these users want is a function that acts just like
R=realloc(P,N) EXCEPT that it lets you compare R==P, and if the two
values are the same pointer you know the object did not move and you can
skip some work. This is simpler than realloci because it means that you
need only one call (not two) in the common case when realloci returns
the null pointer.
In other words, these uses want the realloc function the way it was in
7th Edition Unix, before sanitizers got in the way and insisted that
it's an error to compare realloc's first argument with its result even
if they happen to have the same value.
There's an easy way to change the C standard to support these uses: just
change the spec for realloc to support this usage. There is no need to
change the C library, or musl, or any of the commonly used production C
libraries. The only change you'd need to make is to the C standard and
to picky sanitizers.
This would be *much* better than adding a new, hard-to-explain realloci API.
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 16:22 ` Alejandro Colomar
2025-10-31 16:59 ` Paul Eggert
@ 2025-10-31 17:07 ` Thiago Macieira
2025-10-31 17:29 ` Paul E. McKenney
2 siblings, 0 replies; 116+ messages in thread
From: Thiago Macieira @ 2025-10-31 17:07 UTC (permalink / raw)
To: Alejandro Colomar
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney, Alejandro Colomar
[-- Attachment #1: Type: text/plain, Size: 2269 bytes --]
On Friday, 31 October 2025 09:22:40 Pacific Daylight Time Alejandro Colomar
wrote:
> Thiago, if you need this, it would also be useful to clarify what it
> would be useful for, and numbers if the micro-optimizations are
> important for you.
As in the other email, this is important for us in C++ because we cannot
guarantee in a generic container's implementation that the object's type can
be moved in memory. A very common example of this is libstdc++'s std::string,
which may contain a pointer pointing to itself. If the object is moved
elsewhere in memory, the pointer may need adjusting.
Plus, as in the other email, in the common case of a growing container, the
implementation will need to perform multiple iterations of malloc()
+move+free(), each of which with a bigger buffer. This is a situation where
there's a good chance of realloci() succeeding and the current solution
wasting heap space because it can't satisfy the new, bigger allocation with
space previously freed.
To show this, see this Godbolt example:
https://godbolt.org/z/b8xKb4jjj
This shows what the inner reallocation of a vector of std::string would be.
The difference between resize_current() and resize_realloci() is the call to
realloci() before falling back to the current code. In the latter's
implementation, if realloci() succeeds in resizing the block, the function
block below the .L56 label is skipped. If it can't resize, then we're no worse
than before, modulo an extra function call and branch.
Additionally, not shown in the code above, the implementation could be used
for shrinking too. This would allow the higher layer to decide to do it in
places where it currently doesn't, because moving the elements is unacceptably
expensive (just the fact it is O(n) is an impediment in some cases). And even
if an allocator implementation can't reuse the just-freed memory for new
allocations after this shrinking, it *can* advise the OS that the physical
memory can be reclaimed (MADV_DONTNEED), and I'd expect this to be part of the
Quality of Implementation for any good libc, with implementation-defined
thresholds.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 16:59 ` Paul Eggert
@ 2025-10-31 17:25 ` Thiago Macieira
2025-10-31 17:31 ` Paul Eggert
2025-10-31 18:12 ` [musl] " Paul E. McKenney
2025-10-31 20:13 ` Alejandro Colomar
2025-11-01 12:57 ` Florian Weimer
2 siblings, 2 replies; 116+ messages in thread
From: Thiago Macieira @ 2025-10-31 17:25 UTC (permalink / raw)
To: Alejandro Colomar, Paul Eggert
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney
[-- Attachment #1: Type: text/plain, Size: 2646 bytes --]
On Friday, 31 October 2025 09:59:43 Pacific Daylight Time Paul Eggert wrote:
> Reading the threads leading into this, the motivation for this seems to
> be C++ and similar memory allocators that want a cheap way to grow an
> object - if the object doesn't move they can skip some reinitialization
> work, otherwise they have more work to do.
>
> With that in mind, the proposed API is not the best way to go about the
> problem. What these users want is a function that acts just like
> R=realloc(P,N) EXCEPT that it lets you compare R==P, and if the two
> values are the same pointer you know the object did not move and you can
> skip some work. This is simpler than realloci because it means that you
> need only one call (not two) in the common case when realloci returns
> the null pointer.
If you meant a "resize-in-place or malloc new" function, I might agree. That
would reduce the number of function calls in the example I posted in the other
email, eliminating the malloc() call. Though it remains to be seen if that's
actually better, because there are cases where the malloc()+move+free() is
still needed, so the code could be shared across multiple uses and thus reduce
codegen size. Then there's the case of shrinking, where knowing that the
operation is guaranteed to be O(1) is beneficial.
But not this:
> In other words, these uses want the realloc function the way it was in
> 7th Edition Unix
From the C++ side, we *cannot* allow the object to be relocated in memory
without its permission. We need to know the function *will not* move the
memory region before calling it, because there's currently no way to
generically adjust the relocated objects after the fact. That is why realloc()
is currently little used in C++.
That would require a new extension point for almost every class, which means
it becomes an opt-in and only available for code written to support C++29 at
the earliest.
That's assuming it's even possible. How would one adjust sub-objects of an
object? One thing that keeps coming up are the ARM64e authenticated
pointers[1], and from my limited understanding of the feature, it might not be
possible to write the new authenticated pointer to the new location without
reading from the old. I think the Committee would balk at adding a function
that takes a pointer to already-freed memory whose purpose is to allow the
contents of the new object to be adjusted solely based on arithmetic.
[1] https://clang.llvm.org/docs/PointerAuthentication.html
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 16:22 ` Alejandro Colomar
2025-10-31 16:59 ` Paul Eggert
2025-10-31 17:07 ` Thiago Macieira
@ 2025-10-31 17:29 ` Paul E. McKenney
2 siblings, 0 replies; 116+ messages in thread
From: Paul E. McKenney @ 2025-10-31 17:29 UTC (permalink / raw)
To: Alejandro Colomar
Cc: Thiago Macieira, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely
On Fri, Oct 31, 2025 at 05:22:40PM +0100, Alejandro Colomar wrote:
> Hi Thiago, Thorsten, Paul,
>
> On Fri, Oct 31, 2025 at 09:02:42AM -0700, Thiago Macieira wrote:
> > On Friday, 31 October 2025 08:45:09 Pacific Daylight Time Thorsten Glaser
> > wrote:
> > > >In C, I've sometimes seen programmers trying to check if realloc(3)
> > > >moved or not, to skip some work. That's a micro-optimization that I've
> > > >never written myself, so I won't defend it. But for some reason, some
> > > >programmers keep wanting to do it.
> > >
> > > Huh. That one if is probably more effort than just doing the
> > > arithmetics always… if it’s not actually UB…
> >
> > The conclusion among C++ developers is that using the previous pointer in any
> > way is UB. Therefore, you simply cannot know if the area was moved or not.
>
> Yes, it is UB. realloc(3) zaps old pointers. Paul McKenney is
> proposing to not zap the old pointer in some cases, but with current
> ISO C, it is UB.
>
> I don't remember well the details of what Paul told me in Paris, so I've
> CCed him, in case he can clarify, or maybe if he has some reasons to
> defend wanting to use the old pointer.
>
> Paul, for context, this is a discussion for adding a function
>
> int realloci(void *p, size_t n);
>
> that changes the size of a memory block without moving it. (And thus,
> fails rather often, for some implementations of allocators.)
The proposal that would address this in C++ (sadly, not in C) is this:
https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3790r1.pdf
("Pointer lifetime-end zap proposed solutions: Bag-of-bits pointer class").
You could then write something like this:
ptr_bits<char> p = malloc(12 * sizeof(*p);
ptr_bits<char> q;
q = realloc(p, 16 * sizeof(*p));
if (p != q)
do_something(p, q);
This works because ptr_bits<T> defines comparison in terms of the actual
in-memory representation, just like .compare_exchange() would do on an
atomic pointer of the same underlying type.
I will turn my attention to C after C++. C, C++, and Rust have slightly
different definitions of pointer provenance, so one at a time!
In the meantime, in theory you can get this effect as follows:
char *p = malloc(12 * sizeof(*p);
uintptr_t p_rep = (uintptr_t)p;
char *q;
uintptr_t q_rep;
q = realloc(p, 16 * sizeof(*p));
qrep = (uintptr_t)q;
if (prep != qrep)
do_something(prep, qrep);
Just so you all know, many long-time C and C++ programmers are completely
and absolutely flabbergasted to learn that just comparing the pointers
is UB. We have an education problem. I have done my part:
https://people.kernel.org/paulmck/what-on-earth-does-lifetime-end-pointer-zap-have-to-do-with-rcu
In addition to large numbers of working papers, that is. ;-)
Thanx, Paul
> Thiago, if you need this, it would also be useful to clarify what it
> would be useful for, and numbers if the micro-optimizations are
> important for you.
>
> Here's the start of the thread, for anyone reading new:
> <https://inbox.sourceware.org/libc-alpha/tzrznth5ng3qukc4dlym5woctbppcabjglsxgfnfvdrd45rr5d@573xvnl5twv6/T/#m625715f975b04cd7dd3d96276a7a83ace9f40d52>
>
>
> Have a lovely day!
> Alex
>
> --
> <https://www.alejandro-colomar.es>
> Use port 80 (that is, <...:80/>).
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 17:25 ` Thiago Macieira
@ 2025-10-31 17:31 ` Paul Eggert
2025-10-31 17:53 ` Thiago Macieira
2025-10-31 18:12 ` [musl] " Paul E. McKenney
1 sibling, 1 reply; 116+ messages in thread
From: Paul Eggert @ 2025-10-31 17:31 UTC (permalink / raw)
To: Thiago Macieira, Alejandro Colomar
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney
On 10/31/25 11:25, Thiago Macieira wrote:
> I think the Committee would balk at adding a function
> that takes a pointer to already-freed memory whose purpose is to allow the
> contents of the new object to be adjusted solely based on arithmetic.
Do you know of any platforms where this does not in fact work? Other
than sanitizing platforms that go to some lengths to impose the
Committee's rules even though the hardware would work fine?
If not, then perhaps we can convince the Committee that the mismatch
between the current rules and reality is causing real harm, and that
it'd be a win for C's users to change the standard to match reality better.
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 17:31 ` Paul Eggert
@ 2025-10-31 17:53 ` Thiago Macieira
2025-10-31 18:35 ` Andreas Schwab
` (2 more replies)
0 siblings, 3 replies; 116+ messages in thread
From: Thiago Macieira @ 2025-10-31 17:53 UTC (permalink / raw)
To: Alejandro Colomar, Paul Eggert
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney, Oliver Hunt
[-- Attachment #1: Type: text/plain, Size: 2664 bytes --]
On Friday, 31 October 2025 10:31:54 Pacific Daylight Time Paul Eggert wrote:
> On 10/31/25 11:25, Thiago Macieira wrote:
> > I think the Committee would balk at adding a function
> > that takes a pointer to already-freed memory whose purpose is to allow the
> > contents of the new object to be adjusted solely based on arithmetic.
>
> Do you know of any platforms where this does not in fact work? Other
> than sanitizing platforms that go to some lengths to impose the
> Committee's rules even though the hardware would work fine?
>
> If not, then perhaps we can convince the Committee that the mismatch
> between the current rules and reality is causing real harm, and that
> it'd be a win for C's users to change the standard to match reality better.
Oliver, please comment on ARM64e if you can, for pointer authentication. Think
not just of statically-known pointers like vtables, but the general case of
pointer authentication.
But I think any such thing would be extremely fragile. Very low-level library
authors can probably get it right, but I wouldn't trust this feature to more
than a few dozen people on the planet. We're talking about something like:
void adjust_after_relocation(T *object, uintptr_t old) // or ptr_bits<T>
The temptation is too great to cast the old to a T* and dereference it, which
is UB because the memory has been freed, but will "happen to work" for
sufficiently many executions that it might go unnoticed. Then there's the issue
that the relocated T *object is itself in an inconsistent state and one must
avoid calling most functions on it.
This function for libstdc++'s std::string would look something like:
// can't call old->_M_is_local() or old->_M_local_data(), so we must
// use pointer arithmetic
uintptr_t old_local_data = old + offsetof(std::string, _M_local_buf);
if (uintptr_t(object->_M_data()) == old_local_data) {
// was using Small String Optimization
object->_M_data(object->_M_local_data())
}
I trust libstdc++ developers to know how to write this. I trust myself and
some colleagues for some Qt classes, and I trust developers in folly and
abseil for similar things. Like I said, a few dozen people on the planet.
And unlike Standard Library developers we'd probably have to err on the side
of caution with some compilers, and thus disallow the object being relocated
in the first place. For example, the offsetof() above is only implementation-
defined, for many types. That limits the usefulness of the feature.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 17:25 ` Thiago Macieira
2025-10-31 17:31 ` Paul Eggert
@ 2025-10-31 18:12 ` Paul E. McKenney
2025-10-31 19:15 ` Thiago Macieira
1 sibling, 1 reply; 116+ messages in thread
From: Paul E. McKenney @ 2025-10-31 18:12 UTC (permalink / raw)
To: Thiago Macieira
Cc: Alejandro Colomar, Paul Eggert, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely
On Fri, Oct 31, 2025 at 10:25:27AM -0700, Thiago Macieira wrote:
> On Friday, 31 October 2025 09:59:43 Pacific Daylight Time Paul Eggert wrote:
> > Reading the threads leading into this, the motivation for this seems to
> > be C++ and similar memory allocators that want a cheap way to grow an
> > object - if the object doesn't move they can skip some reinitialization
> > work, otherwise they have more work to do.
> >
> > With that in mind, the proposed API is not the best way to go about the
> > problem. What these users want is a function that acts just like
> > R=realloc(P,N) EXCEPT that it lets you compare R==P, and if the two
> > values are the same pointer you know the object did not move and you can
> > skip some work. This is simpler than realloci because it means that you
> > need only one call (not two) in the common case when realloci returns
> > the null pointer.
>
> If you meant a "resize-in-place or malloc new" function, I might agree. That
> would reduce the number of function calls in the example I posted in the other
> email, eliminating the malloc() call. Though it remains to be seen if that's
> actually better, because there are cases where the malloc()+move+free() is
> still needed, so the code could be shared across multiple uses and thus reduce
> codegen size. Then there's the case of shrinking, where knowing that the
> operation is guaranteed to be O(1) is beneficial.
>
> But not this:
>
> > In other words, these uses want the realloc function the way it was in
> > 7th Edition Unix
>
> >From the C++ side, we *cannot* allow the object to be relocated in memory
> without its permission. We need to know the function *will not* move the
> memory region before calling it, because there's currently no way to
> generically adjust the relocated objects after the fact. That is why realloc()
> is currently little used in C++.
>
> That would require a new extension point for almost every class, which means
> it becomes an opt-in and only available for code written to support C++29 at
> the earliest.
>
> That's assuming it's even possible. How would one adjust sub-objects of an
> object? One thing that keeps coming up are the ARM64e authenticated
> pointers[1], and from my limited understanding of the feature, it might not be
> possible to write the new authenticated pointer to the new location without
> reading from the old. I think the Committee would balk at adding a function
> that takes a pointer to already-freed memory whose purpose is to allow the
> contents of the new object to be adjusted solely based on arithmetic.
In C++, presumably, only std::movable types should be passed to realloc(),
right? In C, yes, this is a definitely an issue, but then again it has
been since the advent of realloc().
Thanx, Paul
> [1] https://clang.llvm.org/docs/PointerAuthentication.html
> --
> Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
> Principal Engineer - Intel Data Center - Platform & Sys. Eng.
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 17:53 ` Thiago Macieira
@ 2025-10-31 18:35 ` Andreas Schwab
2025-10-31 19:17 ` Thiago Macieira
2025-10-31 20:18 ` Paul Eggert
2025-11-01 3:47 ` [musl] " Oliver Hunt
2 siblings, 1 reply; 116+ messages in thread
From: Andreas Schwab @ 2025-10-31 18:35 UTC (permalink / raw)
To: Thiago Macieira
Cc: Alejandro Colomar, Paul Eggert, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney, Oliver Hunt
On Okt 31 2025, Thiago Macieira wrote:
> But I think any such thing would be extremely fragile. Very low-level library
> authors can probably get it right, but I wouldn't trust this feature to more
> than a few dozen people on the planet. We're talking about something like:
>
> void adjust_after_relocation(T *object, uintptr_t old) // or ptr_bits<T>
>
> The temptation is too great to cast the old to a T* and dereference it, which
> is UB because the memory has been freed, but will "happen to work" for
> sufficiently many executions that it might go unnoticed. Then there's the issue
> that the relocated T *object is itself in an inconsistent state and one must
> avoid calling most functions on it.
And it will not be thread safe. The freed memory can be allocated to
another thread any time.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1
"And now for something completely different."
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 18:12 ` [musl] " Paul E. McKenney
@ 2025-10-31 19:15 ` Thiago Macieira
2025-10-31 19:49 ` Paul E. McKenney
0 siblings, 1 reply; 116+ messages in thread
From: Thiago Macieira @ 2025-10-31 19:15 UTC (permalink / raw)
To: paulmck
Cc: Alejandro Colomar, Paul Eggert, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 1066 bytes --]
On Friday, 31 October 2025 11:12:10 Pacific Daylight Time Paul E. McKenney
wrote:
> In C++, presumably, only std::movable types should be passed to realloc(),
> right? In C, yes, this is a definitely an issue, but then again it has
> been since the advent of realloc().
Correct (wrong terminology, see below). That would be extremely useful, but is
half of the problem. It would still be useful to apply the resize-in-place
operation to types that cannot be realloc()ed. Like libstdc++'s std::string.
Terminology-wise, the new term is "relocatable", not "movable", because
"movable" became used for something different in C++11 (something that had a
move constructor or move-assignment operator). Then in the lead-up to C++26
with paper P1144, it became "trivially relocatable": can be moved by memcpy().
It has since become "bitwise trivially relocatable" for reasons which do not
bear elaborating in this discussion.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 18:35 ` Andreas Schwab
@ 2025-10-31 19:17 ` Thiago Macieira
0 siblings, 0 replies; 116+ messages in thread
From: Thiago Macieira @ 2025-10-31 19:17 UTC (permalink / raw)
To: Andreas Schwab
Cc: Alejandro Colomar, Paul Eggert, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney, Oliver Hunt
[-- Attachment #1: Type: text/plain, Size: 898 bytes --]
On Friday, 31 October 2025 11:35:10 Pacific Daylight Time Andreas Schwab wrote:
> > The temptation is too great to cast the old to a T* and dereference it,
> > which is UB because the memory has been freed, but will "happen to work"
> > for sufficiently many executions that it might go unnoticed. Then there's
> > the issue that the relocated T *object is itself in an inconsistent state
> > and one must avoid calling most functions on it.
>
> And it will not be thread safe. The freed memory can be allocated to
> another thread any time.
No argument there. It's already UB. What I meant is that it will "happen to
work" under quite a lot of software testing for the problem to go unnoticed,
and then present itself as a heisenbug to the downstream user.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 19:15 ` Thiago Macieira
@ 2025-10-31 19:49 ` Paul E. McKenney
0 siblings, 0 replies; 116+ messages in thread
From: Paul E. McKenney @ 2025-10-31 19:49 UTC (permalink / raw)
To: Thiago Macieira
Cc: Alejandro Colomar, Paul Eggert, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely
On Fri, Oct 31, 2025 at 12:15:51PM -0700, Thiago Macieira wrote:
> On Friday, 31 October 2025 11:12:10 Pacific Daylight Time Paul E. McKenney
> wrote:
> > In C++, presumably, only std::movable types should be passed to realloc(),
> > right? In C, yes, this is a definitely an issue, but then again it has
> > been since the advent of realloc().
>
> Correct (wrong terminology, see below). That would be extremely useful, but is
> half of the problem. It would still be useful to apply the resize-in-place
> operation to types that cannot be realloc()ed. Like libstdc++'s std::string.
>
>
> Terminology-wise, the new term is "relocatable", not "movable", because
> "movable" became used for something different in C++11 (something that had a
> move constructor or move-assignment operator). Then in the lead-up to C++26
> with paper P1144, it became "trivially relocatable": can be moved by memcpy().
> It has since become "bitwise trivially relocatable" for reasons which do not
> bear elaborating in this discussion.
I stand corrected, thank you!
Thanx, Paul
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 16:59 ` Paul Eggert
2025-10-31 17:25 ` Thiago Macieira
@ 2025-10-31 20:13 ` Alejandro Colomar
2025-10-31 20:33 ` Paul Eggert
2025-10-31 21:06 ` Thiago Macieira
2025-11-01 12:57 ` Florian Weimer
2 siblings, 2 replies; 116+ messages in thread
From: Alejandro Colomar @ 2025-10-31 20:13 UTC (permalink / raw)
To: Paul Eggert
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney, Thiago Macieira
[-- Attachment #1: Type: text/plain, Size: 3740 bytes --]
Hi Paul,
On Fri, Oct 31, 2025 at 10:59:43AM -0600, Paul Eggert wrote:
> On 10/31/25 10:22, Alejandro Colomar wrote:
> > Paul, for context, this is a discussion for adding a function
> >
> > int realloci(void *p, size_t n);
> >
> > that changes the size of a memory block without moving it. (And thus,
> > fails rather often, for some implementations of allocators.)
>
> Reading the threads leading into this, the motivation for this seems to be
> C++ and similar memory allocators that want a cheap way to grow an object -
> if the object doesn't move they can skip some reinitialization work,
> otherwise they have more work to do.
Yes.
> With that in mind, the proposed API is not the best way to go about the
> problem. What these users want is a function that acts just like
> R=realloc(P,N) EXCEPT that it lets you compare R==P, and if the two values
> are the same pointer you know the object did not move and you can skip some
> work. This is simpler than realloci because it means that you need only one
> call (not two) in the common case when realloci returns the null pointer.
Consider that realloci() would be significantly cheaper than realloc(3),
so even if you have an extra function, it might be worth it if it
succeeds a non-negligible number of times.
From what Thiago says, it seems that it would be worth it for them.
I suspect it's because things like std::string often grow or shrink by
small amounts compared to the previous size, as operations on strings
don't change their length significantly quite often.
I've implemented realloci() in musl, and it's really trivial. After
all, it only checks the metadata to know if there's enough available
space after the block. It doesn't need to find a new block, which is
the hard part of realloc(3).
So, while it would be one more call in user code, that call is very
cheap, and the code is really simple to use too:
if (realloci(p, size) == -1)
fall_back_to_expensive_path();
> In other words, these uses want the realloc function the way it was in 7th
> Edition Unix, before sanitizers got in the way and insisted that it's an
> error to compare realloc's first argument with its result even if they
> happen to have the same value.
>
> There's an easy way to change the C standard to support these uses: just
> change the spec for realloc to support this usage. There is no need to
> change the C library, or musl, or any of the commonly used production C
> libraries. The only change you'd need to make is to the C standard and to
> picky sanitizers.
That would make sanitizers and static analyzers unable to verify lots of
code, for the benefit of just a few. I think a separate API would be
better, because it would let realloc(3) be stricter, which would provide
analyzers the ability to verify C code much better. Those that need to
be able to do weird things (C++), can opt in to the relaxed stuff by
using the niche function.
> This would be *much* better than adding a new, hard-to-explain realloci API.
I wouldn't categorize it as hard to explain:
int realloci(void *p, size_t size);
realloci() changes the size of the memory block pointed to by
'p' to 'size' bytes. This is done in-place, that is, without
changing its address.
The contents of the memory will be unchanged in the range from
the start of the region up to the minimum of the old and new
sizes. If the new size is larger than the old size, the added
memory will not be initialized.
'p' must have been returned by an earlier call to malloc(3) or
related functions.
Have a lovely night!
Alex
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 17:53 ` Thiago Macieira
2025-10-31 18:35 ` Andreas Schwab
@ 2025-10-31 20:18 ` Paul Eggert
2025-11-01 3:47 ` [musl] " Oliver Hunt
2 siblings, 0 replies; 116+ messages in thread
From: Paul Eggert @ 2025-10-31 20:18 UTC (permalink / raw)
To: Thiago Macieira, Alejandro Colomar
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney, Oliver Hunt
On 10/31/25 11:53, Thiago Macieira wrote:
> The temptation is too great to cast the old to a T* and dereference it
That temptation exists no matter which of these APIs is used. It exists
even with C89 malloc/realloc/free. The temptation won't be affected by
changing the C standard now to better reflect how implementations behave.
> I wouldn't trust this feature to more
> than a few dozen people on the planet.
I think it's more than that. But whatever the number, it's the same
audience as the people who'd use a realloci API, or a
realloc-unmmoved-or-malloc API. All these APIs are non-obvious and
suitable for experts only.
The 7th Edition Unix API has significant advantages:
(1) No changes are needed to existing mainline implementations.
(2) For the use cases shown so far, it's more efficient than the other
alternatives given.
(3) It matches naive C programmer expectations: lots of code already
assumes the API, even though the code doesn't conform to the standard.
If it's too much of a stretch for the committee to require the
traditional behavior, they could add a new macro (__STDC_REALLOC_REUSE__
say) which would guarantee the behavior. Implementations could then use
efficient code if __STDC_REALLOC_REUSE__ is defined, and
slower-but-portable-to-weird-platforms code otherwise.
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 20:13 ` Alejandro Colomar
@ 2025-10-31 20:33 ` Paul Eggert
2025-10-31 21:14 ` Thiago Macieira
2025-11-09 11:37 ` Alejandro Colomar
2025-10-31 21:06 ` Thiago Macieira
1 sibling, 2 replies; 116+ messages in thread
From: Paul Eggert @ 2025-10-31 20:33 UTC (permalink / raw)
To: Alejandro Colomar
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney, Thiago Macieira
On 10/31/25 14:13, Alejandro Colomar wrote:
> Consider that realloci() would be significantly cheaper than realloc(3),
Not in the case where the object doesn't move: they should be about the
same speed. And when the object grows so much that it does need to move,
the V7 realloc approach should be a bit faster because you need to make
just one call into the memory subsystem, not three (realloci + malloc +
free).
> That would make sanitizers and static analyzers unable to verify lots of
> code
No, just the opposite. Currently sanitizers etc. spend useless work
checking for C23 rules that don't correspond to any hardware or
correctness needs; they're simply rules imposed by the C committee. This
checking is counterproductive to real-world software development.
If we fixed the realloc spec to better match how actual production
hardware behaves, we could fix sanitizers to spend their time flagging
real bugs instead of wasting their time (and developers' time)
generating false alarms.
> I wouldn't categorize it as hard to explain:
Oh, it's not hard to specify a realloci API, or to implement it. What's
hard is explaining its motivation: why it's needed and what it's good
for. It's motivated by specialized applications that most programmers
don't know about and don't need to. And these specialized applications
would be better served by a 7th Edition Unix realloc.
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 20:13 ` Alejandro Colomar
2025-10-31 20:33 ` Paul Eggert
@ 2025-10-31 21:06 ` Thiago Macieira
2025-10-31 22:09 ` Alejandro Colomar
1 sibling, 1 reply; 116+ messages in thread
From: Thiago Macieira @ 2025-10-31 21:06 UTC (permalink / raw)
To: Paul Eggert, Alejandro Colomar
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney
[-- Attachment #1: Type: text/plain, Size: 1975 bytes --]
On Friday, 31 October 2025 13:13:42 Pacific Daylight Time Alejandro Colomar
wrote:
> Consider that realloci() would be significantly cheaper than realloc(3),
> so even if you have an extra function, it might be worth it if it
> succeeds a non-negligible number of times.
>
> From what Thiago says, it seems that it would be worth it for them.
> I suspect it's because things like std::string often grow or shrink by
> small amounts compared to the previous size, as operations on strings
> don't change their length significantly quite often.
Note I'm not talking about growing/shrinking std::strings themselves, but
growing/shrinking arrays (std::vector) of std::strings. The libsdc++
std::string is 32 bytes in size and is not relocatable via memcpy()able.
> So, while it would be one more call in user code, that call is very
> cheap, and the code is really simple to use too:
>
> if (realloci(p, size) == -1)
> fall_back_to_expensive_path();
That's what my example on Godbolt did too.
> I wouldn't categorize it as hard to explain:
>
> int realloci(void *p, size_t size);
>
> realloci() changes the size of the memory block pointed to by
> 'p' to 'size' bytes. This is done in-place, that is, without
> changing its address.
>
> The contents of the memory will be unchanged in the range from
> the start of the region up to the minimum of the old and new
> sizes. If the new size is larger than the old size, the added
> memory will not be initialized.
I'd add: if the new size is smaller than the old size, the bytes in that
storage are undefined, even if this function returned -1. That will allow an
implementation to MADV_DONTNEED the space, even if it can't officially change
the size of the allocation.
Would it be worth returning instead the new size, which may be bigger than the
requested size?
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 20:33 ` Paul Eggert
@ 2025-10-31 21:14 ` Thiago Macieira
2025-10-31 22:25 ` Paul Eggert
2025-11-09 11:37 ` Alejandro Colomar
1 sibling, 1 reply; 116+ messages in thread
From: Thiago Macieira @ 2025-10-31 21:14 UTC (permalink / raw)
To: Alejandro Colomar, Paul Eggert
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney
[-- Attachment #1: Type: text/plain, Size: 2754 bytes --]
On Friday, 31 October 2025 13:33:22 Pacific Daylight Time Paul Eggert wrote:
> On 10/31/25 14:13, Alejandro Colomar wrote:
> > Consider that realloci() would be significantly cheaper than realloc(3),
>
> Not in the case where the object doesn't move: they should be about the
> same speed. And when the object grows so much that it does need to move,
> the V7 realloc approach should be a bit faster because you need to make
> just one call into the memory subsystem, not three (realloci + malloc +
> free).
For data that can be moved in memory, I agree with you.
But that ignores the very common case of data not being allowed to move, which
precludes using realloc() in the first place, even if it would have kept the
pointers intact.
Movable (bitwise trivially relocatable) objects are the exception, not the
rule. I'd guesstimate that, in C++, for std::vector and similar classes, 99%
of the instatiations are done on non-relocatable types and 1% on relocatable
ones. It might be that this 1% of the case corresponds to 20% or even 50% of
all the memory allocations (by quantity or by data volume), but I'd be
surprised if it were larger than that.
The 1% may increase starting with C++29, as developers begin using the ability
to mark their types as bitwise trivially relocatable.
> > That would make sanitizers and static analyzers unable to verify lots of
> > code
>
> No, just the opposite. Currently sanitizers etc. spend useless work
> checking for C23 rules that don't correspond to any hardware or
> correctness needs; they're simply rules imposed by the C committee. This
> checking is counterproductive to real-world software development.
>
> If we fixed the realloc spec to better match how actual production
> hardware behaves, we could fix sanitizers to spend their time flagging
> real bugs instead of wasting their time (and developers' time)
> generating false alarms.
The two are not mutually exclusive.
> > I wouldn't categorize it as hard to explain:
>
> Oh, it's not hard to specify a realloci API, or to implement it. What's
> hard is explaining its motivation: why it's needed and what it's good
> for. It's motivated by specialized applications that most programmers
> don't know about and don't need to. And these specialized applications
> would be better served by a 7th Edition Unix realloc.
I'm telling you this would benefit 100% of C++ applications or a number so
close to it to be virtually indistinguishable. Most *developers* may never
notice they're using this, but they will still benefit from it, in their own
code via template instantiations and inlining.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 21:06 ` Thiago Macieira
@ 2025-10-31 22:09 ` Alejandro Colomar
2025-10-31 22:33 ` Joseph Myers
2025-10-31 23:48 ` Thiago Macieira
0 siblings, 2 replies; 116+ messages in thread
From: Alejandro Colomar @ 2025-10-31 22:09 UTC (permalink / raw)
To: Thiago Macieira
Cc: Paul Eggert, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney
[-- Attachment #1: Type: text/plain, Size: 3988 bytes --]
Hi Thiago,
On Fri, Oct 31, 2025 at 02:06:10PM -0700, Thiago Macieira wrote:
> On Friday, 31 October 2025 13:13:42 Pacific Daylight Time Alejandro Colomar
> wrote:
> > Consider that realloci() would be significantly cheaper than realloc(3),
> > so even if you have an extra function, it might be worth it if it
> > succeeds a non-negligible number of times.
> >
> > From what Thiago says, it seems that it would be worth it for them.
> > I suspect it's because things like std::string often grow or shrink by
> > small amounts compared to the previous size, as operations on strings
> > don't change their length significantly quite often.
>
> Note I'm not talking about growing/shrinking std::strings themselves, but
> growing/shrinking arrays (std::vector) of std::strings. The libsdc++
> std::string is 32 bytes in size and is not relocatable via memcpy()able.
Thanks for clarifying.
> > So, while it would be one more call in user code, that call is very
> > cheap, and the code is really simple to use too:
> >
> > if (realloci(p, size) == -1)
> > fall_back_to_expensive_path();
>
> That's what my example on Godbolt did too.
>
> > I wouldn't categorize it as hard to explain:
> >
> > int realloci(void *p, size_t size);
> >
> > realloci() changes the size of the memory block pointed to by
> > 'p' to 'size' bytes. This is done in-place, that is, without
> > changing its address.
> >
> > The contents of the memory will be unchanged in the range from
> > the start of the region up to the minimum of the old and new
> > sizes. If the new size is larger than the old size, the added
> > memory will not be initialized.
>
> I'd add: if the new size is smaller than the old size, the bytes in that
> storage are undefined, even if this function returned -1. That will allow an
> implementation to MADV_DONTNEED the space, even if it can't officially change
> the size of the allocation.
I'm not entirely sure. What would be the new size? Would it still be
the old one? So, the higher contents are undefined but you're still
able to write to them? It sounds weird.
I think if an allocator is unable to shrink, it likely is because it
really can't shrink, in which case I'd consider either accepting it and
not shrinking, or if I really want to shrink, call realloc(3) --or
malloc(3) and free-- as a fallback, accepting that I'd have to do the
work of reallocating.
> Would it be worth returning instead the new size, which may be bigger than the
> requested size?
Hmmmm, while I wouldn't like the idea of realloc(3) or malloc(3) telling
the underlying size, I think I agree to realloci() telling the
underlying size.
The rationale is that if a programmer uses realloci(), they're
explicitly expressing interest in minimizing realloc(3) calls, because
for some reason moving the contents is expensive. So, it would be nice
if realloci() would be generous, by giving more size than asked for, and
telling the user the actual size.
I'll revise the specification as:
Synopsis
ssize_t realloci(void *p, size_t size);
Description
realloci() changes the size of the memory block pointed to by
'p' to at least 'size' bytes. This is done in-place, that is,
without changing its address.
The contents of the memory will be unchanged in the range from
the start of the region up to the minimum of the old and new
sizes. If the new size is larger than the old size, the added
memory will not be initialized.
This function may allocate more bytes than requested.
Return value
This function returns the new size of the memory block, which
might be larger than the requested size (but not smaller).
On error, -1 is returned, and errno is set to indicate the
error.
Errors
ENOMEM
Not enough contiguous memory.
Have a lovely night!
Alex
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 21:14 ` Thiago Macieira
@ 2025-10-31 22:25 ` Paul Eggert
2025-10-31 23:27 ` Thiago Macieira
0 siblings, 1 reply; 116+ messages in thread
From: Paul Eggert @ 2025-10-31 22:25 UTC (permalink / raw)
To: Thiago Macieira, Alejandro Colomar
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney
On 10/31/25 15:14, Thiago Macieira wrote:
> that ignores the very common case of data not being allowed to move, which
> precludes using realloc() in the first place, even if it would have kept the
> pointers intact.
Using 7th Edition Unix realloc does not ignore that case. The idea is
that you call realloc; if it gives you the same pointer you're done,
otherwise you update the object's contents inplace accordingly. It's the
same basic idea as realloci where, if realloci fails you malloc
something larger, and copy from the old object to the new while updating
the contents of the new object as needed. This is the same amount of
updating work either way; it's just that it's a simpler allocator API
and that simplicity is easier to document/implement/explain and is
likely to help performance a bit too.
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 22:09 ` Alejandro Colomar
@ 2025-10-31 22:33 ` Joseph Myers
2025-10-31 22:51 ` Alejandro Colomar
2025-10-31 23:48 ` Thiago Macieira
1 sibling, 1 reply; 116+ messages in thread
From: Joseph Myers @ 2025-10-31 22:33 UTC (permalink / raw)
To: Alejandro Colomar
Cc: Thiago Macieira, Paul Eggert, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney
On Fri, 31 Oct 2025, Alejandro Colomar wrote:
> ssize_t realloci(void *p, size_t size);
There is no such type name as ssize_t in the C standard (and the convenor
is strongly opposed to adding such a type).
--
Joseph S. Myers
josmyers@redhat.com
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 22:33 ` Joseph Myers
@ 2025-10-31 22:51 ` Alejandro Colomar
0 siblings, 0 replies; 116+ messages in thread
From: Alejandro Colomar @ 2025-10-31 22:51 UTC (permalink / raw)
To: Joseph Myers
Cc: Thiago Macieira, Paul Eggert, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney
[-- Attachment #1: Type: text/plain, Size: 786 bytes --]
Hi Joseph,
On Fri, Oct 31, 2025 at 10:33:49PM +0000, Joseph Myers wrote:
> On Fri, 31 Oct 2025, Alejandro Colomar wrote:
>
> > ssize_t realloci(void *p, size_t size);
>
> There is no such type name as ssize_t in the C standard (and the convenor
> is strongly opposed to adding such a type).
I know, and I also know the convenor has no more voting power than any
other member of the committee.
If this API was implemented in glibc and musl, it would already be
useful to some C++ implementations, regardless of being non-standard.
Maybe POSIX could standardize it first. Maybe this API proved useful,
then a majority of the committee might accept it.
Have a lovely night!
Alex
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 22:25 ` Paul Eggert
@ 2025-10-31 23:27 ` Thiago Macieira
2025-11-01 3:54 ` Paul Eggert
0 siblings, 1 reply; 116+ messages in thread
From: Thiago Macieira @ 2025-10-31 23:27 UTC (permalink / raw)
To: Alejandro Colomar, Paul Eggert
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney
[-- Attachment #1: Type: text/plain, Size: 1260 bytes --]
On Friday, 31 October 2025 15:25:20 Pacific Daylight Time Paul Eggert wrote:
> Using 7th Edition Unix realloc does not ignore that case. The idea is
> that you call realloc; if it gives you the same pointer you're done,
> otherwise you update the object's contents inplace accordingly. It's the
> same basic idea as realloci where, if realloci fails you malloc
> something larger, and copy from the old object to the new while updating
> the contents of the new object as needed. This is the same amount of
> updating work either way; it's just that it's a simpler allocator API
> and that simplicity is easier to document/implement/explain and is
> likely to help performance a bit too.
I'm not sure I understand you.
Are you saying that 7th Edition Unix realloc() returned only one of two
possible values?
NULL on failure
the same ptr that was passed as input on success
I don't think you are because imposing this requirement would imply it will
never memcpy() the data to a new location and that would break quite a lot of
applications that depend the ability to grow a block so long as there's heap
available.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 16:02 ` Thiago Macieira
2025-10-31 16:22 ` Alejandro Colomar
@ 2025-10-31 23:46 ` Morten Welinder
1 sibling, 0 replies; 116+ messages in thread
From: Morten Welinder @ 2025-10-31 23:46 UTC (permalink / raw)
To: musl
Cc: Alejandro Colomar, Thorsten Glaser, libc-alpha, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely
> The conclusion among C++ developers is that using the previous pointer in any
> way is UB. Therefore, you simply cannot know if the area was moved or not.
With a bit of effort you can travel from UB to implementation-defined.
The trick is to
use the pointer before the realloc. Something like
sprintf(buffer1, "%p", (void *)p);
q = realloc(p, newsize);
sprintf(buffer2, "%p", (void *)q);
int moved = strcmp(buffer1, buffer2) != 0;
Clearly not UB. I just can't convince myself that the standards
actually guarantee that a changed pointer doesn't have the same %p
representation.
I don't know of any platforms where a collision would actually happen, though.
M.
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 22:09 ` Alejandro Colomar
2025-10-31 22:33 ` Joseph Myers
@ 2025-10-31 23:48 ` Thiago Macieira
2025-11-01 0:47 ` Alejandro Colomar
1 sibling, 1 reply; 116+ messages in thread
From: Thiago Macieira @ 2025-10-31 23:48 UTC (permalink / raw)
To: Alejandro Colomar
Cc: Paul Eggert, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney
[-- Attachment #1: Type: text/plain, Size: 2843 bytes --]
On Friday, 31 October 2025 15:09:46 Pacific Daylight Time Alejandro Colomar
wrote:
> > I'd add: if the new size is smaller than the old size, the bytes in that
> > storage are undefined, even if this function returned -1. That will allow
> > an implementation to MADV_DONTNEED the space, even if it can't officially
> > change the size of the allocation.
>
> I'm not entirely sure. What would be the new size? Would it still be
> the old one? So, the higher contents are undefined but you're still
> able to write to them? It sounds weird.
I think this needs some discussion.
I'm thinking of allocators like jemalloc that cannot reuse the space freed by
shrinkage. Imagine shrinking a block of 256 kB to 64 B (e.g., 8192
std::strings to 2).
What does realloci() return?
It could return -1, indicating no shrinking happened. In that case, the higher
layer is allowed to presume the data it had there is still there, which
prevents the allocator from doing madvise(MADV_DONTNEED).
Or it could return 0, indicating it did shrink and may have done
MADV_DONTNEED. But in that case, the higher layer will update its book-keeping
of the capacity, causing it to call realloci() again if it needs to grow
again. Though this will probably be fast: the allocator will probably just
return 0 for any size value that is less than the slab size and -1 for any
that is bigger. The drawback of this is that there's a minimum granularity of
one page, so the example above of shrinking to 64 B is keeping 4032 bytes
"hostage" in overhead.
> The rationale is that if a programmer uses realloci(), they're
> explicitly expressing interest in minimizing realloc(3) calls, because
> for some reason moving the contents is expensive. So, it would be nice
> if realloci() would be generous, by giving more size than asked for, and
> telling the user the actual size.
True, but the same rationale applies to the first allocation with malloc() as
well.
There's precedent for this: jemalloc provides nallocx() to calculate the block
ahead of time. Most implementations have one way or another of asking how big
the block really is, after the allocation.
>
> I'll revise the specification as:
>
> Synopsis
> ssize_t realloci(void *p, size_t size);
By the way, looks like this the same functionality as jemalloc's xallocx,
which
The xallocx() function resizes the allocation at ptr in place to be at
least size bytes, and returns the real size of the allocation. If extra
is non-zero, an attempt is made to resize the allocation to be at least
(size + extra) bytes, though inability to allocate the extra byte(s)
will not by itself result in failure to resize.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 23:48 ` Thiago Macieira
@ 2025-11-01 0:47 ` Alejandro Colomar
0 siblings, 0 replies; 116+ messages in thread
From: Alejandro Colomar @ 2025-11-01 0:47 UTC (permalink / raw)
To: Thiago Macieira
Cc: Paul Eggert, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney
[-- Attachment #1: Type: text/plain, Size: 4254 bytes --]
Hi Thiago,
On Fri, Oct 31, 2025 at 04:48:36PM -0700, Thiago Macieira wrote:
> On Friday, 31 October 2025 15:09:46 Pacific Daylight Time Alejandro Colomar
> wrote:
> > > I'd add: if the new size is smaller than the old size, the bytes in that
> > > storage are undefined, even if this function returned -1. That will allow
> > > an implementation to MADV_DONTNEED the space, even if it can't officially
> > > change the size of the allocation.
> >
> > I'm not entirely sure. What would be the new size? Would it still be
> > the old one? So, the higher contents are undefined but you're still
> > able to write to them? It sounds weird.
>
> I think this needs some discussion.
Sure.
> I'm thinking of allocators like jemalloc that cannot reuse the space freed by
> shrinkage. Imagine shrinking a block of 256 kB to 64 B (e.g., 8192
> std::strings to 2).
>
> What does realloci() return?
If it can't reuse the space, I think the most sensible thing to return
would be the original large size. That would indicate the user the
most information.
> It could return -1, indicating no shrinking happened.
I wouldn't do that. I would reserve -1 for indicating a hard error,
such as not being able to grow.
Think that users might fall back to realloc(3) as soon as they see a -1.
They may not remember the original size, so they may not know they're
trying to shrink.
> In that case, the higher
> layer is allowed to presume the data it had there is still there, which
> prevents the allocator from doing madvise(MADV_DONTNEED).
>
> Or it could return 0, indicating it did shrink and may have done
> MADV_DONTNEED.
Yep. Although with the new specification, it'd return the large size.
> But in that case, the higher layer will update its book-keeping
> of the capacity, causing it to call realloci() again if it needs to grow
> again. Though this will probably be fast: the allocator will probably just
> return 0 for any size value that is less than the slab size and -1 for any
> that is bigger. The drawback of this is that there's a minimum granularity of
> one page, so the example above of shrinking to 64 B is keeping 4032 bytes
> "hostage" in overhead.
Yep. I guess not too bad. If they want to release it, they're always
free to call realloc(3). Of course, they'll need to know that this size
is hostage, so returning the actual size is useful.
> > The rationale is that if a programmer uses realloci(), they're
> > explicitly expressing interest in minimizing realloc(3) calls, because
> > for some reason moving the contents is expensive. So, it would be nice
> > if realloci() would be generous, by giving more size than asked for, and
> > telling the user the actual size.
>
> True, but the same rationale applies to the first allocation with malloc() as
> well.
You could immediately follow malloc(3) by realloci(), if you want this
behavior:
void *p;
ssize_t size = 1024;
p = malloc(size);
if (p == NULL)
goto fail;
size = realloci(p, size);
if (size == -1)
goto fail;
// And now we know the actual size.
realloci() would be essentially a no-op, and it shouldn't fail, and
would be negligible compared to malloc(3).
> There's precedent for this: jemalloc provides nallocx() to calculate the block
> ahead of time. Most implementations have one way or another of asking how big
> the block really is, after the allocation.
>
> > I'll revise the specification as:
> >
> > Synopsis
> > ssize_t realloci(void *p, size_t size);
>
> By the way, looks like this the same functionality as jemalloc's xallocx,
> which
>
> The xallocx() function resizes the allocation at ptr in place to be at
> least size bytes, and returns the real size of the allocation. If extra
> is non-zero, an attempt is made to resize the allocation to be at least
> (size + extra) bytes, though inability to allocate the extra byte(s)
> will not by itself result in failure to resize.
Yup, it sounds like it. I guess we can take that as prior art. :)
Have a lovely night!
Alex
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] realloci(): A realloc() variant that works in-place
2025-10-31 12:16 ` Thorsten Glaser
@ 2025-11-01 1:03 ` Rich Felker
0 siblings, 0 replies; 116+ messages in thread
From: Rich Felker @ 2025-11-01 1:03 UTC (permalink / raw)
To: Thorsten Glaser
Cc: musl, libc-alpha, Arthur O'Dwyer, Jonathan Wakely,
Thiago Macieira
On Fri, Oct 31, 2025 at 12:16:39PM +0000, Thorsten Glaser wrote:
> Alejandro Colomar dixit:
>
> >A discussion within the C++ std-proposals@ mailing list triggered the
> >discussion about the need for a realloc() variant that works in-place,
> >that is, that doesn't move the address of the memory, and thus that
> >doesn't invalidate existing pointers derived from it.
>
> How is that supposed to work if you want to grow the
> allocation?
>
> This seems like increasing burden on the implementation
> for everyone, just for niche corner use cases.
Asymptotically, in-place realloc never works.
Rich
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] realloci(): A realloc() variant that works in-place
2025-10-31 17:53 ` Thiago Macieira
2025-10-31 18:35 ` Andreas Schwab
2025-10-31 20:18 ` Paul Eggert
@ 2025-11-01 3:47 ` Oliver Hunt
2025-11-01 14:18 ` Florian Weimer
2 siblings, 1 reply; 116+ messages in thread
From: Oliver Hunt @ 2025-11-01 3:47 UTC (permalink / raw)
To: Thiago Macieira
Cc: Alejandro Colomar, Paul Eggert, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney
[-- Attachment #1: Type: text/plain, Size: 3712 bytes --]
> On Oct 31, 2025, at 10:53 AM, Thiago Macieira <thiago@macieira.org> wrote:
>
> On Friday, 31 October 2025 10:31:54 Pacific Daylight Time Paul Eggert wrote:
>> On 10/31/25 11:25, Thiago Macieira wrote:
>>> I think the Committee would balk at adding a function
>>> that takes a pointer to already-freed memory whose purpose is to allow the
>>> contents of the new object to be adjusted solely based on arithmetic.
>>
>> Do you know of any platforms where this does not in fact work? Other
>> than sanitizing platforms that go to some lengths to impose the
>> Committee's rules even though the hardware would work fine?
>>
>> If not, then perhaps we can convince the Committee that the mismatch
>> between the current rules and reality is causing real harm, and that
>> it'd be a win for C's users to change the standard to match reality better.
>
> Oliver, please comment on ARM64e if you can, for pointer authentication. Think
> not just of statically-known pointers like vtables, but the general case of
> pointer authentication.
I don’t believe ptrauth would really play into this, but MTE does.
At an _extremely_ high level you can consider MTE as providing memory access permissions at a granularity in the order of bytes - I think 16 bytes on arm.
Under MTE an allocator can ensure a previously valid pointer is simply invalid and _cannot_ be used.
Like ptrauth this is largely probabilistic, but the allocator use case folk have presented is deterministic - unallocated memory cannot be read.
Just to be clear MTE isn’t an area I’m an expert in. MTE is another “oh look there are free bits at the top of a pointer” based mitigation. While pointer authentication uses those bits to validate the value of the pointer, MTE uses those bits to protect the validity of the pointer.
MTE does this by saying the high bits of a pointer contain a tag. To protect a region of memory with MTE, to regions of memory are allocated, the region that you wish to protect, and the tag space. The protected region is divided into atoms of some number of bytes (again I think 16 bytes on ARM), giving N atoms. The tag region then consists of N tags of some number of bits, the Nth atom in the projected region corresponds to the Nth tag in that region.
When you deference a pointer the MMU tests for the existence of a tag, if a tag is present the MMU looks up the tag associated with that address. Then, if the tag does not match, it triggers a fault (in this weeks episode of “what is a trivial operation?” :D)
When the allocator (or whatever is involved) wishes to allocate from the protected region, they first do the usual allocator things needed to find the memory that will be returned. The allocator then chooses a tag - how it’s chosen entirely up to the allocator - and then sets the upper bits of the pointer to that tag, and updates the tag space such that the tags for all of the atoms covered by the allocation are set to that tag.
On freeing an object the allocator can choose to invalidate _all_ existing copies of the pointer by simply replacing the tags for the protected region. At that point any attempt to dereference those pointers will fail.
On systems with MTE, your ability to make any assumptions about how much memory can be “safely” accessed outside of a live object - ie where previously a program may have been able to get away with an OoB because the OoB was within the bounds of the size class containing their object, that may now fail.
Similarly even if the allocator has returned sequential allocations, access object n+1, through `object pointer n + sizeof(type)` may now fail.
—Oliver
[-- Attachment #2: Type: text/html, Size: 4518 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 23:27 ` Thiago Macieira
@ 2025-11-01 3:54 ` Paul Eggert
2025-11-01 13:38 ` Thorsten Glaser
2025-11-11 12:04 ` Brooks Davis
0 siblings, 2 replies; 116+ messages in thread
From: Paul Eggert @ 2025-11-01 3:54 UTC (permalink / raw)
To: Thiago Macieira, Alejandro Colomar
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney
On 10/31/25 17:27, Thiago Macieira wrote:
> I don't think you are because imposing this requirement would imply it will
> never memcpy() the data to a new location and that would break quite a lot of
> applications that depend the ability to grow a block so long as there's heap
> available.
You're right I'm not saying that. All I'm saying is that when
R=realloc(P,N) succeeds, you can assume that you can adjust old pointers
into the object addressed by P by adding R-P to them. The C standard
says this results in undefined behavior; all that we need to do is fix
the C standard to say it's well-defined (because it is on practical
platforms).
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 16:59 ` Paul Eggert
2025-10-31 17:25 ` Thiago Macieira
2025-10-31 20:13 ` Alejandro Colomar
@ 2025-11-01 12:57 ` Florian Weimer
2025-11-01 15:11 ` Thiago Macieira
2 siblings, 1 reply; 116+ messages in thread
From: Florian Weimer @ 2025-11-01 12:57 UTC (permalink / raw)
To: Paul Eggert
Cc: Alejandro Colomar, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney, Thiago Macieira
* Paul Eggert:
> On 10/31/25 10:22, Alejandro Colomar wrote:
>> Paul, for context, this is a discussion for adding a function
>>
>> int realloci(void *p, size_t n);
>>
>> that changes the size of a memory block without moving it. (And thus,
>> fails rather often, for some implementations of allocators.)
>
> Reading the threads leading into this, the motivation for this seems to
> be C++ and similar memory allocators that want a cheap way to grow an
> object - if the object doesn't move they can skip some reinitialization
> work, otherwise they have more work to do.
>
> With that in mind, the proposed API is not the best way to go about the
> problem. What these users want is a function that acts just like
> R=realloc(P,N) EXCEPT that it lets you compare R==P, and if the two
> values are the same pointer you know the object did not move and you can
> skip some work. This is simpler than realloci because it means that you
> need only one call (not two) in the common case when realloci returns
> the null pointer.
This would not help C++ because C++ doesn't have this kind of in-place
adjust-all-pointers operation. There's just copy and move, and both
need non-overlapping old and new storage at the same time.
^ permalink raw reply [flat|nested] 116+ messages in thread
* [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-30 23:15 [musl] realloci(): A realloc() variant that works in-place Alejandro Colomar
` (2 preceding siblings ...)
2025-10-31 13:43 ` [musl] " Alejandro Colomar
@ 2025-11-01 13:05 ` Florian Weimer
2025-11-01 15:03 ` Thiago Macieira
2025-11-01 15:22 ` Alejandro Colomar
3 siblings, 2 replies; 116+ messages in thread
From: Florian Weimer @ 2025-11-01 13:05 UTC (permalink / raw)
To: Alejandro Colomar
Cc: libc-alpha, musl, Arthur O'Dwyer, Jonathan Wakely,
Thiago Macieira
* Alejandro Colomar:
> A discussion within the C++ std-proposals@ mailing list triggered the
> discussion about the need for a realloc() variant that works in-place,
> that is, that doesn't move the address of the memory, and thus that
> doesn't invalidate existing pointers derived from it.
> void *realloci(void *p, size_t size);
The caller won't have sufficient information to determine good values
for size.
For the std::vector case at least, what applications want is some form
of non-moving realloc that allows the application to specify an
arithmetic progression and an interval, and the realloc variant should
change the size of the allocation to an element of the arithmetic
progression that resides within the specified interval, or fail.
With this interface, std::vector would not have to know the size
classes of the allocator. On failure, std::vector resizing would have
to fall back to malloc/free and moving objects one by one. But that
is kind of inevitable.
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 3:54 ` Paul Eggert
@ 2025-11-01 13:38 ` Thorsten Glaser
2025-11-01 14:55 ` Thiago Macieira
2025-11-11 12:04 ` Brooks Davis
1 sibling, 1 reply; 116+ messages in thread
From: Thorsten Glaser @ 2025-11-01 13:38 UTC (permalink / raw)
To: musl
Cc: Thiago Macieira, Alejandro Colomar, libc-alpha, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney
Paul Eggert dixit:
> You're right I'm not saying that. All I'm saying is that when
> R=realloc(P,N) succeeds, you can assume that you can adjust old
> pointers into the object addressed by P by adding R-P to them. The C
> standard says this results in undefined behavior; all that we need to
> do is fix the C standard to say it's well-defined (because it is on
> practical platforms).
That won’t fix old standard versions retroactively though.
You can already do this now by casting the old pointer to
ptraddr_t (ifdef __CHERI__) / uintptr_t (else) before the
realloc and the new one afterwards for comparison and the
arithmetic.
bye,
//mirabilos
--
13:28⎜«neurodamage:#cvs» you're a handy guy to have around for systems stuff ☺
16:06⎜<Draget:#cvs> Thank god I found you =) 20:03│«bioe007:#cvs» mira2k: ty
17:14⎜<ldiain:#cvs> Thanks big help you are :-) <bioe007> mira|nwt: ty again
18:36⎜«ThunderChicken:#cvs» mirabilos FTW! 23:03⎜«mithraic:#cvs» aaah. thanks
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] realloci(): A realloc() variant that works in-place
2025-11-01 3:47 ` [musl] " Oliver Hunt
@ 2025-11-01 14:18 ` Florian Weimer
2025-11-02 1:11 ` Oliver Hunt
0 siblings, 1 reply; 116+ messages in thread
From: Florian Weimer @ 2025-11-01 14:18 UTC (permalink / raw)
To: Oliver Hunt
Cc: Thiago Macieira, Alejandro Colomar, Paul Eggert, libc-alpha, musl,
A. Wilcox, Lénárd Szolnoki, Collin Funk,
Arthur O'Dwyer, Jonathan Wakely, Paul E. McKenney
* Oliver Hunt:
>> On Oct 31, 2025, at 10:53 AM, Thiago Macieira <thiago@macieira.org> wrote:
>>
>> On Friday, 31 October 2025 10:31:54 Pacific Daylight Time Paul Eggert wrote:
>>> On 10/31/25 11:25, Thiago Macieira wrote:
>>>> I think the Committee would balk at adding a function
>>>> that takes a pointer to already-freed memory whose purpose is to allow the
>>>> contents of the new object to be adjusted solely based on arithmetic.
>>>
>>> Do you know of any platforms where this does not in fact work? Other
>>> than sanitizing platforms that go to some lengths to impose the
>>> Committee's rules even though the hardware would work fine?
>>>
>>> If not, then perhaps we can convince the Committee that the mismatch
>>> between the current rules and reality is causing real harm, and that
>>> it'd be a win for C's users to change the standard to match reality better.
>>
>> Oliver, please comment on ARM64e if you can, for pointer authentication. Think
>> not just of statically-known pointers like vtables, but the general case of
>> pointer authentication.
>
>
> I don’t believe ptrauth would really play into this, but MTE does.
I think MTE still works because if realloc changes the tag, the
pointer changes. The application then has to do the offset-based
adjustment, which happens to change the tag only.
(I'm not saying this malloc change is a good idea. I don't know of
its implications, and if it can be integrated safely with the other
parts of the languages.)
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 13:38 ` Thorsten Glaser
@ 2025-11-01 14:55 ` Thiago Macieira
0 siblings, 0 replies; 116+ messages in thread
From: Thiago Macieira @ 2025-11-01 14:55 UTC (permalink / raw)
To: musl, Thorsten Glaser
Cc: Alejandro Colomar, libc-alpha, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney
[-- Attachment #1: Type: text/plain, Size: 1339 bytes --]
On Saturday, 1 November 2025 06:38:41 Pacific Daylight Time Thorsten Glaser
wrote:
> Paul Eggert dixit:
> > You're right I'm not saying that. All I'm saying is that when
> > R=realloc(P,N) succeeds, you can assume that you can adjust old
> > pointers into the object addressed by P by adding R-P to them. The C
> > standard says this results in undefined behavior; all that we need to
> > do is fix the C standard to say it's well-defined (because it is on
> > practical platforms).
>
> That won’t fix old standard versions retroactively though.
>
> You can already do this now by casting the old pointer to
> ptraddr_t (ifdef __CHERI__) / uintptr_t (else) before the
> realloc and the new one afterwards for comparison and the
> arithmetic.
Strictly speaking, you need core language changes in both C and C++ to make
the the arithmetic in question valid, instead of UB. I wouldn't hold my hopes
up, in spite of just "works everywhere".
Anyway, I now understand Paul's proposal. It is orthogonal to realloci(). I am
not sure if I would support it, but I wouldn't oppose it in *addition* to
realloci(). I would definitely oppose it if it is *instead of* realloci().
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 13:05 ` Florian Weimer
@ 2025-11-01 15:03 ` Thiago Macieira
2025-11-01 15:14 ` Florian Weimer
2025-11-01 15:22 ` Alejandro Colomar
1 sibling, 1 reply; 116+ messages in thread
From: Thiago Macieira @ 2025-11-01 15:03 UTC (permalink / raw)
To: Alejandro Colomar, Florian Weimer
Cc: libc-alpha, musl, Arthur O'Dwyer, Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 1097 bytes --]
On Saturday, 1 November 2025 06:05:57 Pacific Daylight Time Florian Weimer
wrote:
> For the std::vector case at least, what applications want is some form
> of non-moving realloc that allows the application to specify an
> arithmetic progression and an interval, and the realloc variant should
> change the size of the allocation to an element of the arithmetic
> progression that resides within the specified interval, or fail.
That would indeed be better. When growing, the class doesn't know how many
more items are coming, so a best effort from the runtime is welcome. There's a
chance that the smaller allocation that the runtime was able to fulfill will
suffice and thus we'll have avoided an extra relocation of the elements.
Wouldn't this be the xallocx() interface from jemalloc? It allows the caller
to pass the number of elements/bytes it really needs and the number of
elements/bytes it speculates it will need.
However, it's not necessary.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 12:57 ` Florian Weimer
@ 2025-11-01 15:11 ` Thiago Macieira
0 siblings, 0 replies; 116+ messages in thread
From: Thiago Macieira @ 2025-11-01 15:11 UTC (permalink / raw)
To: Paul Eggert, Florian Weimer
Cc: Alejandro Colomar, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney
[-- Attachment #1: Type: text/plain, Size: 1126 bytes --]
On Saturday, 1 November 2025 05:57:56 Pacific Daylight Time Florian Weimer
wrote:
> This would not help C++ because C++ doesn't have this kind of in-place
> adjust-all-pointers operation. There's just copy and move, and both
> need non-overlapping old and new storage at the same time.
Indeed. It's not unprecedented: we are discussing adding an extension point
for the non-trivial relocation operation (that is, for those class types that
need an out-of-line, complex operation to atomically move the contents from
source to destination and destroy the source).
However, this is yet another extension point because it doesn't obviate the
need for the above. I don't think it will happen because of the complexity of
the task and how few developers will get it right. And that's in addition to
the update to the core language rules on pointer arithmetic.
That's why I said I don't oppose this change to the pointer rules in
principle, only if it is done *instead of* realloci().
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 15:03 ` Thiago Macieira
@ 2025-11-01 15:14 ` Florian Weimer
2025-11-01 15:42 ` Thiago Macieira
0 siblings, 1 reply; 116+ messages in thread
From: Florian Weimer @ 2025-11-01 15:14 UTC (permalink / raw)
To: Thiago Macieira
Cc: Alejandro Colomar, libc-alpha, musl, Arthur O'Dwyer,
Jonathan Wakely
* Thiago Macieira:
> On Saturday, 1 November 2025 06:05:57 Pacific Daylight Time Florian Weimer
> wrote:
>> For the std::vector case at least, what applications want is some form
>> of non-moving realloc that allows the application to specify an
>> arithmetic progression and an interval, and the realloc variant should
>> change the size of the allocation to an element of the arithmetic
>> progression that resides within the specified interval, or fail.
>
> That would indeed be better. When growing, the class doesn't know
> how many more items are coming, so a best effort from the runtime is
> welcome. There's a chance that the smaller allocation that the
> runtime was able to fulfill will suffice and thus we'll have avoided
> an extra relocation of the elements.
>
> Wouldn't this be the xallocx() interface from jemalloc? It allows
> the caller to pass the number of elements/bytes it really needs and
> the number of elements/bytes it speculates it will need.
>
> However, it's not necessary.
I assume the document here is current? <https://jemalloc.net/jemalloc.3.html>
The description is not very precise. I think for avoiding
fragmentation, it would be desirable for xallocx to return values
great than size + extra if there's a tail that cannot be used by
another allocation. It's unclear whether that's permitted.
But with a few clarifications, xallocx might indeed be a simpler
interface for this.
^ permalink raw reply [flat|nested] 116+ messages in thread
* [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 13:05 ` Florian Weimer
2025-11-01 15:03 ` Thiago Macieira
@ 2025-11-01 15:22 ` Alejandro Colomar
2025-11-01 18:10 ` Rich Felker
2025-11-01 19:27 ` Laurent Bercot
1 sibling, 2 replies; 116+ messages in thread
From: Alejandro Colomar @ 2025-11-01 15:22 UTC (permalink / raw)
To: Florian Weimer
Cc: libc-alpha, musl, Arthur O'Dwyer, Jonathan Wakely,
Thiago Macieira
[-- Attachment #1: Type: text/plain, Size: 1712 bytes --]
Hi Florian,
On Sat, Nov 01, 2025 at 02:05:57PM +0100, Florian Weimer wrote:
> * Alejandro Colomar:
>
> > A discussion within the C++ std-proposals@ mailing list triggered the
> > discussion about the need for a realloc() variant that works in-place,
> > that is, that doesn't move the address of the memory, and thus that
> > doesn't invalidate existing pointers derived from it.
>
> > void *realloci(void *p, size_t size);
>
> The caller won't have sufficient information to determine good values
> for size.
>
> For the std::vector case at least, what applications want is some form
> of non-moving realloc that allows the application to specify an
> arithmetic progression and an interval, and the realloc variant should
> change the size of the allocation to an element of the arithmetic
> progression that resides within the specified interval, or fail.
Would this work?:
ssize_t realloci(void *p, size_t size);
Where realloci() allocates at least 'size' bytes (but possibly more),
and returns the actual usable size of the block. So, you could
realloci(p, 3000);
and it would return for example 4096, which would be the usable size of
the block. Or it would return -1 if it is unable to grow that much.
realloci() would never fail when shrinking, as it could just return a
larger size and be done with it.
Have a lovely day!
Alex
> With this interface, std::vector would not have to know the size
> classes of the allocator. On failure, std::vector resizing would have
> to fall back to malloc/free and moving objects one by one. But that
> is kind of inevitable.
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 15:14 ` Florian Weimer
@ 2025-11-01 15:42 ` Thiago Macieira
2025-11-01 16:14 ` Alejandro Colomar
0 siblings, 1 reply; 116+ messages in thread
From: Thiago Macieira @ 2025-11-01 15:42 UTC (permalink / raw)
To: Florian Weimer
Cc: Alejandro Colomar, libc-alpha, musl, Arthur O'Dwyer,
Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 1440 bytes --]
On Saturday, 1 November 2025 08:14:05 Pacific Daylight Time Florian Weimer
wrote:
> I assume the document here is current?
> <https://jemalloc.net/jemalloc.3.html>
>
> The description is not very precise. I think for avoiding
> fragmentation, it would be desirable for xallocx to return values
> great than size + extra if there's a tail that cannot be used by
> another allocation. It's unclear whether that's permitted.
> But with a few clarifications, xallocx might indeed be a simpler
> interface for this.
Agreed. It would be very much implementation-dependent, though.
In other words, xallocx() may return any value greater than the *current*
size, which may be smaller than the new desired minimum.
In one scenario, let's say you have a block of currently 2304 bytes and you're
asking for 2560 bytes with an extra 512 "if it won't bother you". But this
already is in the 4096-byte slab, so all allocations are of that size and
don't fit the 2048 one. So it will return 4096.
In another, let's say you have a block of 1920 bytes and you make the same
request. As it can't grow past 2048 without memcpy(), it will return 2048,
which is less than the requested minimum of 2560. In this case, it's up to the
caller to decide to malloc() a new block, as it really needs that new minimum.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 15:42 ` Thiago Macieira
@ 2025-11-01 16:14 ` Alejandro Colomar
2025-11-01 19:40 ` Thiago Macieira
0 siblings, 1 reply; 116+ messages in thread
From: Alejandro Colomar @ 2025-11-01 16:14 UTC (permalink / raw)
To: Thiago Macieira
Cc: Florian Weimer, libc-alpha, musl, Arthur O'Dwyer,
Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 2308 bytes --]
Hi Thiago, Florian,
On Sat, Nov 01, 2025 at 08:42:30AM -0700, Thiago Macieira wrote:
> On Saturday, 1 November 2025 08:14:05 Pacific Daylight Time Florian Weimer
> wrote:
> > I assume the document here is current?
> > <https://jemalloc.net/jemalloc.3.html>
> >
> > The description is not very precise. I think for avoiding
> > fragmentation, it would be desirable for xallocx to return values
> > great than size + extra if there's a tail that cannot be used by
> > another allocation. It's unclear whether that's permitted.
> > But with a few clarifications, xallocx might indeed be a simpler
> > interface for this.
>
> Agreed. It would be very much implementation-dependent, though.
>
> In other words, xallocx() may return any value greater than the *current*
> size, which may be smaller than the new desired minimum.
>
> In one scenario, let's say you have a block of currently 2304 bytes and you're
> asking for 2560 bytes with an extra 512 "if it won't bother you". But this
> already is in the 4096-byte slab, so all allocations are of that size and
> don't fit the 2048 one. So it will return 4096.
>
> In another, let's say you have a block of 1920 bytes and you make the same
> request. As it can't grow past 2048 without memcpy(), it will return 2048,
> which is less than the requested minimum of 2560. In this case, it's up to the
> caller to decide to malloc() a new block, as it really needs that new minimum.
Hmmm, I would simplify for realloci(). Perhaps we could have the
following semantics:
size_t realloci(void *p, size_t size);
- It will never fail. It always allocates a size >=MIN(oldsize, size).
- When shrinking, either it does actually shrink (if the space can be
reused by others), or returns a large size if that space would anyway
be wasted.
- When growing, it grows to the first step at least as large as the
requested size if possible, or it grows as much as possible. Then
it's up to the caller to judge if that's enough. For example:
actual_size = realloci(p, requested_size);
if (actual_size < needed_size)
do_actual_realloc();
Does this sound good for std::vector?
Have a lovely day!
Alex
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 15:22 ` Alejandro Colomar
@ 2025-11-01 18:10 ` Rich Felker
2025-11-01 18:17 ` Thorsten Glaser
` (2 more replies)
2025-11-01 19:27 ` Laurent Bercot
1 sibling, 3 replies; 116+ messages in thread
From: Rich Felker @ 2025-11-01 18:10 UTC (permalink / raw)
To: Alejandro Colomar
Cc: Florian Weimer, libc-alpha, musl, Arthur O'Dwyer,
Jonathan Wakely, Thiago Macieira
On Sat, Nov 01, 2025 at 04:22:30PM +0100, Alejandro Colomar wrote:
> Hi Florian,
>
> On Sat, Nov 01, 2025 at 02:05:57PM +0100, Florian Weimer wrote:
> > * Alejandro Colomar:
> >
> > > A discussion within the C++ std-proposals@ mailing list triggered the
> > > discussion about the need for a realloc() variant that works in-place,
> > > that is, that doesn't move the address of the memory, and thus that
> > > doesn't invalidate existing pointers derived from it.
> >
> > > void *realloci(void *p, size_t size);
> >
> > The caller won't have sufficient information to determine good values
> > for size.
> >
> > For the std::vector case at least, what applications want is some form
> > of non-moving realloc that allows the application to specify an
> > arithmetic progression and an interval, and the realloc variant should
> > change the size of the allocation to an element of the arithmetic
> > progression that resides within the specified interval, or fail.
>
> Would this work?:
>
> ssize_t realloci(void *p, size_t size);
>
> Where realloci() allocates at least 'size' bytes (but possibly more),
> and returns the actual usable size of the block. So, you could
>
> realloci(p, 3000);
>
> and it would return for example 4096, which would be the usable size of
> the block. Or it would return -1 if it is unable to grow that much.
> realloci() would never fail when shrinking, as it could just return a
> larger size and be done with it.
ssize_t is POSIX-only and unlikely to be adopted by C standard. It
could return (size_t)-1 instead.
Actually returning a value larger than n seems bad (makes it
impossible to detect OOB writes beyond the actually requested size)
though so this seems like a dubious feature.
Rich
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 18:10 ` Rich Felker
@ 2025-11-01 18:17 ` Thorsten Glaser
2025-11-01 18:20 ` Collin Funk
2025-11-01 19:14 ` Alejandro Colomar
2 siblings, 0 replies; 116+ messages in thread
From: Thorsten Glaser @ 2025-11-01 18:17 UTC (permalink / raw)
To: musl
Cc: Alejandro Colomar, Florian Weimer, libc-alpha, Arthur O'Dwyer,
Jonathan Wakely, Thiago Macieira
On Sat, 1 Nov 2025, Rich Felker wrote:
>Actually returning a value larger than n seems bad (makes it
>impossible to detect OOB writes beyond the actually requested size)
>though so this seems like a dubious feature.
But it allows the application to write beyond the requested size
through the returned one, saving it from needing to call the new
realloc-like function too often if it stores that value as size,
e.g. for growing string buffers.
bye,
//mirabilos
--
(gnutls can also be used, but if you are compiling lynx for your own use,
there is no reason to consider using that package)
-- Thomas E. Dickey on the Lynx mailing list, about OpenSSL
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 18:10 ` Rich Felker
2025-11-01 18:17 ` Thorsten Glaser
@ 2025-11-01 18:20 ` Collin Funk
2025-11-01 19:14 ` Alejandro Colomar
2 siblings, 0 replies; 116+ messages in thread
From: Collin Funk @ 2025-11-01 18:20 UTC (permalink / raw)
To: Rich Felker
Cc: Alejandro Colomar, Florian Weimer, libc-alpha, musl,
Arthur O'Dwyer, Jonathan Wakely, Thiago Macieira
Rich Felker <dalias@libc.org> writes:
>> Where realloci() allocates at least 'size' bytes (but possibly more),
>> and returns the actual usable size of the block. So, you could
>>
>> realloci(p, 3000);
>>
>> and it would return for example 4096, which would be the usable size of
>> the block. Or it would return -1 if it is unable to grow that much.
>> realloci() would never fail when shrinking, as it could just return a
>> larger size and be done with it.
>
> ssize_t is POSIX-only and unlikely to be adopted by C standard. It
> could return (size_t)-1 instead.
ptrdiff_t probably makes more sense for allocation sizes anyways.
Collin
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 18:10 ` Rich Felker
2025-11-01 18:17 ` Thorsten Glaser
2025-11-01 18:20 ` Collin Funk
@ 2025-11-01 19:14 ` Alejandro Colomar
2 siblings, 0 replies; 116+ messages in thread
From: Alejandro Colomar @ 2025-11-01 19:14 UTC (permalink / raw)
To: Rich Felker
Cc: Florian Weimer, libc-alpha, musl, Arthur O'Dwyer,
Jonathan Wakely, Thiago Macieira
[-- Attachment #1: Type: text/plain, Size: 2001 bytes --]
Hi Rich,
On Sat, Nov 01, 2025 at 02:10:17PM -0400, Rich Felker wrote:
> > Would this work?:
> >
> > ssize_t realloci(void *p, size_t size);
> >
> > Where realloci() allocates at least 'size' bytes (but possibly more),
> > and returns the actual usable size of the block. So, you could
> >
> > realloci(p, 3000);
> >
> > and it would return for example 4096, which would be the usable size of
> > the block. Or it would return -1 if it is unable to grow that much.
> > realloci() would never fail when shrinking, as it could just return a
> > larger size and be done with it.
>
[...]
>
> Actually returning a value larger than n seems bad (makes it
> impossible to detect OOB writes beyond the actually requested size)
You could still detect OOB writes beyond the requested size after
malloc(3) and realloc(3). It would only be memory grown with realloci()
that you couldn't detect OOB writes beyond the requested size.
However, I don't see this as a problem. If we consider that not as
a requested size, but as a hint, then we can consider that the size is
the value returned, and OOB writes beyond that size would still be
detected.
In the draft for v2 I'll send soon, I have this, which entirely ignores
the requested (hint) size:
+size_t realloci(void *p, size_t n)
+{
+ struct meta *g = get_meta(p);
+ int idx = get_slot_index(p);
+ size_t stride = get_stride(g);
+ unsigned char *start = g->mem->storage + stride*idx;
+ unsigned char *end = start + stride - IB;
+ size_t avail_size = end-(unsigned char *)p;
+
+ set_size(p, end, avail_size);
+ return avail_size;
+}
As you can see, it sets the size as 'avail_size', so OOB detectors still
work as expected. It's just that the size is not 'n'.
> though so this seems like a dubious feature.
Have a lovely night!
Alex
>
> Rich
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 15:22 ` Alejandro Colomar
2025-11-01 18:10 ` Rich Felker
@ 2025-11-01 19:27 ` Laurent Bercot
2025-11-01 19:38 ` Thorsten Glaser
` (2 more replies)
1 sibling, 3 replies; 116+ messages in thread
From: Laurent Bercot @ 2025-11-01 19:27 UTC (permalink / raw)
To: musl; +Cc: libc-alpha, musl, Arthur O'Dwyer, Jonathan Wakely,
Thiago Macieira
>Where realloci() allocates at least 'size' bytes (but possibly more),
>and returns the actual usable size of the block.
If you're set on doing something like this, it would be simpler to
provide a function that just returns the usable size of the block,
under which realloc() is guaranteed not to relocate the object, and
above which it is guaranteed to relocate.
This sounds like it would not work with multithreading, but neither
would your new realloci approach that returns a supposedly usable size.
All in all I don't see why C should be polluted with functions that
are unusable by C programmers just to help optimize C++ implementations.
This sounds like a bad trade-off, and I feel like a C++-specific issue
should be solved in a C++-specific way.
--
Laurent
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 19:27 ` Laurent Bercot
@ 2025-11-01 19:38 ` Thorsten Glaser
2025-11-01 20:02 ` Thiago Macieira
2025-11-01 20:50 ` Demi Marie Obenour
2 siblings, 0 replies; 116+ messages in thread
From: Thorsten Glaser @ 2025-11-01 19:38 UTC (permalink / raw)
To: musl; +Cc: libc-alpha, Arthur O'Dwyer, Jonathan Wakely, Thiago Macieira
On Sat, 1 Nov 2025, Laurent Bercot wrote:
>> Where realloci() allocates at least 'size' bytes (but possibly more),
>> and returns the actual usable size of the block.
>
> If you're set on doing something like this, it would be simpler to
> provide a function that just returns the usable size of the block,
> under which realloc() is guaranteed not to relocate the object, and
> above which it is guaranteed to relocate.
The size of that block may depend on the size requested (and possibly
the availability at the time of requesting), so… no, not really simpler.
bye,
//mirabilos
--
When he found out that the m68k port was in a pretty bad shape, he did
not, like many before him, shrug and move on; instead, he took it upon
himself to start compiling things, just so he could compile his shell.
How's that for dedication. -- Wouter, about my Debian/m68k revival
^ permalink raw reply [flat|nested] 116+ messages in thread
* [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 16:14 ` Alejandro Colomar
@ 2025-11-01 19:40 ` Thiago Macieira
2025-11-02 13:31 ` Alejandro Colomar
0 siblings, 1 reply; 116+ messages in thread
From: Thiago Macieira @ 2025-11-01 19:40 UTC (permalink / raw)
To: Alejandro Colomar
Cc: Florian Weimer, libc-alpha, musl, Arthur O'Dwyer,
Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 1796 bytes --]
On Saturday, 1 November 2025 09:14:24 Pacific Daylight Time Alejandro Colomar
wrote:
> - It will never fail. It always allocates a size >=MIN(oldsize, size).
>
> - When shrinking, either it does actually shrink (if the space can be
> reused by others), or returns a large size if that space would anyway
> be wasted.
>
> - When growing, it grows to the first step at least as large as the
> requested size if possible, or it grows as much as possible. Then
> it's up to the caller to judge if that's enough. For example:
>
> actual_size = realloci(p, requested_size);
> if (actual_size < needed_size)
> do_actual_realloc();
>
> Does this sound good for std::vector?
Yes.
I'm pondering whether to also add the "extra" parameter from xallocx(), thus
making it nearly the same API. I can't think of a good reason, because like
your proposal, it's documented to
"The xallocx() function returns the real size of the resulting resized
allocation pointed to by ptr, which is a value less than size if the
allocation could not be adequately grown in place. "
This means it always returns a value between
cursize
and
ROUND_UP(newsize+extra, blocksize)
Under what circumstances would it make any use of the separation of the two
values? Is it to make upper layers simpler, by having a constant in the extra
parameter? Or is it maybe to avoid them having to deal with overflow in the
addition or multiplication? Does anyone know? We should probably ping Jason
Evans.
I can see where it's used in the source code, but I haven't spent enough time
to understand what decisions it may do differently.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 19:27 ` Laurent Bercot
2025-11-01 19:38 ` Thorsten Glaser
@ 2025-11-01 20:02 ` Thiago Macieira
2025-11-01 20:58 ` Thorsten Glaser
2025-11-01 22:12 ` Re[2]: " Laurent Bercot
2025-11-01 20:50 ` Demi Marie Obenour
2 siblings, 2 replies; 116+ messages in thread
From: Thiago Macieira @ 2025-11-01 20:02 UTC (permalink / raw)
To: musl, Laurent Bercot
Cc: libc-alpha, musl, Arthur O'Dwyer, Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 3295 bytes --]
On Saturday, 1 November 2025 12:27:00 Pacific Daylight Time Laurent Bercot
wrote:
> >Where realloci() allocates at least 'size' bytes (but possibly more),
> >and returns the actual usable size of the block.
>
> If you're set on doing something like this, it would be simpler to
> provide a function that just returns the usable size of the block,
> under which realloc() is guaranteed not to relocate the object, and
> above which it is guaranteed to relocate.
What would this function return for an implementation that has a linear heap
and the present block is the last one? PTRDIFF_MAX?
In any case, the inability to use it in a multithreaded environment is a
showstopper.
> This sounds like it would not work with multithreading, but neither
> would your new realloci approach that returns a supposedly usable size.
realloci() would be usable in multithreaded environments because it can
perform a lock and be sure that the requested size did fit, or some other size
did, before unlocking and returning.
> All in all I don't see why C should be polluted with functions that
> are unusable by C programmers just to help optimize C++ implementations.
> This sounds like a bad trade-off, and I feel like a C++-specific issue
> should be solved in a C++-specific way.
Just because this discussion started from the C++ side does not mean it's only
usable from C++. Everything one can do in C++ one can do in C, even if it
takes writing a bit more code.
The godbolt example I posted:
https://godbolt.org/z/ET6M6hW6q
is effectively entirely C, except for the actual element in the vector (a
std::string), which I chose only to make it realistic, to show that a problem
exists today to be solved. But it can be any C struct whose pointer address
must remain stable, such as when it's inserted in a linked list. Think of for
example:
struct my_data_structure {
struct list_head list; // The embedded list node
int value;
char name[]; // C99 Flexible Array Member
};
Suppose I want to update the name stored in this element and the new name is
bigger than the current one. With realloci() we could query the allocator to
see if it can be extended without moving, which removes the need to update the
pointers to this element in the list. If this element is stored in a lock-free
list, updating the previous and next elements may be an expensive operation
we'd prefer to avoid.
Even if were only a C++ problem and thus not a Standard C or POSIX problem, it
would be a problem for the *C Library implementations* to resolve anyway. The
alternative would be that the C++ Standard Libraries deploy their own
replacements for malloc() & free() that could be extended in place,
duplicating implementations and thus adding more complexity to running
applications, and being unable to share heaps. As a I user, I would prefer not
to see that happen.
It would not be the first time the C library implementations need to solve C++
problems (think __cxa_thread_atexit()) and won't be the last. If the C
committee doesn't agree on the value, so be it. The work for the people in
this thread probably remains the same.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 19:27 ` Laurent Bercot
2025-11-01 19:38 ` Thorsten Glaser
2025-11-01 20:02 ` Thiago Macieira
@ 2025-11-01 20:50 ` Demi Marie Obenour
2 siblings, 0 replies; 116+ messages in thread
From: Demi Marie Obenour @ 2025-11-01 20:50 UTC (permalink / raw)
To: musl, Laurent Bercot
Cc: libc-alpha, Arthur O'Dwyer, Jonathan Wakely, Thiago Macieira
[-- Attachment #1.1.1: Type: text/plain, Size: 1015 bytes --]
On 11/1/25 15:27, Laurent Bercot wrote:
>> Where realloci() allocates at least 'size' bytes (but possibly more),
>> and returns the actual usable size of the block.
>
> If you're set on doing something like this, it would be simpler to
> provide a function that just returns the usable size of the block,
> under which realloc() is guaranteed not to relocate the object, and
> above which it is guaranteed to relocate.
>
> This sounds like it would not work with multithreading, but neither
> would your new realloci approach that returns a supposedly usable size.
>
> All in all I don't see why C should be polluted with functions that
> are unusable by C programmers just to help optimize C++ implementations.
> This sounds like a bad trade-off, and I feel like a C++-specific issue
> should be solved in a C++-specific way.
C++ depends on C for allocation, and the same problem could also happen
in C. It's just more common in C++.
--
Sincerely,
Demi Marie Obenour (she/her/hers)
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 20:02 ` Thiago Macieira
@ 2025-11-01 20:58 ` Thorsten Glaser
2025-11-01 22:12 ` Re[2]: " Laurent Bercot
1 sibling, 0 replies; 116+ messages in thread
From: Thorsten Glaser @ 2025-11-01 20:58 UTC (permalink / raw)
To: musl; +Cc: Laurent Bercot, libc-alpha, Arthur O'Dwyer, Jonathan Wakely
On Sat, 1 Nov 2025, Thiago Macieira wrote:
>Everything one can do in C++ one can do in C, even if it
>takes writing a bit more code.
But in C, you don’t have the problem that data types are
unmovable by realloc. You can just use offsets instead of
pointers.
unconvinced,
//mirabilos
--
Solange man keine schmutzigen Tricks macht, und ich meine *wirklich*
schmutzige Tricks, wie bei einer doppelt verketteten Liste beide
Pointer XORen und in nur einem Word speichern, funktioniert Boehm ganz
hervorragend. -- Andreas Bogk über boehm-gc in d.a.s.r
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re[2]: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 20:02 ` Thiago Macieira
2025-11-01 20:58 ` Thorsten Glaser
@ 2025-11-01 22:12 ` Laurent Bercot
1 sibling, 0 replies; 116+ messages in thread
From: Laurent Bercot @ 2025-11-01 22:12 UTC (permalink / raw)
To: musl; +Cc: libc-alpha, musl, Arthur O'Dwyer, Jonathan Wakely
>realloci() would be usable in multithreaded environments because it can
>perform a lock and be sure that the requested size did fit, or some other size did, before unlocking and returning.
Indeed. The proposed modified realloci, however, always reserving the
maximum available space in the block if the user requested something
larger than is available, sounds like it would be suboptimal because
chances are the user will need a relocation anyway.
>struct my_data_structure {
> struct list_head list; // The embedded list node
> int value;
> char name[]; // C99 Flexible Array Member
>};
So... a linked list, living in the heap with one malloc per node, of a
structure containing a FAM. Yeah, in this case I can understand why
you'd need realloci. But honestly, if you know you will need to increase
the length of the name array, this is a pretty poor choice of data
structure, painting yourself into a corner. In this case, I would
certainly make name a char * field, which would be the only part that
needs to be reallocated, and the list pointers would never need
modification.
If you want to argue that the double indirection is costly in terms of
malloc, I agree; and since struct my_data_structure now has a fixed
size,
it is easy to implement the linked list differently, e.g. as an array
(potentially dynamically-sized, but reallocs here would be occasional
and relocation of the list would not be an issue), using indices instead
of pointers, which cuts down the amount of heap manipulation compared to
one-malloc-per-node way more than using a FAM would.
My point is that relocation issues are _a solved problem_ in C already.
If a realloci function appears, I will never use it, because realloc()
covers all my use cases; and never should any C application developer.
My concern is that it would be a footgun, encouraging bad design
practices, and C is not in shortage of that.
--
Laurent
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] realloci(): A realloc() variant that works in-place
2025-11-01 14:18 ` Florian Weimer
@ 2025-11-02 1:11 ` Oliver Hunt
0 siblings, 0 replies; 116+ messages in thread
From: Oliver Hunt @ 2025-11-02 1:11 UTC (permalink / raw)
To: Florian Weimer
Cc: Thiago Macieira, Alejandro Colomar, Paul Eggert, libc-alpha, musl,
A. Wilcox, Lénárd Szolnoki, Collin Funk,
Arthur O'Dwyer, Jonathan Wakely, Paul E. McKenney
[-- Attachment #1: Type: text/plain, Size: 3300 bytes --]
> On Nov 1, 2025, at 7:18 AM, Florian Weimer <fw@deneb.enyo.de> wrote:
>
> * Oliver Hunt:
>
>>> On Oct 31, 2025, at 10:53 AM, Thiago Macieira <thiago@macieira.org> wrote:
>>>
>>> On Friday, 31 October 2025 10:31:54 Pacific Daylight Time Paul Eggert wrote:
>>>> On 10/31/25 11:25, Thiago Macieira wrote:
>>>>> I think the Committee would balk at adding a function
>>>>> that takes a pointer to already-freed memory whose purpose is to allow the
>>>>> contents of the new object to be adjusted solely based on arithmetic.
>>>>
>>>> Do you know of any platforms where this does not in fact work? Other
>>>> than sanitizing platforms that go to some lengths to impose the
>>>> Committee's rules even though the hardware would work fine?
>>>>
>>>> If not, then perhaps we can convince the Committee that the mismatch
>>>> between the current rules and reality is causing real harm, and that
>>>> it'd be a win for C's users to change the standard to match reality better.
>>>
>>> Oliver, please comment on ARM64e if you can, for pointer authentication. Think
>>> not just of statically-known pointers like vtables, but the general case of
>>> pointer authentication.
>>
>>
>> I don’t believe ptrauth would really play into this, but MTE does.
>
> I think MTE still works because if realloc changes the tag, the
> pointer changes. The application then has to do the offset-based
> adjustment, which happens to change the tag only.
That’s possible - I’ll have to re-read the thread to get a more thorough understanding of exactly what it being proposed (why hasn’t someone written a formal proposal yet? :D :D )
I only did a very high level scan of the thread after Thiago pinged me, and it seemed like the intent was to be able to access outside of the officially live region on the basis of “knowing” that the memory was available.
My assumption would be an implementation that did do an in place reallocation would update the tags for only the newly active region (so their tags matched the existing one), but it’s interesting as a debugging/testing idea that realloc would always replace the tags even if it does do an inplace realloc so that it is never possible to reuse the old pointer.
For a change I’m not thinking of anything security related, this is my recollection from when I was a TA at uni - a common mistake students would make is to do realloc and keep using their old pointer, because it often worked, and then get confused when it didn’t (this was very early in the "learning C” course work so they were coming from Java, and hadn’t yet actually learned how memory “works”). Having it be an “always fails” path seems useful in such a case.
> (I'm not saying this malloc change is a good idea. I don't know of
> its implications, and if it can be integrated safely with the other
> parts of the languages.)
I’m dubious of it being a good idea, but that’s just based on my general reticence about providing mechanism that functionally expose internal mechanisms of the allocator that operate independently of the abstract machine’s ideas about memory :D As above I’ll need to re-read the thread more thoroughly to understand the exact semantics being proposed.
—Oliver
[-- Attachment #2: Type: text/html, Size: 9581 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 19:40 ` Thiago Macieira
@ 2025-11-02 13:31 ` Alejandro Colomar
2025-11-02 23:10 ` Thiago Macieira
0 siblings, 1 reply; 116+ messages in thread
From: Alejandro Colomar @ 2025-11-02 13:31 UTC (permalink / raw)
To: Thiago Macieira
Cc: Florian Weimer, libc-alpha, musl, Arthur O'Dwyer,
Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 3598 bytes --]
Hi Thiago,
On Sat, Nov 01, 2025 at 12:40:41PM -0700, Thiago Macieira wrote:
> On Saturday, 1 November 2025 09:14:24 Pacific Daylight Time Alejandro Colomar
> wrote:
> > - It will never fail. It always allocates a size >=MIN(oldsize, size).
> >
> > - When shrinking, either it does actually shrink (if the space can be
> > reused by others), or returns a large size if that space would anyway
> > be wasted.
> >
> > - When growing, it grows to the first step at least as large as the
> > requested size if possible, or it grows as much as possible. Then
> > it's up to the caller to judge if that's enough. For example:
> >
> > actual_size = realloci(p, requested_size);
> > if (actual_size < needed_size)
> > do_actual_realloc();
> >
> > Does this sound good for std::vector?
>
> Yes.
>
> I'm pondering whether to also add the "extra" parameter from xallocx(), thus
> making it nearly the same API. I can't think of a good reason, because like
> your proposal, it's documented to
>
> "The xallocx() function returns the real size of the resulting resized
> allocation pointed to by ptr, which is a value less than size if the
> allocation could not be adequately grown in place. "
>
> This means it always returns a value between
> cursize
> and
> ROUND_UP(newsize+extra, blocksize)
>
> Under what circumstances would it make any use of the separation of the two
> values? Is it to make upper layers simpler, by having a constant in the extra
> parameter? Or is it maybe to avoid them having to deal with overflow in the
> addition or multiplication? Does anyone know? We should probably ping Jason
> Evans.
>
> I can see where it's used in the source code, but I haven't spent enough time
> to understand what decisions it may do differently.
I'm writing a format prroposal for wg14, and while writing the wording,
I think I don't see a reason for overallocating this extra.
The purpose of realloci() is being extremely cheap. So, why would one
ask for extra size? You could just keep calling realloci() every time,
and let the allocator grow in small steps. That would simplify the
implementation of the caller, which doesn't need have code for growing
in large steps.
If the caller needs 37 kiB, it should ask for exactly 37 kiB. The
allocator would likely give 40 kiB. Then the caller will ask again when
it needs 41 kiB. Why would the caller want to allocate something like
64 kiB (just as an example), but then be happy with 37? This call is
much cheaper than an actual move with realloc(3) or malloc(3)+free(3).
So, why not require the caller to not ask too much? We could go back to
reporting an error if there's not enough memory.
Of course, it would still guarantee no errors when shrinking, but
I think we could error out when growing.
Unless there's a reason for the user to attempt to overallocate that I'm
not seeing.
#define realloci(p, size) reallocarrayi(p, size, 1)
ssize_t reallocarrayi(void *p, size_t n, size_t eltsize);
The returned value of reallocarrayi() would be the number of elements
actually allocated, which would be guaranteed to be either >= n, or -1
or error. There would be a further guarantee that it wouldn't error
when n <= oldsize.
n = reallocarrayi(p, 37, sizeof(T));
if (n == -1)
goto fail;
...
n = reallocarrayi(p, 41, sizeof(T));
if (n == -1)
goto fail;
Have a lovely day!
Alex
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-02 13:31 ` Alejandro Colomar
@ 2025-11-02 23:10 ` Thiago Macieira
2025-11-02 23:55 ` Arthur O'Dwyer
2025-11-02 23:58 ` Alejandro Colomar
0 siblings, 2 replies; 116+ messages in thread
From: Thiago Macieira @ 2025-11-02 23:10 UTC (permalink / raw)
To: Alejandro Colomar
Cc: Florian Weimer, libc-alpha, musl, Arthur O'Dwyer,
Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 2179 bytes --]
On Sunday, 2 November 2025 05:31:59 Pacific Standard Time Alejandro Colomar
wrote:
> The purpose of realloci() is being extremely cheap. So, why would one
> ask for extra size?
Speculative growth. When the container is being added to, it knows it needs at
least one more element, but it can't predict the future to know how many more.
So it asks "pretty please" for a few more.
However, we can keep the code and simply ask for a bit more in the same
parameter, instead of two. With the interface as you've described, it's not a
failure to be unable to return as many bytes as requested.
> If the caller needs 37 kiB, it should ask for exactly 37 kiB. The
> allocator would likely give 40 kiB.
With ptmalloc inside glibc, the rounding up is likely only to be to the
smallest allocation unit, that is, 16 bytes. So it's unlikely to return 40 kB,
aside of some special circumstances (e.g., an mmap()-backed allocation).
However, the upper layer can ask for 40 kB if it speculates it might need that
and only be given 37.5 kB.
> Then the caller will ask again when
> it needs 41 kiB. Why would the caller want to allocate something like
> 64 kiB (just as an example), but then be happy with 37? This call is
> much cheaper than an actual move with realloc(3) or malloc(3)+free(3).
So it is. I don't know yet whether it's going to be too expensive to call it
for every single element growth - I expect it will be. So container is
probably going to request more than a single element when growing, but
probably not double as it does today.
All this will need fine-tuning once implementations exist.
> So, why not require the caller to not ask too much? We could go back to
> reporting an error if there's not enough memory.
>
> Of course, it would still guarantee no errors when shrinking, but
> I think we could error out when growing.
I'd prefer no errors either way. If there isn't memory to grow the underlying
space (a brk() system call returns ENOMEM), then realloci() returns as much as
it could get but not more.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-02 23:10 ` Thiago Macieira
@ 2025-11-02 23:55 ` Arthur O'Dwyer
2025-11-03 0:27 ` Rich Felker
2025-11-03 0:56 ` Thiago Macieira
2025-11-02 23:58 ` Alejandro Colomar
1 sibling, 2 replies; 116+ messages in thread
From: Arthur O'Dwyer @ 2025-11-02 23:55 UTC (permalink / raw)
To: Thiago Macieira
Cc: Alejandro Colomar, Florian Weimer, libc-alpha, musl,
Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 2901 bytes --]
On Sun, Nov 2, 2025 at 6:10 PM Thiago Macieira <thiago@macieira.org> wrote:
> On Sunday, 2 November 2025 05:31:59 Pacific Standard Time Alejandro
> Colomar
> wrote:
> > The purpose of realloci() is being extremely cheap. So, why would one
> > ask for extra size?
>
> Speculative growth. When the container is being added to, it knows it
> needs at
> least one more element, but it can't predict the future to know how many
> more.
> So it asks "pretty please" for a few more.
>
I'll just chime in to mention that I recently had cause to look into what
various STL implementations (libc++, libstdc++, Microsoft) do when you
write something like:
std::vector<char> v;
v.resize(VERY_LARGE_NUMBER);
v.push_back(1);
Naturally every STL implementation will ask the vector's
std::allocator<int> for basically 2*VERY_LARGE_NUMBER bytes of memory at
this point.
If that much memory is available, then we're on the happy path and
everything's great. If less memory is available, the allocator throws a
std::bad_alloc exception.
Now for the interesting part: `vector::push_back` really only *needs* a
*single* additional byte — VERY_LARGE_NUMBER+1 bytes — in order to do its
job. Does any STL implementation actually catch the std::bad_alloc
exception and retry with a smaller allocation, in order to diligently do
the job it was asked to do?
Answer: *No.* In practice, *no* STL implementation retries allocations
inside `vector::push_back`. In practice, anytime the allocator throws, that
exception propagates out and we're done. So that means that std::vector has
basically "one chance" — it gets to make of the allocator a *single*
request, and so (in theory) it must choose that request wisely. (In
practice, running out of memory is rare and nobody cares if you run out a
little earlier than you would've otherwise.)
If there *were* a way to ask a C++ allocator for "2*VERY_LARGE_NUMBER
bytes, but, if you can't do that, I'll settle for as few as
VERY_LARGE_NUMBER+1 bytes," then presumably `vector::push_back` is exactly
the place we'd see that API getting used.
But this whole (bunch of) thread(s) started because of Thiago's throwaway
comment along the lines of "C++ doesn't care about realloc because realloc
has a bad API," and I think this thread (these threads) are just driving
that point home. If I wanted to reach a good allocator API, "I wouldn't
start from here." I don't think one can design a good allocator API by
making a ton of tiny patches on top of `malloc` and `realloc`. You have to
design the API *first*, and then show how to implement `malloc` and
`realloc` in terms of it.
(Also, C++ couldn't use it without also redesigning `std::allocator`, which
is almost just a thin wrapper around `malloc` and `free`. `std::allocator`
doesn't even have a counterpart to `realloc` at the moment.)
–Arthur
[-- Attachment #2: Type: text/html, Size: 3503 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-02 23:10 ` Thiago Macieira
2025-11-02 23:55 ` Arthur O'Dwyer
@ 2025-11-02 23:58 ` Alejandro Colomar
2025-11-03 0:28 ` Rich Felker
2025-11-03 0:41 ` Thiago Macieira
1 sibling, 2 replies; 116+ messages in thread
From: Alejandro Colomar @ 2025-11-02 23:58 UTC (permalink / raw)
To: Thiago Macieira
Cc: Florian Weimer, libc-alpha, musl, Arthur O'Dwyer,
Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 5637 bytes --]
Hi Thiago,
On Sun, Nov 02, 2025 at 03:10:45PM -0800, Thiago Macieira wrote:
> On Sunday, 2 November 2025 05:31:59 Pacific Standard Time Alejandro Colomar
> wrote:
> > The purpose of realloci() is being extremely cheap. So, why would one
> > ask for extra size?
>
> Speculative growth. When the container is being added to, it knows it needs at
> least one more element, but it can't predict the future to know how many more.
> So it asks "pretty please" for a few more.
realloci() would still give a few more. The difference is that if you
speculate from std::vector, you're speculating, while realloci() gives
you more without speculation.
> However, we can keep the code and simply ask for a bit more in the same
> parameter, instead of two. With the interface as you've described, it's not a
> failure to be unable to return as many bytes as requested.
Actually, read until the end of this mail. I'm starting to understand
the extra parameter.
> > If the caller needs 37 kiB, it should ask for exactly 37 kiB. The
> > allocator would likely give 40 kiB.
>
> With ptmalloc inside glibc, the rounding up is likely only to be to the
> smallest allocation unit, that is, 16 bytes.
Hmmm, it seems you're right; when dealing with tens of kiB, granularity
is at 16 B.
alx@devuan:~/tmp$ cat r.c
#include <malloc.h>
#include <stdio.h>
#include <stdlib.h>
int
main(void)
{
void *p;
p = NULL;
p = realloc(p, 37 * 1024);
printf("37 * 1024: %zu\n", malloc_usable_size(p));
for (ptrdiff_t i = -10; i < 10; i++) {
p = realloc(p, 41 * 1024 + i);
printf("41 * 1024 + %td: %zu\n", i, malloc_usable_size(p));
}
}
alx@devuan:~/tmp$ gcc -Wall -Wextra r.c
alx@devuan:~/tmp$ ./a.out
37 * 1024: 37896
41 * 1024 + -10: 41976
41 * 1024 + -9: 41976
41 * 1024 + -8: 41976
41 * 1024 + -7: 41992
41 * 1024 + -6: 41992
41 * 1024 + -5: 41992
41 * 1024 + -4: 41992
41 * 1024 + -3: 41992
41 * 1024 + -2: 41992
41 * 1024 + -1: 41992
41 * 1024 + 0: 41992
41 * 1024 + 1: 41992
41 * 1024 + 2: 41992
41 * 1024 + 3: 41992
41 * 1024 + 4: 41992
41 * 1024 + 5: 41992
41 * 1024 + 6: 41992
41 * 1024 + 7: 41992
41 * 1024 + 8: 41992
41 * 1024 + 9: 42008
> So it's unlikely to return 40 kB,
> aside of some special circumstances (e.g., an mmap()-backed allocation).
Now that I checked, it seems musl's mmap(2) threshold is in the hundreds
of kiB. I was expecting it to be lower.
> However, the upper layer can ask for 40 kB if it speculates it might need that
> and only be given 37.5 kB.
How about two consecutive calls?
size = realloci(p, speculative_size);
if (size != -1)
goto done;
size = realloci(p, needed_size);
if (size != -1)
goto done;
do_actual_realloc();
> > Then the caller will ask again when
> > it needs 41 kiB. Why would the caller want to allocate something like
> > 64 kiB (just as an example), but then be happy with 37? This call is
> > much cheaper than an actual move with realloc(3) or malloc(3)+free(3).
>
> So it is. I don't know yet whether it's going to be too expensive to call it
> for every single element growth - I expect it will be. So container is
> probably going to request more than a single element when growing, but
> probably not double as it does today.
Hmmm, could be.
> All this will need fine-tuning once implementations exist.
>
> > So, why not require the caller to not ask too much? We could go back to
> > reporting an error if there's not enough memory.
> >
> > Of course, it would still guarantee no errors when shrinking, but
> > I think we could error out when growing.
>
> I'd prefer no errors either way. If there isn't memory to grow the underlying
> space (a brk() system call returns ENOMEM), then realloci() returns as much as
> it could get but not more.
The problem is that this is asking the implementation to speculate.
Consider the case that a realloci() implementation knows that the
requested size fails. Let's put some arbitrary numbers:
old_size = 10000;
requested_size = 30000;
It knows the block can grow to somewhere between 10000 (which it
currently has) and 30000 (the system reported ENOMEM), but now it has
the task of allocating as much as it can get. Should it do a binary
search of the size? Try 20000, then if it fails try 15000, etc.?
That's speculation, and it would make this function too slow.
I suspect that's why xallocx() has the extra parameter. It removes the
need for speculating. However, what that parameter is doing is hiding
two calls in a single one. Maybe it would be better to ask the user to
do the two calls explicitly.
Let's compare:
speculative_size = needed_size + EXTRA_GROWTH;
size = realloci(p, speculative_size);
if (size != -1)
goto done;
size = realloci(p, needed_size);
if (size != -1)
goto done;
do_actual_realloc();
with:
size = realloci2(p, needed_size, EXTRA_GROWTH);
if (size != -1)
goto done;
do_actual_realloc();
Hmmmmm, if people are going to do this every time they need realloci(),
then I guess it makes sense to have it more compact. Also, you can have
a constant EXTRA_GROWTH that you use all the time, so you don't need to
calculate the speculative size unnecessarily (wasting another line of
code, plus one for the declaration of the variable).
I'm starting to like the interface of xallocx(). Prior art often has
reasons. :)
Have a lovely night!
Alex
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-02 23:55 ` Arthur O'Dwyer
@ 2025-11-03 0:27 ` Rich Felker
2025-11-03 0:56 ` Thiago Macieira
1 sibling, 0 replies; 116+ messages in thread
From: Rich Felker @ 2025-11-03 0:27 UTC (permalink / raw)
To: Arthur O'Dwyer
Cc: Thiago Macieira, Alejandro Colomar, Florian Weimer, libc-alpha,
musl, Jonathan Wakely
On Sun, Nov 02, 2025 at 06:55:53PM -0500, Arthur O'Dwyer wrote:
> On Sun, Nov 2, 2025 at 6:10 PM Thiago Macieira <thiago@macieira.org> wrote:
>
> > On Sunday, 2 November 2025 05:31:59 Pacific Standard Time Alejandro
> > Colomar
> > wrote:
> > > The purpose of realloci() is being extremely cheap. So, why would one
> > > ask for extra size?
> >
> > Speculative growth. When the container is being added to, it knows it
> > needs at
> > least one more element, but it can't predict the future to know how many
> > more.
> > So it asks "pretty please" for a few more.
> >
>
> I'll just chime in to mention that I recently had cause to look into what
> various STL implementations (libc++, libstdc++, Microsoft) do when you
> write something like:
> std::vector<char> v;
> v.resize(VERY_LARGE_NUMBER);
> v.push_back(1);
> Naturally every STL implementation will ask the vector's
> std::allocator<int> for basically 2*VERY_LARGE_NUMBER bytes of memory at
> this point.
> If that much memory is available, then we're on the happy path and
> everything's great. If less memory is available, the allocator throws a
> std::bad_alloc exception.
> Now for the interesting part: `vector::push_back` really only *needs* a
> *single* additional byte — VERY_LARGE_NUMBER+1 bytes — in order to do its
> job. Does any STL implementation actually catch the std::bad_alloc
> exception and retry with a smaller allocation, in order to diligently do
> the job it was asked to do?
> Answer: *No.* In practice, *no* STL implementation retries allocations
> inside `vector::push_back`. In practice, anytime the allocator throws, that
> exception propagates out and we're done. So that means that std::vector has
> basically "one chance" — it gets to make of the allocator a *single*
> request, and so (in theory) it must choose that request wisely. (In
> practice, running out of memory is rare and nobody cares if you run out a
> little earlier than you would've otherwise.)
>
> If there *were* a way to ask a C++ allocator for "2*VERY_LARGE_NUMBER
> bytes, but, if you can't do that, I'll settle for as few as
> VERY_LARGE_NUMBER+1 bytes," then presumably `vector::push_back` is exactly
> the place we'd see that API getting used.
At this point this sounds completely orthogonal to the realloci
discussion. Asymptotically, in-place realloc is *never* going to
succeed in doubling the size of an object. The only times it will
succeed are in non-size-segregating allocators when the old object is
either at the "top of the heap" (only one such object exists in the
process at any given time) or where you just happened to free an
object that just happened to be positioned right above it. This
happens so rarely that there is utterly no point in optimizing for it.
It the program doesn't work or performs badly when this optimization
can't succeed (which is most of the time) then the program needs to be
fixed not to do whatever it was trying to do.
Rich
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-02 23:58 ` Alejandro Colomar
@ 2025-11-03 0:28 ` Rich Felker
2025-11-03 9:36 ` Alejandro Colomar
2025-11-03 0:41 ` Thiago Macieira
1 sibling, 1 reply; 116+ messages in thread
From: Rich Felker @ 2025-11-03 0:28 UTC (permalink / raw)
To: Alejandro Colomar
Cc: Thiago Macieira, Florian Weimer, libc-alpha, musl,
Arthur O'Dwyer, Jonathan Wakely
On Mon, Nov 03, 2025 at 12:58:39AM +0100, Alejandro Colomar wrote:
> > All this will need fine-tuning once implementations exist.
> >
> > > So, why not require the caller to not ask too much? We could go back to
> > > reporting an error if there's not enough memory.
> > >
> > > Of course, it would still guarantee no errors when shrinking, but
> > > I think we could error out when growing.
> >
> > I'd prefer no errors either way. If there isn't memory to grow the underlying
> > space (a brk() system call returns ENOMEM), then realloci() returns as much as
> > it could get but not more.
>
> The problem is that this is asking the implementation to speculate.
>
> Consider the case that a realloci() implementation knows that the
> requested size fails. Let's put some arbitrary numbers:
>
> old_size = 10000;
> requested_size = 30000;
>
> It knows the block can grow to somewhere between 10000 (which it
> currently has) and 30000 (the system reported ENOMEM), but now it has
> the task of allocating as much as it can get. Should it do a binary
> search of the size? Try 20000, then if it fails try 15000, etc.?
> That's speculation, and it would make this function too slow.
I don't see any plausible implementation in which this involved a
binary search. Either you have fixed-size slots in which case you just
look at the size of the slot to see what the max obtainable is, or you
have a dlmalloc-like situation where you check the size of the
adjacent free block (if any) to determine the max obtainable. These
are O(1) operations.
Rich
^ permalink raw reply [flat|nested] 116+ messages in thread
* [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-02 23:58 ` Alejandro Colomar
2025-11-03 0:28 ` Rich Felker
@ 2025-11-03 0:41 ` Thiago Macieira
1 sibling, 0 replies; 116+ messages in thread
From: Thiago Macieira @ 2025-11-03 0:41 UTC (permalink / raw)
To: Alejandro Colomar
Cc: Florian Weimer, libc-alpha, musl, Arthur O'Dwyer,
Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 3716 bytes --]
On Sunday, 2 November 2025 15:58:39 Pacific Standard Time Alejandro Colomar
wrote:
> realloci() would still give a few more. The difference is that if you
> speculate from std::vector, you're speculating, while realloci() gives
> you more without speculation.
Clearly we need to avoid the double speculation. I don't think that will be
prescribed in the C Standard, so speculation would be implementation-defined
behaviour. That leaves room for us to to fine-tune with prototype
implementations, then make a recommendation for implementors.
> > So it's unlikely to return 40 kB,
> > aside of some special circumstances (e.g., an mmap()-backed allocation).
>
> Now that I checked, it seems musl's mmap(2) threshold is in the hundreds
> of kiB. I was expecting it to be lower.
mmap() is more expensive than manipulating bits in memory you already own. And
unless you've dealt with low-level page tables, you may not realise that
munmap() is actually more expensive, because it needs to perform a TLB flush
(or equivalent) in every thread of the process that is currently running. That
explains the mmap threshold.
> How about two consecutive calls?
>
> size = realloci(p, speculative_size);
> if (size != -1)
> goto done;
>
> size = realloci(p, needed_size);
> if (size != -1)
> goto done;
>
> do_actual_realloc();
That works, but I'd prefer if the first call returned to me the maximum that it
could satisfy. And locked it in place, because multi-threading is a thing.
> The problem is that this is asking the implementation to speculate.
>
> Consider the case that a realloci() implementation knows that the
> requested size fails. Let's put some arbitrary numbers:
>
> old_size = 10000;
> requested_size = 30000;
>
> It knows the block can grow to somewhere between 10000 (which it
> currently has) and 30000 (the system reported ENOMEM), but now it has
> the task of allocating as much as it can get. Should it do a binary
> search of the size? Try 20000, then if it fails try 15000, etc.?
> That's speculation, and it would make this function too slow.
That's a good question. To be verified later, but my guess is that the
implementation could return the available limit in this heap. For systems that
use a brk() call like Linux, that's simply the amount of memory to the brk.
For an mmap()-backed arena, that's the available memory to the end of the
arena.
In fact, an mmap()-backed arena usually cannot grow. On Linux, it is possible
to attempt to allocate more pages at the next virtual address using
MAP_FIXED_NOREPLACE, which will fail if either there's no memory or if it
would overlap with an existing mapping. This means the ENOMEM scenario is
indistinguishable from the one where there's something already using the next
address.
This does mean we could have a situation where
realloci(ptr, 30000) = 10240
but maybe
realloci(ptr, 20000) = 20480
> Hmmmmm, if people are going to do this every time they need realloci(),
> then I guess it makes sense to have it more compact. Also, you can have
> a constant EXTRA_GROWTH that you use all the time, so you don't need to
> calculate the speculative size unnecessarily (wasting another line of
> code, plus one for the declaration of the variable).
I don't know if it will be. Again to test with fine-tuning, but my initial idea
is to round up to the next power of two elements or 4 kB, whichever is
smaller.
MIN(ROUND_POWER2(count) * sizeof(T), count * sizeof(T) + 4096)
> I'm starting to like the interface of xallocx(). Prior art often has
> reasons. :)
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-02 23:55 ` Arthur O'Dwyer
2025-11-03 0:27 ` Rich Felker
@ 2025-11-03 0:56 ` Thiago Macieira
1 sibling, 0 replies; 116+ messages in thread
From: Thiago Macieira @ 2025-11-03 0:56 UTC (permalink / raw)
To: Arthur O'Dwyer
Cc: Alejandro Colomar, Florian Weimer, libc-alpha, musl,
Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 1623 bytes --]
On Sunday, 2 November 2025 15:55:53 Pacific Standard Time Arthur O'Dwyer wrote:
> But this whole (bunch of) thread(s) started because of Thiago's throwaway
> comment along the lines of "C++ doesn't care about realloc because realloc
> has a bad API," and I think this thread (these threads) are just driving
> that point home. If I wanted to reach a good allocator API, "I wouldn't
> start from here." I don't think one can design a good allocator API by
> making a ton of tiny patches on top of `malloc` and `realloc`. You have to
> design the API first, and then show how to implement `malloc` and `realloc`
> in terms of it. (Also, C++ couldn't use it without also redesigning
> `std::allocator`, which is almost just a thin wrapper around `malloc` and
> `free`. `std::allocator` doesn't even have a counterpart to `realloc` at
> the moment.)
Indeed, and that is something the C++ Committee will need to figure out. As it
stands, it won't be able to benefit from realloci. I have a few ideas, which we
can discuss back in std-proposals/discussion.
But QVector, QString and QByteArray will, because we *have* a different
allocator interface that would allow retrying with smaller values and fine-
tuning. In fact, QString and QByteArray already use realloc(), because we know
for sure the container's element type is most trivially relocatable: char and
char16_t. Other C++ libraries not using std::allocator will benefit from this
too. As will C libraries operating on arrays.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-03 0:28 ` Rich Felker
@ 2025-11-03 9:36 ` Alejandro Colomar
2025-11-03 21:28 ` Rich Felker
0 siblings, 1 reply; 116+ messages in thread
From: Alejandro Colomar @ 2025-11-03 9:36 UTC (permalink / raw)
To: Rich Felker
Cc: Thiago Macieira, Florian Weimer, libc-alpha, musl,
Arthur O'Dwyer, Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 1904 bytes --]
Hi Rich,
On Sun, Nov 02, 2025 at 07:28:57PM -0500, Rich Felker wrote:
> On Mon, Nov 03, 2025 at 12:58:39AM +0100, Alejandro Colomar wrote:
> > > All this will need fine-tuning once implementations exist.
> > >
> > > > So, why not require the caller to not ask too much? We could go back to
> > > > reporting an error if there's not enough memory.
> > > >
> > > > Of course, it would still guarantee no errors when shrinking, but
> > > > I think we could error out when growing.
> > >
> > > I'd prefer no errors either way. If there isn't memory to grow the underlying
> > > space (a brk() system call returns ENOMEM), then realloci() returns as much as
> > > it could get but not more.
> >
> > The problem is that this is asking the implementation to speculate.
> >
> > Consider the case that a realloci() implementation knows that the
> > requested size fails. Let's put some arbitrary numbers:
> >
> > old_size = 10000;
> > requested_size = 30000;
> >
> > It knows the block can grow to somewhere between 10000 (which it
> > currently has) and 30000 (the system reported ENOMEM), but now it has
> > the task of allocating as much as it can get. Should it do a binary
> > search of the size? Try 20000, then if it fails try 15000, etc.?
> > That's speculation, and it would make this function too slow.
>
> I don't see any plausible implementation in which this involved a
> binary search. Either you have fixed-size slots in which case you just
> look at the size of the slot to see what the max obtainable is, or you
> have a dlmalloc-like situation where you check the size of the
> adjacent free block (if any) to determine the max obtainable. These
> are O(1) operations.
I was thinking of mremap(2) without MREMAP_MAYMOVE.
Have a lovely day!
Alex
> Rich
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-03 9:36 ` Alejandro Colomar
@ 2025-11-03 21:28 ` Rich Felker
2025-11-03 23:51 ` The 8472
0 siblings, 1 reply; 116+ messages in thread
From: Rich Felker @ 2025-11-03 21:28 UTC (permalink / raw)
To: Alejandro Colomar
Cc: Thiago Macieira, Florian Weimer, libc-alpha, musl,
Arthur O'Dwyer, Jonathan Wakely
On Mon, Nov 03, 2025 at 10:36:07AM +0100, Alejandro Colomar wrote:
> Hi Rich,
>
> On Sun, Nov 02, 2025 at 07:28:57PM -0500, Rich Felker wrote:
> > On Mon, Nov 03, 2025 at 12:58:39AM +0100, Alejandro Colomar wrote:
> > > > All this will need fine-tuning once implementations exist.
> > > >
> > > > > So, why not require the caller to not ask too much? We could go back to
> > > > > reporting an error if there's not enough memory.
> > > > >
> > > > > Of course, it would still guarantee no errors when shrinking, but
> > > > > I think we could error out when growing.
> > > >
> > > > I'd prefer no errors either way. If there isn't memory to grow the underlying
> > > > space (a brk() system call returns ENOMEM), then realloci() returns as much as
> > > > it could get but not more.
> > >
> > > The problem is that this is asking the implementation to speculate.
> > >
> > > Consider the case that a realloci() implementation knows that the
> > > requested size fails. Let's put some arbitrary numbers:
> > >
> > > old_size = 10000;
> > > requested_size = 30000;
> > >
> > > It knows the block can grow to somewhere between 10000 (which it
> > > currently has) and 30000 (the system reported ENOMEM), but now it has
> > > the task of allocating as much as it can get. Should it do a binary
> > > search of the size? Try 20000, then if it fails try 15000, etc.?
> > > That's speculation, and it would make this function too slow.
> >
> > I don't see any plausible implementation in which this involved a
> > binary search. Either you have fixed-size slots in which case you just
> > look at the size of the slot to see what the max obtainable is, or you
> > have a dlmalloc-like situation where you check the size of the
> > adjacent free block (if any) to determine the max obtainable. These
> > are O(1) operations.
>
> I was thinking of mremap(2) without MREMAP_MAYMOVE.
OK, this whole conversation is mixing up unrelated things:
1. In-place realloc to avoid relatively-expensive memcpy
2. In-place realloc to avoid updating pointers
The case where mremap would be used is utterly irrelevant to (1). And
further, the cost of the mremap operation is so high (syscall
overhead, page table/TLB synchronization) that any cost of updating
pointers because the object moved is dwarfed and thereby irrelevant
too.
So I don't see why anyone should care about this case.
Moreover, I see (2) as entirely misguided. The whole provenance model
makes it broken to try to rely on pointer values not changing, and no
code should be trying to do that. A new allocator interface should not
be pandering to this very fragile, very likely to be broken by
compiler transformations, utterly backwards practice. Just treat the
old pointer as invalid and always update like you're supposed to,
regardless of whether the value is different.
Rich
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-03 21:28 ` Rich Felker
@ 2025-11-03 23:51 ` The 8472
2025-11-04 10:31 ` Szabolcs Nagy
2025-11-04 21:01 ` Rich Felker
0 siblings, 2 replies; 116+ messages in thread
From: The 8472 @ 2025-11-03 23:51 UTC (permalink / raw)
To: Rich Felker, Alejandro Colomar
Cc: Thiago Macieira, Florian Weimer, libc-alpha, musl,
Arthur O'Dwyer, Jonathan Wakely
Hello,
On 03/11/2025 22:28, Rich Felker wrote:
> On Mon, Nov 03, 2025 at 10:36:07AM +0100, Alejandro Colomar wrote:
>> Hi Rich,
>>
>> On Sun, Nov 02, 2025 at 07:28:57PM -0500, Rich Felker wrote:
>>> On Mon, Nov 03, 2025 at 12:58:39AM +0100, Alejandro Colomar wrote:
>>>>> All this will need fine-tuning once implementations exist.
>>>>>
>>>>>> So, why not require the caller to not ask too much? We could go back to
>>>>>> reporting an error if there's not enough memory.
>>>>>>
>>>>>> Of course, it would still guarantee no errors when shrinking, but
>>>>>> I think we could error out when growing.
>>>>>
>>>>> I'd prefer no errors either way. If there isn't memory to grow the underlying
>>>>> space (a brk() system call returns ENOMEM), then realloci() returns as much as
>>>>> it could get but not more.
>>>>
>>>> The problem is that this is asking the implementation to speculate.
>>>>
>>>> Consider the case that a realloci() implementation knows that the
>>>> requested size fails. Let's put some arbitrary numbers:
>>>>
>>>> old_size = 10000;
>>>> requested_size = 30000;
>>>>
>>>> It knows the block can grow to somewhere between 10000 (which it
>>>> currently has) and 30000 (the system reported ENOMEM), but now it has
>>>> the task of allocating as much as it can get. Should it do a binary
>>>> search of the size? Try 20000, then if it fails try 15000, etc.?
>>>> That's speculation, and it would make this function too slow.
>>>
>>> I don't see any plausible implementation in which this involved a
>>> binary search. Either you have fixed-size slots in which case you just
>>> look at the size of the slot to see what the max obtainable is, or you
>>> have a dlmalloc-like situation where you check the size of the
>>> adjacent free block (if any) to determine the max obtainable. These
>>> are O(1) operations.
>>
>> I was thinking of mremap(2) without MREMAP_MAYMOVE.
>
> OK, this whole conversation is mixing up unrelated things:
>
> 1. In-place realloc to avoid relatively-expensive memcpy
> 2. In-place realloc to avoid updating pointers
>
> The case where mremap would be used is utterly irrelevant to (1). And
> further, the cost of the mremap operation is so high (syscall
> overhead, page table/TLB synchronization) that any cost of updating
> pointers because the object moved is dwarfed and thereby irrelevant
> too.
>
> So I don't see why anyone should care about this case.
>
> Moreover, I see (2) as entirely misguided. The whole provenance model
> makes it broken to try to rely on pointer values not changing, and no
> code should be trying to do that. A new allocator interface should not
> be pandering to this very fragile, very likely to be broken by
> compiler transformations, utterly backwards practice. Just treat the
> old pointer as invalid and always update like you're supposed to,
> regardless of whether the value is different.
>
> Rich
>
On the Rust side we have uses for both these scenarios, and more.
A) A strictly in-place realloc is useful for collections and
arenas that have outstanding borrows (thus cannot move)
but want to try growing in-place before they have to allocate
another chunk.
For those a metadata-update or mremap without MAYMOVE is fine.
B) Collections that want to resize and can change their pointer
but need custom data movement, i.e. not a plain memcpy from the old
to the new location. A VecDeque that needs to copy its front and
tail to different locations after a resize. A Vec wants to copy
fewer bytes than its allocated size.
In these cases mremap(MREMAP_MAYMOVE) is fine but memcpy should
be avoided and we would fallback to malloc + custom copy operations
+ free.
C) Alignment-changing reallocations, for example to go from a Box<[u8]>
to Box<[f32]>. In those cases mremap and memcpy are both fine but for
large allocations the former would be preferred.
Combinations are also possible, for example converting
a Box<str> to an Arc<str> requires alignment changes and custom
data rearrangment (B + C).
To cover those different cases jemalloc's API[0] seems
like good starting point:
void *rallocx(void *ptr, size_t size, int flags);
size_t xallocx(void *ptr, size_t size, size_t extra, int flags);
xallocx covers the strict in-place reallocation, including making
best-effort extension requests.
reallocx generally allows moving and already allows alignment-changes
through MALLOCX_ALIGN flags. The flags could be further extended
with MAY_MOVE (mremap) and MAY_COPY (memcpy) flags.
The 8472
[0] https://jemalloc.net/jemalloc.3.html
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-03 23:51 ` The 8472
@ 2025-11-04 10:31 ` Szabolcs Nagy
2025-11-04 17:24 ` Thiago Macieira
2025-11-04 21:01 ` Rich Felker
1 sibling, 1 reply; 116+ messages in thread
From: Szabolcs Nagy @ 2025-11-04 10:31 UTC (permalink / raw)
To: The 8472
Cc: Rich Felker, Alejandro Colomar, Thiago Macieira, Florian Weimer,
libc-alpha, musl, Arthur O'Dwyer, Jonathan Wakely
* The 8472 <the8472.rs@infinite-source.de> [2025-11-04 00:51:16 +0100]:
> On 03/11/2025 22:28, Rich Felker wrote:
> >
> > OK, this whole conversation is mixing up unrelated things:
> >
> > 1. In-place realloc to avoid relatively-expensive memcpy
> > 2. In-place realloc to avoid updating pointers
> >
> > The case where mremap would be used is utterly irrelevant to (1). And
> > further, the cost of the mremap operation is so high (syscall
> > overhead, page table/TLB synchronization) that any cost of updating
> > pointers because the object moved is dwarfed and thereby irrelevant
> > too.
> >
> > So I don't see why anyone should care about this case.
> >
> > Moreover, I see (2) as entirely misguided. The whole provenance model
> > makes it broken to try to rely on pointer values not changing, and no
> > code should be trying to do that. A new allocator interface should not
> > be pandering to this very fragile, very likely to be broken by
> > compiler transformations, utterly backwards practice. Just treat the
> > old pointer as invalid and always update like you're supposed to,
> > regardless of whether the value is different.
> >
> > Rich
> >
>
> On the Rust side we have uses for both these scenarios, and more.
>
> A) A strictly in-place realloc is useful for collections and
> arenas that have outstanding borrows (thus cannot move)
> but want to try growing in-place before they have to allocate
> another chunk.
>
> For those a metadata-update or mremap without MAYMOVE is fine.
how useful?
> B) Collections that want to resize and can change their pointer
> but need custom data movement, i.e. not a plain memcpy from the old
> to the new location. A VecDeque that needs to copy its front and
> tail to different locations after a resize. A Vec wants to copy
> fewer bytes than its allocated size.
>
> In these cases mremap(MREMAP_MAYMOVE) is fine but memcpy should
> be avoided and we would fallback to malloc + custom copy operations
> + free.
does it actually help or always falls back?
> C) Alignment-changing reallocations, for example to go from a Box<[u8]>
> to Box<[f32]>. In those cases mremap and memcpy are both fine but for
> large allocations the former would be preferred.
>
>
> Combinations are also possible, for example converting
> a Box<str> to an Arc<str> requires alignment changes and custom
> data rearrangment (B + C).
are these common operations?
> To cover those different cases jemalloc's API[0] seems
> like good starting point:
>
> void *rallocx(void *ptr, size_t size, int flags);
> size_t xallocx(void *ptr, size_t size, size_t extra, int flags);
>
>
> xallocx covers the strict in-place reallocation, including making
> best-effort extension requests.
>
> reallocx generally allows moving and already allows alignment-changes
> through MALLOCX_ALIGN flags. The flags could be further extended
> with MAY_MOVE (mremap) and MAY_COPY (memcpy) flags.
is there an analysis with actual numbers so one
can see the tradeoffs instead of speculating?
we know that users think a new api would be
useful, but some evidence is needed.
>
>
> The 8472
>
>
> [0] https://jemalloc.net/jemalloc.3.html
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-04 10:31 ` Szabolcs Nagy
@ 2025-11-04 17:24 ` Thiago Macieira
2025-11-04 20:46 ` Thiago Macieira
0 siblings, 1 reply; 116+ messages in thread
From: Thiago Macieira @ 2025-11-04 17:24 UTC (permalink / raw)
To: The 8472, Rich Felker, Alejandro Colomar, Thiago Macieira,
Florian Weimer, libc-alpha, musl, Arthur O'Dwyer,
Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 1762 bytes --]
On Tuesday, 4 November 2025 02:31:38 Pacific Standard Time Szabolcs Nagy wrote:
> is there an analysis with actual numbers so one
> can see the tradeoffs instead of speculating?
> we know that users think a new api would be
> useful, but some evidence is needed.
I'm modifying Qt to insert a "reallocateInPlace" call where it would
eventually be, with an out-of-line function that currently simply fails all
the time because I neither have xallocx() nor realloci(). In spite of that,
disassembly shows the compiler is not inlining it even in LTO mode.
In fact, just by looking at the places where it is being called from reveals
the data types that could benefit from this. And some that could benefit from
realloc() in the first place, if only we could declare & detect the type in
question bitwise trivially relocatable.
A quick check with the qtdiag application shows it did get called:
Thread 1 "qtdiag" hit Breakpoint 1, QArrayData::reallocateInPlace
(data=0x5555555bdc30, objectSize=48, alignment=16, capacity=3,
option=QArrayData::Grow)
Thread 1 "qtdiag" hit Breakpoint 1, QArrayData::reallocateInPlace
(data=0x5555555c8920, objectSize=48, alignment=16, capacity=6,
option=QArrayData::Grow)
Thread 1 "qtdiag" hit Breakpoint 1, QArrayData::reallocateInPlace
(data=0x5555555ccf50, objectSize=48, alignment=16, capacity=11,
option=QArrayData::Grow)
Thread 1 "qtdiag" hit Breakpoint 1, QArrayData::reallocateInPlace
(data=0x5555555cd600, objectSize=48, alignment=16, capacity=22,
option=QArrayData::Grow)
I'll have more information for a more complex application once I finish
rebuilding everything.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-04 17:24 ` Thiago Macieira
@ 2025-11-04 20:46 ` Thiago Macieira
0 siblings, 0 replies; 116+ messages in thread
From: Thiago Macieira @ 2025-11-04 20:46 UTC (permalink / raw)
To: The 8472, Rich Felker, Alejandro Colomar, Florian Weimer,
libc-alpha, musl, Arthur O'Dwyer, Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 2197 bytes --]
On Tuesday, 4 November 2025 09:24:25 Pacific Standard Time Thiago Macieira
wrote:
> I'll have more information for a more complex application once I finish
> rebuilding everything.
Here are some results with Qt Creator, which is the most complex Qt
application I can test on the drop of a dime.
It called the new function 119 times just to print --help. When I launch it
and immediately close its window, it got called 1152 times.
Looking at those first 119 calls, my previous observation stands: quite a few
of those *could* have been realloc(), but aren't because we lack in C++ a way
to detect that the type could be memcpy'ed about and still be valid (and won't
have one until C++29). Qt has a mechanism to opt-in, but the authors of these
types haven't.
This means this count does not include the reallocations that did use
realloc() because the types used the Qt opt-in, and that includes very common
containers like QStringList. So while this new function would be useful for
Qt-based applications, it would be *more* useful for non-Qt C++ ones
(presuming the C++ Standard finds a way to do so), especially because
std::string is not guaranteed to work after being memcpy()ed around.
What I can't tell is whether a realloci() call would have succeeded. These
calls happen when the container is growing, meaning it's about to create one
or more non-trivial objects in the allocated memory, and the chances are
really good that such objects will themselves allocate memory. For the first
couple of growths, the new elements' memory use will likely prevent the
container from extending in size. However, once the container has grown past a
certain size and freed previous allocations, there may be heap space for the
new objects to be created without occupying memory next to the container.
Similarly, once the heap is sufficiently fragmented in a running application, an
array of sufficient size will necessitate using a sufficiently large free region
in the heap, but the objects' allocations can fit other, smaller free spaces.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-03 23:51 ` The 8472
2025-11-04 10:31 ` Szabolcs Nagy
@ 2025-11-04 21:01 ` Rich Felker
2025-11-05 0:37 ` Demi Marie Obenour
1 sibling, 1 reply; 116+ messages in thread
From: Rich Felker @ 2025-11-04 21:01 UTC (permalink / raw)
To: The 8472
Cc: Alejandro Colomar, Thiago Macieira, Florian Weimer, libc-alpha,
musl, Arthur O'Dwyer, Jonathan Wakely
On Tue, Nov 04, 2025 at 12:51:16AM +0100, The 8472 wrote:
> Hello,
>
> On 03/11/2025 22:28, Rich Felker wrote:
> > On Mon, Nov 03, 2025 at 10:36:07AM +0100, Alejandro Colomar wrote:
> > > Hi Rich,
> > >
> > > On Sun, Nov 02, 2025 at 07:28:57PM -0500, Rich Felker wrote:
> > > > On Mon, Nov 03, 2025 at 12:58:39AM +0100, Alejandro Colomar wrote:
> > > > > > All this will need fine-tuning once implementations exist.
> > > > > >
> > > > > > > So, why not require the caller to not ask too much? We could go back to
> > > > > > > reporting an error if there's not enough memory.
> > > > > > >
> > > > > > > Of course, it would still guarantee no errors when shrinking, but
> > > > > > > I think we could error out when growing.
> > > > > >
> > > > > > I'd prefer no errors either way. If there isn't memory to grow the underlying
> > > > > > space (a brk() system call returns ENOMEM), then realloci() returns as much as
> > > > > > it could get but not more.
> > > > >
> > > > > The problem is that this is asking the implementation to speculate.
> > > > >
> > > > > Consider the case that a realloci() implementation knows that the
> > > > > requested size fails. Let's put some arbitrary numbers:
> > > > >
> > > > > old_size = 10000;
> > > > > requested_size = 30000;
> > > > >
> > > > > It knows the block can grow to somewhere between 10000 (which it
> > > > > currently has) and 30000 (the system reported ENOMEM), but now it has
> > > > > the task of allocating as much as it can get. Should it do a binary
> > > > > search of the size? Try 20000, then if it fails try 15000, etc.?
> > > > > That's speculation, and it would make this function too slow.
> > > >
> > > > I don't see any plausible implementation in which this involved a
> > > > binary search. Either you have fixed-size slots in which case you just
> > > > look at the size of the slot to see what the max obtainable is, or you
> > > > have a dlmalloc-like situation where you check the size of the
> > > > adjacent free block (if any) to determine the max obtainable. These
> > > > are O(1) operations.
> > >
> > > I was thinking of mremap(2) without MREMAP_MAYMOVE.
> >
> > OK, this whole conversation is mixing up unrelated things:
> >
> > 1. In-place realloc to avoid relatively-expensive memcpy
> > 2. In-place realloc to avoid updating pointers
> >
> > The case where mremap would be used is utterly irrelevant to (1). And
> > further, the cost of the mremap operation is so high (syscall
> > overhead, page table/TLB synchronization) that any cost of updating
> > pointers because the object moved is dwarfed and thereby irrelevant
> > too.
> >
> > So I don't see why anyone should care about this case.
> >
> > Moreover, I see (2) as entirely misguided. The whole provenance model
> > makes it broken to try to rely on pointer values not changing, and no
> > code should be trying to do that. A new allocator interface should not
> > be pandering to this very fragile, very likely to be broken by
> > compiler transformations, utterly backwards practice. Just treat the
> > old pointer as invalid and always update like you're supposed to,
> > regardless of whether the value is different.
> >
> > Rich
> >
>
> On the Rust side we have uses for both these scenarios, and more.
>
> A) A strictly in-place realloc is useful for collections and
> arenas that have outstanding borrows (thus cannot move)
> but want to try growing in-place before they have to allocate
> another chunk.
This "useful" needs to be quantified. Only in very very rare cases
will in-place expansion even be possible. The vast majority of the
time, you must allocate another discontiguous chunk to meet the above
contractual obligation anyway.
> For those a metadata-update or mremap without MAYMOVE is fine.
>
> B) Collections that want to resize and can change their pointer
> but need custom data movement, i.e. not a plain memcpy from the old
> to the new location. A VecDeque that needs to copy its front and
> tail to different locations after a resize. A Vec wants to copy
> fewer bytes than its allocated size.
>
> In these cases mremap(MREMAP_MAYMOVE) is fine but memcpy should
> be avoided and we would fallback to malloc + custom copy operations
> + free.
malloc + custom copy + free sounds like it's always the right
solution. Simply ignore the existence of realloc entirely. It's not
useful.
> C) Alignment-changing reallocations, for example to go from a Box<[u8]>
> to Box<[f32]>. In those cases mremap and memcpy are both fine but for
> large allocations the former would be preferred.
Why is alignment-changing reallocation a thing? Why would the type of
an object change like this? Even if it does, all allocations are
always inherently aligned sufficiently for the alignment requirement
of any non-over-aligned type.
Rich
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-04 21:01 ` Rich Felker
@ 2025-11-05 0:37 ` Demi Marie Obenour
2025-11-05 4:56 ` Rich Felker
2025-11-06 18:03 ` James Y Knight
0 siblings, 2 replies; 116+ messages in thread
From: Demi Marie Obenour @ 2025-11-05 0:37 UTC (permalink / raw)
To: musl, Rich Felker, The 8472
Cc: Alejandro Colomar, Thiago Macieira, Florian Weimer, libc-alpha,
Arthur O'Dwyer, Jonathan Wakely
[-- Attachment #1.1.1: Type: text/plain, Size: 5500 bytes --]
On 11/4/25 16:01, Rich Felker wrote:
> On Tue, Nov 04, 2025 at 12:51:16AM +0100, The 8472 wrote:
>> Hello,
>>
>> On 03/11/2025 22:28, Rich Felker wrote:
>>> On Mon, Nov 03, 2025 at 10:36:07AM +0100, Alejandro Colomar wrote:
>>>> Hi Rich,
>>>>
>>>> On Sun, Nov 02, 2025 at 07:28:57PM -0500, Rich Felker wrote:
>>>>> On Mon, Nov 03, 2025 at 12:58:39AM +0100, Alejandro Colomar wrote:
>>>>>>> All this will need fine-tuning once implementations exist.
>>>>>>>
>>>>>>>> So, why not require the caller to not ask too much? We could go back to
>>>>>>>> reporting an error if there's not enough memory.
>>>>>>>>
>>>>>>>> Of course, it would still guarantee no errors when shrinking, but
>>>>>>>> I think we could error out when growing.
>>>>>>>
>>>>>>> I'd prefer no errors either way. If there isn't memory to grow the underlying
>>>>>>> space (a brk() system call returns ENOMEM), then realloci() returns as much as
>>>>>>> it could get but not more.
>>>>>>
>>>>>> The problem is that this is asking the implementation to speculate.
>>>>>>
>>>>>> Consider the case that a realloci() implementation knows that the
>>>>>> requested size fails. Let's put some arbitrary numbers:
>>>>>>
>>>>>> old_size = 10000;
>>>>>> requested_size = 30000;
>>>>>>
>>>>>> It knows the block can grow to somewhere between 10000 (which it
>>>>>> currently has) and 30000 (the system reported ENOMEM), but now it has
>>>>>> the task of allocating as much as it can get. Should it do a binary
>>>>>> search of the size? Try 20000, then if it fails try 15000, etc.?
>>>>>> That's speculation, and it would make this function too slow.
>>>>>
>>>>> I don't see any plausible implementation in which this involved a
>>>>> binary search. Either you have fixed-size slots in which case you just
>>>>> look at the size of the slot to see what the max obtainable is, or you
>>>>> have a dlmalloc-like situation where you check the size of the
>>>>> adjacent free block (if any) to determine the max obtainable. These
>>>>> are O(1) operations.
>>>>
>>>> I was thinking of mremap(2) without MREMAP_MAYMOVE.
>>>
>>> OK, this whole conversation is mixing up unrelated things:
>>>
>>> 1. In-place realloc to avoid relatively-expensive memcpy
>>> 2. In-place realloc to avoid updating pointers
>>>
>>> The case where mremap would be used is utterly irrelevant to (1). And
>>> further, the cost of the mremap operation is so high (syscall
>>> overhead, page table/TLB synchronization) that any cost of updating
>>> pointers because the object moved is dwarfed and thereby irrelevant
>>> too.
>>>
>>> So I don't see why anyone should care about this case.
>>>
>>> Moreover, I see (2) as entirely misguided. The whole provenance model
>>> makes it broken to try to rely on pointer values not changing, and no
>>> code should be trying to do that. A new allocator interface should not
>>> be pandering to this very fragile, very likely to be broken by
>>> compiler transformations, utterly backwards practice. Just treat the
>>> old pointer as invalid and always update like you're supposed to,
>>> regardless of whether the value is different.
>>>
>>> Rich
>>>
>>
>> On the Rust side we have uses for both these scenarios, and more.
>>
>> A) A strictly in-place realloc is useful for collections and
>> arenas that have outstanding borrows (thus cannot move)
>> but want to try growing in-place before they have to allocate
>> another chunk.
>
> This "useful" needs to be quantified. Only in very very rare cases
> will in-place expansion even be possible. The vast majority of the
> time, you must allocate another discontiguous chunk to meet the above
> contractual obligation anyway.
Would it be better to provide an allocation API that returns the amount
of memory actually allocated? That would at least allow any padding at
the end of the allocation to be used instead of being wasted.
>> For those a metadata-update or mremap without MAYMOVE is fine.
>>
>> B) Collections that want to resize and can change their pointer
>> but need custom data movement, i.e. not a plain memcpy from the old
>> to the new location. A VecDeque that needs to copy its front and
>> tail to different locations after a resize. A Vec wants to copy
>> fewer bytes than its allocated size.
>>
>> In these cases mremap(MREMAP_MAYMOVE) is fine but memcpy should
>> be avoided and we would fallback to malloc + custom copy operations
>> + free.
>
> malloc + custom copy + free sounds like it's always the right
> solution. Simply ignore the existence of realloc entirely. It's not
> useful.
For large allocations, one can do this via mremap(MREMAP_MAYMOVE),
which uses page table swizzling rather than a copy. However, there
might need to be some trickery to avoid TLB shootdowns. In theory,
the TLB shootdown can be delayed until the virtual address that was
moved from is reused. I don’t know if Linux implements this.
>> C) Alignment-changing reallocations, for example to go from a Box<[u8]>
>> to Box<[f32]>. In those cases mremap and memcpy are both fine but for
>> large allocations the former would be preferred.
>
> Why is alignment-changing reallocation a thing? Why would the type of
> an object change like this? Even if it does, all allocations are
> always inherently aligned sufficiently for the alignment requirement
> of any non-over-aligned type.
>
> Rich
--
Sincerely,
Demi Marie Obenour (she/her/hers)
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-05 0:37 ` Demi Marie Obenour
@ 2025-11-05 4:56 ` Rich Felker
2025-11-05 11:24 ` Alejandro Colomar
2025-11-06 18:03 ` James Y Knight
1 sibling, 1 reply; 116+ messages in thread
From: Rich Felker @ 2025-11-05 4:56 UTC (permalink / raw)
To: Demi Marie Obenour
Cc: musl, The 8472, Alejandro Colomar, Thiago Macieira,
Florian Weimer, libc-alpha, Arthur O'Dwyer, Jonathan Wakely
On Tue, Nov 04, 2025 at 07:37:41PM -0500, Demi Marie Obenour wrote:
> On 11/4/25 16:01, Rich Felker wrote:
> > On Tue, Nov 04, 2025 at 12:51:16AM +0100, The 8472 wrote:
> >> Hello,
> >>
> >> On 03/11/2025 22:28, Rich Felker wrote:
> >>> On Mon, Nov 03, 2025 at 10:36:07AM +0100, Alejandro Colomar wrote:
> >>>> Hi Rich,
> >>>>
> >>>> On Sun, Nov 02, 2025 at 07:28:57PM -0500, Rich Felker wrote:
> >>>>> On Mon, Nov 03, 2025 at 12:58:39AM +0100, Alejandro Colomar wrote:
> >>>>>>> All this will need fine-tuning once implementations exist.
> >>>>>>>
> >>>>>>>> So, why not require the caller to not ask too much? We could go back to
> >>>>>>>> reporting an error if there's not enough memory.
> >>>>>>>>
> >>>>>>>> Of course, it would still guarantee no errors when shrinking, but
> >>>>>>>> I think we could error out when growing.
> >>>>>>>
> >>>>>>> I'd prefer no errors either way. If there isn't memory to grow the underlying
> >>>>>>> space (a brk() system call returns ENOMEM), then realloci() returns as much as
> >>>>>>> it could get but not more.
> >>>>>>
> >>>>>> The problem is that this is asking the implementation to speculate.
> >>>>>>
> >>>>>> Consider the case that a realloci() implementation knows that the
> >>>>>> requested size fails. Let's put some arbitrary numbers:
> >>>>>>
> >>>>>> old_size = 10000;
> >>>>>> requested_size = 30000;
> >>>>>>
> >>>>>> It knows the block can grow to somewhere between 10000 (which it
> >>>>>> currently has) and 30000 (the system reported ENOMEM), but now it has
> >>>>>> the task of allocating as much as it can get. Should it do a binary
> >>>>>> search of the size? Try 20000, then if it fails try 15000, etc.?
> >>>>>> That's speculation, and it would make this function too slow.
> >>>>>
> >>>>> I don't see any plausible implementation in which this involved a
> >>>>> binary search. Either you have fixed-size slots in which case you just
> >>>>> look at the size of the slot to see what the max obtainable is, or you
> >>>>> have a dlmalloc-like situation where you check the size of the
> >>>>> adjacent free block (if any) to determine the max obtainable. These
> >>>>> are O(1) operations.
> >>>>
> >>>> I was thinking of mremap(2) without MREMAP_MAYMOVE.
> >>>
> >>> OK, this whole conversation is mixing up unrelated things:
> >>>
> >>> 1. In-place realloc to avoid relatively-expensive memcpy
> >>> 2. In-place realloc to avoid updating pointers
> >>>
> >>> The case where mremap would be used is utterly irrelevant to (1). And
> >>> further, the cost of the mremap operation is so high (syscall
> >>> overhead, page table/TLB synchronization) that any cost of updating
> >>> pointers because the object moved is dwarfed and thereby irrelevant
> >>> too.
> >>>
> >>> So I don't see why anyone should care about this case.
> >>>
> >>> Moreover, I see (2) as entirely misguided. The whole provenance model
> >>> makes it broken to try to rely on pointer values not changing, and no
> >>> code should be trying to do that. A new allocator interface should not
> >>> be pandering to this very fragile, very likely to be broken by
> >>> compiler transformations, utterly backwards practice. Just treat the
> >>> old pointer as invalid and always update like you're supposed to,
> >>> regardless of whether the value is different.
> >>>
> >>> Rich
> >>>
> >>
> >> On the Rust side we have uses for both these scenarios, and more.
> >>
> >> A) A strictly in-place realloc is useful for collections and
> >> arenas that have outstanding borrows (thus cannot move)
> >> but want to try growing in-place before they have to allocate
> >> another chunk.
> >
> > This "useful" needs to be quantified. Only in very very rare cases
> > will in-place expansion even be possible. The vast majority of the
> > time, you must allocate another discontiguous chunk to meet the above
> > contractual obligation anyway.
>
> Would it be better to provide an allocation API that returns the amount
> of memory actually allocated? That would at least allow any padding at
> the end of the allocation to be used instead of being wasted.
No, that was a mistake made long ago with malloc_usable_size and
wrongly equating "amount actually allocated" with "amount we could
enlarge the allocation up to without running into something else".
Equating them is wrong because (1) the compiler will rightly treat
accesses beyond the requested size at allocation time, even if
malloc_usable_size reports more and the allocator implementation lets
you use more, as UB, and (2) allowing the use of "extra space" here
precludes detecting overflows. It really is necessary, if you want to
allow the application to use extra space, to have some interface by
which it's requested. However, I don't think anyone has made a
compelling case yet that doing this is useful enough to be worth the
trouble and possible unforseen bad consequences -- keep in mind the
bad consequences of malloc_usable_size were entirely unseen at the
time it was sloppily introduced.
Rich
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-05 4:56 ` Rich Felker
@ 2025-11-05 11:24 ` Alejandro Colomar
2025-11-05 17:38 ` Thiago Macieira
0 siblings, 1 reply; 116+ messages in thread
From: Alejandro Colomar @ 2025-11-05 11:24 UTC (permalink / raw)
To: Rich Felker
Cc: Demi Marie Obenour, musl, The 8472, Thiago Macieira,
Florian Weimer, libc-alpha, Arthur O'Dwyer, Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 2387 bytes --]
Hi Rich, Demi,
On Tue, Nov 04, 2025 at 11:56:54PM -0500, Rich Felker wrote:
> On Tue, Nov 04, 2025 at 07:37:41PM -0500, Demi Marie Obenour wrote:
> > > This "useful" needs to be quantified. Only in very very rare cases
> > > will in-place expansion even be possible. The vast majority of the
> > > time, you must allocate another discontiguous chunk to meet the above
> > > contractual obligation anyway.
> >
> > Would it be better to provide an allocation API that returns the amount
> > of memory actually allocated? That would at least allow any padding at
> > the end of the allocation to be used instead of being wasted.
>
> No, that was a mistake made long ago with malloc_usable_size and
> wrongly equating "amount actually allocated" with "amount we could
> enlarge the allocation up to without running into something else".
Agree. If an API tells the size of the block, it must internally set
the size of the block to that. That way, there's an opt-in to the
decreased safety. For most users, you want to use only what you asked
for, which makes _FORTIFY_SOURCE and other tools provide safety for your
code.
> Equating them is wrong because (1) the compiler will rightly treat
> accesses beyond the requested size at allocation time, even if
> malloc_usable_size reports more and the allocator implementation lets
> you use more, as UB, and (2) allowing the use of "extra space" here
> precludes detecting overflows. It really is necessary, if you want to
> allow the application to use extra space, to have some interface by
> which it's requested. However, I don't think anyone has made a
> compelling case yet that doing this is useful enough to be worth the
> trouble and possible unforseen bad consequences -- keep in mind the
> bad consequences of malloc_usable_size were entirely unseen at the
> time it was sloppily introduced.
Agree. Before adding realloci(), I'd like to see numbers. An
interesting thing to do would be to see some application that uses
realloc(3) currently, and check how often realloc(3) doesn't move the
object. (To test that without UB, one needs to do the trick with
uintptr_t.)
Otherwise, we should presume that realloci() would fail always, and
wouldn't be useful.
Have a lovely day!
Alex
>
> Rich
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-05 11:24 ` Alejandro Colomar
@ 2025-11-05 17:38 ` Thiago Macieira
2025-11-06 21:53 ` Alejandro Colomar
0 siblings, 1 reply; 116+ messages in thread
From: Thiago Macieira @ 2025-11-05 17:38 UTC (permalink / raw)
To: Rich Felker, Alejandro Colomar
Cc: Demi Marie Obenour, musl, The 8472, Florian Weimer, libc-alpha,
Arthur O'Dwyer, Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 1699 bytes --]
On Wednesday, 5 November 2025 03:24:14 Pacific Standard Time Alejandro Colomar
wrote:
> Agree. Before adding realloci(), I'd like to see numbers. An
> interesting thing to do would be to see some application that uses
> realloc(3) currently, and check how often realloc(3) doesn't move the
> object. (To test that without UB, one needs to do the trick with
> uintptr_t.)
I can give you some numbers for realloc() success, with the understanding that
they are NOT AT ALL the cases where realloci() would be used. I can't emit a
realloc() call that MAY relocate an an array of objects, if the objects don't
allow relocating. That would crash the application on the first time realloc()
did relocate. So I can only do it for the cases where the implementation would
keep using realloc() in the future.
The only way to test how often realloci() would succeed is to have realloci().
Anyway, my test is running qtcreator. Remember that it called the placeholder
function that would call realloci() 119 times just for running --help.
It also called QArrayData::reallocateUnaligned() with objectSize > 4 a total
of 1577 times, in 115 of which realloc() returned the same pointer (7.3%). In
fact, a quick glimpse of the gdb output shows that the same pointer succeed in
growing more than once in a row: I see a pointer at least 4x twice in a row,
with an increased capacity parameter, suggesting that it's the same array.
In the full run (without --help), realloc() extended in place 1045 out of
10088 calls, increasing to 10.3% extension success rate.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-05 0:37 ` Demi Marie Obenour
2025-11-05 4:56 ` Rich Felker
@ 2025-11-06 18:03 ` James Y Knight
2025-11-06 21:49 ` Alejandro Colomar
1 sibling, 1 reply; 116+ messages in thread
From: James Y Knight @ 2025-11-06 18:03 UTC (permalink / raw)
To: musl
Cc: Rich Felker, The 8472, Alejandro Colomar, Thiago Macieira,
Florian Weimer, libc-alpha, Arthur O'Dwyer, Jonathan Wakely
On Tue, Nov 4, 2025 at 7:38 PM Demi Marie Obenour <demiobenour@gmail.com> wrote:
>
> Would it be better to provide an allocation API that returns the amount
> of memory actually allocated? That would at least allow any padding at
> the end of the allocation to be used instead of being wasted.
Yes, a new malloc variant with direct size feedback would be a much
better idea than a new non-moving realloc variant. And, for people who
care about usefulness for C++: that's already the direction C++
started to go.
Crucially, the new function must return both the pointer and the
newly-allocated-size (which must be at least the requested size but
could be more). Because this is a new API, it is explicitly opt-in for
callers who know they _want_ to be able to grow into the remainder of
an implementations rounded-up allocation-bucket size, and therefore
would not trigger UB concerns or reduce the ability to detect
overflows from "normal" fixed-size allocations done with malloc.
This has been covered before in C++ standards proposals:
https://wg21.link/p0401 is already in C++23. It adds an API
`std::allocation_result<T*, std::size_t> allocate_at_least(std::size_t
n);" to the allocator template interface. libcxx defines this, and its
standard-library containers e.g. vector call it.
Unfortunately, the default allocate_at_least currently only ever
returns the exact requested-size, because there's currently no
standard allocation API to request size-feedback from the default
system allocator. But, if you provide a user-defined allocator
template argument to the container (e.g. via "std::vector<int,
MyCustomAllocator>"), you can use this functionality today.
To address the desire to have it by default, there is a second
proposal: https://wg21.link/p0901, which proposes to add a new
"operator new" overload (again, takes a requested size, returns
pointer and actually-allocated size), that the default
"allocate_at_least" should call. Unfortunately, that is NOT in C++
yet. (I believe the last action was that there was mostly support for
the feature but the proposal needed some tweaks).
It would be useful to have a C API for this if P0901 ends up in the
C++ spec, because the default C++ standard library "operator new"
implementations typically just call libc malloc.
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-06 18:03 ` James Y Knight
@ 2025-11-06 21:49 ` Alejandro Colomar
2025-11-06 23:10 ` Michael Winterberg
2025-11-11 20:36 ` James Y Knight
0 siblings, 2 replies; 116+ messages in thread
From: Alejandro Colomar @ 2025-11-06 21:49 UTC (permalink / raw)
To: James Y Knight
Cc: musl, Rich Felker, The 8472, Thiago Macieira, Florian Weimer,
libc-alpha, Arthur O'Dwyer, Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 3871 bytes --]
Hi James,
On Thu, Nov 06, 2025 at 01:03:52PM -0500, James Y Knight wrote:
> On Tue, Nov 4, 2025 at 7:38 PM Demi Marie Obenour <demiobenour@gmail.com> wrote:
> >
> > Would it be better to provide an allocation API that returns the amount
> > of memory actually allocated? That would at least allow any padding at
> > the end of the allocation to be used instead of being wasted.
>
> Yes, a new malloc variant with direct size feedback would be a much
> better idea than a new non-moving realloc variant. And, for people who
> care about usefulness for C++: that's already the direction C++
> started to go.
I disagree. Most users don't need this, and should use this.
malloc(3) is already a special case of realloc(3), but it's useful
enough that it warrants a separate function.
However, for the case of realloci(), I don't see the usefulness of a
hypothetical malloci() to be enough to justify it.
After all, you can do this:
p = malloc(size);
if (p == NULL)
goto fail;
actual_size = realloci(p, size);
And the realloci() call should be cheap compared to malloc(3). There's
absolutely no need to conflate this into a new function. realloci()
should be enough.
If you'll do that often enough, feel free to wrap it yourself:
void *
malloci(size_t *size)
{
ssize_t s;
void *p;
p = malloc(*size);
if (p == NULL)
return NULL;
s = realloci(*size);
if (s != -1)
*size = s;
return p;
}
But, there's no need to rush the growth. It should be fine to wait
until you need to grow and then call realloci().
> Crucially, the new function must return both the pointer and the
> newly-allocated-size (which must be at least the requested size but
> could be more). Because this is a new API, it is explicitly opt-in for
> callers who know they _want_ to be able to grow into the remainder of
> an implementations rounded-up allocation-bucket size, and therefore
> would not trigger UB concerns or reduce the ability to detect
> overflows from "normal" fixed-size allocations done with malloc.
>
> This has been covered before in C++ standards proposals:
> https://wg21.link/p0401 is already in C++23. It adds an API
> `std::allocation_result<T*, std::size_t> allocate_at_least(std::size_t
> n);" to the allocator template interface. libcxx defines this, and its
> standard-library containers e.g. vector call it.
>
> Unfortunately, the default allocate_at_least currently only ever
> returns the exact requested-size, because there's currently no
> standard allocation API to request size-feedback from the default
> system allocator. But, if you provide a user-defined allocator
> template argument to the container (e.g. via "std::vector<int,
> MyCustomAllocator>"), you can use this functionality today.
>
> To address the desire to have it by default, there is a second
> proposal: https://wg21.link/p0901, which proposes to add a new
> "operator new" overload (again, takes a requested size, returns
> pointer and actually-allocated size), that the default
> "allocate_at_least" should call. Unfortunately, that is NOT in C++
> yet. (I believe the last action was that there was mostly support for
> the feature but the proposal needed some tweaks).
>
> It would be useful to have a C API for this if P0901 ends up in the
> C++ spec, because the default C++ standard library "operator new"
> implementations typically just call libc malloc.
I don't see this being used in C, so I think the C implementation should
be just enough to allow C++ to do their thing. For that, realloci()
would be enough. C++ can call it immediately after malloc(3) if needed,
and can wrap it with the malloci() from above if they want.
Have a lovely night!
Alex
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-05 17:38 ` Thiago Macieira
@ 2025-11-06 21:53 ` Alejandro Colomar
0 siblings, 0 replies; 116+ messages in thread
From: Alejandro Colomar @ 2025-11-06 21:53 UTC (permalink / raw)
To: Thiago Macieira
Cc: Rich Felker, Demi Marie Obenour, musl, The 8472, Florian Weimer,
libc-alpha, Arthur O'Dwyer, Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 1991 bytes --]
Hi Thiago,
On Wed, Nov 05, 2025 at 09:38:42AM -0800, Thiago Macieira wrote:
> On Wednesday, 5 November 2025 03:24:14 Pacific Standard Time Alejandro Colomar
> wrote:
> > Agree. Before adding realloci(), I'd like to see numbers. An
> > interesting thing to do would be to see some application that uses
> > realloc(3) currently, and check how often realloc(3) doesn't move the
> > object. (To test that without UB, one needs to do the trick with
> > uintptr_t.)
>
> I can give you some numbers for realloc() success, with the understanding that
> they are NOT AT ALL the cases where realloci() would be used. I can't emit a
> realloc() call that MAY relocate an an array of objects, if the objects don't
> allow relocating. That would crash the application on the first time realloc()
> did relocate. So I can only do it for the cases where the implementation would
> keep using realloc() in the future.
>
> The only way to test how often realloci() would succeed is to have realloci().
>
> Anyway, my test is running qtcreator. Remember that it called the placeholder
> function that would call realloci() 119 times just for running --help.
>
> It also called QArrayData::reallocateUnaligned() with objectSize > 4 a total
> of 1577 times, in 115 of which realloc() returned the same pointer (7.3%). In
> fact, a quick glimpse of the gdb output shows that the same pointer succeed in
> growing more than once in a row: I see a pointer at least 4x twice in a row,
> with an increased capacity parameter, suggesting that it's the same array.
>
> In the full run (without --help), realloc() extended in place 1045 out of
> 10088 calls, increasing to 10.3% extension success rate.
Thanks! 10% sounds useful enough IMO. That's motivation enough for me
to write the proposal for ISO C, and continue working on the musl
patches.
Have a lovely night!
Alex
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-06 21:49 ` Alejandro Colomar
@ 2025-11-06 23:10 ` Michael Winterberg
2025-11-07 15:33 ` Rich Felker
2025-11-11 20:36 ` James Y Knight
1 sibling, 1 reply; 116+ messages in thread
From: Michael Winterberg @ 2025-11-06 23:10 UTC (permalink / raw)
To: musl
> But, there's no need to rush the growth. It should be fine to wait
> until you need to grow and then call realloci().
>
How many extant allocators actually "grow" beyond their result for
malloc_usable_size?
i.e. if Thiago replaced initial allocations with this,
void* malloc_size_feedback(size_t size, size_t* actual) {
*actual = 0;
void* p = malloc(size);
if (p != 0) {
*actual = malloc_usable_size(p);
p = realloc(p, *actual);
}
return p;
}
would there still be a 10% hit rate on reuse?
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-06 23:10 ` Michael Winterberg
@ 2025-11-07 15:33 ` Rich Felker
0 siblings, 0 replies; 116+ messages in thread
From: Rich Felker @ 2025-11-07 15:33 UTC (permalink / raw)
To: Michael Winterberg; +Cc: musl
On Thu, Nov 06, 2025 at 03:10:03PM -0800, Michael Winterberg wrote:
> > But, there's no need to rush the growth. It should be fine to wait
> > until you need to grow and then call realloci().
> >
>
> How many extant allocators actually "grow" beyond their result for
> malloc_usable_size?
>
> i.e. if Thiago replaced initial allocations with this,
>
> void* malloc_size_feedback(size_t size, size_t* actual) {
> *actual = 0;
> void* p = malloc(size);
> if (p != 0) {
> *actual = malloc_usable_size(p);
> p = realloc(p, *actual);
> }
> return p;
> }
>
> would there still be a 10% hit rate on reuse?
I can't speak for others, but on musl/mallocng, *actual==size is
guaranteed and the realloc above is a no-op.
If malloc_usable_size(p) returns anything greater than size and the
caller attempts to access past size, compilers *will* see this, note
the UB, and optimize accordingly. This is (among other less critical
reasons) why we make the guarantee.
Rich
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-10-31 20:33 ` Paul Eggert
2025-10-31 21:14 ` Thiago Macieira
@ 2025-11-09 11:37 ` Alejandro Colomar
2025-11-09 15:31 ` Paul Eggert
1 sibling, 1 reply; 116+ messages in thread
From: Alejandro Colomar @ 2025-11-09 11:37 UTC (permalink / raw)
To: Paul Eggert
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney, Thiago Macieira
[-- Attachment #1: Type: text/plain, Size: 2026 bytes --]
Hi Paul,
On Fri, Oct 31, 2025 at 02:33:22PM -0600, Paul Eggert wrote:
> On 10/31/25 14:13, Alejandro Colomar wrote:
>
> > Consider that realloci() would be significantly cheaper than realloc(3),
>
> Not in the case where the object doesn't move: they should be about the same
> speed. And when the object grows so much that it does need to move, the V7
> realloc approach should be a bit faster because you need to make just one
> call into the memory subsystem, not three (realloci + malloc + free).
>
> > That would make sanitizers and static analyzers unable to verify lots of
> > code
> No, just the opposite. Currently sanitizers etc. spend useless work checking
> for C23 rules that don't correspond to any hardware or correctness needs;
> they're simply rules imposed by the C committee. This checking is
> counterproductive to real-world software development.
I'm worried that it might decrease the ability of static analyzers to
detect memory leaks. Currently, a static analyzer (such as GCC's
-fanalyzer) can see calls to [[gnu::malloc(realloc, 1)]] functions and
assume that realloc(3) free's them. If realloc(3) would only free(3)
conditionally, then you couldn't apply that attribute, which would make
analysis more difficult.
Have a lovely day!
Alex
> If we fixed the realloc spec to better match how actual production hardware
> behaves, we could fix sanitizers to spend their time flagging real bugs
> instead of wasting their time (and developers' time) generating false
> alarms.
>
> > I wouldn't categorize it as hard to explain:
> Oh, it's not hard to specify a realloci API, or to implement it. What's hard
> is explaining its motivation: why it's needed and what it's good for. It's
> motivated by specialized applications that most programmers don't know about
> and don't need to. And these specialized applications would be better served
> by a 7th Edition Unix realloc.
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-09 11:37 ` Alejandro Colomar
@ 2025-11-09 15:31 ` Paul Eggert
2025-11-09 17:38 ` Alejandro Colomar
0 siblings, 1 reply; 116+ messages in thread
From: Paul Eggert @ 2025-11-09 15:31 UTC (permalink / raw)
To: Alejandro Colomar
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney, Thiago Macieira
On 2025-11-09 03:37, Alejandro Colomar wrote:
>>> That would make sanitizers and static analyzers unable to verify lots of
>>> code
>> No, just the opposite. Currently sanitizers etc. spend useless work checking
>> for C23 rules that don't correspond to any hardware or correctness needs;
>> they're simply rules imposed by the C committee. This checking is
>> counterproductive to real-world software development.
> I'm worried that it might decrease the ability of static analyzers to
> detect memory leaks. Currently, a static analyzer (such as GCC's
> -fanalyzer) can see calls to [[gnu::malloc(realloc, 1)]] functions and
> assume that realloc(3) free's them. If realloc(3) would only free(3)
> conditionally, then you couldn't apply that attribute, which would make
> analysis more difficult.
Again, this is backwards. If the spec for P=realloc(Q,R) is changed so
that it's valid to check P==Q afterwards (which it is on every practical
production platform), then static analyzers can and should be changed
accordingly. The P==Q situation will not count as a memory leak, and
other situations will still count. This will be an improvement over the
current situation, where static analyzers issue false alarms about such
code.
Static analyzers should be our servants, not our masters.
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-09 15:31 ` Paul Eggert
@ 2025-11-09 17:38 ` Alejandro Colomar
2025-11-09 18:11 ` Rich Felker
2025-11-09 18:16 ` Alejandro Colomar
0 siblings, 2 replies; 116+ messages in thread
From: Alejandro Colomar @ 2025-11-09 17:38 UTC (permalink / raw)
To: Paul Eggert
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney, Thiago Macieira
[-- Attachment #1: Type: text/plain, Size: 2644 bytes --]
Hi Paul,
On Sun, Nov 09, 2025 at 07:31:37AM -0800, Paul Eggert wrote:
> On 2025-11-09 03:37, Alejandro Colomar wrote:
> > > > That would make sanitizers and static analyzers unable to verify lots of
> > > > code
> > > No, just the opposite. Currently sanitizers etc. spend useless work checking
> > > for C23 rules that don't correspond to any hardware or correctness needs;
> > > they're simply rules imposed by the C committee. This checking is
> > > counterproductive to real-world software development.
> > I'm worried that it might decrease the ability of static analyzers to
> > detect memory leaks. Currently, a static analyzer (such as GCC's
> > -fanalyzer) can see calls to [[gnu::malloc(realloc, 1)]] functions and
> > assume that realloc(3) free's them. If realloc(3) would only free(3)
> > conditionally, then you couldn't apply that attribute, which would make
> > analysis more difficult.
>
> Again, this is backwards. If the spec for P=realloc(Q,R) is changed so that
> it's valid to check P==Q afterwards (which it is on every practical
> production platform), then static analyzers can and should be changed
> accordingly. The P==Q situation will not count as a memory leak, and other
> situations will still count. This will be an improvement over the current
> situation, where static analyzers issue false alarms about such code.
My point was that it's easier to consider the lifetime of P ends at
every realloc(3) call than to consider it to end only if Q!=P.
Right now, I can do:
void
my_free(void *p)
{
return free(p);
}
[[gnu::malloc(free)]]
void *
my_realloc(void *p, size_t n)
{
return realloc(p, n);
}
[[gnu::malloc(realloc, 1)]] [[gnu::malloc(free)]]
void *
my_malloc(size_t n)
{
return malloc(n);
}
If realloc(3) wouldn't create a new lifetime, this use of attributes
wouldn't be legal, and so I couldn't tell the static analyzer how to
analyze this code. We'd need significantly more complex rules to
describe the relationship between these functions.
Or maybe this would still be valid... Since
[[gnu::malloc(deallocator)]] doesn't imply [[gnu::malloc]], then this
could still be valid.
But because old pointers would be conditionally be valid, it would be
more difficult to determine whether the pointer is free(3)d or not, as
the old pointers could be used for free(3)ing. So, V7 Unix semantics
could reduce the value of this attribute.
Have a lovely night!
Alex
> Static analyzers should be our servants, not our masters.
>
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-09 17:38 ` Alejandro Colomar
@ 2025-11-09 18:11 ` Rich Felker
2025-11-09 19:03 ` Paul Eggert
2025-11-09 18:16 ` Alejandro Colomar
1 sibling, 1 reply; 116+ messages in thread
From: Rich Felker @ 2025-11-09 18:11 UTC (permalink / raw)
To: Alejandro Colomar
Cc: Paul Eggert, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney, Thiago Macieira
On Sun, Nov 09, 2025 at 06:38:25PM +0100, Alejandro Colomar wrote:
> Hi Paul,
>
> On Sun, Nov 09, 2025 at 07:31:37AM -0800, Paul Eggert wrote:
> > On 2025-11-09 03:37, Alejandro Colomar wrote:
> > > > > That would make sanitizers and static analyzers unable to verify lots of
> > > > > code
> > > > No, just the opposite. Currently sanitizers etc. spend useless work checking
> > > > for C23 rules that don't correspond to any hardware or correctness needs;
> > > > they're simply rules imposed by the C committee. This checking is
> > > > counterproductive to real-world software development.
> > > I'm worried that it might decrease the ability of static analyzers to
> > > detect memory leaks. Currently, a static analyzer (such as GCC's
> > > -fanalyzer) can see calls to [[gnu::malloc(realloc, 1)]] functions and
> > > assume that realloc(3) free's them. If realloc(3) would only free(3)
> > > conditionally, then you couldn't apply that attribute, which would make
> > > analysis more difficult.
> >
> > Again, this is backwards. If the spec for P=realloc(Q,R) is changed so that
> > it's valid to check P==Q afterwards (which it is on every practical
> > production platform), then static analyzers can and should be changed
> > accordingly. The P==Q situation will not count as a memory leak, and other
> > situations will still count. This will be an improvement over the current
> > situation, where static analyzers issue false alarms about such code.
>
> My point was that it's easier to consider the lifetime of P ends at
> every realloc(3) call than to consider it to end only if Q!=P.
I agree with this. Moreover, checking if P!=Q *should* remain
undefined. Anything you can do with the result is wrong.
Rich
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-09 17:38 ` Alejandro Colomar
2025-11-09 18:11 ` Rich Felker
@ 2025-11-09 18:16 ` Alejandro Colomar
1 sibling, 0 replies; 116+ messages in thread
From: Alejandro Colomar @ 2025-11-09 18:16 UTC (permalink / raw)
To: Paul Eggert
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney, Thiago Macieira
[-- Attachment #1: Type: text/plain, Size: 3229 bytes --]
On Sun, Nov 09, 2025 at 06:38:30PM +0100, Alejandro Colomar wrote:
> Hi Paul,
>
> On Sun, Nov 09, 2025 at 07:31:37AM -0800, Paul Eggert wrote:
> > On 2025-11-09 03:37, Alejandro Colomar wrote:
> > > > > That would make sanitizers and static analyzers unable to verify lots of
> > > > > code
> > > > No, just the opposite. Currently sanitizers etc. spend useless work checking
> > > > for C23 rules that don't correspond to any hardware or correctness needs;
> > > > they're simply rules imposed by the C committee. This checking is
> > > > counterproductive to real-world software development.
> > > I'm worried that it might decrease the ability of static analyzers to
> > > detect memory leaks. Currently, a static analyzer (such as GCC's
> > > -fanalyzer) can see calls to [[gnu::malloc(realloc, 1)]] functions and
> > > assume that realloc(3) free's them. If realloc(3) would only free(3)
> > > conditionally, then you couldn't apply that attribute, which would make
> > > analysis more difficult.
> >
> > Again, this is backwards. If the spec for P=realloc(Q,R) is changed so that
> > it's valid to check P==Q afterwards (which it is on every practical
> > production platform), then static analyzers can and should be changed
> > accordingly. The P==Q situation will not count as a memory leak, and other
> > situations will still count. This will be an improvement over the current
> > situation, where static analyzers issue false alarms about such code.
>
> My point was that it's easier to consider the lifetime of P ends at
> every realloc(3) call than to consider it to end only if Q!=P.
>
> Right now, I can do:
>
> void
> my_free(void *p)
> {
> return free(p);
> }
>
> [[gnu::malloc(free)]]
> void *
> my_realloc(void *p, size_t n)
> {
> return realloc(p, n);
> }
Self-correction:
void *my_realloc(void *p, size_t n);
[[gnu::malloc(my_realloc, 1)]] [[gnu::malloc(my_free)]]
void *my_realloc(void *p, size_t n);
(I need to repeat the prototype, to be able to refer to itself.)
>
> [[gnu::malloc(realloc, 1)]] [[gnu::malloc(free)]]
And these attributes should use my_realloc and my_free.
> void *
> my_malloc(size_t n)
> {
> return malloc(n);
> }
>
> If realloc(3) wouldn't create a new lifetime, this use of attributes
> wouldn't be legal, and so I couldn't tell the static analyzer how to
> analyze this code. We'd need significantly more complex rules to
> describe the relationship between these functions.
>
> Or maybe this would still be valid... Since
> [[gnu::malloc(deallocator)]] doesn't imply [[gnu::malloc]], then this
> could still be valid.
>
> But because old pointers would be conditionally be valid, it would be
> more difficult to determine whether the pointer is free(3)d or not, as
> the old pointers could be used for free(3)ing. So, V7 Unix semantics
> could reduce the value of this attribute.
>
>
> Have a lovely night!
> Alex
>
> > Static analyzers should be our servants, not our masters.
> >
>
> --
> <https://www.alejandro-colomar.es>
> Use port 80 (that is, <...:80/>).
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-09 18:11 ` Rich Felker
@ 2025-11-09 19:03 ` Paul Eggert
2025-11-09 19:16 ` Alejandro Colomar
2025-11-10 1:20 ` Rich Felker
0 siblings, 2 replies; 116+ messages in thread
From: Paul Eggert @ 2025-11-09 19:03 UTC (permalink / raw)
To: Rich Felker
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney, Thiago Macieira, Alejandro Colomar
On 2025-11-09 10:11, Rich Felker wrote:
> On Sun, Nov 09, 2025 at 06:38:25PM +0100, Alejandro Colomar wrote:
>> My point was that it's easier to consider the lifetime of P ends at
>> every realloc(3) call than to consider it to end only if Q!=P.
>
> I agree with this.
? The lifetime of P does not necessarily end after Q=realloc(P,N), even
in C23. So the situation is already more complicated than Alejandro's
incorrect summary, and for good reason. And there should be nothing
wrong with adjusting this part of the standard to better reflect how
real-world implementations behave.
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-09 19:03 ` Paul Eggert
@ 2025-11-09 19:16 ` Alejandro Colomar
2025-11-10 1:20 ` Rich Felker
1 sibling, 0 replies; 116+ messages in thread
From: Alejandro Colomar @ 2025-11-09 19:16 UTC (permalink / raw)
To: Paul Eggert
Cc: Rich Felker, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney, Thiago Macieira
[-- Attachment #1: Type: text/plain, Size: 1375 bytes --]
Hi Paul,
On Sun, Nov 09, 2025 at 11:03:52AM -0800, Paul Eggert wrote:
> On 2025-11-09 10:11, Rich Felker wrote:
> > On Sun, Nov 09, 2025 at 06:38:25PM +0100, Alejandro Colomar wrote:
>
> > > My point was that it's easier to consider the lifetime of P ends at
> > > every realloc(3) call than to consider it to end only if Q!=P.
> >
> > I agree with this.
>
> ? The lifetime of P does not necessarily end after Q=realloc(P,N), even in
> C23. So the situation is already more complicated than Alejandro's incorrect
> summary, and for good reason. And there should be nothing wrong with
> adjusting this part of the standard to better reflect how real-world
> implementations behave.
My reading of ISO C23 is that the lifetime of both the object and the
pointer are unconditionally terminated by realloc(3). A new object is
created, whose contents are the same as those of the old object, but
it's an entirely new object, with a new lifetime.
The realloc function
deallocates the old object pointed to by ptr
and returns a pointer to a new object
that has the size specified by size.
The contents of the new object
shall be the same as that of the old object prior to deallocation,
up to the lesser of the new and old sizes.
Have a lovely night!
Alex
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-09 19:03 ` Paul Eggert
2025-11-09 19:16 ` Alejandro Colomar
@ 2025-11-10 1:20 ` Rich Felker
2025-11-10 2:47 ` Paul Eggert
1 sibling, 1 reply; 116+ messages in thread
From: Rich Felker @ 2025-11-10 1:20 UTC (permalink / raw)
To: Paul Eggert
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney, Thiago Macieira, Alejandro Colomar
On Sun, Nov 09, 2025 at 11:03:52AM -0800, Paul Eggert wrote:
> On 2025-11-09 10:11, Rich Felker wrote:
> > On Sun, Nov 09, 2025 at 06:38:25PM +0100, Alejandro Colomar wrote:
>
> > > My point was that it's easier to consider the lifetime of P ends at
> > > every realloc(3) call than to consider it to end only if Q!=P.
> >
> > I agree with this.
>
> ? The lifetime of P does not necessarily end after Q=realloc(P,N), even in
> C23. So the situation is already more complicated than Alejandro's incorrect
> summary, and for good reason. And there should be nothing wrong with
> adjusting this part of the standard to better reflect how real-world
> implementations behave.
The only way the lifetime of P does not end is if realloc returns a
null pointer indicating failure. On success, it is as if
malloc+memcpy+free.
Rich
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-10 1:20 ` Rich Felker
@ 2025-11-10 2:47 ` Paul Eggert
2025-11-10 10:07 ` Alejandro Colomar
` (2 more replies)
0 siblings, 3 replies; 116+ messages in thread
From: Paul Eggert @ 2025-11-10 2:47 UTC (permalink / raw)
To: Rich Felker
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney, Thiago Macieira, Alejandro Colomar
On 2025-11-09 17:20, Rich Felker wrote:
> The only way the lifetime of P does not end is if realloc returns a
> null pointer indicating failure.
Yes, and my point was that Alejandro's summary of the situation (which
you went along with) got this detail wrong. And once one gets this
detail right (which static analyzers of course can do), that discredits
the idea that static analyzers are so dumb that they can't handle
conditional results from functions like realloc. On the contrary, static
analyzers do that sort of thing routinely, and they could continue to do
so if the standard were changed slightly in the direction I proposed.
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-10 2:47 ` Paul Eggert
@ 2025-11-10 10:07 ` Alejandro Colomar
2025-11-10 14:51 ` Zack Weinberg
2025-11-10 15:11 ` Rich Felker
2 siblings, 0 replies; 116+ messages in thread
From: Alejandro Colomar @ 2025-11-10 10:07 UTC (permalink / raw)
To: Paul Eggert
Cc: Rich Felker, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney, Thiago Macieira
[-- Attachment #1: Type: text/plain, Size: 1514 bytes --]
Hi Paul, Rich,
On Sun, Nov 09, 2025 at 06:47:54PM -0800, Paul Eggert wrote:
> On 2025-11-09 17:20, Rich Felker wrote:
> > The only way the lifetime of P does not end is if realloc returns a
> > null pointer indicating failure.
>
> Yes, and my point was that Alejandro's summary of the situation (which you
> went along with) got this detail wrong. And once one gets this detail right
> (which static analyzers of course can do), that discredits the idea that
> static analyzers are so dumb that they can't handle conditional results from
> functions like realloc. On the contrary, static analyzers do that sort of
> thing routinely, and they could continue to do so if the standard were
> changed slightly in the direction I proposed.
I agree I was wrong in my wording. And considering that
[[gnu::malloc(free)]] doesn't imply [[gnu::malloc]], then your proposed
semantics are also easy to express. I guess this could do it:
void *eggert_realloc(void *, size_t);
[[gnu::malloc(eggert_realloc, 1)]] [[gnu::malloc(free)]]
void *eggert_realloc(void *, size_t);
void *current_realloc(void *, size_t);
[[gnu::malloc(current_realloc, 1)]] [[gnu::malloc(free)]]
[[gnu::malloc]]
void *current_realloc(void *, size_t);
And since analyzers already need to consider when it fails, I guess
you're right that adding p==q to the logic of the analyzer wouldn't hurt
so much.
Have a lovely day!
Alex
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-10 2:47 ` Paul Eggert
2025-11-10 10:07 ` Alejandro Colomar
@ 2025-11-10 14:51 ` Zack Weinberg
2025-11-10 15:11 ` Rich Felker
2 siblings, 0 replies; 116+ messages in thread
From: Zack Weinberg @ 2025-11-10 14:51 UTC (permalink / raw)
To: Paul Eggert; +Cc: GNU libc development, musl
On Sun, Nov 9, 2025, at 9:47 PM, Paul Eggert wrote:
> On 2025-11-09 17:20, Rich Felker wrote:
>> The only way the lifetime of P does not end is if realloc returns a
>> null pointer indicating failure.
>
> Yes, and my point was that Alejandro's summary of the situation (which
> you went along with) got this detail wrong. And once one gets this
> detail right (which static analyzers of course can do), that
> discredits the idea that static analyzers are so dumb that they can't
> handle conditional results from functions like realloc. On the
> contrary, static analyzers do that sort of thing routinely, and they
> could continue to do so if the standard were changed slightly in the
> direction I proposed.
I've been trying to stay out of this argument, but I'm fed up with how
circular it's gotten.
Paul: Your proposed change to the C standard is never, ever going to
happen. It's exactly the same situation as the pointer aliasing rules:
The wording you don't like has been there since C1989. People have been
complaining about it since 1991 or so. The C committee cannot help but
be aware of the complaints. They have repeatedly chosen to take no
action on those complaints. The logical conclusion is that the standard
says what the committee wants it to say and further complaining is
futile. Please drop this line of argument.
In addition, it has been repeatedly explained that the C++ use cases for
realloci() require a function whose _contract_ is that it will never
move the block to be resized; a standard-licensed mechanism for fixing
up pointers inside a moved block is _not good enough_ for C++. (We're
never going to get that mechanism either, but it's _moot_ in this case!)
So please drop that line of argument as well.
zw
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-10 2:47 ` Paul Eggert
2025-11-10 10:07 ` Alejandro Colomar
2025-11-10 14:51 ` Zack Weinberg
@ 2025-11-10 15:11 ` Rich Felker
2025-11-10 15:18 ` Alejandro Colomar
2 siblings, 1 reply; 116+ messages in thread
From: Rich Felker @ 2025-11-10 15:11 UTC (permalink / raw)
To: Paul Eggert
Cc: libc-alpha, musl, A. Wilcox, Lénárd Szolnoki,
Collin Funk, Arthur O'Dwyer, Jonathan Wakely,
Paul E. McKenney, Thiago Macieira, Alejandro Colomar
On Sun, Nov 09, 2025 at 06:47:54PM -0800, Paul Eggert wrote:
> On 2025-11-09 17:20, Rich Felker wrote:
> > The only way the lifetime of P does not end is if realloc returns a
> > null pointer indicating failure.
>
> Yes, and my point was that Alejandro's summary of the situation (which you
> went along with) got this detail wrong. And once one gets this detail right
> (which static analyzers of course can do), that discredits the idea that
> static analyzers are so dumb that they can't handle conditional results from
> functions like realloc. On the contrary, static analyzers do that sort of
> thing routinely, and they could continue to do so if the standard were
> changed slightly in the direction I proposed.
It's not a "conditional result" unless the condition you mean is
failure to allocate. realloc *always frees the old object* on success.
Rich
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-10 15:11 ` Rich Felker
@ 2025-11-10 15:18 ` Alejandro Colomar
0 siblings, 0 replies; 116+ messages in thread
From: Alejandro Colomar @ 2025-11-10 15:18 UTC (permalink / raw)
To: Rich Felker
Cc: Paul Eggert, libc-alpha, musl, A. Wilcox,
Lénárd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney, Thiago Macieira
[-- Attachment #1: Type: text/plain, Size: 1147 bytes --]
Hi Rich,
On Mon, Nov 10, 2025 at 10:11:35AM -0500, Rich Felker wrote:
> On Sun, Nov 09, 2025 at 06:47:54PM -0800, Paul Eggert wrote:
> > On 2025-11-09 17:20, Rich Felker wrote:
> > > The only way the lifetime of P does not end is if realloc returns a
> > > null pointer indicating failure.
> >
> > Yes, and my point was that Alejandro's summary of the situation (which you
> > went along with) got this detail wrong. And once one gets this detail right
> > (which static analyzers of course can do), that discredits the idea that
> > static analyzers are so dumb that they can't handle conditional results from
> > functions like realloc. On the contrary, static analyzers do that sort of
> > thing routinely, and they could continue to do so if the standard were
> > changed slightly in the direction I proposed.
>
> It's not a "conditional result" unless the condition you mean is
> failure to allocate. realloc *always frees the old object* on success.
AFAIU, he meant that, a failure to allocate.
Have a lovely day!
Alex
>
> Rich
--
<https://www.alejandro-colomar.es>
Use port 80 (that is, <...:80/>).
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-01 3:54 ` Paul Eggert
2025-11-01 13:38 ` Thorsten Glaser
@ 2025-11-11 12:04 ` Brooks Davis
2025-11-11 13:55 ` Florian Weimer
1 sibling, 1 reply; 116+ messages in thread
From: Brooks Davis @ 2025-11-11 12:04 UTC (permalink / raw)
To: musl
Cc: Thiago Macieira, Alejandro Colomar, libc-alpha, A. Wilcox,
L??n??rd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney
On Fri, Oct 31, 2025 at 09:54:52PM -0600, Paul Eggert wrote:
> On 10/31/25 17:27, Thiago Macieira wrote:
> > I don't think you are because imposing this requirement would imply it will
> > never memcpy() the data to a new location and that would break quite a lot of
> > applications that depend the ability to grow a block so long as there's heap
> > available.
>
> You're right I'm not saying that. All I'm saying is that when R=realloc(P,N)
> succeeds, you can assume that you can adjust old pointers into the object
> addressed by P by adding R-P to them. The C standard says this results in
> undefined behavior; all that we need to do is fix the C standard to say it's
> well-defined (because it is on practical platforms).
As described (adding R-P), this is broken on CHERI systems. You must
derive new pointers from R as pointers derived from P will be out of
bounds (and with revocation will become invalid at some indeterminate
point.) In CheriBSD, we do take care to preserve the invariant that
either the pointer is unchanged including bounds or we've done a
malloc-memcpy-free.
MTE is likely to have similar issues with pointer updates unless the
implementer ensures that realloc returns pointers of the same color.
-- Brooks
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-11 12:04 ` Brooks Davis
@ 2025-11-11 13:55 ` Florian Weimer
2025-11-12 10:31 ` Brooks Davis
0 siblings, 1 reply; 116+ messages in thread
From: Florian Weimer @ 2025-11-11 13:55 UTC (permalink / raw)
To: Brooks Davis
Cc: musl, Thiago Macieira, Alejandro Colomar, libc-alpha, A. Wilcox,
L??n??rd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney
* Brooks Davis:
> On Fri, Oct 31, 2025 at 09:54:52PM -0600, Paul Eggert wrote:
>> On 10/31/25 17:27, Thiago Macieira wrote:
>> > I don't think you are because imposing this requirement would imply it will
>> > never memcpy() the data to a new location and that would break quite a lot of
>> > applications that depend the ability to grow a block so long as there's heap
>> > available.
>>
>> You're right I'm not saying that. All I'm saying is that when R=realloc(P,N)
>> succeeds, you can assume that you can adjust old pointers into the object
>> addressed by P by adding R-P to them. The C standard says this results in
>> undefined behavior; all that we need to do is fix the C standard to say it's
>> well-defined (because it is on practical platforms).
>
> As described (adding R-P), this is broken on CHERI systems. You must
> derive new pointers from R as pointers derived from P will be out of
> bounds (and with revocation will become invalid at some indeterminate
> point.) In CheriBSD, we do take care to preserve the invariant that
> either the pointer is unchanged including bounds or we've done a
> malloc-memcpy-free.
Would this work?
new->field = (decltype(new->field)) ((char *) new
+ ((uintptr_t) new->field
- (uintptr_t) old));
That should have correct provenance.
> MTE is likely to have similar issues with pointer updates unless the
> implementer ensures that realloc returns pointers of the same color.
Only if pointer additions wrap around, but that would be a problem even
without MTE: ptrdiff_t cannot represent all potential pointer offsets
anyway.
Thanks,
Florian
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-06 21:49 ` Alejandro Colomar
2025-11-06 23:10 ` Michael Winterberg
@ 2025-11-11 20:36 ` James Y Knight
2025-11-11 20:51 ` Rich Felker
2025-11-11 21:56 ` Thiago Macieira
1 sibling, 2 replies; 116+ messages in thread
From: James Y Knight @ 2025-11-11 20:36 UTC (permalink / raw)
To: Alejandro Colomar
Cc: musl, Rich Felker, The 8472, Thiago Macieira, Florian Weimer,
libc-alpha, Arthur O'Dwyer, Jonathan Wakely
On Thu, Nov 6, 2025 at 4:49 PM Alejandro Colomar <alx@kernel.org> wrote:
> > Yes, a new malloc variant with direct size feedback would be a much
> > better idea than a new non-moving realloc variant. And, for people who
> > care about usefulness for C++: that's already the direction C++
> > started to go.
>
> I disagree. Most users don't need this, and should use this.
>
> malloc(3) is already a special case of realloc(3), but it's useful
> enough that it warrants a separate function.
>
> However, for the case of realloci(), I don't see the usefulness of a
> hypothetical malloci() to be enough to justify it.
>
> After all, you can do this:
>
> p = malloc(size);
> if (p == NULL)
> goto fail;
>
> actual_size = realloci(p, size);
>
> And the realloci() call should be cheap compared to malloc(3). There's
> absolutely no need to conflate this into a new function. realloci()
> should be enough.
Yes, one could write something like that if realloci was added -- so
long as it had the semantics of potentially allocating a larger size
than was requested (which I don't think was the actual proposal?).
But, it would be somewhat less efficient and as useful for C++ than
size-feedback-malloc would be. My point was: that it'd be better to
add size-feedback-malloc INSTEAD of realloci -- not in addition.
As has been stated multiple times, realloc on most modern malloc
implementations can't typically grow an allocation, beyond the small
amount of growth into the padding space for the implementation's
chosen size bucket. And when you're talking about small growth like
that, it is substantially less expensive to have an API to tell the
allocator to give you that otherwise-wasted memory up-front, from the
initial allocation.
> And the realloci() call should be cheap compared to malloc(3).
realloci will not be cheap enough to be ignorable. At the very least,
it needs to look up the size bucket for the pointer passed to it,
which can be a significant overhead by itself. That's the entire
reason C++ added a sized delete, see (from 2013)
https://wg21.link/n3536.
> But, there's no need to rush the growth.
Please see the section in the doc,
https://wg21.link/p0401r6#reallocation which explicitly discusses and
rejects an 'in-place realloc'. If you want to defer growth, you need
to add a new branch to the control flow for growing the buffer. That
would be a significant overhead compared to simply setting a slightly
larger capacity up-front, which can be provided for free by the malloc
implementation at the point of allocation.
> I don't see this being used in C, so I think the C implementation should
> be just enough to allow C++ to do their thing. For that, realloci()
> would be enough. C++ can call it immediately after malloc(3) if needed,
> and can wrap it with the malloci() from above if they want.
Er, what?
One _could_ do that, but it would be exceedingly silly to add a brand
new realloci API with the justification that it's "useful for
C++"...when it's not. Only to wrap it -- with a substantial
performance cost! -- to act like size-feedback-malloc, which is the
API that would actually be useful.
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-11 20:36 ` James Y Knight
@ 2025-11-11 20:51 ` Rich Felker
2025-11-11 21:07 ` Thorsten Glaser
` (2 more replies)
2025-11-11 21:56 ` Thiago Macieira
1 sibling, 3 replies; 116+ messages in thread
From: Rich Felker @ 2025-11-11 20:51 UTC (permalink / raw)
To: James Y Knight
Cc: Alejandro Colomar, musl, The 8472, Thiago Macieira,
Florian Weimer, libc-alpha, Arthur O'Dwyer, Jonathan Wakely
On Tue, Nov 11, 2025 at 03:36:33PM -0500, James Y Knight wrote:
> On Thu, Nov 6, 2025 at 4:49 PM Alejandro Colomar <alx@kernel.org> wrote:
> > > Yes, a new malloc variant with direct size feedback would be a much
> > > better idea than a new non-moving realloc variant. And, for people who
> > > care about usefulness for C++: that's already the direction C++
> > > started to go.
> >
> > I disagree. Most users don't need this, and should use this.
> >
> > malloc(3) is already a special case of realloc(3), but it's useful
> > enough that it warrants a separate function.
> >
> > However, for the case of realloci(), I don't see the usefulness of a
> > hypothetical malloci() to be enough to justify it.
> >
> > After all, you can do this:
> >
> > p = malloc(size);
> > if (p == NULL)
> > goto fail;
> >
> > actual_size = realloci(p, size);
> >
> > And the realloci() call should be cheap compared to malloc(3). There's
> > absolutely no need to conflate this into a new function. realloci()
> > should be enough.
>
> Yes, one could write something like that if realloci was added -- so
> long as it had the semantics of potentially allocating a larger size
> than was requested (which I don't think was the actual proposal?).
>
> But, it would be somewhat less efficient and as useful for C++ than
> size-feedback-malloc would be. My point was: that it'd be better to
> add size-feedback-malloc INSTEAD of realloci -- not in addition.
>
> As has been stated multiple times, realloc on most modern malloc
> implementations can't typically grow an allocation, beyond the small
> amount of growth into the padding space for the implementation's
> chosen size bucket. And when you're talking about small growth like
> that, it is substantially less expensive to have an API to tell the
> allocator to give you that otherwise-wasted memory up-front, from the
> initial allocation.
>
> > And the realloci() call should be cheap compared to malloc(3).
>
> realloci will not be cheap enough to be ignorable. At the very least,
> it needs to look up the size bucket for the pointer passed to it,
> which can be a significant overhead by itself. That's the entire
> reason C++ added a sized delete, see (from 2013)
> https://wg21.link/n3536.
>
> > But, there's no need to rush the growth.
>
> Please see the section in the doc,
> https://wg21.link/p0401r6#reallocation which explicitly discusses and
> rejects an 'in-place realloc'. If you want to defer growth, you need
> to add a new branch to the control flow for growing the buffer. That
> would be a significant overhead compared to simply setting a slightly
> larger capacity up-front, which can be provided for free by the malloc
> implementation at the point of allocation.
>
> > I don't see this being used in C, so I think the C implementation should
> > be just enough to allow C++ to do their thing. For that, realloci()
> > would be enough. C++ can call it immediately after malloc(3) if needed,
> > and can wrap it with the malloci() from above if they want.
>
> Er, what?
>
> One _could_ do that, but it would be exceedingly silly to add a brand
> new realloci API with the justification that it's "useful for
> C++"...when it's not. Only to wrap it -- with a substantial
> performance cost! -- to act like size-feedback-malloc, which is the
> API that would actually be useful.
I'm still unclear why "size-feedback-malloc" is supposed to be useful
enough to justify all of this. It seems like at most it's saving a
fairly small % of space at small sizes and an even smaller % (a fixed
maximum, the page size) at larger sizes serviced by direct mmap.
What is the actual problem folks are trying to solve?
Has that ever been stated clearly?
Is it a single problem or multiple ones that keep getting raised
because they seem vaguely related to the topic?
Multiple people who are not participants in this thread but who have
seen it have told me it looks like something out of the Simple
Sabotage Field Manual and are frustrated to see myself and others
having our time wasted on it.
Rich
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-11 20:51 ` Rich Felker
@ 2025-11-11 21:07 ` Thorsten Glaser
2025-11-11 22:19 ` Jeffrey Walton
2025-11-11 21:59 ` Thiago Macieira
2025-11-11 23:17 ` James Y Knight
2 siblings, 1 reply; 116+ messages in thread
From: Thorsten Glaser @ 2025-11-11 21:07 UTC (permalink / raw)
To: musl
Cc: James Y Knight, Alejandro Colomar, The 8472, Thiago Macieira,
Florian Weimer, libc-alpha, Arthur O'Dwyer, Jonathan Wakely
On Tue, 11 Nov 2025, Rich Felker wrote:
>Is it a single problem or multiple ones that keep getting raised
>because they seem vaguely related to the topic?
I think the C++ people have a problem that they think they can
solve with this. I’m not sure whether that prompted Alex’ work
on a suggestion or whether he’s just invested into changing
realloc and they jumped on. It does seem like there are also
other people chiming in, thinking they could possibly use a
part of this anywhere, but so far, we seem to be getting only
concrete info out of C++ people.
>I'm still unclear why "size-feedback-malloc" is supposed to be useful
>enough to justify all of this. It seems like at most it's saving a
>fairly small % of space at small sizes and an even smaller % (a fixed
>maximum, the page size) at larger sizes serviced by direct mmap.
It seems like, if C++ has a need that must be served by libc’s
malloc implementation, then that’s an implementation thing, and
we don’t need to change the C standard for it, but instead ask
libc implementors to add a function specifically for the use by
C++ runtimes (which they can use once they (compile-time) tested
its availability, over the existing way).
I’ve not yet seen any use of this for C code that doesn’t have
other downsides that make it improbable to impractical.
Just my impressions,
//mirabilos
--
When he found out that the m68k port was in a pretty bad shape, he did
not, like many before him, shrug and move on; instead, he took it upon
himself to start compiling things, just so he could compile his shell.
How's that for dedication. -- Wouter, about my Debian/m68k revival
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-11 20:36 ` James Y Knight
2025-11-11 20:51 ` Rich Felker
@ 2025-11-11 21:56 ` Thiago Macieira
1 sibling, 0 replies; 116+ messages in thread
From: Thiago Macieira @ 2025-11-11 21:56 UTC (permalink / raw)
To: Alejandro Colomar, James Y Knight
Cc: musl, Rich Felker, The 8472, Florian Weimer, libc-alpha,
Arthur O'Dwyer, Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 2811 bytes --]
On Tuesday, 11 November 2025 12:36:33 Pacific Standard Time James Y Knight
wrote:
> As has been stated multiple times, realloc on most modern malloc
> implementations can't typically grow an allocation, beyond the small
> amount of growth into the padding space for the implementation's
> chosen size bucket. And when you're talking about small growth like
> that, it is substantially less expensive to have an API to tell the
> allocator to give you that otherwise-wasted memory up-front, from the
> initial allocation.
I've already provided evidence to the contrary, showing realloc() did grow in-
place 7.3% of the time, doubling the size of the allocation. See my email from
last Wednesday.
> > But, there's no need to rush the growth.
>
> Please see the section in the doc,
> https://wg21.link/p0401r6#reallocation which explicitly discusses and
> rejects an 'in-place realloc'.
I don't see that conclusion there. The second bullet point of that section is
the one that talks about realloci() and doesn't reject it. It only says that
it could probe with a growing size to see how big it can get, but I suppose
the API the authors were thinking of returned a boolean, not a size. With a
size indicating how big it did get, there would be no need to probe.
The paragraph below the bullet points talks about Allocators, which are an
important but not the only use-case for this. If realloci() did exist, there
would need to be a paper to allow Standard Library containers to use the
function, and that could be based on the P0401 paper. For everyone else who
doesn't use Allocators, we can get the immediate benefit directly from
realloci().
> If you want to defer growth, you need
> to add a new branch to the control flow for growing the buffer. That
> would be a significant overhead compared to simply setting a slightly
> larger capacity up-front, which can be provided for free by the malloc
> implementation at the point of allocation.
You can do both. The compiler must emit both code paths anyway (aside from
very limited scenarios I've never observed), so the only difference is what
happens at runtime. Growing a buffer by 100% to add one element amortises the
cost of adding the next N, but is a cost if it doesn't add nearly that many.
Besides, fine-tuning this can be done in the container, as a Quality of
Implementation metric.
> One _could_ do that, but it would be exceedingly silly to add a brand
> new realloci API with the justification that it's "useful for
> C++"...when it's not. Only to wrap it -- with a substantial
> performance cost! -- to act like size-feedback-malloc, which is the
> API that would actually be useful.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-11 20:51 ` Rich Felker
2025-11-11 21:07 ` Thorsten Glaser
@ 2025-11-11 21:59 ` Thiago Macieira
2025-11-11 22:45 ` Rich Felker
2025-11-11 23:17 ` James Y Knight
2 siblings, 1 reply; 116+ messages in thread
From: Thiago Macieira @ 2025-11-11 21:59 UTC (permalink / raw)
To: James Y Knight, Rich Felker
Cc: Alejandro Colomar, musl, The 8472, Florian Weimer, libc-alpha,
Arthur O'Dwyer, Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 1074 bytes --]
On Tuesday, 11 November 2025 12:51:28 Pacific Standard Time Rich Felker wrote:
> What is the actual problem folks are trying to solve?
>
> Has that ever been stated clearly?
Yes, more than once: growing a container (array, usually) that holds objects
with non-trivial copy/move operations. realloc() is unsuitable for that
because the elements can't be memmove()d, so containers today must allocate a
new buffer, perform the expensive move, then free the old buffer.
> Is it a single problem or multiple ones that keep getting raised
> because they seem vaguely related to the topic?
There are a few incidental requests, like finding out how big a buffer realy is,
when the allocation happens.
> Multiple people who are not participants in this thread but who have
> seen it have told me it looks like something out of the Simple
> Sabotage Field Manual and are frustrated to see myself and others
> having our time wasted on it.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-11 21:07 ` Thorsten Glaser
@ 2025-11-11 22:19 ` Jeffrey Walton
2025-11-12 3:56 ` Thiago Macieira
0 siblings, 1 reply; 116+ messages in thread
From: Jeffrey Walton @ 2025-11-11 22:19 UTC (permalink / raw)
To: musl
Cc: James Y Knight, Alejandro Colomar, The 8472, Thiago Macieira,
Florian Weimer, libc-alpha, Arthur O'Dwyer, Jonathan Wakely
On Tue, Nov 11, 2025 at 4:07 PM Thorsten Glaser <tg@evolvis.org> wrote:
>
> On Tue, 11 Nov 2025, Rich Felker wrote:
>
> >Is it a single problem or multiple ones that keep getting raised
> >because they seem vaguely related to the topic?
>
> I think the C++ people have a problem that they think they can
> solve with this. I’m not sure whether that prompted Alex’ work
> on a suggestion or whether he’s just invested into changing
> realloc and they jumped on. It does seem like there are also
> other people chiming in, thinking they could possibly use a
> part of this anywhere, but so far, we seem to be getting only
> concrete info out of C++ people.
>
> >I'm still unclear why "size-feedback-malloc" is supposed to be useful
> >enough to justify all of this. It seems like at most it's saving a
> >fairly small % of space at small sizes and an even smaller % (a fixed
> >maximum, the page size) at larger sizes serviced by direct mmap.
>
> It seems like, if C++ has a need that must be served by libc’s
> malloc implementation, then that’s an implementation thing, and
> we don’t need to change the C standard for it, but instead ask
> libc implementors to add a function specifically for the use by
> C++ runtimes (which they can use once they (compile-time) tested
> its availability, over the existing way).
My apologies for being one of "those" people, but...
I thought that is what reserve(..) is for in C++ containers, like
std::string and std::vector:
// size is 0 and capacity is 1024
std::string s;
s.reserve(1024);
// size is now 26, no reallocations
for (char ch='A'; ch<='Z'; ++ch)
s.push_back(ch);
Now the C++ folks have a growable string that does not need reallocations.
> I’ve not yet seen any use of this for C code that doesn’t have
> other downsides that make it improbable to impractical.
Jeff
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-11 21:59 ` Thiago Macieira
@ 2025-11-11 22:45 ` Rich Felker
2025-11-12 3:50 ` Thiago Macieira
0 siblings, 1 reply; 116+ messages in thread
From: Rich Felker @ 2025-11-11 22:45 UTC (permalink / raw)
To: Thiago Macieira
Cc: James Y Knight, Alejandro Colomar, musl, The 8472, Florian Weimer,
libc-alpha, Arthur O'Dwyer, Jonathan Wakely
On Tue, Nov 11, 2025 at 01:59:19PM -0800, Thiago Macieira wrote:
> On Tuesday, 11 November 2025 12:51:28 Pacific Standard Time Rich Felker wrote:
> > What is the actual problem folks are trying to solve?
> >
> > Has that ever been stated clearly?
>
> Yes, more than once: growing a container (array, usually) that holds objects
> with non-trivial copy/move operations. realloc() is unsuitable for that
> because the elements can't be memmove()d, so containers today must allocate a
> new buffer, perform the expensive move, then free the old buffer.
And that's still going to happen. The only difference is the exact
thresholds at which it happens, and having slightly less memory
overhead along the way. If the problem is "moving objects is expensive
and we don't want to do that", nothing on this path does anything to
solve it.
Rich
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-11 20:51 ` Rich Felker
2025-11-11 21:07 ` Thorsten Glaser
2025-11-11 21:59 ` Thiago Macieira
@ 2025-11-11 23:17 ` James Y Knight
2025-11-12 2:04 ` Rich Felker
2 siblings, 1 reply; 116+ messages in thread
From: James Y Knight @ 2025-11-11 23:17 UTC (permalink / raw)
To: Rich Felker, Thorsten Glaser
Cc: Alejandro Colomar, musl, The 8472, Thiago Macieira,
Florian Weimer, libc-alpha, Arthur O'Dwyer, Jonathan Wakely
On Tue, Nov 11, 2025 at 3:51 PM Rich Felker <dalias@libc.org> wrote:
> I'm still unclear why "size-feedback-malloc" is supposed to be useful
> enough to justify all of this. It seems like at most it's saving a
> fairly small % of space at small sizes and an even smaller % (a fixed
> maximum, the page size) at larger sizes serviced by direct mmap.
>
> What is the actual problem folks are trying to solve?
>
> Has that ever been stated clearly?
(I'd just note that size-feedback-malloc wasn't the topic for most of
this thread -- and I'm not sure if realloci proponents have the same
goal.)
The point of the size-feedback allocation proposals in C++ is, yes, to
save memory, mostly for relatively-small-sized growable containers.
As it turns out, C++ programs tend to have a lot of such smallish
strings and vectors running around. Depending on the starting size,
growth strategy, and malloc implementation details, you can end up
with remarkably high memory overhead in practice for these small
containers -- and the size returning allocation API help to ameliorate
that. Yes, this doesn't result in a _HUGE_ % of total memory usage
savings, but is noticeable.
Just to take a concrete example, GNU libstdc++'s std::string allocates
the exact size requested, initially. So, appending a char will always
initially grow the container, by doubling the size
(amortized-linear-time append is required by the spec, so an
exponential size-growth is required). Thus, taking initial capacity of
16 bytes (which reserves 24 on glibc malloc), and appending 1 byte
will immediately grow it to 32 bytes (reserving 40 on glibc) -- even
though the original 24-byte allocation would've sufficed. Or, creating
a string sized 30 bytes (which reserved 48 bytes in glibc malloc), and
appending 4 bytes will grow it to a capacity of 60, which ends up
reserving 72 -- even though the original 48-byte allocation would've
sufficed.
There will, of course, _always_ be cases where you need to actually
grow the container (and move the objects). But, having a
malloc-variant that returns the size can substantially reduce the
memory used, on average, for these small containers, under real world
usage patterns, because it will avoid doubling the size of the
container when there was still space left in the current allocation.
On Tue, Nov 11, 2025 at 4:07 PM Thorsten Glaser <tg@evolvis.org> wrote:
> It seems like, if C++ has a need that must be served by libc’s
> malloc implementation, then that’s an implementation thing, and
> we don’t need to change the C standard for it, but instead ask
> libc implementors to add a function specifically for the use by
> C++ runtimes (which they can use once they (compile-time) tested
> its availability, over the existing way).
I don't believe the C++ proposals ever asked for a Standard C API to
be added -- they do indeed treat it as an implementation detail, so
there's not a requirement to add a new malloc-related API to C for
C++'s purposes.
But...if C _does_ wish to add an API with a goal of it being useful to
C++...then I'd say that realloci is not the API to add.
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-11 23:17 ` James Y Knight
@ 2025-11-12 2:04 ` Rich Felker
2025-11-12 18:35 ` James Y Knight
0 siblings, 1 reply; 116+ messages in thread
From: Rich Felker @ 2025-11-12 2:04 UTC (permalink / raw)
To: James Y Knight
Cc: Thorsten Glaser, Alejandro Colomar, musl, The 8472,
Thiago Macieira, Florian Weimer, libc-alpha, Arthur O'Dwyer,
Jonathan Wakely
On Tue, Nov 11, 2025 at 06:17:33PM -0500, James Y Knight wrote:
> On Tue, Nov 11, 2025 at 3:51 PM Rich Felker <dalias@libc.org> wrote:
> > I'm still unclear why "size-feedback-malloc" is supposed to be useful
> > enough to justify all of this. It seems like at most it's saving a
> > fairly small % of space at small sizes and an even smaller % (a fixed
> > maximum, the page size) at larger sizes serviced by direct mmap.
> >
> > What is the actual problem folks are trying to solve?
> >
> > Has that ever been stated clearly?
>
> (I'd just note that size-feedback-malloc wasn't the topic for most of
> this thread -- and I'm not sure if realloci proponents have the same
> goal.)
>
> The point of the size-feedback allocation proposals in C++ is, yes, to
> save memory, mostly for relatively-small-sized growable containers.
>
> As it turns out, C++ programs tend to have a lot of such smallish
> strings and vectors running around. Depending on the starting size,
> growth strategy, and malloc implementation details, you can end up
> with remarkably high memory overhead in practice for these small
> containers -- and the size returning allocation API help to ameliorate
> that. Yes, this doesn't result in a _HUGE_ % of total memory usage
> savings, but is noticeable.
>
> Just to take a concrete example, GNU libstdc++'s std::string allocates
> the exact size requested, initially. So, appending a char will always
> initially grow the container, by doubling the size
But in order to utilize a new interface, libstdc++ would have to be
modified to do so, and there would be all sorts of conditional support
and namespacing issues to deal with.
If you're stuck modifying libstdc++ anyway, you might as well just
make it use a better strategy than initially using exact size, then
doubling. This works everywhere and does not require any new contracts
or feature detection.
Rich
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-11 22:45 ` Rich Felker
@ 2025-11-12 3:50 ` Thiago Macieira
0 siblings, 0 replies; 116+ messages in thread
From: Thiago Macieira @ 2025-11-12 3:50 UTC (permalink / raw)
To: Rich Felker
Cc: James Y Knight, Alejandro Colomar, musl, The 8472, Florian Weimer,
libc-alpha, Arthur O'Dwyer, Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 1234 bytes --]
On Tuesday, 11 November 2025 14:45:45 Pacific Standard Time Rich Felker wrote:
> And that's still going to happen. The only difference is the exact
> thresholds at which it happens, and having slightly less memory
> overhead along the way. If the problem is "moving objects is expensive
> and we don't want to do that", nothing on this path does anything to
> solve it.
It may postpone it, possibly indefinitely.
There are two unknowns with the algorithm dealing with container growth:
- the number of elements the user will ultimately add
- how much realloci() may grow
Because we don't and can't know those, it's very hard to predict how useful
this will be in general circumstances. I have experimented with giving a
number for the first, but I can't simulate the second without an actual
implementation.
The point is that there is a non-zero chance that in expanding the block we
save one or more malloc+move+free sequences. The most common would be the last
operation, when the user stops adding elements before the next reallocation,
which is also the costliest all reallocations.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-11 22:19 ` Jeffrey Walton
@ 2025-11-12 3:56 ` Thiago Macieira
0 siblings, 0 replies; 116+ messages in thread
From: Thiago Macieira @ 2025-11-12 3:56 UTC (permalink / raw)
To: musl, noloader
Cc: James Y Knight, Alejandro Colomar, The 8472, Florian Weimer,
libc-alpha, Arthur O'Dwyer, Jonathan Wakely
[-- Attachment #1: Type: text/plain, Size: 1205 bytes --]
On Tuesday, 11 November 2025 14:19:27 Pacific Standard Time Jeffrey Walton
wrote:
> Now the C++ folks have a growable string that does not need reallocations.
If you know the end size, even a rough estimate of it, by all means use it. An
overestimation is probably worth the trade-off for not needing to reallocate.
The problem is when the user doesn't know the end size, such as when parsing
some input that doesn't come with a size prepended. Or when the user simply
fails to call such function because it isn't *necessary*, just an
optimisation.
BTW, that reminds me of the dual of container growth: the shrink_to_fit()
operation. Some developers may reserve() too much and then shrink_to_fit() the
memory. If the contained type can't be memmove()d around, the container
implementation may opt to not give the memory back in the first place. If we
did have realloci(), we could signal to the memory allocator that a portion of
the memory is now free for new allocations - whether that implementation can
make use of it will be implementation-dependent.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Principal Engineer - Intel Data Center - Platform & Sys. Eng.
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 870 bytes --]
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-11 13:55 ` Florian Weimer
@ 2025-11-12 10:31 ` Brooks Davis
2025-11-12 11:26 ` Florian Weimer
0 siblings, 1 reply; 116+ messages in thread
From: Brooks Davis @ 2025-11-12 10:31 UTC (permalink / raw)
To: Florian Weimer
Cc: musl, Thiago Macieira, Alejandro Colomar, libc-alpha, A. Wilcox,
L??n??rd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney
On Tue, Nov 11, 2025 at 02:55:03PM +0100, Florian Weimer wrote:
> * Brooks Davis:
>
> > On Fri, Oct 31, 2025 at 09:54:52PM -0600, Paul Eggert wrote:
> >> On 10/31/25 17:27, Thiago Macieira wrote:
> >> > I don't think you are because imposing this requirement would imply it will
> >> > never memcpy() the data to a new location and that would break quite a lot of
> >> > applications that depend the ability to grow a block so long as there's heap
> >> > available.
> >>
> >> You're right I'm not saying that. All I'm saying is that when R=realloc(P,N)
> >> succeeds, you can assume that you can adjust old pointers into the object
> >> addressed by P by adding R-P to them. The C standard says this results in
> >> undefined behavior; all that we need to do is fix the C standard to say it's
> >> well-defined (because it is on practical platforms).
> >
> > As described (adding R-P), this is broken on CHERI systems. You must
> > derive new pointers from R as pointers derived from P will be out of
> > bounds (and with revocation will become invalid at some indeterminate
> > point.) In CheriBSD, we do take care to preserve the invariant that
> > either the pointer is unchanged including bounds or we've done a
> > malloc-memcpy-free.
>
> Would this work?
>
> new->field = (decltype(new->field)) ((char *) new
> + ((uintptr_t) new->field
> - (uintptr_t) old));
>
> That should have correct provenance.
Yes, this works (I'd tend to write that with ptraddr_t casts instead
of uintptr_t casts as the provenance is slightly ambigious, but those
aren't standard yet (hopefully coming in C++2Y)). I think people have
historically choosen the other option because it saves a subtraction in
a loop over values that require updating. In our heap temporal safety
work we've taken care to ensure that this work even if the old values
are invalidated during the update.
> > MTE is likely to have similar issues with pointer updates unless the
> > implementer ensures that realloc returns pointers of the same color.
>
> Only if pointer additions wrap around, but that would be a problem even
> without MTE: ptrdiff_t cannot represent all potential pointer offsets
> anyway.
The issues is that if new and old are different colors, you must derive
the updates from new and not old or you'll have the wrong color.
-- Brooks
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-12 10:31 ` Brooks Davis
@ 2025-11-12 11:26 ` Florian Weimer
0 siblings, 0 replies; 116+ messages in thread
From: Florian Weimer @ 2025-11-12 11:26 UTC (permalink / raw)
To: Brooks Davis
Cc: musl, Thiago Macieira, Alejandro Colomar, libc-alpha, A. Wilcox,
L??n??rd Szolnoki, Collin Funk, Arthur O'Dwyer,
Jonathan Wakely, Paul E. McKenney
* Brooks Davis:
>> > MTE is likely to have similar issues with pointer updates unless the
>> > implementer ensures that realloc returns pointers of the same color.
>>
>> Only if pointer additions wrap around, but that would be a problem even
>> without MTE: ptrdiff_t cannot represent all potential pointer offsets
>> anyway.
>
> The issues is that if new and old are different colors, you must derive
> the updates from new and not old or you'll have the wrong color.
The tag change would end up in ptrdiff_t. I think MTE still uses 64-bit
arithmetic for pointers.
(And what I wrote above is wrong for most (all?) Linux 64-bit
architectures: offsets between object pointers are always representable
in ptrdiff_t.)
Thanks,
Florian
^ permalink raw reply [flat|nested] 116+ messages in thread
* Re: [musl] Re: realloci(): A realloc() variant that works in-place
2025-11-12 2:04 ` Rich Felker
@ 2025-11-12 18:35 ` James Y Knight
0 siblings, 0 replies; 116+ messages in thread
From: James Y Knight @ 2025-11-12 18:35 UTC (permalink / raw)
To: Rich Felker
Cc: Thorsten Glaser, Alejandro Colomar, musl, The 8472,
Thiago Macieira, Florian Weimer, libc-alpha, Arthur O'Dwyer,
Jonathan Wakely
On Tue, Nov 11, 2025 at 9:04 PM Rich Felker <dalias@libc.org> wrote:
> > Just to take a concrete example, GNU libstdc++'s std::string allocates
> > the exact size requested, initially. So, appending a char will always
> > initially grow the container, by doubling the size
>
> But in order to utilize a new interface, libstdc++ would have to be
> modified to do so, and there would be all sorts of conditional support
> and namespacing issues to deal with.
>
> If you're stuck modifying libstdc++ anyway, you might as well just
> make it use a better strategy than initially using exact size, then
> doubling. This works everywhere and does not require any new contracts
> or feature detection.
The "better" strategy is pretty dependent on the malloc
implementation's internal implementation details.
Many strings are initialized with particular data and then never
grown, so you want to avoid reserving more memory for the initial
allocation -- but enable using the remainder of the already-reserved
space if there is any.
We can look at llvm libc++'s std::string implementation as an example.
It was initially written to round up its requested allocations to a
multiple of 16, exactly to avoid wasting reserved space. Presumably
the malloc implementation the author used at the time it was written
was rounding internally to multiples of 16 bytes. But of course, other
allocators have different behaviors. E.g. on glibc, rounding to 16 is
nearly _pessimal_, as glibc malloc only allocates buckets of size
24+16*n (so, 24, 40, 56, 72, 88, etc). So on glibc, requesting sizes
that are multiples of 16-byte guarantees 8 wasted bytes in every
allocation. And every other allocator of course has its own unique
behaviors.
So now, libc++ rounds allocation requests to 8 bytes instead. This
avoids pessimizing commonly-used allocators which aren't rounding to
multiples of 16, and still preserves the ability to use some of the
slack space -- with the assumption that allocators are likely to at
least round sizes to a multiple of 8. But, that's still a compromise
which leaves memory on the table for any allocator which has
size-classes which are spaced larger than 8 bytes apart (which is most
of them).
Of course, libc++ could attempt to hardcode the allocation bucket
strategy of each libc, e.g. on glibc, always request 24+16*n
allocations. That would be effective as long as the implementations
match -- which cannot be guaranteed as the behavior of libc malloc
could change, or the user could replace the allocator. C++ allocators
are explicitly user-replaceable, and C malloc is typically
user-replaceable as an implementation extension. So, it's
significantly better to use an API which has the allocator directly
provide the required information than to try to guess a heuristic. So
you could call the nonstandard API, `size_t nallocx(size_t)`, which
returns the size which would be reserved for a given requested size.
That can be useful, but is suboptimal, as described by
http://wg21.link/p0901r11#nallocx
^ permalink raw reply [flat|nested] 116+ messages in thread
end of thread, other threads:[~2025-11-12 18:36 UTC | newest]
Thread overview: 116+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-30 23:15 [musl] realloci(): A realloc() variant that works in-place Alejandro Colomar
2025-10-30 23:25 ` A. Wilcox
2025-10-30 23:35 ` Collin Funk
2025-10-31 6:47 ` Lénárd Szolnoki
2025-10-31 12:16 ` Thorsten Glaser
2025-11-01 1:03 ` Rich Felker
2025-10-31 13:43 ` [musl] " Alejandro Colomar
2025-10-31 14:13 ` Laurent Bercot
2025-10-31 14:36 ` Thorsten Glaser
2025-10-31 15:14 ` Alejandro Colomar
2025-10-31 15:45 ` Thorsten Glaser
2025-10-31 16:02 ` Thiago Macieira
2025-10-31 16:22 ` Alejandro Colomar
2025-10-31 16:59 ` Paul Eggert
2025-10-31 17:25 ` Thiago Macieira
2025-10-31 17:31 ` Paul Eggert
2025-10-31 17:53 ` Thiago Macieira
2025-10-31 18:35 ` Andreas Schwab
2025-10-31 19:17 ` Thiago Macieira
2025-10-31 20:18 ` Paul Eggert
2025-11-01 3:47 ` [musl] " Oliver Hunt
2025-11-01 14:18 ` Florian Weimer
2025-11-02 1:11 ` Oliver Hunt
2025-10-31 18:12 ` [musl] " Paul E. McKenney
2025-10-31 19:15 ` Thiago Macieira
2025-10-31 19:49 ` Paul E. McKenney
2025-10-31 20:13 ` Alejandro Colomar
2025-10-31 20:33 ` Paul Eggert
2025-10-31 21:14 ` Thiago Macieira
2025-10-31 22:25 ` Paul Eggert
2025-10-31 23:27 ` Thiago Macieira
2025-11-01 3:54 ` Paul Eggert
2025-11-01 13:38 ` Thorsten Glaser
2025-11-01 14:55 ` Thiago Macieira
2025-11-11 12:04 ` Brooks Davis
2025-11-11 13:55 ` Florian Weimer
2025-11-12 10:31 ` Brooks Davis
2025-11-12 11:26 ` Florian Weimer
2025-11-09 11:37 ` Alejandro Colomar
2025-11-09 15:31 ` Paul Eggert
2025-11-09 17:38 ` Alejandro Colomar
2025-11-09 18:11 ` Rich Felker
2025-11-09 19:03 ` Paul Eggert
2025-11-09 19:16 ` Alejandro Colomar
2025-11-10 1:20 ` Rich Felker
2025-11-10 2:47 ` Paul Eggert
2025-11-10 10:07 ` Alejandro Colomar
2025-11-10 14:51 ` Zack Weinberg
2025-11-10 15:11 ` Rich Felker
2025-11-10 15:18 ` Alejandro Colomar
2025-11-09 18:16 ` Alejandro Colomar
2025-10-31 21:06 ` Thiago Macieira
2025-10-31 22:09 ` Alejandro Colomar
2025-10-31 22:33 ` Joseph Myers
2025-10-31 22:51 ` Alejandro Colomar
2025-10-31 23:48 ` Thiago Macieira
2025-11-01 0:47 ` Alejandro Colomar
2025-11-01 12:57 ` Florian Weimer
2025-11-01 15:11 ` Thiago Macieira
2025-10-31 17:07 ` Thiago Macieira
2025-10-31 17:29 ` Paul E. McKenney
2025-10-31 23:46 ` Morten Welinder
2025-10-31 15:35 ` Thiago Macieira
2025-11-01 13:05 ` Florian Weimer
2025-11-01 15:03 ` Thiago Macieira
2025-11-01 15:14 ` Florian Weimer
2025-11-01 15:42 ` Thiago Macieira
2025-11-01 16:14 ` Alejandro Colomar
2025-11-01 19:40 ` Thiago Macieira
2025-11-02 13:31 ` Alejandro Colomar
2025-11-02 23:10 ` Thiago Macieira
2025-11-02 23:55 ` Arthur O'Dwyer
2025-11-03 0:27 ` Rich Felker
2025-11-03 0:56 ` Thiago Macieira
2025-11-02 23:58 ` Alejandro Colomar
2025-11-03 0:28 ` Rich Felker
2025-11-03 9:36 ` Alejandro Colomar
2025-11-03 21:28 ` Rich Felker
2025-11-03 23:51 ` The 8472
2025-11-04 10:31 ` Szabolcs Nagy
2025-11-04 17:24 ` Thiago Macieira
2025-11-04 20:46 ` Thiago Macieira
2025-11-04 21:01 ` Rich Felker
2025-11-05 0:37 ` Demi Marie Obenour
2025-11-05 4:56 ` Rich Felker
2025-11-05 11:24 ` Alejandro Colomar
2025-11-05 17:38 ` Thiago Macieira
2025-11-06 21:53 ` Alejandro Colomar
2025-11-06 18:03 ` James Y Knight
2025-11-06 21:49 ` Alejandro Colomar
2025-11-06 23:10 ` Michael Winterberg
2025-11-07 15:33 ` Rich Felker
2025-11-11 20:36 ` James Y Knight
2025-11-11 20:51 ` Rich Felker
2025-11-11 21:07 ` Thorsten Glaser
2025-11-11 22:19 ` Jeffrey Walton
2025-11-12 3:56 ` Thiago Macieira
2025-11-11 21:59 ` Thiago Macieira
2025-11-11 22:45 ` Rich Felker
2025-11-12 3:50 ` Thiago Macieira
2025-11-11 23:17 ` James Y Knight
2025-11-12 2:04 ` Rich Felker
2025-11-12 18:35 ` James Y Knight
2025-11-11 21:56 ` Thiago Macieira
2025-11-03 0:41 ` Thiago Macieira
2025-11-01 15:22 ` Alejandro Colomar
2025-11-01 18:10 ` Rich Felker
2025-11-01 18:17 ` Thorsten Glaser
2025-11-01 18:20 ` Collin Funk
2025-11-01 19:14 ` Alejandro Colomar
2025-11-01 19:27 ` Laurent Bercot
2025-11-01 19:38 ` Thorsten Glaser
2025-11-01 20:02 ` Thiago Macieira
2025-11-01 20:58 ` Thorsten Glaser
2025-11-01 22:12 ` Re[2]: " Laurent Bercot
2025-11-01 20:50 ` Demi Marie Obenour
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).