mailing list of musl libc
 help / color / mirror / code / Atom feed
* Replacing malloc
@ 2014-08-08 12:15 Alexander Monakov
  2014-08-08 13:57 ` Szabolcs Nagy
  2014-08-08 17:19 ` Rich Felker
  0 siblings, 2 replies; 3+ messages in thread
From: Alexander Monakov @ 2014-08-08 12:15 UTC (permalink / raw)
  To: musl

[changing topic, subject adjusted]

On Fri, 8 Aug 2014, Rich Felker wrote:
> The fourth issue is much bigger: replacing malloc is UB and does not
> work, especially not on musl. :-)

Whoa.  Let me ask for further clarifications.

You probably don't need me to tell you that most people expect that replacing
malloc would in fact work, with at least two use cases in mind: a tracking
wrapper around libc malloc (obtained via dlsym), or an entire custom allocator
that obtains fresh memory via mmap.  So you can LD_PRELOAD a "malloc
debugging library" or preload or even link against an alternative allocator.

Of course it's not without inherent issues.  If the alternative allocator
provides malloc/realloc/calloc/free, it's going to see an unexpected but
legitimate free when the application passes pointer obtained via
posix_memalign.  Or when the application obtains a free()-able pointer via
other libc functionality such as asprintf and the libc is linked in such a way
that internal malloc calls are not interposable.  Or on glibc a malloc wrapper
needs to handle malloc->dlsym->malloc recursion.

I hope above you didn't mean to say that anybody wishing to use malloc
wrappers or custom mmap-based malloc replacements on musl should abandon all
hope, period; but merely that it is not for production use, and attempting to
do so should be with care, for instance if gnash library uses custom malloc,
it may not return pointers to that memory to be free()'d by the main
executable (calling libc's free).  But it would be like that on any libc.  So
I have to wonder what "especially not on musl" stands for.

But so far every time I speak about a problem with musl the problem is
deeper than I initially think -- so please clarify :)

Alexander


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Replacing malloc
  2014-08-08 12:15 Replacing malloc Alexander Monakov
@ 2014-08-08 13:57 ` Szabolcs Nagy
  2014-08-08 17:19 ` Rich Felker
  1 sibling, 0 replies; 3+ messages in thread
From: Szabolcs Nagy @ 2014-08-08 13:57 UTC (permalink / raw)
  To: musl

* Alexander Monakov <amonakov@ispras.ru> [2014-08-08 16:15:29 +0400]:
> On Fri, 8 Aug 2014, Rich Felker wrote:
> > The fourth issue is much bigger: replacing malloc is UB and does not
> > work, especially not on musl. :-)
> 
> Whoa.  Let me ask for further clarifications.
> 
> You probably don't need me to tell you that most people expect that replacing
> malloc would in fact work, with at least two use cases in mind: a tracking
> wrapper around libc malloc (obtained via dlsym), or an entire custom allocator
> that obtains fresh memory via mmap.  So you can LD_PRELOAD a "malloc
> debugging library" or preload or even link against an alternative allocator.
> 
> Of course it's not without inherent issues.  If the alternative allocator
> provides malloc/realloc/calloc/free, it's going to see an unexpected but
> legitimate free when the application passes pointer obtained via
> posix_memalign.  Or when the application obtains a free()-able pointer via
> other libc functionality such as asprintf and the libc is linked in such a way
> that internal malloc calls are not interposable.  Or on glibc a malloc wrapper
> needs to handle malloc->dlsym->malloc recursion.

i think the point is that all these issues are libc internals
and the application shouldnt know about the details and libc
should be free to change them without prior notice

> I hope above you didn't mean to say that anybody wishing to use malloc
> wrappers or custom mmap-based malloc replacements on musl should abandon all
> hope, period; but merely that it is not for production use, and attempting to
> do so should be with care, for instance if gnash library uses custom malloc,
> it may not return pointers to that memory to be free()'d by the main
> executable (calling libc's free).  But it would be like that on any libc.  So
> I have to wonder what "especially not on musl" stands for.
> 

musl is linked with -Bsymbolic-functions so internally uses
its own malloc which avoids a lot of issues (it can internally
rely on its own malloc behaviour), but can cause problems when
external and internal alloc is mixed as you noted
(ie strdup, strndup, wcsdup, posix_memalign, aligned_alloc,
valloc, asprintf, vasprintf,.. pointers are passed to free,
the list may change in future, eg reallocarray was proposed)

external alloc must not use brk (it will collide with the internal
brk usage) and on some systems brk is the only way to access most
of heap memory (iirc on arm with default kernel settings and default
RLIMIT_DATA half of the address space is not available to mmap)

theoretically a libc implementation is allowed to handle the 'malloc'
symbol specially so usual linking rules do not apply (eg musl could
#define malloc __musl_malloc_v1 in the headers but leave free as is)

theoretically the libc may use malloc to implement mmap, dlsym etc
(although mmap is used in async-signal-safe contexts in practice..)

> But so far every time I speak about a problem with musl the problem is
> deeper than I initially think -- so please clarify :)
> 
> Alexander


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Replacing malloc
  2014-08-08 12:15 Replacing malloc Alexander Monakov
  2014-08-08 13:57 ` Szabolcs Nagy
@ 2014-08-08 17:19 ` Rich Felker
  1 sibling, 0 replies; 3+ messages in thread
From: Rich Felker @ 2014-08-08 17:19 UTC (permalink / raw)
  To: musl

On Fri, Aug 08, 2014 at 04:15:29PM +0400, Alexander Monakov wrote:
> [changing topic, subject adjusted]
> 
> On Fri, 8 Aug 2014, Rich Felker wrote:
> > The fourth issue is much bigger: replacing malloc is UB and does not
> > work, especially not on musl. :-)
> 
> Whoa.  Let me ask for further clarifications.
> 
> You probably don't need me to tell you that most people expect that replacing
> malloc would in fact work, with at least two use cases in mind: a tracking
> wrapper around libc malloc (obtained via dlsym), or an entire custom allocator
> that obtains fresh memory via mmap.  So you can LD_PRELOAD a "malloc
> debugging library" or preload or even link against an alternative allocator.

A wrapper "works" as long as it does not expect to see all calls (it
won't see ones internal to libc). So if it's just for keeping stats on
application memory usage, it's probably fine. But if it's trying to
actually encapsulate the allocations (e.g. increase N and add its own
headers around them), you'll run into serious problems when you pass a
pointer from a wrapped malloc to an unwrapped free, or vice versa.

> Of course it's not without inherent issues.  If the alternative allocator
> provides malloc/realloc/calloc/free, it's going to see an unexpected but
> legitimate free when the application passes pointer obtained via
> posix_memalign.  Or when the application obtains a free()-able pointer via
> other libc functionality such as asprintf and the libc is linked in such a way
> that internal malloc calls are not interposable.  Or on glibc a malloc wrapper
> needs to handle malloc->dlsym->malloc recursion.

Right. I've raised similar issues with glibc, which attempts to
support malloc replacements but really doesn't (because of similar
issues). The new C11 aligned_alloc is again a problem since many/most
malloc replacements will fail to provide it. 

My view is that if they want to support this, they need to provide
detailed documentation for what's required to make it work, including
a list of symbols that need to be provided. This is complicated by
namespace clash issues. For instance providing mallinfo wrongly
imposes on the standard namespace, but not providing it might lead to
applications which want mallinfo calling the "wrong one".

Of course future-proofing this against new malloc-subsystem functions
yet to be added is also very difficult, and I don't know the right way
to do it. I just know that the current way is a time-bomb...

> I hope above you didn't mean to say that anybody wishing to use malloc
> wrappers or custom mmap-based malloc replacements on musl should abandon all
> hope, period; but merely that it is not for production use, and attempting to

It's definitely not for production use, at least not with arbitrary
versions of musl. It may be possible, with some testing and analysis,
when doing static linking with a known version of musl or in an
environment you have full control over (like an embedded system).

> do so should be with care, for instance if gnash library uses custom malloc,
> it may not return pointers to that memory to be free()'d by the main
> executable (calling libc's free).  But it would be like that on any libc.  So
> I have to wonder what "especially not on musl" stands for.

The situation used to be much worse, since many custom mallocs attempt
to use sbrk and thereby corrupt the state of the internal malloc's
heap. A while back we had a big discussion about this which ended in
disabling sbrk (making it always-fail with nonzero arguments). That
caught a lot of issues where applications were previously corrupting
the heap (now they immediately fail with ENOMEM messages, or they
fallback to mmap and work).

The remaining aspects to "especially not on musl" are related to the
fact that, unlike glibc, musl does not go out of its way to attempt to
support malloc replacements:

- For dynamic linking, musl always calls malloc/free directly, not
  symbolically, so it will not use your replacements. This matters
  especially for the dynamic linker that uses malloc before anything
  else is loaded (glibc's dynamic linker instead uses a temporary
  malloc at load time and switches later) and which uses malloc
  internals for being able to "donate back" unused writable memory
  from dynamic linking for use by malloc.

- For static linking, musl will use whatever malloc symbol appears
  first in the link order (there's no way around this). So the
  inconsistency in behavior may be surprising. Also, this means that
  if you call any of the memalign-type functions (including: if they
  get called internally! but I don't think they do) and your malloc
  replacement does not provide them, the internal memalign's
  assumptions about the heap structure will probably corrupt your
  malloc's state.

- Since sbrk intentionally does not work, some malloc implementations
  might not work at all.

I can't think of any other reasons it's _especially_ bad to try this
on musl right now, but there may be more.

> But so far every time I speak about a problem with musl the problem is
> deeper than I initially think -- so please clarify :)

:)

Basically, each implementation has its own manifestation of the UB of
replacing malloc. I think musl's current manifestation might be less
similar to application expectations than other implementations'. It's
also not a behavior that we document, support, or try to preserve
across versions; it's the way it is because the way it is currently
makes implementation internals the simplest.

Rich


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-08-08 17:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-08 12:15 Replacing malloc Alexander Monakov
2014-08-08 13:57 ` Szabolcs Nagy
2014-08-08 17:19 ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).