mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: [musl] [PATCH v2] MT fork
Date: Mon, 9 Nov 2020 13:44:56 -0500	[thread overview]
Message-ID: <20201109184456.GB534@brightrain.aerifal.cx> (raw)
In-Reply-To: <C6YXH9CO8YXT.1ASOTS17EZ872@mussels>

On Mon, Nov 09, 2020 at 03:01:24PM -0300, Érico Nogueira wrote:
> On Mon Nov 9, 2020 at 9:07 AM -03, Rich Felker wrote:
> > On Sun, Nov 08, 2020 at 05:12:15PM +0100, Szabolcs Nagy wrote:
> > > * Rich Felker <dalias@libc.org> [2020-11-05 22:36:17 -0500]:
> > > > On Fri, Oct 30, 2020 at 11:31:17PM -0400, Rich Felker wrote:
> > > > > One thing I know is potentially problematic is interaction with malloc
> > > > > replacement -- locking of any of the subsystems locked at fork time
> > > > > necessarily takes place after application atfork handlers, so if the
> > > > > malloc replacement registers atfork handlers (as many do), it could
> > > > > deadlock. I'm exploring whether malloc use in these systems can be
> > > > > eliminated. A few are almost-surely better just using direct mmap
> > > > > anyway, but for some it's borderline. I'll have a better idea sometime
> > > > > in the next few days.
> > > > 
> > > > OK, here's a summary of the affected locks (where there's a lock order
> > > > conflict between them and application-replaced malloc):
> > > 
> > > if malloc replacements take internal locks in atfork
> > > handlers then it's not just libc internal locks that
> > > can cause problems but locks taken by other atfork
> > > handlers that were registered before the malloc one.
> >
> > No other locks taken there could impede forward process in libc. The
> > only reason malloc is special here is because we allowed the
> > application to redefine a family of functions used by libc. For
> > MT-fork with malloc inside libc, we do the malloc_atfork code last so
> > that the lock isn't held while other libc components' locks are being
> > taken. But we can't start taking libc-internal locks until the
> > application atfork handlers run, or the application could see a
> > deadlocked libc state (e.g. if something in the atfork handlers used
> > time functions, maybe to log time of fork, or gettext functions, maybe
> > to print a localized message, etc.).
> >
> > > i don't think there is a clean solution to this. i
> > > think using mmap is ugly (uless there is some reason
> > > to prefer that other than the atfork issue).
> >
> > When the size is exactly a 4k page, it's preferable in terms of memory
> > usage to use mmap instead of malloc, except on wacky archs with large
> > pages (which I don't think it makes sense to memory-usage-optimize
> > for; they're inherently and unfixably memory-inefficient).
> >
> > But for the most part, yes, it's ugly and I don't want to make such a
> > change.
> >
> > > > - atexit: uses calloc to allocate more handler slots of the builtin 32
> > > >   are exhausted. Could reasonably be changed to just mmap a whole page
> > > >   of slots in this case.
> >
> > Not sure on this. Since it's only used in the extremely rare case
> > where a huge number of atexit handlers are registered, it's probably
> > nicer to use mmap anyway -- it avoids linking a malloc that will
> > almost surely never be used by atfork.
> >
> > > > - dlerror: the lock is just for a queue of buffers to be freed on
> > > >   future calls, since they can't be freed at thread exit time because
> > > >   the calling context (thread that's "already exited") is not valid to
> > > >   call application code, and malloc might be replaced. one plausible
> > > >   solution here is getting rid of the free queue hack (and thus the
> > > >   lock) entirely and instead calling libc's malloc/free via dlsym
> > > >   rather than using the potentially-replaced symbol. but this would
> > > >   not work for static linking (same dlerror is used even though dlopen
> > > >   always fails; in future it may work) so it's probably not a good
> > > >   approach. mmap is really not a good option here because it's
> > > >   excessive mem usage. It's probably possible to just repeatedly
> > > >   unlock/relock around performing each free so that only one lock is
> > > >   held at once.
> >
> > I think the repeated unlock/relock works fine here, but it would also
> > work to just null out the list in the child (i.e. never free any
> > buffers that were queued to be freed in the parent before fork). Fork
> > of MT process is inherently leaky anyway (you can never free thread
> > data from the other parent threads). I'm not sure which approach is
> > nicer. The repeated unlock/relock is less logically invasive I think.
> >
> > > > - gettext: bindtextdomain calls calloc while holding the lock on list
> > > >   of bindings. It could drop the lock, allocate, retake it, recheck
> > > >   for an existing binding, and free in that case, but this is
> > > >   undesirable because it introduces a dependency on free in
> > > >   static-linked programs. Otherwise all memory gettext allocates is
> > > >   permanent. Because of this we could just mmap an area and bump
> > > >   allocate it, but that's wasteful because most programs will only use
> > > >   one tiny binding. We could also just leak on the rare possibility of
> > > >   concurrent binding allocations; the number of such leaks is bounded
> > > >   by nthreads*ndomains, and we could make it just nthreads by keeping
> > > >   and reusing abandoned ones.
> >
> > Thoughts here?
> >
> > > > - sem_open: a one-time calloc of global semtab takes place with the
> > > >   lock held. On 32-bit archs this table is exactly 4k; on 64-bit it's
> > > >   6k. So it seems very reasonable to just mmap instead of calloc.
> >
> > This should almost surely just be changed to mmap. With mallocng, an
> > early malloc(4k) will take over 5k, and this table is permanent so the
> > excess will never be released. mmap avoids that entirely.
> >
> > > > - timezone: The tz core allocates memory to remember the last-seen
> > > >   value of TZ env var to react if it changes. Normally it's small, so
> > > >   perhaps we could just use a small (e.g. 32 byte) static buffer and
> > > >   replace it with a whole mmapped page if a value too large for that
> > > >   is seen.
> > > > 
> > > > Also, somehow I failed to find one of the important locks MT-fork
> > > > needs to be taking: locale_map.c has a lock for the records of mapped
> > > > locales. Allocation also takes place with it held, and for the same
> > > > reason as gettext it really shouldn't be changed to allocate
> > > > differently. It could possibly do the allocation without the lock held
> > > > though and leak it (or save it for reuse later if needed) when another
> > > > thread races to load the locale.
> > > 
> > > yeah this sounds problematic.
> > > 
> > > if malloc interposers want to do something around fork
> > > then libc may need to expose some better api than atfork.
> >
> > Most of them use existing pthread_atfork if they're intended to be
> > usable with MT-fork. I don't think inventing a new musl-specific API
> > is a solution here. Their pthread_atfork approach already fully works
> > with glibc because glibc *doesn't* make anything but malloc work in
> > the MT-forked child. All the other subsystems mentioned here will
> > deadlock or blow up in the child with glibc, but only with 0.01%
> > probability since it's rare for them to be under hammering at fork
> > time.
> >
> > One solution you might actually like: getting rid of
> > application-provided-malloc use inside libc. This could be achieved by
> > making malloc a thin wrapper for __libc_malloc or whatever, which
> > could be called by everything in libc that doesn't actually have a
> > contract to return "as-if-by-malloc" memory. Only a few functions like
> > getdelim would be left still calling malloc.
> 
> This code block in glob() uses strdup(), which I'd assume would have to
> use the application provided malloc. Wouldn't that have to be worked
> around somehow?
> 
> 	if (*pat) {
> 		char *p = strdup(pat);
> 		if (!p) return GLOB_NOSPACE;
> 		buf[0] = 0;
> 		size_t pos = 0;
> 		char *s = p;
> 		if ((flags & (GLOB_TILDE | GLOB_TILDE_CHECK)) && *p == '~')
> 			error = expand_tilde(&s, buf, &pos);
> 		if (!error)
> 			error = do_glob(buf, pos, 0, s, flags, errfunc, &tail);
> 		free(p);
> 	}

It could either be left using public malloc (imo fine since this is
not an "internal component of libc" but generic library code with no
tie-in to libc) or use of strdup could be replaced with a trivial
alternate version that uses __libc_malloc instead. My leaning would be
towards the former -- only using libc malloc in places where calling
the application-provided malloc could lead to recursive locking of
libc-internal locks (because the caller already holds a libc-internal
lock) or other "inconsistent state" issues (like dlerror buffers at
pthread_exit time).

Rich

  reply	other threads:[~2020-11-09 18:45 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-26  0:50 [musl] Status report and " Rich Felker
2020-10-26  0:59 ` Rich Felker
2020-10-26  3:29   ` Rich Felker
2020-10-26 18:44     ` Érico Nogueira
2020-10-26 19:52       ` Rich Felker
2020-10-26 20:11         ` Érico Nogueira
2020-10-27 21:17   ` Rich Felker
2020-10-28 18:56     ` [musl] [PATCH v2] " Rich Felker
2020-10-28 23:06       ` Milan P. Stanić
2020-10-29 16:13         ` Szabolcs Nagy
2020-10-29 16:20           ` Rich Felker
2020-10-29 20:55           ` Milan P. Stanić
2020-10-29 22:21             ` Szabolcs Nagy
2020-10-29 23:00               ` Milan P. Stanić
2020-10-29 23:27                 ` Rich Felker
2020-10-30  0:13                   ` Rich Felker
2020-10-30  7:47                   ` Milan P. Stanić
2020-10-30 18:52                     ` Milan P. Stanić
2020-10-30 18:57                       ` Rich Felker
2020-10-30 21:31                         ` Ariadne Conill
2020-10-31  3:31                           ` Rich Felker
2020-11-06  3:36                             ` Rich Felker
2020-11-08 16:12                               ` Szabolcs Nagy
2020-11-09 17:07                                 ` Rich Felker
2020-11-09 18:01                                   ` Érico Nogueira
2020-11-09 18:44                                     ` Rich Felker [this message]
2020-11-09 18:54                                       ` Érico Nogueira
2020-11-09 21:59                                   ` Szabolcs Nagy
2020-11-09 22:23                                     ` Rich Felker
2020-11-11  0:52                                       ` Rich Felker
2020-11-11  6:35                                         ` Alexey Izbyshev
2020-11-11 11:25                                           ` Szabolcs Nagy
2020-11-11 14:56                                             ` Rich Felker
2020-11-11 16:35                                         ` Rich Felker
2020-10-31  7:22                           ` Timo Teras
2020-10-31 13:29                             ` Szabolcs Nagy
2020-10-31 13:35                               ` Timo Teras
2020-10-31 14:41                                 ` Ariadne Conill
2020-10-31 14:49                                   ` Rich Felker
2020-10-31 14:48                                 ` Rich Felker
2020-10-31 14:47                             ` Rich Felker
2020-10-29 23:32                 ` Szabolcs Nagy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201109184456.GB534@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).