From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Subject: Re: [musl] [PATCH v2] MT fork
Date: Mon, 9 Nov 2020 13:44:56 -0500 [thread overview]
Message-ID: <20201109184456.GB534@brightrain.aerifal.cx> (raw)
In-Reply-To: <C6YXH9CO8YXT.1ASOTS17EZ872@mussels>
On Mon, Nov 09, 2020 at 03:01:24PM -0300, Érico Nogueira wrote:
> On Mon Nov 9, 2020 at 9:07 AM -03, Rich Felker wrote:
> > On Sun, Nov 08, 2020 at 05:12:15PM +0100, Szabolcs Nagy wrote:
> > > * Rich Felker <dalias@libc.org> [2020-11-05 22:36:17 -0500]:
> > > > On Fri, Oct 30, 2020 at 11:31:17PM -0400, Rich Felker wrote:
> > > > > One thing I know is potentially problematic is interaction with malloc
> > > > > replacement -- locking of any of the subsystems locked at fork time
> > > > > necessarily takes place after application atfork handlers, so if the
> > > > > malloc replacement registers atfork handlers (as many do), it could
> > > > > deadlock. I'm exploring whether malloc use in these systems can be
> > > > > eliminated. A few are almost-surely better just using direct mmap
> > > > > anyway, but for some it's borderline. I'll have a better idea sometime
> > > > > in the next few days.
> > > >
> > > > OK, here's a summary of the affected locks (where there's a lock order
> > > > conflict between them and application-replaced malloc):
> > >
> > > if malloc replacements take internal locks in atfork
> > > handlers then it's not just libc internal locks that
> > > can cause problems but locks taken by other atfork
> > > handlers that were registered before the malloc one.
> >
> > No other locks taken there could impede forward process in libc. The
> > only reason malloc is special here is because we allowed the
> > application to redefine a family of functions used by libc. For
> > MT-fork with malloc inside libc, we do the malloc_atfork code last so
> > that the lock isn't held while other libc components' locks are being
> > taken. But we can't start taking libc-internal locks until the
> > application atfork handlers run, or the application could see a
> > deadlocked libc state (e.g. if something in the atfork handlers used
> > time functions, maybe to log time of fork, or gettext functions, maybe
> > to print a localized message, etc.).
> >
> > > i don't think there is a clean solution to this. i
> > > think using mmap is ugly (uless there is some reason
> > > to prefer that other than the atfork issue).
> >
> > When the size is exactly a 4k page, it's preferable in terms of memory
> > usage to use mmap instead of malloc, except on wacky archs with large
> > pages (which I don't think it makes sense to memory-usage-optimize
> > for; they're inherently and unfixably memory-inefficient).
> >
> > But for the most part, yes, it's ugly and I don't want to make such a
> > change.
> >
> > > > - atexit: uses calloc to allocate more handler slots of the builtin 32
> > > > are exhausted. Could reasonably be changed to just mmap a whole page
> > > > of slots in this case.
> >
> > Not sure on this. Since it's only used in the extremely rare case
> > where a huge number of atexit handlers are registered, it's probably
> > nicer to use mmap anyway -- it avoids linking a malloc that will
> > almost surely never be used by atfork.
> >
> > > > - dlerror: the lock is just for a queue of buffers to be freed on
> > > > future calls, since they can't be freed at thread exit time because
> > > > the calling context (thread that's "already exited") is not valid to
> > > > call application code, and malloc might be replaced. one plausible
> > > > solution here is getting rid of the free queue hack (and thus the
> > > > lock) entirely and instead calling libc's malloc/free via dlsym
> > > > rather than using the potentially-replaced symbol. but this would
> > > > not work for static linking (same dlerror is used even though dlopen
> > > > always fails; in future it may work) so it's probably not a good
> > > > approach. mmap is really not a good option here because it's
> > > > excessive mem usage. It's probably possible to just repeatedly
> > > > unlock/relock around performing each free so that only one lock is
> > > > held at once.
> >
> > I think the repeated unlock/relock works fine here, but it would also
> > work to just null out the list in the child (i.e. never free any
> > buffers that were queued to be freed in the parent before fork). Fork
> > of MT process is inherently leaky anyway (you can never free thread
> > data from the other parent threads). I'm not sure which approach is
> > nicer. The repeated unlock/relock is less logically invasive I think.
> >
> > > > - gettext: bindtextdomain calls calloc while holding the lock on list
> > > > of bindings. It could drop the lock, allocate, retake it, recheck
> > > > for an existing binding, and free in that case, but this is
> > > > undesirable because it introduces a dependency on free in
> > > > static-linked programs. Otherwise all memory gettext allocates is
> > > > permanent. Because of this we could just mmap an area and bump
> > > > allocate it, but that's wasteful because most programs will only use
> > > > one tiny binding. We could also just leak on the rare possibility of
> > > > concurrent binding allocations; the number of such leaks is bounded
> > > > by nthreads*ndomains, and we could make it just nthreads by keeping
> > > > and reusing abandoned ones.
> >
> > Thoughts here?
> >
> > > > - sem_open: a one-time calloc of global semtab takes place with the
> > > > lock held. On 32-bit archs this table is exactly 4k; on 64-bit it's
> > > > 6k. So it seems very reasonable to just mmap instead of calloc.
> >
> > This should almost surely just be changed to mmap. With mallocng, an
> > early malloc(4k) will take over 5k, and this table is permanent so the
> > excess will never be released. mmap avoids that entirely.
> >
> > > > - timezone: The tz core allocates memory to remember the last-seen
> > > > value of TZ env var to react if it changes. Normally it's small, so
> > > > perhaps we could just use a small (e.g. 32 byte) static buffer and
> > > > replace it with a whole mmapped page if a value too large for that
> > > > is seen.
> > > >
> > > > Also, somehow I failed to find one of the important locks MT-fork
> > > > needs to be taking: locale_map.c has a lock for the records of mapped
> > > > locales. Allocation also takes place with it held, and for the same
> > > > reason as gettext it really shouldn't be changed to allocate
> > > > differently. It could possibly do the allocation without the lock held
> > > > though and leak it (or save it for reuse later if needed) when another
> > > > thread races to load the locale.
> > >
> > > yeah this sounds problematic.
> > >
> > > if malloc interposers want to do something around fork
> > > then libc may need to expose some better api than atfork.
> >
> > Most of them use existing pthread_atfork if they're intended to be
> > usable with MT-fork. I don't think inventing a new musl-specific API
> > is a solution here. Their pthread_atfork approach already fully works
> > with glibc because glibc *doesn't* make anything but malloc work in
> > the MT-forked child. All the other subsystems mentioned here will
> > deadlock or blow up in the child with glibc, but only with 0.01%
> > probability since it's rare for them to be under hammering at fork
> > time.
> >
> > One solution you might actually like: getting rid of
> > application-provided-malloc use inside libc. This could be achieved by
> > making malloc a thin wrapper for __libc_malloc or whatever, which
> > could be called by everything in libc that doesn't actually have a
> > contract to return "as-if-by-malloc" memory. Only a few functions like
> > getdelim would be left still calling malloc.
>
> This code block in glob() uses strdup(), which I'd assume would have to
> use the application provided malloc. Wouldn't that have to be worked
> around somehow?
>
> if (*pat) {
> char *p = strdup(pat);
> if (!p) return GLOB_NOSPACE;
> buf[0] = 0;
> size_t pos = 0;
> char *s = p;
> if ((flags & (GLOB_TILDE | GLOB_TILDE_CHECK)) && *p == '~')
> error = expand_tilde(&s, buf, &pos);
> if (!error)
> error = do_glob(buf, pos, 0, s, flags, errfunc, &tail);
> free(p);
> }
It could either be left using public malloc (imo fine since this is
not an "internal component of libc" but generic library code with no
tie-in to libc) or use of strdup could be replaced with a trivial
alternate version that uses __libc_malloc instead. My leaning would be
towards the former -- only using libc malloc in places where calling
the application-provided malloc could lead to recursive locking of
libc-internal locks (because the caller already holds a libc-internal
lock) or other "inconsistent state" issues (like dlerror buffers at
pthread_exit time).
Rich
next prev parent reply other threads:[~2020-11-09 18:45 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-26 0:50 [musl] Status report and " Rich Felker
2020-10-26 0:59 ` Rich Felker
2020-10-26 3:29 ` Rich Felker
2020-10-26 18:44 ` Érico Nogueira
2020-10-26 19:52 ` Rich Felker
2020-10-26 20:11 ` Érico Nogueira
2020-10-27 21:17 ` Rich Felker
2020-10-28 18:56 ` [musl] [PATCH v2] " Rich Felker
2020-10-28 23:06 ` Milan P. Stanić
2020-10-29 16:13 ` Szabolcs Nagy
2020-10-29 16:20 ` Rich Felker
2020-10-29 20:55 ` Milan P. Stanić
2020-10-29 22:21 ` Szabolcs Nagy
2020-10-29 23:00 ` Milan P. Stanić
2020-10-29 23:27 ` Rich Felker
2020-10-30 0:13 ` Rich Felker
2020-10-30 7:47 ` Milan P. Stanić
2020-10-30 18:52 ` Milan P. Stanić
2020-10-30 18:57 ` Rich Felker
2020-10-30 21:31 ` Ariadne Conill
2020-10-31 3:31 ` Rich Felker
2020-11-06 3:36 ` Rich Felker
2020-11-08 16:12 ` Szabolcs Nagy
2020-11-09 17:07 ` Rich Felker
2020-11-09 18:01 ` Érico Nogueira
2020-11-09 18:44 ` Rich Felker [this message]
2020-11-09 18:54 ` Érico Nogueira
2020-11-09 21:59 ` Szabolcs Nagy
2020-11-09 22:23 ` Rich Felker
2020-11-11 0:52 ` Rich Felker
2020-11-11 6:35 ` Alexey Izbyshev
2020-11-11 11:25 ` Szabolcs Nagy
2020-11-11 14:56 ` Rich Felker
2020-11-11 16:35 ` Rich Felker
2020-10-31 7:22 ` Timo Teras
2020-10-31 13:29 ` Szabolcs Nagy
2020-10-31 13:35 ` Timo Teras
2020-10-31 14:41 ` Ariadne Conill
2020-10-31 14:49 ` Rich Felker
2020-10-31 14:48 ` Rich Felker
2020-10-31 14:47 ` Rich Felker
2020-10-29 23:32 ` Szabolcs Nagy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201109184456.GB534@brightrain.aerifal.cx \
--to=dalias@libc.org \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).