From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 7444 invoked from network); 12 Nov 2020 23:27:12 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 12 Nov 2020 23:27:12 -0000 Received: (qmail 21736 invoked by uid 550); 12 Nov 2020 23:27:07 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 21718 invoked from network); 12 Nov 2020 23:27:07 -0000 Date: Fri, 13 Nov 2020 00:26:55 +0100 From: Szabolcs Nagy To: Rich Felker Cc: musl@lists.openwall.com Message-ID: <20201112232655.GJ1370092@port70.net> Mail-Followup-To: Rich Felker , musl@lists.openwall.com References: <20201111205728.GA19755@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201111205728.GA19755@brightrain.aerifal.cx> Subject: Re: [musl] [PATCH v3] MT-fork (series) * Rich Felker [2020-11-11 15:57:31 -0500]: > >From cbecda0b506c7d49a2f7fe3dc44e0e3dcf663764 Mon Sep 17 00:00:00 2001 > From: Rich Felker > Date: Tue, 10 Nov 2020 14:29:05 -0500 > Subject: [PATCH 1/5] dlerror: don't gratuitously hold freebuf_queue lock while > freeing > > thread-local buffers allocated for dlerror need to be queued for free > at a later time when the owning thread exits, since malloc may be > replaced by application code and the exiting context is not valid to > call application code from. the code to process queue of pending > frees, introduced in commit aa5a9d15e09851f7b4a1668e9dbde0f6234abada, > gratuitously held the lock for the entire duration of queue > processing, updating the global queue pointer after each free, despite > there being no logical requirement that all frees finish before > another thread can access the queue. > > instead, immediately claim the whole queue for freeing and release the > lock, then walk the list and perform frees without the lock held. the > change is unlikely to make any meaningful difference to performance, > but it eliminates one point where the allocator is called under an > internal lock. since the allocator may be application-provided, such > calls are undesirable because they allow application code to impede > forward progress of libc functions in other threads arbitrarily long, > and to induce deadlock if it calls a libc function that requires the > same lock. > > the change also eliminates a lock ordering consideration that's an > impediment upcoming work with multithreaded fork. > --- > src/ldso/dlerror.c | 13 ++++++++----- > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/src/ldso/dlerror.c b/src/ldso/dlerror.c > index 3fcc7779..d8bbfc03 100644 > --- a/src/ldso/dlerror.c > +++ b/src/ldso/dlerror.c > @@ -35,13 +35,16 @@ void __dl_thread_cleanup(void) > hidden void __dl_vseterr(const char *fmt, va_list ap) > { > LOCK(freebuf_queue_lock); > - while (freebuf_queue) { > - void **p = freebuf_queue; > - freebuf_queue = *p; > - free(p); > - } > + void **q = freebuf_queue; > + freebuf_queue = 0; > UNLOCK(freebuf_queue_lock); > > + while (q) { > + void **p = *q; > + free(q); > + q = p; > + } > + this looks good. > va_list ap2; > va_copy(ap2, ap); > pthread_t self = __pthread_self(); > -- > 2.21.0 > > >From c1e5d243b7e39b2fbfb17144608ce045575d8e95 Mon Sep 17 00:00:00 2001 > From: Rich Felker > Date: Tue, 10 Nov 2020 19:32:09 -0500 > Subject: [PATCH 2/5] drop use of getdelim/stdio in dynamic linker > > the only place stdio was used here was for reading the ldso path file, > taking advantage of getdelim to automatically allocate and resize the > buffer. the motivation for use here was that, with shared libraries, > stdio is already available anyway and free to use. this has long been > a nuisance to users because getdelim's use of realloc here triggered a > valgrind bug, but removing it doesn't really fix that; on some archs > even calling the valgrind-interposed malloc at this point will crash. > > the actual motivation for this change is moving towards getting rid of > use of application-provided malloc in parts of libc where it would be > called with libc-internal locks held, leading to the possibility of > deadlock if the malloc implementation doesn't follow unwritten rules > about which libc functions are safe for it to call. since getdelim is > required to produce a pointer as if by malloc (i.e. that can be passed > to reallor or free), it necessarily must use the public malloc. > > instead of performing a realloc loop as the path file is read, first > query its size with fstat and allocate only once. this produces > slightly different truncation behavior when racing with writes to a > file, but neither behavior is or could be made safe anyway; on a live > system, ldso path files should be replaced by atomic rename only. the > change should also reduce memory waste. > --- > ldso/dynlink.c | 27 ++++++++++++++++++++++----- > 1 file changed, 22 insertions(+), 5 deletions(-) > > diff --git a/ldso/dynlink.c b/ldso/dynlink.c > index f9ac0100..502e52c5 100644 > --- a/ldso/dynlink.c > +++ b/ldso/dynlink.c > @@ -1,6 +1,5 @@ > #define _GNU_SOURCE > #define SYSCALL_NO_TLS 1 > -#include > #include > #include > #include > @@ -556,6 +555,20 @@ static void reclaim_gaps(struct dso *dso) > } > } > > +static ssize_t read_loop(int fd, void *p, size_t n) > +{ > + for (size_t i=0; i + ssize_t l = read(fd, (char *)p+i, n-i); > + if (l<0) { > + if (errno==EINTR) continue; > + else return -1; > + } > + if (l==0) return i; > + i += l; > + } > + return n; > +} > + > static void *mmap_fixed(void *p, size_t n, int prot, int flags, int fd, off_t off) > { > static int no_map_fixed; > @@ -1060,13 +1073,17 @@ static struct dso *load_library(const char *name, struct dso *needed_by) > snprintf(etc_ldso_path, sizeof etc_ldso_path, > "%.*s/etc/ld-musl-" LDSO_ARCH ".path", > (int)prefix_len, prefix); > - FILE *f = fopen(etc_ldso_path, "rbe"); > - if (f) { > - if (getdelim(&sys_path, (size_t[1]){0}, 0, f) <= 0) { > + fd = open(etc_ldso_path, O_RDONLY|O_CLOEXEC); > + if (fd>=0) { > + size_t n = 0; > + if (!fstat(fd, &st)) n = st.st_size; > + if ((sys_path = malloc(n+1))) > + sys_path[n] = 0; > + if (!sys_path || read_loop(fd, sys_path, n)<0) { should this handle the short read case? i assume we only want to support atomic updates to the path file so there should not be a short read, but i think rejecting read_loop(,,n)!=n is safer. > free(sys_path); > sys_path = ""; > } > - fclose(f); > + close(fd); > } else if (errno != ENOENT) { > sys_path = ""; > } > -- > 2.21.0 > > >From 8d37958d58cf36f53d5fcc7a8aa6d633da6071b2 Mon Sep 17 00:00:00 2001 > From: Rich Felker > Date: Wed, 11 Nov 2020 00:22:34 -0500 > Subject: [PATCH 3/5] give libc access to its own malloc even if public malloc > is interposed > > allowing the application to replace malloc (since commit > c9f415d7ea2dace5bf77f6518b6afc36bb7a5732) has brought multiple > headaches where it's used from various critical sections in libc > components. for example: > > - the thread-local message buffers allocated for dlerror can't be > freed at thread exit time because application code would then run in > the context of a non-existant thread. this was handled in commit > aa5a9d15e09851f7b4a1668e9dbde0f6234abada by queuing them for free > later. > > - the dynamic linker has to be careful not to pass memory allocated at > early startup time (necessarily using its own malloc) to realloc or > free after redoing relocations with the application and all > libraries present. bugs in this area were fixed several times, at > least in commits 0c5c8f5da6e36fe4ab704bee0cd981837859e23f and > 2f1f51ae7b2d78247568e7fdb8462f3c19e469a4 and possibly others. > > - by calling the allocator from contexts where libc-internal locks are > held, we impose undocumented requirements on alternate malloc > implementations not to call into any libc function that might > attempt to take these locks; if they do, deadlock results. > > - work to make fork of a multithreaded parent give the child an > unrestricted execution environment is blocked by lock order issues > as long as the application-provided allocator can be called with > libc-internal locks held. > > these problems are all fixed by giving libc internals access to the > original, non-replaced allocator, for use where needed. it can't be > used everywhere, as some interfaces like str[n]dup, open_[w]memstream, > getline/getdelim, etc. are required to provide the called memory > obtained as if by (the public) malloc. and there are a number of libc > interfaces that are "pure library" code, not part of some internal > singleton, and where using the application's choice of malloc > implementation is preferable -- things like glob, regex, etc. > > one might expect there to be significant cost to static-linked > programs, pulling in two malloc implementations, one of them > mostly-unused, if malloc is replaced. however, in almost all of the > places where malloc is used internally, care has been taken already > not to pull in realloc/free (i.e. to link with just the bump > allocator). this size optimization carries over automatically. > > the newly-exposed internal allocator functions are obtained by > renaming the actual definitions, then adding new wrappers around them > with the public names. technically __libc_realloc and __libc_free > could be aliases rather than needing a layer of wrapper, but this > would almost surely break certain instrumentation (valgrind) and the > size and performance difference is negligible. __libc_calloc needs to > be handled specially since calloc is designed to work with either the > internal or the replaced malloc. > > as a bonus, this change also eliminates the longstanding ugly > dependency of the static bump allocator on order of object files in > libc.a, by making it so there's only one definition of the malloc > function and having it in the same source file as the bump allocator. > --- > src/include/stdlib.h | 6 ++++++ > src/malloc/free.c | 6 ++++++ > src/malloc/libc_calloc.c | 4 ++++ > src/malloc/lite_malloc.c | 14 +++++++++++++- > src/malloc/mallocng/glue.h | 4 ++++ > src/malloc/oldmalloc/malloc.c | 4 ++++ > src/malloc/realloc.c | 6 ++++++ > 7 files changed, 43 insertions(+), 1 deletion(-) > create mode 100644 src/malloc/free.c > create mode 100644 src/malloc/libc_calloc.c > create mode 100644 src/malloc/realloc.c > > diff --git a/src/include/stdlib.h b/src/include/stdlib.h > index d38a5417..e9da2015 100644 > --- a/src/include/stdlib.h > +++ b/src/include/stdlib.h > @@ -9,4 +9,10 @@ hidden int __mkostemps(char *, int, int); > hidden int __ptsname_r(int, char *, size_t); > hidden char *__randname(char *); > > +hidden void *__libc_malloc(size_t); > +hidden void *__libc_malloc_impl(size_t); > +hidden void *__libc_calloc(size_t, size_t); > +hidden void *__libc_realloc(void *, size_t); > +hidden void __libc_free(void *); > + > #endif > diff --git a/src/malloc/free.c b/src/malloc/free.c > new file mode 100644 > index 00000000..f17a952c > --- /dev/null > +++ b/src/malloc/free.c > @@ -0,0 +1,6 @@ > +#include > + > +void free(void *p) > +{ > + return __libc_free(p); > +} > diff --git a/src/malloc/libc_calloc.c b/src/malloc/libc_calloc.c > new file mode 100644 > index 00000000..d25eabea > --- /dev/null > +++ b/src/malloc/libc_calloc.c > @@ -0,0 +1,4 @@ > +#define calloc __libc_calloc > +#define malloc __libc_malloc > + > +#include "calloc.c" > diff --git a/src/malloc/lite_malloc.c b/src/malloc/lite_malloc.c > index f8931ba5..0f461617 100644 > --- a/src/malloc/lite_malloc.c > +++ b/src/malloc/lite_malloc.c > @@ -100,4 +100,16 @@ static void *__simple_malloc(size_t n) > return p; > } > > -weak_alias(__simple_malloc, malloc); > +weak_alias(__simple_malloc, __libc_malloc_impl); > + > +void *__libc_malloc(size_t n) > +{ > + return __libc_malloc_impl(n); > +} > + > +static void *default_malloc(size_t n) > +{ > + return __libc_malloc_impl(n); > +} > + > +weak_alias(default_malloc, malloc); maybe i'm missing something but i thought it would be enough to do weak_alias(__simple_malloc, __libc_malloc); static void *default_malloc(size_t n) { return __libc_malloc(n); } weak_alias(default_malloc, malloc); here and have strong __libc_malloc symbol in the malloc implementation. > diff --git a/src/malloc/mallocng/glue.h b/src/malloc/mallocng/glue.h > index 16acd1ea..8d7d9a3b 100644 > --- a/src/malloc/mallocng/glue.h > +++ b/src/malloc/mallocng/glue.h > @@ -20,6 +20,10 @@ > #define is_allzero __malloc_allzerop > #define dump_heap __dump_heap > > +#define malloc __libc_malloc_impl > +#define realloc __libc_realloc > +#define free __libc_free > + > #if USE_REAL_ASSERT > #include > #else > diff --git a/src/malloc/oldmalloc/malloc.c b/src/malloc/oldmalloc/malloc.c > index c0997ad8..0c082bce 100644 > --- a/src/malloc/oldmalloc/malloc.c > +++ b/src/malloc/oldmalloc/malloc.c > @@ -10,6 +10,10 @@ > #include "pthread_impl.h" > #include "malloc_impl.h" > > +#define malloc __libc_malloc > +#define realloc __libc_realloc > +#define free __libc_free > + > #if defined(__GNUC__) && defined(__PIC__) > #define inline inline __attribute__((always_inline)) > #endif > diff --git a/src/malloc/realloc.c b/src/malloc/realloc.c > new file mode 100644 > index 00000000..fb0e8b7c > --- /dev/null > +++ b/src/malloc/realloc.c > @@ -0,0 +1,6 @@ > +#include > + > +void *realloc(void *p, size_t n) > +{ > + return __libc_realloc(p, n); > +} > -- > 2.21.0 > > >From 34952fe5de44a833370cbe87b63fb8eec61466d7 Mon Sep 17 00:00:00 2001 > From: Rich Felker > Date: Wed, 11 Nov 2020 13:08:42 -0500 > Subject: [PATCH 4/5] convert malloc use under libc-internal locks to use > internal allocator > > this change lifts undocumented restrictions on calls by replacement > mallocs to libc functions that might take these locks, and sets the > stage for lifting restrictions on the child execution environment > after multithreaded fork. > > care is taken to #define macros to replace all four functions (malloc, > calloc, realloc, free) even if not all of them will be used, using an > undefined symbol name for the ones intended not to be used so that any > inadvertent future use will be caught at compile time rather than > directed to the wrong implementation. > --- > ldso/dynlink.c | 5 +++++ > src/aio/aio.c | 5 +++++ > src/exit/atexit.c | 5 +++++ > src/ldso/dlerror.c | 5 +++++ > src/locale/dcngettext.c | 5 +++++ > src/locale/locale_map.c | 6 ++++++ > src/thread/sem_open.c | 5 +++++ > src/time/__tz.c | 5 +++++ > 8 files changed, 41 insertions(+) this patch looks good. > > diff --git a/ldso/dynlink.c b/ldso/dynlink.c > index 502e52c5..61714f40 100644 > --- a/ldso/dynlink.c > +++ b/ldso/dynlink.c > @@ -23,6 +23,11 @@ > #include "libc.h" > #include "dynlink.h" > > +#define malloc __libc_malloc > +#define calloc __libc_calloc > +#define realloc __libc_realloc > +#define free __libc_free > + > static void error(const char *, ...); > > #define MAXP2(a,b) (-(-(a)&-(b))) > diff --git a/src/aio/aio.c b/src/aio/aio.c > index b488e3d6..e004f98b 100644 > --- a/src/aio/aio.c > +++ b/src/aio/aio.c > @@ -11,6 +11,11 @@ > #include "pthread_impl.h" > #include "aio_impl.h" > > +#define malloc __libc_malloc > +#define calloc __libc_calloc > +#define realloc __libc_realloc > +#define free __libc_free > + > /* The following is a threads-based implementation of AIO with minimal > * dependence on implementation details. Most synchronization is > * performed with pthread primitives, but atomics and futex operations > diff --git a/src/exit/atexit.c b/src/exit/atexit.c > index 160d277a..fcd940fa 100644 > --- a/src/exit/atexit.c > +++ b/src/exit/atexit.c > @@ -3,6 +3,11 @@ > #include "libc.h" > #include "lock.h" > > +#define malloc __libc_malloc > +#define calloc __libc_calloc > +#define realloc undef > +#define free undef > + > /* Ensure that at least 32 atexit handlers can be registered without malloc */ > #define COUNT 32 > > diff --git a/src/ldso/dlerror.c b/src/ldso/dlerror.c > index d8bbfc03..c782ca6c 100644 > --- a/src/ldso/dlerror.c > +++ b/src/ldso/dlerror.c > @@ -5,6 +5,11 @@ > #include "dynlink.h" > #include "lock.h" > > +#define malloc __libc_malloc > +#define calloc __libc_calloc > +#define realloc __libc_realloc > +#define free __libc_free > + > char *dlerror() > { > pthread_t self = __pthread_self(); > diff --git a/src/locale/dcngettext.c b/src/locale/dcngettext.c > index 4c304393..39a98e83 100644 > --- a/src/locale/dcngettext.c > +++ b/src/locale/dcngettext.c > @@ -11,6 +11,11 @@ > #include "pleval.h" > #include "lock.h" > > +#define malloc __libc_malloc > +#define calloc __libc_calloc > +#define realloc undef > +#define free undef > + > struct binding { > struct binding *next; > int dirlen; > diff --git a/src/locale/locale_map.c b/src/locale/locale_map.c > index e7eede62..94f1b04e 100644 > --- a/src/locale/locale_map.c > +++ b/src/locale/locale_map.c > @@ -1,10 +1,16 @@ > #include > #include > #include > +#include > #include "locale_impl.h" > #include "libc.h" > #include "lock.h" > > +#define malloc __libc_malloc > +#define calloc undef > +#define realloc undef > +#define free undef > + > const char *__lctrans_impl(const char *msg, const struct __locale_map *lm) > { > const char *trans = 0; > diff --git a/src/thread/sem_open.c b/src/thread/sem_open.c > index 6fb0c5b2..dad8f177 100644 > --- a/src/thread/sem_open.c > +++ b/src/thread/sem_open.c > @@ -13,6 +13,11 @@ > #include > #include "lock.h" > > +#define malloc __libc_malloc > +#define calloc __libc_calloc > +#define realloc undef > +#define free undef > + > static struct { > ino_t ino; > sem_t *sem; > diff --git a/src/time/__tz.c b/src/time/__tz.c > index 49a7371e..3044d206 100644 > --- a/src/time/__tz.c > +++ b/src/time/__tz.c > @@ -7,6 +7,11 @@ > #include "libc.h" > #include "lock.h" > > +#define malloc __libc_malloc > +#define calloc undef > +#define realloc undef > +#define free undef > + > long __timezone = 0; > int __daylight = 0; > char *__tzname[2] = { 0, 0 }; > -- > 2.21.0 > > >From 167390f05564e0a4d3fcb4329377fd7743267560 Mon Sep 17 00:00:00 2001 > From: Rich Felker > Date: Wed, 11 Nov 2020 13:37:33 -0500 > Subject: [PATCH] lift child restrictions after multi-threaded fork > > as the outcome of Austin Group tracker issue #62, future editions of > POSIX have dropped the requirement that fork be AS-safe. this allows > but does not require implementations to synchronize fork with internal > locks and give forked children of multithreaded parents a partly or > fully unrestricted execution environment where they can continue to > use the standard library (per POSIX, they can only portably use > AS-safe functions). > > up until recently, taking this allowance did not seem desirable. > however, commit 8ed2bd8bfcb4ea6448afb55a941f4b5b2b0398c0 exposed the > extent to which applications and libraries are depending on the > ability to use malloc and other non-AS-safe interfaces in MT-forked > children, by converting latent very-low-probability catastrophic state > corruption into predictable deadlock. dealing with the fallout has > been a huge burden for users/distros. > > while it looks like most of the non-portable usage in applications > could be fixed given sufficient effort, at least some of it seems to > occur in language runtimes which are exposing the ability to run > unrestricted code in the child as part of the contract with the > programmer. any attempt at fixing such contracts is not just a > technical problem but a social one, and is probably not tractable. > > this patch extends the fork function to take locks for all libc > singletons in the parent, and release or reset those locks in the > child, so that when the underlying fork operation takes place, the > state protected by these locks is consistent and ready for the child > to use. locking is skipped in the case where the parent is > single-threaded so as not to interfere with legacy AS-safety property > of fork in single-threaded programs. lock order is mostly arbitrary, > but the malloc locks (including bump allocator in case it's used) must > be taken after the locks on any subsystems that might use malloc, and > non-AS-safe locks cannot be taken while the thread list lock is held, > imposing a requirement that it be taken last. > --- > ldso/dynlink.c | 19 ++++++++++ > src/exit/at_quick_exit.c | 2 + > src/exit/atexit.c | 2 + > src/internal/fork_impl.h | 19 ++++++++++ > src/ldso/dlerror.c | 2 + > src/locale/dcngettext.c | 5 ++- > src/locale/locale_map.c | 5 ++- > src/malloc/lite_malloc.c | 5 ++- > src/malloc/mallocng/glue.h | 14 ++++++- > src/malloc/oldmalloc/malloc.c | 19 ++++++++++ > src/misc/syslog.c | 2 + > src/prng/random.c | 2 + > src/process/fork.c | 70 +++++++++++++++++++++++++++++++++++ > src/stdio/ofl.c | 2 + > src/thread/sem_open.c | 2 + > src/thread/vmlock.c | 2 + > src/time/__tz.c | 2 + > 17 files changed, 170 insertions(+), 4 deletions(-) > create mode 100644 src/internal/fork_impl.h > > diff --git a/ldso/dynlink.c b/ldso/dynlink.c > index 61714f40..6b868c84 100644 > --- a/ldso/dynlink.c > +++ b/ldso/dynlink.c > @@ -20,6 +20,7 @@ > #include > #include > #include "pthread_impl.h" > +#include "fork_impl.h" > #include "libc.h" > #include "dynlink.h" > > @@ -1426,6 +1427,17 @@ void __libc_exit_fini() > } > } > > +void __ldso_atfork(int who) > +{ > + if (who<0) { > + pthread_rwlock_wrlock(&lock); > + pthread_mutex_lock(&init_fini_lock); > + } else { > + pthread_mutex_unlock(&init_fini_lock); > + pthread_rwlock_unlock(&lock); > + } > +} > + > static struct dso **queue_ctors(struct dso *dso) > { > size_t cnt, qpos, spos, i; > @@ -1484,6 +1496,13 @@ static struct dso **queue_ctors(struct dso *dso) > } > queue[qpos] = 0; > for (i=0; imark = 0; > + for (i=0; i + if (queue[i]->ctor_visitor && queue[i]->ctor_visitor->tid < 0) { > + error("State of %s is inconsistent due to multithreaded fork\n", > + queue[i]->name); > + free(queue); > + if (runtime) longjmp(*rtld_fail, 1); > + } hm since fork takes the init_fini_lock i guess the ctors could be finished in the child if necessary. or is there some problem with that? otherwise the patch looks good to me. > > return queue; > } > diff --git a/src/exit/at_quick_exit.c b/src/exit/at_quick_exit.c > index d3ce6522..e4b5d78d 100644 > --- a/src/exit/at_quick_exit.c > +++ b/src/exit/at_quick_exit.c > @@ -1,12 +1,14 @@ > #include > #include "libc.h" > #include "lock.h" > +#include "fork_impl.h" > > #define COUNT 32 > > static void (*funcs[COUNT])(void); > static int count; > static volatile int lock[1]; > +volatile int *const __at_quick_exit_lockptr = lock; > > void __funcs_on_quick_exit() > { > diff --git a/src/exit/atexit.c b/src/exit/atexit.c > index fcd940fa..854e9fdd 100644 > --- a/src/exit/atexit.c > +++ b/src/exit/atexit.c > @@ -2,6 +2,7 @@ > #include > #include "libc.h" > #include "lock.h" > +#include "fork_impl.h" > > #define malloc __libc_malloc > #define calloc __libc_calloc > @@ -20,6 +21,7 @@ static struct fl > > static int slot; > static volatile int lock[1]; > +volatile int *const __atexit_lockptr = lock; > > void __funcs_on_exit() > { > diff --git a/src/internal/fork_impl.h b/src/internal/fork_impl.h > new file mode 100644 > index 00000000..5892c13b > --- /dev/null > +++ b/src/internal/fork_impl.h > @@ -0,0 +1,19 @@ > +#include > + > +extern hidden volatile int *const __at_quick_exit_lockptr; > +extern hidden volatile int *const __atexit_lockptr; > +extern hidden volatile int *const __dlerror_lockptr; > +extern hidden volatile int *const __gettext_lockptr; > +extern hidden volatile int *const __locale_lockptr; > +extern hidden volatile int *const __random_lockptr; > +extern hidden volatile int *const __sem_open_lockptr; > +extern hidden volatile int *const __stdio_ofl_lockptr; > +extern hidden volatile int *const __syslog_lockptr; > +extern hidden volatile int *const __timezone_lockptr; > + > +extern hidden volatile int *const __bump_lockptr; > + > +extern hidden volatile int *const __vmlock_lockptr; > + > +hidden void __malloc_atfork(int); > +hidden void __ldso_atfork(int); > diff --git a/src/ldso/dlerror.c b/src/ldso/dlerror.c > index c782ca6c..afe59253 100644 > --- a/src/ldso/dlerror.c > +++ b/src/ldso/dlerror.c > @@ -4,6 +4,7 @@ > #include "pthread_impl.h" > #include "dynlink.h" > #include "lock.h" > +#include "fork_impl.h" > > #define malloc __libc_malloc > #define calloc __libc_calloc > @@ -24,6 +25,7 @@ char *dlerror() > > static volatile int freebuf_queue_lock[1]; > static void **freebuf_queue; > +volatile int *const __dlerror_lockptr = freebuf_queue_lock; > > void __dl_thread_cleanup(void) > { > diff --git a/src/locale/dcngettext.c b/src/locale/dcngettext.c > index 39a98e83..d1e6c6d1 100644 > --- a/src/locale/dcngettext.c > +++ b/src/locale/dcngettext.c > @@ -10,6 +10,7 @@ > #include "atomic.h" > #include "pleval.h" > #include "lock.h" > +#include "fork_impl.h" > > #define malloc __libc_malloc > #define calloc __libc_calloc > @@ -39,9 +40,11 @@ static char *gettextdir(const char *domainname, size_t *dirlen) > return 0; > } > > +static volatile int lock[1]; > +volatile int *const __gettext_lockptr = lock; > + > char *bindtextdomain(const char *domainname, const char *dirname) > { > - static volatile int lock[1]; > struct binding *p, *q; > > if (!domainname) return 0; > diff --git a/src/locale/locale_map.c b/src/locale/locale_map.c > index 94f1b04e..fa51f2e3 100644 > --- a/src/locale/locale_map.c > +++ b/src/locale/locale_map.c > @@ -5,6 +5,7 @@ > #include "locale_impl.h" > #include "libc.h" > #include "lock.h" > +#include "fork_impl.h" > > #define malloc __libc_malloc > #define calloc undef > @@ -27,9 +28,11 @@ static const char envvars[][12] = { > "LC_MESSAGES", > }; > > +static volatile int lock[1]; > +volatile int *const __locale_lockptr = lock; > + > const struct __locale_map *__get_locale(int cat, const char *val) > { > - static volatile int lock[1]; > static void *volatile loc_head; > const struct __locale_map *p; > struct __locale_map *new = 0; > diff --git a/src/malloc/lite_malloc.c b/src/malloc/lite_malloc.c > index 0f461617..43a988fb 100644 > --- a/src/malloc/lite_malloc.c > +++ b/src/malloc/lite_malloc.c > @@ -6,6 +6,7 @@ > #include "libc.h" > #include "lock.h" > #include "syscall.h" > +#include "fork_impl.h" > > #define ALIGN 16 > > @@ -31,10 +32,12 @@ static int traverses_stack_p(uintptr_t old, uintptr_t new) > return 0; > } > > +static volatile int lock[1]; > +volatile int *const __bump_lockptr = lock; > + > static void *__simple_malloc(size_t n) > { > static uintptr_t brk, cur, end; > - static volatile int lock[1]; > static unsigned mmap_step; > size_t align=1; > void *p; > diff --git a/src/malloc/mallocng/glue.h b/src/malloc/mallocng/glue.h > index 8d7d9a3b..151c48b8 100644 > --- a/src/malloc/mallocng/glue.h > +++ b/src/malloc/mallocng/glue.h > @@ -60,7 +60,8 @@ __attribute__((__visibility__("hidden"))) > extern int __malloc_lock[1]; > > #define LOCK_OBJ_DEF \ > -int __malloc_lock[1]; > +int __malloc_lock[1]; \ > +void __malloc_atfork(int who) { malloc_atfork(who); } > > static inline void rdlock() > { > @@ -77,5 +78,16 @@ static inline void unlock() > static inline void upgradelock() > { > } > +static inline void resetlock() > +{ > + __malloc_lock[0] = 0; > +} > + > +static inline void malloc_atfork(int who) > +{ > + if (who<0) rdlock(); > + else if (who>0) resetlock(); > + else unlock(); > +} > > #endif > diff --git a/src/malloc/oldmalloc/malloc.c b/src/malloc/oldmalloc/malloc.c > index 0c082bce..53f5f959 100644 > --- a/src/malloc/oldmalloc/malloc.c > +++ b/src/malloc/oldmalloc/malloc.c > @@ -9,6 +9,7 @@ > #include "atomic.h" > #include "pthread_impl.h" > #include "malloc_impl.h" > +#include "fork_impl.h" > > #define malloc __libc_malloc > #define realloc __libc_realloc > @@ -531,3 +532,21 @@ void __malloc_donate(char *start, char *end) > c->csize = n->psize = C_INUSE | (end-start); > __bin_chunk(c); > } > + > +void __malloc_atfork(int who) > +{ > + if (who<0) { > + lock(mal.split_merge_lock); > + for (int i=0; i<64; i++) > + lock(mal.bins[i].lock); > + } else if (!who) { > + for (int i=0; i<64; i++) > + unlock(mal.bins[i].lock); > + unlock(mal.split_merge_lock); > + } else { > + for (int i=0; i<64; i++) > + mal.bins[i].lock[0] = mal.bins[i].lock[1] = 0; > + mal.split_merge_lock[1] = 0; > + mal.split_merge_lock[0] = 0; > + } > +} > diff --git a/src/misc/syslog.c b/src/misc/syslog.c > index 13d4b0a6..7dc0c1be 100644 > --- a/src/misc/syslog.c > +++ b/src/misc/syslog.c > @@ -10,6 +10,7 @@ > #include > #include > #include "lock.h" > +#include "fork_impl.h" > > static volatile int lock[1]; > static char log_ident[32]; > @@ -17,6 +18,7 @@ static int log_opt; > static int log_facility = LOG_USER; > static int log_mask = 0xff; > static int log_fd = -1; > +volatile int *const __syslog_lockptr = lock; > > int setlogmask(int maskpri) > { > diff --git a/src/prng/random.c b/src/prng/random.c > index 633a17f6..d3780fa7 100644 > --- a/src/prng/random.c > +++ b/src/prng/random.c > @@ -1,6 +1,7 @@ > #include > #include > #include "lock.h" > +#include "fork_impl.h" > > /* > this code uses the same lagged fibonacci generator as the > @@ -23,6 +24,7 @@ static int i = 3; > static int j = 0; > static uint32_t *x = init+1; > static volatile int lock[1]; > +volatile int *const __random_lockptr = lock; > > static uint32_t lcg31(uint32_t x) { > return (1103515245*x + 12345) & 0x7fffffff; > diff --git a/src/process/fork.c b/src/process/fork.c > index 8d34a9c4..54bc2892 100644 > --- a/src/process/fork.c > +++ b/src/process/fork.c > @@ -1,15 +1,85 @@ > #include > #include > #include "libc.h" > +#include "lock.h" > +#include "pthread_impl.h" > +#include "fork_impl.h" > + > +static volatile int *const dummy_lockptr = 0; > + > +weak_alias(dummy_lockptr, __at_quick_exit_lockptr); > +weak_alias(dummy_lockptr, __atexit_lockptr); > +weak_alias(dummy_lockptr, __dlerror_lockptr); > +weak_alias(dummy_lockptr, __gettext_lockptr); > +weak_alias(dummy_lockptr, __locale_lockptr); > +weak_alias(dummy_lockptr, __random_lockptr); > +weak_alias(dummy_lockptr, __sem_open_lockptr); > +weak_alias(dummy_lockptr, __stdio_ofl_lockptr); > +weak_alias(dummy_lockptr, __syslog_lockptr); > +weak_alias(dummy_lockptr, __timezone_lockptr); > +weak_alias(dummy_lockptr, __bump_lockptr); > + > +weak_alias(dummy_lockptr, __vmlock_lockptr); > + > +static volatile int *const *const atfork_locks[] = { > + &__at_quick_exit_lockptr, > + &__atexit_lockptr, > + &__dlerror_lockptr, > + &__gettext_lockptr, > + &__locale_lockptr, > + &__random_lockptr, > + &__sem_open_lockptr, > + &__stdio_ofl_lockptr, > + &__syslog_lockptr, > + &__timezone_lockptr, > + &__bump_lockptr, > +}; > > static void dummy(int x) { } > weak_alias(dummy, __fork_handler); > +weak_alias(dummy, __malloc_atfork); > +weak_alias(dummy, __ldso_atfork); > + > +static void dummy_0(void) { } > +weak_alias(dummy_0, __tl_lock); > +weak_alias(dummy_0, __tl_unlock); > > pid_t fork(void) > { > + sigset_t set; > __fork_handler(-1); > + __block_app_sigs(&set); > + int need_locks = libc.need_locks > 0; > + if (need_locks) { > + __ldso_atfork(-1); > + __inhibit_ptc(); > + for (int i=0; i + if (*atfork_locks[i]) LOCK(*atfork_locks[i]); > + __malloc_atfork(-1); > + __tl_lock(); > + } > + pthread_t self=__pthread_self(), next=self->next; > pid_t ret = _Fork(); > int errno_save = errno; > + if (need_locks) { > + if (!ret) { > + for (pthread_t td=next; td!=self; td=td->next) > + td->tid = -1; > + if (__vmlock_lockptr) { > + __vmlock_lockptr[0] = 0; > + __vmlock_lockptr[1] = 0; > + } > + } > + __tl_unlock(); > + __malloc_atfork(!ret); > + for (int i=0; i + if (*atfork_locks[i]) > + if (ret) UNLOCK(*atfork_locks[i]); > + else **atfork_locks[i] = 0; > + __release_ptc(); > + __ldso_atfork(!ret); > + } > + __restore_sigs(&set); > __fork_handler(!ret); > if (ret<0) errno = errno_save; > return ret; > diff --git a/src/stdio/ofl.c b/src/stdio/ofl.c > index f2d3215a..aad3d171 100644 > --- a/src/stdio/ofl.c > +++ b/src/stdio/ofl.c > @@ -1,8 +1,10 @@ > #include "stdio_impl.h" > #include "lock.h" > +#include "fork_impl.h" > > static FILE *ofl_head; > static volatile int ofl_lock[1]; > +volatile int *const __stdio_ofl_lockptr = ofl_lock; > > FILE **__ofl_lock() > { > diff --git a/src/thread/sem_open.c b/src/thread/sem_open.c > index dad8f177..0ad29de9 100644 > --- a/src/thread/sem_open.c > +++ b/src/thread/sem_open.c > @@ -12,6 +12,7 @@ > #include > #include > #include "lock.h" > +#include "fork_impl.h" > > #define malloc __libc_malloc > #define calloc __libc_calloc > @@ -24,6 +25,7 @@ static struct { > int refcnt; > } *semtab; > static volatile int lock[1]; > +volatile int *const __sem_open_lockptr = lock; > > #define FLAGS (O_RDWR|O_NOFOLLOW|O_CLOEXEC|O_NONBLOCK) > > diff --git a/src/thread/vmlock.c b/src/thread/vmlock.c > index 75f3cb76..fa0a8e3c 100644 > --- a/src/thread/vmlock.c > +++ b/src/thread/vmlock.c > @@ -1,6 +1,8 @@ > #include "pthread_impl.h" > +#include "fork_impl.h" > > static volatile int vmlock[2]; > +volatile int *const __vmlock_lockptr = vmlock; > > void __vm_wait() > { > diff --git a/src/time/__tz.c b/src/time/__tz.c > index 3044d206..dd2c42c0 100644 > --- a/src/time/__tz.c > +++ b/src/time/__tz.c > @@ -6,6 +6,7 @@ > #include > #include "libc.h" > #include "lock.h" > +#include "fork_impl.h" > > #define malloc __libc_malloc > #define calloc undef > @@ -35,6 +36,7 @@ static char *old_tz = old_tz_buf; > static size_t old_tz_size = sizeof old_tz_buf; > > static volatile int lock[1]; > +volatile int *const __timezone_lockptr = lock; > > static int getint(const char **p) > { > -- > 2.21.0 >