mailing list of musl libc
 help / color / mirror / code / Atom feed
* value of thread-specific data immediately after initialization
@ 2018-09-13 20:39 Benjamin Peterson
  2018-09-17 20:54 ` Rich Felker
  0 siblings, 1 reply; 3+ messages in thread
From: Benjamin Peterson @ 2018-09-13 20:39 UTC (permalink / raw)
  To: musl

POSIX says that after the creation of a thread-specific key, its associated value should be NULL in all threads. musl may not uphold this requirement if a particular key is deleted and later resued.

Here's an example of the bug:

#include <pthread.h>
#include <stdio.h>

int main() {
	pthread_key_t k;
	pthread_key_create(&k, NULL);
	printf("%p\n", pthread_getspecific(k));
	pthread_setspecific(k, &k);
	pthread_key_delete(k);
	pthread_key_create(&k, NULL);
	printf("%p\n", pthread_getspecific(k));
	pthread_key_delete(k);
	return 0;
}

Per POSIX, I would expect this testcase to output two NULLs. However, musl prints the address of the old data even after the second initialization:

$ musl-gcc -o test test.c
$ ./test 
0
0x7fff2ba3d004


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: value of thread-specific data immediately after initialization
  2018-09-13 20:39 value of thread-specific data immediately after initialization Benjamin Peterson
@ 2018-09-17 20:54 ` Rich Felker
  2018-09-18  3:18   ` Rich Felker
  0 siblings, 1 reply; 3+ messages in thread
From: Rich Felker @ 2018-09-17 20:54 UTC (permalink / raw)
  To: musl

On Thu, Sep 13, 2018 at 01:39:38PM -0700, Benjamin Peterson wrote:
> POSIX says that after the creation of a thread-specific key, its
> associated value should be NULL in all threads. musl may not uphold
> this requirement if a particular key is deleted and later resued.
> 
> Here's an example of the bug:
> 
> #include <pthread.h>
> #include <stdio.h>
> 
> int main() {
> 	pthread_key_t k;
> 	pthread_key_create(&k, NULL);
> 	printf("%p\n", pthread_getspecific(k));
> 	pthread_setspecific(k, &k);
> 	pthread_key_delete(k);
> 	pthread_key_create(&k, NULL);
> 	printf("%p\n", pthread_getspecific(k));
> 	pthread_key_delete(k);
> 	return 0;
> }
> 
> Per POSIX, I would expect this testcase to output two NULLs.
> However, musl prints the address of the old data even after the
> second initialization:
> 
> $ musl-gcc -o test test.c
> $ ./test 
> 0
> 0x7fff2ba3d004

I'm aware of this, and was aware of it when I wrote the code. My view
was that it's a programming error to delete a key while there are
still values associated with it, especially when the intended usage
has dtors (which obviously couldn't be run) linked to the keys, with
the implication that the data stored may have nontrivial destruction
requirements. However, this is not in agreement with POSIX, which
says:

    "The thread-specific data values associated with key need not be
    NULL at the time pthread_key_delete() is called."

I think this should be fixed, but I'm not sure the whole thing is
sufficiently specified. For example what is supposed to happen if a
call to pthread_key_delete is made concurrently with the exit of
another thread?

I think pthread_key_create and delete need to take an rwlock for write
on the keys (dtors) array, and __pthread_tsd_run_dtors needs to hold a
read lock while walking the list, releasing and reacquiring it
whenever it finds a dtor it needs to call. This at least makes it
thread-safe. The current code is not; if a key is deleted while data
remains, the loop in __pthread_tsd_run_dtors has a TOCTOU race where
it could find (self->tsd[i] && keys[i]) true, then crash when calling
keys[i] if keys[i] has become a null pointer.

For clearing values at delete time, I don't want to use sequence
numbers and lazy deletion. They waste a lot of memory, aren't robust
unless they're very large (at least 64-bit, preferably larger), and
slow down read access to tsd values. Instead I think we can move
pthread_key_delete to a separate file (so it doesn't pull in bulky
dependencies if not linked) and have it call __synccall to zero the
value in all threads. This is very expensive, but deletion is not
something you expect to have happen often if at all.

If we maintained a linked list of all live threads we could just walk
that instead, but doing that would impose heavy synchronization costs
on reasonable operations rather than just on unreasonable ones, I
think.

One mitigation of the cost would be not to reuse keys until they're
exhausted. That is, on deletion, set the dtor value to a sentinel
rather than clearing it. Then, pthread_key_create would only allocate
new key values that haven't been recycled until they all run out. When
they run out, __synccall would be used to clear all the stale values
at once, and all the sentinel dtor pointers could be cleared for
reuse.

Rich


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: value of thread-specific data immediately after initialization
  2018-09-17 20:54 ` Rich Felker
@ 2018-09-18  3:18   ` Rich Felker
  0 siblings, 0 replies; 3+ messages in thread
From: Rich Felker @ 2018-09-18  3:18 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 4111 bytes --]

On Mon, Sep 17, 2018 at 04:54:56PM -0400, Rich Felker wrote:
> On Thu, Sep 13, 2018 at 01:39:38PM -0700, Benjamin Peterson wrote:
> > POSIX says that after the creation of a thread-specific key, its
> > associated value should be NULL in all threads. musl may not uphold
> > this requirement if a particular key is deleted and later resued.
> > 
> > Here's an example of the bug:
> > 
> > #include <pthread.h>
> > #include <stdio.h>
> > 
> > int main() {
> > 	pthread_key_t k;
> > 	pthread_key_create(&k, NULL);
> > 	printf("%p\n", pthread_getspecific(k));
> > 	pthread_setspecific(k, &k);
> > 	pthread_key_delete(k);
> > 	pthread_key_create(&k, NULL);
> > 	printf("%p\n", pthread_getspecific(k));
> > 	pthread_key_delete(k);
> > 	return 0;
> > }
> > 
> > Per POSIX, I would expect this testcase to output two NULLs.
> > However, musl prints the address of the old data even after the
> > second initialization:
> > 
> > $ musl-gcc -o test test.c
> > $ ./test 
> > 0
> > 0x7fff2ba3d004
> 
> I'm aware of this, and was aware of it when I wrote the code. My view
> was that it's a programming error to delete a key while there are
> still values associated with it, especially when the intended usage
> has dtors (which obviously couldn't be run) linked to the keys, with
> the implication that the data stored may have nontrivial destruction
> requirements. However, this is not in agreement with POSIX, which
> says:
> 
>     "The thread-specific data values associated with key need not be
>     NULL at the time pthread_key_delete() is called."
> 
> I think this should be fixed, but I'm not sure the whole thing is
> sufficiently specified. For example what is supposed to happen if a
> call to pthread_key_delete is made concurrently with the exit of
> another thread?
> 
> I think pthread_key_create and delete need to take an rwlock for write
> on the keys (dtors) array, and __pthread_tsd_run_dtors needs to hold a
> read lock while walking the list, releasing and reacquiring it
> whenever it finds a dtor it needs to call. This at least makes it
> thread-safe. The current code is not; if a key is deleted while data
> remains, the loop in __pthread_tsd_run_dtors has a TOCTOU race where
> it could find (self->tsd[i] && keys[i]) true, then crash when calling
> keys[i] if keys[i] has become a null pointer.
> 
> For clearing values at delete time, I don't want to use sequence
> numbers and lazy deletion. They waste a lot of memory, aren't robust
> unless they're very large (at least 64-bit, preferably larger), and
> slow down read access to tsd values. Instead I think we can move
> pthread_key_delete to a separate file (so it doesn't pull in bulky
> dependencies if not linked) and have it call __synccall to zero the
> value in all threads. This is very expensive, but deletion is not
> something you expect to have happen often if at all.
> 
> If we maintained a linked list of all live threads we could just walk
> that instead, but doing that would impose heavy synchronization costs
> on reasonable operations rather than just on unreasonable ones, I
> think.
> 
> One mitigation of the cost would be not to reuse keys until they're
> exhausted. That is, on deletion, set the dtor value to a sentinel
> rather than clearing it. Then, pthread_key_create would only allocate
> new key values that haven't been recycled until they all run out. When
> they run out, __synccall would be used to clear all the stale values
> at once, and all the sentinel dtor pointers could be cleared for
> reuse.

I have a patch I'm testing that implements this idea, including the
mitigation in the last paragraph. It seems to work fine; I've attached
it in case you or anyone else wants to take a look before I'm ready to
commit/push.

Note that your test program is insufficient to demonstrate that it's
working; it assumes the same key will be handed out again, which now
won't happen until all keys have been exhausted. The best way I know
to test is loop allocating keys until you get an error, setting a
value for all of them, then delete and recreate just one and confirm
that its value is null.

Rich

[-- Attachment #2: key_delete.diff --]
[-- Type: text/plain, Size: 4094 bytes --]

diff --git a/src/internal/pthread_impl.h b/src/internal/pthread_impl.h
index 052a547..26e6e1d 100644
--- a/src/internal/pthread_impl.h
+++ b/src/internal/pthread_impl.h
@@ -148,6 +148,9 @@ hidden void __do_cleanup_push(struct __ptcb *);
 hidden void __do_cleanup_pop(struct __ptcb *);
 hidden void __pthread_tsd_run_dtors();
 
+hidden void __pthread_key_delete_synccall(void (*)(void *), void *);
+hidden int __pthread_key_delete_impl(pthread_key_t);
+
 extern hidden volatile int __block_new_threads;
 extern hidden volatile size_t __pthread_tsd_size;
 extern hidden void *__pthread_tsd_main[];
diff --git a/src/thread/pthread_key_create.c b/src/thread/pthread_key_create.c
index a78e507..b6ec8c5 100644
--- a/src/thread/pthread_key_create.c
+++ b/src/thread/pthread_key_create.c
@@ -5,52 +5,100 @@ void *__pthread_tsd_main[PTHREAD_KEYS_MAX] = { 0 };
 
 static void (*volatile keys[PTHREAD_KEYS_MAX])(void *);
 
+static pthread_rwlock_t key_lock = PTHREAD_RWLOCK_INITIALIZER;
+
+static pthread_key_t next_key;
+
 static void nodtor(void *dummy)
 {
 }
 
+static void dirty(void *dummy)
+{
+}
+
+static void clean_dirty_tsd(void *dummy)
+{
+	pthread_t self = __pthread_self();
+	pthread_key_t i;
+	for (i=0; i<PTHREAD_KEYS_MAX; i++) {
+		if (keys[i] == dirty && self->tsd[i])
+			self->tsd[i] = 0;
+	}
+}
+
 int __pthread_key_create(pthread_key_t *k, void (*dtor)(void *))
 {
-	unsigned i = (uintptr_t)&k / 16 % PTHREAD_KEYS_MAX;
-	unsigned j = i;
+	pthread_key_t j = next_key;
 	pthread_t self = __pthread_self();
+	int found_dirty = 0;
 
 	/* This can only happen in the main thread before
 	 * pthread_create has been called. */
 	if (!self->tsd) self->tsd = __pthread_tsd_main;
 
 	if (!dtor) dtor = nodtor;
+
+	pthread_rwlock_wrlock(&key_lock);
 	do {
-		if (!a_cas_p(keys+j, 0, (void *)dtor)) {
-			*k = j;
+		if (!keys[j]) {
+			keys[next_key = *k = j] = dtor;
+			pthread_rwlock_unlock(&key_lock);
 			return 0;
+		} else if (keys[j] == dirty) {
+			found_dirty = 1;
+		}
+	} while ((j=(j+1)%PTHREAD_KEYS_MAX) != next_key);
+
+	if (!found_dirty) {
+		pthread_rwlock_unlock(&key_lock);
+		return EAGAIN;
+	}
+
+	__pthread_key_delete_synccall(clean_dirty_tsd, 0);
+
+	for (j=0; j<PTHREAD_KEYS_MAX; j++) {
+		if (keys[j] == dirty) {
+			if (dtor) {
+				keys[next_key = *k = j] = dtor;
+				dtor = 0;
+			} else {
+				keys[j] = 0;
+			}
 		}
-	} while ((j=(j+1)%PTHREAD_KEYS_MAX) != i);
-	return EAGAIN;
+	}
+
+	pthread_rwlock_unlock(&key_lock);
+	return 0;
 }
 
-int __pthread_key_delete(pthread_key_t k)
+int __pthread_key_delete_impl(pthread_key_t k)
 {
-	keys[k] = 0;
+	pthread_rwlock_wrlock(&key_lock);
+	keys[k] = dirty;
+	pthread_rwlock_unlock(&key_lock);
 	return 0;
 }
 
 void __pthread_tsd_run_dtors()
 {
 	pthread_t self = __pthread_self();
-	int i, j, not_finished = self->tsd_used;
-	for (j=0; not_finished && j<PTHREAD_DESTRUCTOR_ITERATIONS; j++) {
-		not_finished = 0;
+	int i, j;
+	pthread_rwlock_rdlock(&key_lock);
+	for (j=0; self->tsd_used && j<PTHREAD_DESTRUCTOR_ITERATIONS; j++) {
+		self->tsd_used = 0;
 		for (i=0; i<PTHREAD_KEYS_MAX; i++) {
-			if (self->tsd[i] && keys[i]) {
-				void *tmp = self->tsd[i];
-				self->tsd[i] = 0;
-				keys[i](tmp);
-				not_finished = 1;
+			void *val = self->tsd[i];
+			void (*dtor)(void *) = keys[i];
+			self->tsd[i] = 0;
+			if (val && dtor && dtor != nodtor && dtor != dirty) {
+				pthread_rwlock_unlock(&key_lock);
+				dtor(val);
+				pthread_rwlock_rdlock(&key_lock);
 			}
 		}
 	}
+	pthread_rwlock_unlock(&key_lock);
 }
 
-weak_alias(__pthread_key_delete, pthread_key_delete);
 weak_alias(__pthread_key_create, pthread_key_create);
diff --git a/src/thread/pthread_key_delete.c b/src/thread/pthread_key_delete.c
index e69de29..012fe2d 100644
--- a/src/thread/pthread_key_delete.c
+++ b/src/thread/pthread_key_delete.c
@@ -0,0 +1,14 @@
+#include "pthread_impl.h"
+#include "libc.h"
+
+void __pthread_key_delete_synccall(void (*f)(void *), void *p)
+{
+	__synccall(f, p);
+}
+
+int __pthread_key_delete(pthread_key_t k)
+{
+	return __pthread_key_delete_impl(k);
+}
+
+weak_alias(__pthread_key_delete, pthread_key_delete);

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2018-09-18  3:18 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-13 20:39 value of thread-specific data immediately after initialization Benjamin Peterson
2018-09-17 20:54 ` Rich Felker
2018-09-18  3:18   ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).