On Mon, Sep 17, 2018 at 04:54:56PM -0400, Rich Felker wrote: > On Thu, Sep 13, 2018 at 01:39:38PM -0700, Benjamin Peterson wrote: > > POSIX says that after the creation of a thread-specific key, its > > associated value should be NULL in all threads. musl may not uphold > > this requirement if a particular key is deleted and later resued. > > > > Here's an example of the bug: > > > > #include > > #include > > > > int main() { > > pthread_key_t k; > > pthread_key_create(&k, NULL); > > printf("%p\n", pthread_getspecific(k)); > > pthread_setspecific(k, &k); > > pthread_key_delete(k); > > pthread_key_create(&k, NULL); > > printf("%p\n", pthread_getspecific(k)); > > pthread_key_delete(k); > > return 0; > > } > > > > Per POSIX, I would expect this testcase to output two NULLs. > > However, musl prints the address of the old data even after the > > second initialization: > > > > $ musl-gcc -o test test.c > > $ ./test > > 0 > > 0x7fff2ba3d004 > > I'm aware of this, and was aware of it when I wrote the code. My view > was that it's a programming error to delete a key while there are > still values associated with it, especially when the intended usage > has dtors (which obviously couldn't be run) linked to the keys, with > the implication that the data stored may have nontrivial destruction > requirements. However, this is not in agreement with POSIX, which > says: > > "The thread-specific data values associated with key need not be > NULL at the time pthread_key_delete() is called." > > I think this should be fixed, but I'm not sure the whole thing is > sufficiently specified. For example what is supposed to happen if a > call to pthread_key_delete is made concurrently with the exit of > another thread? > > I think pthread_key_create and delete need to take an rwlock for write > on the keys (dtors) array, and __pthread_tsd_run_dtors needs to hold a > read lock while walking the list, releasing and reacquiring it > whenever it finds a dtor it needs to call. This at least makes it > thread-safe. The current code is not; if a key is deleted while data > remains, the loop in __pthread_tsd_run_dtors has a TOCTOU race where > it could find (self->tsd[i] && keys[i]) true, then crash when calling > keys[i] if keys[i] has become a null pointer. > > For clearing values at delete time, I don't want to use sequence > numbers and lazy deletion. They waste a lot of memory, aren't robust > unless they're very large (at least 64-bit, preferably larger), and > slow down read access to tsd values. Instead I think we can move > pthread_key_delete to a separate file (so it doesn't pull in bulky > dependencies if not linked) and have it call __synccall to zero the > value in all threads. This is very expensive, but deletion is not > something you expect to have happen often if at all. > > If we maintained a linked list of all live threads we could just walk > that instead, but doing that would impose heavy synchronization costs > on reasonable operations rather than just on unreasonable ones, I > think. > > One mitigation of the cost would be not to reuse keys until they're > exhausted. That is, on deletion, set the dtor value to a sentinel > rather than clearing it. Then, pthread_key_create would only allocate > new key values that haven't been recycled until they all run out. When > they run out, __synccall would be used to clear all the stale values > at once, and all the sentinel dtor pointers could be cleared for > reuse. I have a patch I'm testing that implements this idea, including the mitigation in the last paragraph. It seems to work fine; I've attached it in case you or anyone else wants to take a look before I'm ready to commit/push. Note that your test program is insufficient to demonstrate that it's working; it assumes the same key will be handed out again, which now won't happen until all keys have been exhausted. The best way I know to test is loop allocating keys until you get an error, setting a value for all of them, then delete and recreate just one and confirm that its value is null. Rich