Rich,
thanks a lot for looking into the code.

Am Dienstag, den 12.08.2014, 12:01 -0400 schrieb Rich Felker:
> As far as I can tell, the only thing that's saving you from sending
> futex wakes after free is that you're just using spinlocks.

No, I don't think so. These protect critical sections at the beginning
of the cnd_t calls. The cnd_*wait calls hold the mutex at that time
anyhow, so even if these would be implemented with mutexes (an extra
one per cnd_t to protect the critical section) this wouldn't cause
late wakes, I think.

> This is an
> extremely expensive solution: While contention is rare, as soon as you
> do hit contention, if there are many threads they all pile on and
> start spinning, and the time to obtain a lock (and cpu time/energy
> spent waiting) grows extremely high. And of course it becomes infinite
> if you have any threads of differing priorities and the low-priority
> thread has the lock...

I think you dramatize a bit :)

First you overlooked that all the cnd_*wait calls are done while in
addition the mutex in question is held. So no two waiters will fight
for the critical section at the same time.

In addition this protects against the signal and broadcast calls. I
don't think that a thundering herd of "wakers" (and not waiters) is a
very common scenario.

Second, the idea here is that inside these critical sections there is
only a constant amount of actions that is to be performed, and in
particular there are no waits or other system calls. E.g the malloc
call is done outside, the data is prepared, and then only linked
inside the critical section.

I checked the assembler and this produces only some assembler lines
inside the critical sections. They may only last at most some memory
cycles. The worst case scenario is that the data that is used is not
cached. So such a "regular" critical section last only a very small
fraction of a scheduling interval.

It is very unlikely that a thread that reaches the critical section is
unscheduled *during* that critical section. If it is unscheduled, you
are right, the wait can be long. But that event is very unlikely, so
the average time inside the critical section is still short, with a
probability distribution that is a bit skewed because of the
outliers.

(And then there is no concept of different scheduling priorities for C
threads, all of them are equal.)

Jens

-- 
:: INRIA Nancy Grand Est ::: AlGorille ::: ICube/ICPS :::
:: ::::::::::::::: office Strasbourg : +33 368854536   ::
:: :::::::::::::::::::::: gsm France : +33 651400183   ::
:: ::::::::::::::: gsm international : +49 15737185122 ::
:: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::