Hello, Am Freitag, den 08.08.2014, 16:48 -0400 schrieb Rich Felker: > On Fri, Aug 08, 2014 at 03:14:06PM -0400, Rich Felker wrote: > > I think I may have a solution you'll like: > > > > We can perform the release of the lock via a compare-and-swap rather > > than a simple swap. In this way, we can know before releasing the lock > > whether it's going to require a wake or not: > > > > - If waiters was zero and the cas from owned/uncontended to zero > > succeeds, no futex wake operation is needed. > > > > - If waiters was nonzero, or if the cas fails (thereby instead > > requiring a cas from owned/contended to zero), we can do the > > following: > > > > Don't use a userspace CAS to release; this would allow the lock to be > > acquired by another thread, released, destroyed, and freed before the > > futex wake is performed. Instead, use FUTEX_WAKE_OP to atomically > > perform the atomic assignment and futex wake. > > FUTEX_WAKE_OP is highly under-documented, and i'm worried it might be > unsupported on some archs (since the atomics for it have to be > implemented on a per-arch basis in the kernel) but of course we can > just fallback on archs where it's not supported yet. > > Anyway, the behavior seems to be: > > - Futex acquisition for uaddr1 and uaddr2 both happen prior to the > atomic operation, and this hold locks that seem to prevent new > waiters on the futex(es). This should preclude any risk of waking a > new waiter that arrives after the atomic operation, as desired. > > - Both uaddr1 and uaddr2 are hashed, with no check for equality. This > is a fairly costly wasteful operation, but could be fixed on the > kernel side. At present I suspect they don't care because > FUTEX_WAKE_OP is considered unnecessary, but if I raise it on the > glibc bug tracker thread for issue 13690 as a solution to the > problem, I think there would be a lot more interest in optimizing > this kernel path. > > - After the atomic operation is performed, a wake is always performed > on uaddr1 (based on the previous acquisition); this fact is omitted > from all the documentation, but it's obviously intentional since > otherwise the uaddr1 argument would not be used for anything but > wasting time. The wake on uaddr2 is conditional on a comparison. > > - No allocation is required anywhere in the operation, so we don't > have to worry about lost actions on OOM. For plain FUTEX_WAKE this > would not have been an issue (if acquirin the futex required memory, > then failure for FUTEX_WAKE to acquire it would mean there was no > FUTEX_WAIT taking place anyway), but for FUTEX_WAKE_OP, failure > would omit the atomic operation, which must take place even if there > are no current FUTEX_WAIT waiters (e.g. if the FUTEX_WAIT was > interrupted by a signal handler). > > Based on the above, I think it's safe to move forward with using > FUTEX_WAKE_OP. It seems optimal to me to use uaddr1==uaddr2 and a > comparison that always yields false, so that the wake only goes to > uaddr1. This will allow the kernel to optimize out double-hashing in > the future by checking for uaddr1==uaddr2, and already optimizes out > the double-iteration of the hash bucket for waking purposes. > > Any further thoughts on the matter? I think we should finish the > private futex support task before starting on this, so that we don't > do new work that's going to conflict with a pending patch. This looks promissing, but I yet don't know enough about these less common futex operations to comment more on it. Generally I think that the control structures should be as tight as possible, give provable properties in the mathematical sense. The interaction between user- and kernelland should be minimal, and we shouldn't provoque reactions of the kernel that concern threads (or even process) that are not really targetted. Jens PS: I will be a bit less available in the next days. -- :: INRIA Nancy Grand Est ::: AlGorille ::: ICube/ICPS ::: :: ::::::::::::::: office Strasbourg : +33 368854536 :: :: :::::::::::::::::::::: gsm France : +33 651400183 :: :: ::::::::::::::: gsm international : +49 15737185122 :: :: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::