From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/5788 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Explaining cond var destroy [Re: [musl] C threads, v3.0] Date: Fri, 8 Aug 2014 16:48:55 -0400 Message-ID: <20140808204855.GQ1674@brightrain.aerifal.cx> References: <1407358510.24324.328.camel@eris.loria.fr> <20140806220453.GD1674@brightrain.aerifal.cx> <1407365015.24324.334.camel@eris.loria.fr> <20140806231539.GE1674@brightrain.aerifal.cx> <1407397851.24324.354.camel@eris.loria.fr> <20140807161342.GH1674@brightrain.aerifal.cx> <1407430024.24324.387.camel@eris.loria.fr> <20140807172514.GI1674@brightrain.aerifal.cx> <1407489627.24324.419.camel@eris.loria.fr> <20140808191405.GP1674@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1407530960 31252 80.91.229.3 (8 Aug 2014 20:49:20 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 8 Aug 2014 20:49:20 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-5793-gllmg-musl=m.gmane.org@lists.openwall.com Fri Aug 08 22:49:10 2014 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1XFr64-0004nW-0Y for gllmg-musl@plane.gmane.org; Fri, 08 Aug 2014 22:49:08 +0200 Original-Received: (qmail 21651 invoked by uid 550); 8 Aug 2014 20:49:07 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 21643 invoked from network); 8 Aug 2014 20:49:07 -0000 Content-Disposition: inline In-Reply-To: <20140808191405.GP1674@brightrain.aerifal.cx> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:5788 Archived-At: On Fri, Aug 08, 2014 at 03:14:06PM -0400, Rich Felker wrote: > I think I may have a solution you'll like: > > We can perform the release of the lock via a compare-and-swap rather > than a simple swap. In this way, we can know before releasing the lock > whether it's going to require a wake or not: > > - If waiters was zero and the cas from owned/uncontended to zero > succeeds, no futex wake operation is needed. > > - If waiters was nonzero, or if the cas fails (thereby instead > requiring a cas from owned/contended to zero), we can do the > following: > > Don't use a userspace CAS to release; this would allow the lock to be > acquired by another thread, released, destroyed, and freed before the > futex wake is performed. Instead, use FUTEX_WAKE_OP to atomically > perform the atomic assignment and futex wake. FUTEX_WAKE_OP is highly under-documented, and i'm worried it might be unsupported on some archs (since the atomics for it have to be implemented on a per-arch basis in the kernel) but of course we can just fallback on archs where it's not supported yet. Anyway, the behavior seems to be: - Futex acquisition for uaddr1 and uaddr2 both happen prior to the atomic operation, and this hold locks that seem to prevent new waiters on the futex(es). This should preclude any risk of waking a new waiter that arrives after the atomic operation, as desired. - Both uaddr1 and uaddr2 are hashed, with no check for equality. This is a fairly costly wasteful operation, but could be fixed on the kernel side. At present I suspect they don't care because FUTEX_WAKE_OP is considered unnecessary, but if I raise it on the glibc bug tracker thread for issue 13690 as a solution to the problem, I think there would be a lot more interest in optimizing this kernel path. - After the atomic operation is performed, a wake is always performed on uaddr1 (based on the previous acquisition); this fact is omitted from all the documentation, but it's obviously intentional since otherwise the uaddr1 argument would not be used for anything but wasting time. The wake on uaddr2 is conditional on a comparison. - No allocation is required anywhere in the operation, so we don't have to worry about lost actions on OOM. For plain FUTEX_WAKE this would not have been an issue (if acquirin the futex required memory, then failure for FUTEX_WAKE to acquire it would mean there was no FUTEX_WAIT taking place anyway), but for FUTEX_WAKE_OP, failure would omit the atomic operation, which must take place even if there are no current FUTEX_WAIT waiters (e.g. if the FUTEX_WAIT was interrupted by a signal handler). Based on the above, I think it's safe to move forward with using FUTEX_WAKE_OP. It seems optimal to me to use uaddr1==uaddr2 and a comparison that always yields false, so that the wake only goes to uaddr1. This will allow the kernel to optimize out double-hashing in the future by checking for uaddr1==uaddr2, and already optimizes out the double-iteration of the hash bucket for waking purposes. Any further thoughts on the matter? I think we should finish the private futex support task before starting on this, so that we don't do new work that's going to conflict with a pending patch. Rich