From: Jens Gustedt <jens.gustedt@inria.fr>
To: musl@lists.openwall.com
Subject: Re: Explaining cond var destroy [Re: [musl] C threads, v3.0]
Date: Sat, 09 Aug 2014 08:47:34 +0200 [thread overview]
Message-ID: <1407566854.4988.231.camel@eris.loria.fr> (raw)
In-Reply-To: <20140808204855.GQ1674@brightrain.aerifal.cx>
[-- Attachment #1: Type: text/plain, Size: 4333 bytes --]
Hello,
Am Freitag, den 08.08.2014, 16:48 -0400 schrieb Rich Felker:
> On Fri, Aug 08, 2014 at 03:14:06PM -0400, Rich Felker wrote:
> > I think I may have a solution you'll like:
> >
> > We can perform the release of the lock via a compare-and-swap rather
> > than a simple swap. In this way, we can know before releasing the lock
> > whether it's going to require a wake or not:
> >
> > - If waiters was zero and the cas from owned/uncontended to zero
> > succeeds, no futex wake operation is needed.
> >
> > - If waiters was nonzero, or if the cas fails (thereby instead
> > requiring a cas from owned/contended to zero), we can do the
> > following:
> >
> > Don't use a userspace CAS to release; this would allow the lock to be
> > acquired by another thread, released, destroyed, and freed before the
> > futex wake is performed. Instead, use FUTEX_WAKE_OP to atomically
> > perform the atomic assignment and futex wake.
>
> FUTEX_WAKE_OP is highly under-documented, and i'm worried it might be
> unsupported on some archs (since the atomics for it have to be
> implemented on a per-arch basis in the kernel) but of course we can
> just fallback on archs where it's not supported yet.
>
> Anyway, the behavior seems to be:
>
> - Futex acquisition for uaddr1 and uaddr2 both happen prior to the
> atomic operation, and this hold locks that seem to prevent new
> waiters on the futex(es). This should preclude any risk of waking a
> new waiter that arrives after the atomic operation, as desired.
>
> - Both uaddr1 and uaddr2 are hashed, with no check for equality. This
> is a fairly costly wasteful operation, but could be fixed on the
> kernel side. At present I suspect they don't care because
> FUTEX_WAKE_OP is considered unnecessary, but if I raise it on the
> glibc bug tracker thread for issue 13690 as a solution to the
> problem, I think there would be a lot more interest in optimizing
> this kernel path.
>
> - After the atomic operation is performed, a wake is always performed
> on uaddr1 (based on the previous acquisition); this fact is omitted
> from all the documentation, but it's obviously intentional since
> otherwise the uaddr1 argument would not be used for anything but
> wasting time. The wake on uaddr2 is conditional on a comparison.
>
> - No allocation is required anywhere in the operation, so we don't
> have to worry about lost actions on OOM. For plain FUTEX_WAKE this
> would not have been an issue (if acquirin the futex required memory,
> then failure for FUTEX_WAKE to acquire it would mean there was no
> FUTEX_WAIT taking place anyway), but for FUTEX_WAKE_OP, failure
> would omit the atomic operation, which must take place even if there
> are no current FUTEX_WAIT waiters (e.g. if the FUTEX_WAIT was
> interrupted by a signal handler).
>
> Based on the above, I think it's safe to move forward with using
> FUTEX_WAKE_OP. It seems optimal to me to use uaddr1==uaddr2 and a
> comparison that always yields false, so that the wake only goes to
> uaddr1. This will allow the kernel to optimize out double-hashing in
> the future by checking for uaddr1==uaddr2, and already optimizes out
> the double-iteration of the hash bucket for waking purposes.
>
> Any further thoughts on the matter? I think we should finish the
> private futex support task before starting on this, so that we don't
> do new work that's going to conflict with a pending patch.
This looks promissing, but I yet don't know enough about these less
common futex operations to comment more on it.
Generally I think that the control structures should be as tight as
possible, give provable properties in the mathematical sense. The
interaction between user- and kernelland should be minimal, and we
shouldn't provoque reactions of the kernel that concern threads (or
even process) that are not really targetted.
Jens
PS: I will be a bit less available in the next days.
--
:: INRIA Nancy Grand Est ::: AlGorille ::: ICube/ICPS :::
:: ::::::::::::::: office Strasbourg : +33 368854536 ::
:: :::::::::::::::::::::: gsm France : +33 651400183 ::
:: ::::::::::::::: gsm international : +49 15737185122 ::
:: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
next prev parent reply other threads:[~2014-08-09 6:47 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-08-04 9:30 C threads, v3.0 Jens Gustedt
2014-08-04 9:33 ` Jens Gustedt
2014-08-04 14:50 ` Rich Felker
2014-08-04 16:48 ` Jens Gustedt
2014-08-04 17:06 ` Rich Felker
2014-08-04 22:16 ` Jens Gustedt
2014-08-04 22:36 ` Rich Felker
2014-08-06 3:52 ` Explaining cond var destroy [Re: [musl] C threads, v3.0] Rich Felker
2014-08-06 8:43 ` Jens Gustedt
2014-08-06 9:41 ` Jens Gustedt
2014-08-06 10:03 ` Rich Felker
2014-08-06 10:32 ` Jens Gustedt
2014-08-06 16:15 ` Rich Felker
2014-08-06 16:56 ` Jens Gustedt
2014-08-06 17:32 ` Rich Felker
2014-08-06 20:55 ` Jens Gustedt
2014-08-06 22:04 ` Rich Felker
2014-08-06 22:43 ` Jens Gustedt
2014-08-06 23:15 ` Rich Felker
2014-08-07 7:50 ` Jens Gustedt
2014-08-07 10:52 ` Szabolcs Nagy
2014-08-07 11:03 ` Jens Gustedt
2014-08-07 16:13 ` Rich Felker
2014-08-07 16:47 ` Jens Gustedt
2014-08-07 17:25 ` Rich Felker
2014-08-08 9:20 ` Jens Gustedt
2014-08-08 16:53 ` Rich Felker
2014-08-08 19:14 ` Rich Felker
2014-08-08 20:48 ` Rich Felker
2014-08-09 6:47 ` Jens Gustedt [this message]
2014-08-12 2:50 ` Rich Felker
2014-08-12 7:04 ` Jens Gustedt
2014-08-12 16:01 ` Rich Felker
2014-08-12 19:09 ` Jens Gustedt
2014-08-12 21:18 ` Rich Felker
2014-08-13 6:43 ` Jens Gustedt
2014-08-13 7:19 ` Jens Gustedt
2014-08-06 9:50 ` Rich Felker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1407566854.4988.231.camel@eris.loria.fr \
--to=jens.gustedt@inria.fr \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).