mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Alexander Monakov <amonakov@ispras.ru>
To: musl@lists.openwall.com
Subject: semaphore redesign
Date: Sat, 28 Feb 2015 02:21:22 +0300 (MSK)	[thread overview]
Message-ID: <alpine.LNX.2.11.1502280111220.10769@monopod.intra.ispras.ru> (raw)
In-Reply-To: <alpine.LNX.2.00.1409012059290.1514@monopod.intra.ispras.ru>

Hello,

As new cancellation has landed (except that __timedwait fails to propagate
ECANCELED in addition to ETIMEDOUT and EINTR), I'm happy to post semaphore
redesign.  We discussed this implementation with Rich on IRC once, and
presently I'm not aware of any issues (finally!), but still testing and
performance comparison need to be done.

This implementation aims to solve the problem with sem_getvalue that started
this thread, while also making fast (uncontended) paths leaner.  There were
some explanations scattered in the old thread; I'll try to provide a summary.

Conceptually, a semaphore has a non-negative value and when the value is zero,
a non-negative waiter count.  The new implementation stores the value minus
number of waiters in val[0], and in val[1] it stores the "wake count": the
number of waiters that were discounted from val[0] in sem_post; val[1] is
futex-woken in sem_post, and futex-waited on in sem_wait.

A thread that entered sem_wait and became a waiter by decrementing
non-positive val[0] may leave either by consuming a post (decrementing a
positive val[1]), or, if error condition such as timeout or cancellation has
been propagated from futex wait, by discounting itself as a waiter by
incrementing a negative val[0].  Conversely, if after encountering an error a
waiter observes non-negative val[0], it means that another thread already
discounted it from waiters when doing a sem_post, so the waiter proceeds to
consume the post and suppress the error.

Since any leaving waiter must either decrement positive val[1] or increment
negative val[0], accessing val[1] non-atomically with val[0] in sem_post does
not lead to potential use of destroyed semaphore.

At Rich's request, I'm posting this as plain C source code rather than a diff.
The implementation of sem_getvalue stays the same.

Alexander

#include <semaphore.h>
#include "pthread_impl.h"

int sem_post(sem_t *sem)
{
	int val;
	do {
		val = sem->__val[0];
		if (val == SEM_VALUE_MAX) {
			errno = EOVERFLOW;
			return -1;
		}
	} while (val != a_cas(sem->__val, val, val+1));
	if (val < 0) {
		int priv = sem->__val[2];
		a_inc(sem->__val+1);
		__wake(sem->__val+1, 1, priv);
	}
	return 0;
}

int sem_trywait(sem_t *sem)
{
	int val;
	do {
		val = sem->__val[0];
		if (val <= 0) {
			errno = EAGAIN;
			return -1;
		}
	} while (val != a_cas(sem->__val, val, val-1));
	return 0;
}

static void dummy(void *arg)
{
}

int sem_timedwait(sem_t *restrict sem, const struct timespec *restrict at)
{
	pthread_testcancel();

	// Do not spin if already contended (val0 is negative)
	for (int spins = 100; spins && sem->__val[0] == 0; spins--)
		a_spin();

	if (a_fetch_add(sem->__val, -1) > 0)
		return 0;

	int priv = sem->__val[2], e = 0, ee, cs;
	pthread_setcancelstate(PTHREAD_CANCEL_MASKED, &cs);

	do {
		ee = __timedwait(sem->__val+1, 0, CLOCK_REALTIME, at, dummy, 0, priv);
		int wak = sem->__val[1];
		if (wak > 0 && wak == a_cas(sem->__val+1, wak, wak-1))
			goto done;
	} while (!ee || ee == EINTR);

	// Upon returning from wait with an error, either discount ourselves as a waiter
	// by incrementing negative val0, and propagate error, or consume a racing post
	// if val0 is non-negative, and suppress error.
	for (;;) {  
		int val = sem->__val[0];
		if (val < 0 && val == a_cas(sem->__val, val, val+1))
			break;
		int wak = sem->__val[1];
		if (wak > 0 && wak == a_cas(sem->__val+1, wak, wak-1))
			goto done;
		a_spin();
	}
	e = ee;
done:
	pthread_setcancelstate(cs, 0);

	if (!e) return 0;

	if (e == ECANCELED) {
		pthread_testcancel();
		pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, 0);
	}

	errno = e;
	return -1;
}


  reply	other threads:[~2015-02-27 23:21 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-27  2:33 sem_getvalue conformance considerations Rich Felker
2014-08-27  7:05 ` Jens Gustedt
2014-08-27  7:43   ` Rich Felker
2014-08-27 10:43     ` Alexander Monakov
2014-08-27 13:32       ` Alexander Monakov
2014-08-27 19:06         ` Alexander Monakov
2014-08-27 21:06           ` Alexander Monakov
2014-08-28 20:47             ` Alexander Monakov
2014-08-29 22:51               ` Alexander Monakov
2014-08-30  5:12                 ` Rich Felker
2014-09-01 17:50                 ` Alexander Monakov
2015-02-27 23:21                   ` Alexander Monakov [this message]
2015-02-28 15:42                     ` semaphore redesign Rich Felker
2015-03-01 18:54                       ` Alexander Monakov
2015-03-01 17:30                     ` Szabolcs Nagy
2015-03-01 17:50                       ` Szabolcs Nagy
2015-03-02 22:40                         ` Alexander Monakov
2015-03-02 22:45                           ` Rich Felker
2015-03-01 18:24                       ` Alexander Monakov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LNX.2.11.1502280111220.10769@monopod.intra.ispras.ru \
    --to=amonakov@ispras.ru \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).