mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Jens Gustedt <jens.gustedt@inria.fr>
To: musl@lists.openwall.com
Subject: Re: Multi-threaded performance progress
Date: Tue, 26 Aug 2014 18:35:19 +0200	[thread overview]
Message-ID: <1409070919.8054.47.camel@eris.loria.fr> (raw)
In-Reply-To: <1409036654.4835.14.camel@eris.loria.fr>


[-- Attachment #1.1: Type: text/plain, Size: 2241 bytes --]

Am Dienstag, den 26.08.2014, 09:04 +0200 schrieb Jens Gustedt:
> Am Montag, den 25.08.2014, 23:43 -0400 schrieb Rich Felker:
> > This release cycle looks like it's going to be huge for multi-threaded
> > performance issues. So far the cumulative improvement on my main
> > development system, as measured by the cond_bench.c by Timo Teräs, is
> > from ~250k signals in 2 seconds up to ~3.7M signals in 2 seconds.
> > That's comparable to what glibc gets on similar hardware with a cond
> > var implementation that's much less correct. The improvements are a
> > result of adding private futex support, redesigning the cond var
> > implementation, and improvements to the spin-before-futex-wait
> > behavior.
> 
> Very impressive!

I reviewed the new pthread_cond code closely and found it to be really
rock solid.

I have some minor things, that might still improve things (or
not). They make the code a bit longer, but they attempt to gain things
here and there:

 - Tighten the lock on _c_lock such that the critical section
   contains the least necessary.

 - Have all the update of the list of waiters done by the signaling
   or broadcasting thread. This work is serialized by the lock on the
   cv, anyhow, so let the main work be done by a thread that already
   holds the lock and is scheduled.

 - In case of broadcast, work on head and tail of the list
   first. These are the only ones that would change the _c_head and _c_tail
   entries of the cv.

 - Try to reduce the number of futex calls. Threads that are leaving
   don't have to regain the lock when there is known contention with a
   signaler, now that the signaler is doing the main work in that
   case.

   Also only wake up the signaling thread at the end when he is known
   to be inside a futex call.

There are perhaps other possibilities, like doing some spinning in
"lock" before going into __wait.

Jens


-- 
:: INRIA Nancy Grand Est ::: AlGorille ::: ICube/ICPS :::
:: ::::::::::::::: office Strasbourg : +33 368854536   ::
:: :::::::::::::::::::::: gsm France : +33 651400183   ::
:: ::::::::::::::: gsm international : +49 15737185122 ::
:: http://icube-icps.unistra.fr/index.php/Jens_Gustedt ::




[-- Attachment #1.2: pthread_cond_timedwait.patch --]
[-- Type: text/x-patch, Size: 7675 bytes --]

diff --git a/src/thread/pthread_cond_timedwait.c b/src/thread/pthread_cond_timedwait.c
index 2d192b0..136fa6a 100644
--- a/src/thread/pthread_cond_timedwait.c
+++ b/src/thread/pthread_cond_timedwait.c
@@ -42,12 +42,32 @@ static inline void lock(volatile int *l)
 	}
 }
 
+/* Avoid taking the lock if we know it isn't necessary. */
+static inline int lockRace(volatile int *l, int*volatile* notifier)
+{
+	int ret = 1;
+	if (!*notifier && (ret = a_cas(l, 0, 1))) {
+		a_cas(l, 1, 2);
+		do __wait(l, 0, 2, 1);
+		while (!*notifier && (ret = a_cas(l, 0, 2)));
+	}
+	return ret;
+}
+
 static inline void unlock(volatile int *l)
 {
 	if (a_swap(l, 0)==2)
 		__wake(l, 1, 1);
 }
 
+static inline void unlockRace(volatile int *l, int known)
+{
+	int found = a_swap(l, 0);
+	known += (found == 2);
+	if (known > 0)
+		__wake(l, known, 1);
+}
+
 static inline void unlock_requeue(volatile int *l, volatile int *r, int w)
 {
 	a_store(l, 0);
@@ -56,6 +76,53 @@ static inline void unlock_requeue(volatile int *l, volatile int *r, int w)
 		|| __syscall(SYS_futex, l, FUTEX_REQUEUE, 0, 1, r);
 }
 
+/* Splice the node out of the current list of c and notify a signaling
+   thread with whom there was contention. */
+static inline void leave(struct waiter* node) {
+	/* Access to cv object is valid because this waiter was not
+	 * yet signaled and a new signal/broadcast cannot return
+	 * after seeing a LEAVING waiter without getting notified
+	 * via the futex notify below. */
+	pthread_cond_t *c = node->cond;
+	int locked = lockRace(&c->_c_lock, &node->notify);
+	/* node->notify will only be changed while node is
+	 * still in the list.*/
+	int * ref = node->notify;
+
+	if (!ref) {
+		if (node->prev) node->prev->next = node->next;
+		else if (c->_c_head == node) c->_c_head = node->next;
+		if (node->next) node->next->prev = node->prev;
+		else if (c->_c_tail == node) c->_c_tail = node->prev;
+
+		unlock(&c->_c_lock);
+	} else {
+		/* A race occurred with a signaling or broadcasting thread. The call
+		 * to unlockRace, there, ensures that sufficiently many waiters on _c_lock
+		 * are woken up. */
+		if (!locked) unlock(&c->_c_lock);
+
+		/* There will be at most one signaling or broadcasting thread waiting on ref[0].
+		 * Make sure that we don't waste a futex wake, if that thread isn't yet in futex wait. */
+		if (a_fetch_add(&ref[0], -1)==1 && ref[1])
+			__wake(&ref[0], 1, 1);
+	}
+
+	node->mutex_ret = pthread_mutex_lock(node->mutex);
+}
+
+static inline void enqueue(pthread_cond_t * c, struct waiter* node) {
+	lock(&c->_c_lock);
+
+	struct waiter* ohead = c->_c_head;
+	node->next = ohead;
+	if (ohead) ohead->prev = node;
+	else c->_c_tail = node;
+	c->_c_head = node;
+
+	unlock(&c->_c_lock);
+}
+
 enum {
 	WAITING,
 	SIGNALED,
@@ -78,33 +145,14 @@ static void unwait(void *arg)
 	int oldstate = a_cas(&node->state, WAITING, LEAVING);
 
 	if (oldstate == WAITING) {
-		/* Access to cv object is valid because this waiter was not
-		 * yet signaled and a new signal/broadcast cannot return
-		 * after seeing a LEAVING waiter without getting notified
-		 * via the futex notify below. */
-
-		pthread_cond_t *c = node->cond;
-		lock(&c->_c_lock);
-		
-		if (c->_c_head == node) c->_c_head = node->next;
-		else if (node->prev) node->prev->next = node->next;
-		if (c->_c_tail == node) c->_c_tail = node->prev;
-		else if (node->next) node->next->prev = node->prev;
-		
-		unlock(&c->_c_lock);
-
-		if (node->notify) {
-			if (a_fetch_add(node->notify, -1)==1)
-				__wake(node->notify, 1, 1);
-		}
-	} else {
-		/* Lock barrier first to control wake order. */
-		lock(&node->barrier);
+		leave(node);
+		return;
 	}
 
-	node->mutex_ret = pthread_mutex_lock(node->mutex);
+	/* Lock barrier first to control wake order. */
+	lock(&node->barrier);
 
-	if (oldstate == WAITING) return;
+	node->mutex_ret = pthread_mutex_lock(node->mutex);
 
 	if (!node->next) a_inc(&node->mutex->_m_waiters);
 
@@ -121,7 +169,7 @@ static void unwait(void *arg)
 
 int pthread_cond_timedwait(pthread_cond_t *restrict c, pthread_mutex_t *restrict m, const struct timespec *restrict ts)
 {
-	struct waiter node = { .cond = c, .mutex = m };
+	struct waiter node = { .cond = c, .mutex = m, .state = WAITING, .barrier = 2 };
 	int e, seq, *fut, clock = c->_c_clock;
 
 	if ((m->_m_type&15) && (m->_m_lock&INT_MAX) != __pthread_self()->tid)
@@ -138,17 +186,10 @@ int pthread_cond_timedwait(pthread_cond_t *restrict c, pthread_mutex_t *restrict
 		seq = c->_c_seq;
 		a_inc(&c->_c_waiters);
 	} else {
-		lock(&c->_c_lock);
-
-		seq = node.barrier = 2;
+		seq = node.barrier;
 		fut = &node.barrier;
-		node.state = WAITING;
-		node.next = c->_c_head;
-		c->_c_head = &node;
-		if (!c->_c_tail) c->_c_tail = &node;
-		else node.next->prev = &node;
 
-		unlock(&c->_c_lock);
+		enqueue(c, &node);
 	}
 
 	pthread_mutex_unlock(m);
@@ -162,35 +203,85 @@ int pthread_cond_timedwait(pthread_cond_t *restrict c, pthread_mutex_t *restrict
 	return node.mutex_ret ? node.mutex_ret : e;
 }
 
+static inline int cond_signal (struct waiter * p, int* ref)
+{
+	int ret = a_cas(&p->state, WAITING, SIGNALED);
+	if (ret != WAITING) {
+		ref[0]++;
+		p->notify = ref;
+		if (p->prev) p->prev->next = p->next;
+		if (p->next) p->next->prev = p->prev;
+		p->next = 0;
+		p->prev = 0;
+	}
+	return ret;
+}
+
 int __private_cond_signal(pthread_cond_t *c, int n)
 {
-	struct waiter *p, *first=0;
-	int ref = 0, cur;
+	struct waiter *p, *prev, *first=0;
+	int ref[2] = { 0 }, cur;
 
-	lock(&c->_c_lock);
-	for (p=c->_c_tail; n && p; p=p->prev) {
-		if (a_cas(&p->state, WAITING, SIGNALED) != WAITING) {
-			ref++;
-			p->notify = &ref;
+	if (n == 1) {
+		lock(&c->_c_lock);
+		for (p=c->_c_tail; p; p=prev) {
+			prev = p->prev;
+			if (!cond_signal(p, ref)) {
+				first=p;
+				p=prev;
+				first->prev = 0;
+				break;
+			}
+		}
+		/* Split the list, leaving any remainder on the cv. */
+		if (p) {
+			p->next = 0;
 		} else {
-			n--;
-			if (!first) first=p;
+			c->_c_head = 0;
 		}
-	}
-	/* Split the list, leaving any remainder on the cv. */
-	if (p) {
-		if (p->next) p->next->prev = 0;
-		p->next = 0;
+		c->_c_tail = p;
+		unlockRace(&c->_c_lock, ref[0]);
 	} else {
-		c->_c_head = 0;
+		lock(&c->_c_lock);
+                struct waiter * head = c->_c_head;
+		if (head) {
+			/* Signal head and tail first to reduce possible
+			 * races for the cv to the beginning of the
+			 * processing. */
+			int headrace = cond_signal(head, ref);
+			struct waiter * tail = c->_c_tail;
+			p=tail->prev;
+			if (tail != head) {
+				if (!cond_signal(tail, ref)) first=tail;
+				else while (p != head) {
+					prev = p->prev;
+					if (!cond_signal(p, ref)) {
+						first=p;
+						p=prev;
+						break;
+					}
+					p=prev;
+				}
+			}
+			if (!first && !headrace) first = head;
+			c->_c_head = 0;
+			c->_c_tail = 0;
+			/* Now process the inner part of the list. */
+			if (p) {
+				while (p != head) {
+					prev = p->prev;
+					cond_signal(p, ref);
+					p=prev;
+				}
+			}
+		}
+		unlockRace(&c->_c_lock, ref[0]);
 	}
-	c->_c_tail = p;
-	unlock(&c->_c_lock);
 
 	/* Wait for any waiters in the LEAVING state to remove
 	 * themselves from the list before returning or allowing
 	 * signaled threads to proceed. */
-	while ((cur = ref)) __wait(&ref, 0, cur, 1);
+	while ((cur = ref[0])) __wait(&ref[0], &ref[1], cur, 1);
 
 	/* Allow first signaled waiter, if any, to proceed. */
 	if (first) unlock(&first->barrier);

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

  parent reply	other threads:[~2014-08-26 16:35 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-26  3:43 Rich Felker
2014-08-26  7:04 ` Jens Gustedt
2014-08-26 10:44   ` Szabolcs Nagy
2014-08-26 11:09     ` Jens Gustedt
2014-08-26 16:35   ` Jens Gustedt [this message]
2014-08-26 17:32     ` Rich Felker
2014-08-26 17:53     ` Rich Felker
2014-08-26 18:30       ` Jens Gustedt
2014-08-26 19:05         ` Rich Felker
2014-08-26 19:34           ` Jens Gustedt
2014-08-26 20:26             ` Rich Felker
2014-08-26 21:15               ` Jens Gustedt
2014-08-26 21:36                 ` Rich Felker
2014-08-27  9:53                   ` Jens Gustedt
2014-08-27 16:47                     ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1409070919.8054.47.camel@eris.loria.fr \
    --to=jens.gustedt@inria.fr \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).