From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/5795
Path: news.gmane.org!not-for-mail
From: Rich Felker <dalias@libc.org>
Newsgroups: gmane.linux.lib.musl.general
Subject: Re: [PATCH] private futex support
Date: Fri, 8 Aug 2014 21:50:13 -0400
Message-ID: <20140809015013.GX1674@brightrain.aerifal.cx>
References: <20140626184803.GA8845@brightrain.aerifal.cx>
 <20140808113857.1839babf@vostro>
Reply-To: musl@lists.openwall.com
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="ZmUaFz6apKcXQszQ"
X-Trace: ger.gmane.org 1407549034 27468 80.91.229.3 (9 Aug 2014 01:50:34 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Sat, 9 Aug 2014 01:50:34 +0000 (UTC)
To: musl@lists.openwall.com
Original-X-From: musl-return-5800-gllmg-musl=m.gmane.org@lists.openwall.com Sat Aug 09 03:50:29 2014
Return-path: <musl-return-5800-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@plane.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by plane.gmane.org with smtp (Exim 4.69)
	(envelope-from <musl-return-5800-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1XFvne-0005OO-Us
	for gllmg-musl@plane.gmane.org; Sat, 09 Aug 2014 03:50:27 +0200
Original-Received: (qmail 19923 invoked by uid 550); 9 Aug 2014 01:50:26 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
Original-Received: (qmail 19913 invoked from network); 9 Aug 2014 01:50:26 -0000
Content-Disposition: inline
In-Reply-To: <20140808113857.1839babf@vostro>
User-Agent: Mutt/1.5.21 (2010-09-15)
Original-Sender: Rich Felker <dalias@aerifal.cx>
Xref: news.gmane.org gmane.linux.lib.musl.general:5795
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/5795>


--ZmUaFz6apKcXQszQ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Fri, Aug 08, 2014 at 11:38:57AM +0300, Timo Teras wrote:
> > actually commit it. If anyone is interested in this feature, please
> > see if you can find some examples that demonstrate that it measurably
> > improves performance.
> 
> And running my simple test-case of having two threads wake up each
> other using a condition variable, seems to yield noticeable performance
> speed up from private futexes. See at the end of mail for the code.
> 
> The low and high numbers from few test runs are as follows from musl 
> git 4fe57cad709fdfb377060 without and with the futex patch are as
> follows:
> 
> ~/privfutex $ time ~/oss/musl/lib/libc.so  ./test
> count=2516417
> real	0m 2.00s
> user	0m 1.68s
> sys	0m 2.30s
> 
> ~/privfutex $ time ~/oss/musl/lib/libc.so  ./test
> count=2679381
> real	0m 2.00s
> user	0m 1.59s
> sys	0m 2.39s
> 
> Private futexes:
> 
> ~/privfutex $ time ~/oss/musl/lib/libc.so  ./test
> count=3839470
> real	0m 2.00s
> user	0m 1.68s
> sys	0m 1.98s
> 
> ~/privfutex $ time ~/oss/musl/lib/libc.so  ./test
> count=5350852
> real	0m 2.00s
> user	0m 1.66s
> sys	0m 2.32s
> 
> 
> You can see essentially lowered sys time use, and up to doubled
> throughput of wait/wake operations.

I was able to match the relative difference (albeit at about 10% of
the total throughput you got for both versions) on my atom.

I also dug up an old test of mine that shows some difference (1.9s vs
2.2s to run). The original point of this test was to demonstrate that
glibc's non-process-shared condvars are 2-2.5x slower than their
process-shared ones (yes, the opposite of what you would expect; see
bug 13234). The code is attached.

> So I suspect your test case was not measuring right thing. Private
> futexes speed up only specific loads, and this type of pthread_cond_t
> usage would probably be the pattern benefiting most.
> 
> Please reconsidering adding this after addressing the found
> deficiencies stated in the beginning.

Yes, I think you've succeeded in establishing that private futex
support is useful. So now I just need to check for more stupid
mistakes, get it into a form that's ready to commit, and do some
testing between now and the next release. We should do at least one
test with private futexes hard-wired to fail (or just find an old
kernel to test on) to make sure the fallback code is working, too.

Rich

--ZmUaFz6apKcXQszQ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="cvb2.c"

#include <stdio.h>
#include <pthread.h>

pthread_mutex_t m = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t c = PTHREAD_COND_INITIALIZER;
volatile int p;

int left[5], avail[5], wakes;

void *tf(void *arg)
{
	int i = (long)arg;
	pthread_mutex_lock(&m);
	while (left[i]) {
		while (!avail[i]) pthread_cond_wait(&c, &m), wakes++;
		left[i]--; avail[i]--;
	}
	pthread_mutex_unlock(&m);
}

int main()
{
	pthread_t td[5];
	int i, total;
	pthread_mutexattr_t ma;
	pthread_mutexattr_init(&ma);
	pthread_mutexattr_settype(&ma, PTHREAD_MUTEX_ERRORCHECK);
	pthread_condattr_t ca;
	pthread_condattr_init(&ca);
	pthread_condattr_setpshared(&ca, PTHREAD_PROCESS_SHARED);
	//pthread_cond_init(&c, &ca);
	//pthread_mutex_init(&m, &ma);
	for (i=0; i<5; i++) left[i] = 100000;
	for (i=0; i<5; i++) pthread_create(td+i, 0, tf, (void*)(long)i);
	pthread_mutex_lock(&m);
	for (;;) {
		for (total=i=0; i<5; i++) total+=left[i];
		if (!total) break;
		for (i=0; i<5; i++) avail[i]=1;
		pthread_cond_broadcast(&c);
		pthread_mutex_unlock(&m);
		pthread_mutex_lock(&m);
	}
	pthread_mutex_unlock(&m);
	for (i=0; i<5; i++) pthread_join(td[i], 0);
	printf("%d\n", wakes);
}

--ZmUaFz6apKcXQszQ--