From mboxrd@z Thu Jan  1 00:00:00 1970
MIME-Version: 1.0
In-Reply-To: <7a13edea51b085e17fc02c0e8d0b6a62@coraid.com>
References: <f75780240912080825y51dda728q283429fa95ab462c@mail.gmail.com>
	 <5d1347dfd611729cd82ac5bc0ca79c92@coraid.com>
	 <dd6fe68a0912081135g5f051f33u25af76c81e87dc66@mail.gmail.com>
	 <7a13edea51b085e17fc02c0e8d0b6a62@coraid.com>
Date: Tue,  8 Dec 2009 15:52:51 -0800
Message-ID: <dd6fe68a0912081552s30851f04n109e56479bb423cb@mail.gmail.com>
Subject: Re: [9fans] etherigbe.c using _xinc?
From: Russ Cox <rsc@swtch.com>
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Content-Type: text/plain; charset=UTF-8
Topicbox-Message-UUID: acf10538-ead5-11e9-9d60-3106f5b1d025

it looks like you are comparing these two functions

void
loopxinc(void)
{
	uint i, x;

	for(i = 0; i < N; i++){
		_xinc(&x);
		_xdec(&x);
	}
}

void
looplock(void)
{
	uint i;
	static Lock l;

	for(i = 0; i < N; i++){
		lock(&l);
		unlock(&l);
	}
}

but the former does two operations and the latter
only one.  your claim was that _xinc is slower
than incref (== lock(), x++, unlock()).  but you are
timing xinc+xdec against incref.

assuming xinc and xdec are approximately the same
cost (so i can just halve the numbers for loopxinc),
that would make the fair comparison produce:

intel core i7 2.4ghz
loop    0 nsec/call
loopxinc        10 nsec/call  // was 20
looplock        11 nsec/call

intel 5000 1.6ghz
loop    0 nsec/call
loopxinc        22 nsec/call  // was 44
looplock        25 nsec/call

intel atom 330 1.6ghz (exception!)
loop    2 nsec/call
loopxinc        7 nsec/call  // was 14
looplock        22 nsec/call

amd k10 2.0ghz
loop    2 nsec/call
loopxinc        15 nsec/call  // was 30
looplock        20 nsec/call

intel p4 xeon 3.0ghz

loop    1 nsec/call
loopxinc        38 nsec/call  // was 76
looplock        42 nsec/call

which looks like a much different story.

russ