From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: <7a13edea51b085e17fc02c0e8d0b6a62@coraid.com> References: <5d1347dfd611729cd82ac5bc0ca79c92@coraid.com> <7a13edea51b085e17fc02c0e8d0b6a62@coraid.com> Date: Tue, 8 Dec 2009 15:52:51 -0800 Message-ID: Subject: Re: [9fans] etherigbe.c using _xinc? From: Russ Cox To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: text/plain; charset=UTF-8 Topicbox-Message-UUID: acf10538-ead5-11e9-9d60-3106f5b1d025 it looks like you are comparing these two functions void loopxinc(void) { uint i, x; for(i = 0; i < N; i++){ _xinc(&x); _xdec(&x); } } void looplock(void) { uint i; static Lock l; for(i = 0; i < N; i++){ lock(&l); unlock(&l); } } but the former does two operations and the latter only one. your claim was that _xinc is slower than incref (== lock(), x++, unlock()). but you are timing xinc+xdec against incref. assuming xinc and xdec are approximately the same cost (so i can just halve the numbers for loopxinc), that would make the fair comparison produce: intel core i7 2.4ghz loop 0 nsec/call loopxinc 10 nsec/call // was 20 looplock 11 nsec/call intel 5000 1.6ghz loop 0 nsec/call loopxinc 22 nsec/call // was 44 looplock 25 nsec/call intel atom 330 1.6ghz (exception!) loop 2 nsec/call loopxinc 7 nsec/call // was 14 looplock 22 nsec/call amd k10 2.0ghz loop 2 nsec/call loopxinc 15 nsec/call // was 30 looplock 20 nsec/call intel p4 xeon 3.0ghz loop 1 nsec/call loopxinc 38 nsec/call // was 76 looplock 42 nsec/call which looks like a much different story. russ