From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Tue, 8 Dec 2009 15:00:46 -0500 To: 9fans@9fans.net Message-ID: <7a13edea51b085e17fc02c0e8d0b6a62@coraid.com> In-Reply-To: References: <5d1347dfd611729cd82ac5bc0ca79c92@coraid.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="upas-auurhpxvnkofctevbmjbzbrhaa" Subject: Re: [9fans] etherigbe.c using _xinc? Topicbox-Message-UUID: acb453cc-ead5-11e9-9d60-3106f5b1d025 This is a multi-part message in MIME format. --upas-auurhpxvnkofctevbmjbzbrhaa Content-Disposition: inline Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit > do you have numbers to back up this claim? > > you are claiming that the locked XCHGL > in tas (pc/l.s) called from lock (port/taslock.c) > called from incref (port/chan.c) is "much faster" > than the locked INCL in _xinc (pc/l.s). > it seems to me that a locked memory bus > is a locked memory bus. yes, i do. xinc on most modern intel is a real loss. and a moderate loss on amd. my atom 330 is an exception. intel core i7 2.4ghz loop 0 nsec/call loopxinc 20 nsec/call looplock 11 nsec/call intel 5000 1.6ghz loop 0 nsec/call loopxinc 44 nsec/call looplock 25 nsec/call intel atom 330 1.6ghz (exception!) loop 2 nsec/call loopxinc 14 nsec/call looplock 22 nsec/call amd k10 2.0ghz loop 2 nsec/call loopxinc 30 nsec/call looplock 20 nsec/call intel p4 xeon 3.0ghz loop 1 nsec/call loopxinc 76 nsec/call looplock 42 nsec/call - erik --upas-auurhpxvnkofctevbmjbzbrhaa Content-Disposition: attachment; filename=xinc.s Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit TEXT _xinc(SB), 1, $0 /* void _xinc(long*); */ MOVL l+0(FP), AX LOCK; INCL 0(AX) RET TEXT _xdec(SB), 1, $0 /* long _xdec(long*); */ MOVL l+0(FP), BX XORL AX, AX LOCK; DECL 0(BX) JLT _xdeclt JGT _xdecgt RET _xdecgt: INCL AX RET _xdeclt: DECL AX RET --upas-auurhpxvnkofctevbmjbzbrhaa Content-Disposition: attachment; filename=timing.c Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit #include #include void _xinc(uint*); void _xdec(uint*); enum { N = 1<<30, }; void loop(void) { uint i; for(i = 0; i < N; i++) ; } void loopxinc(void) { uint i, x; for(i = 0; i < N; i++){ _xinc(&x); _xdec(&x); } } void looplock(void) { uint i; static Lock l; for(i = 0; i < N; i++){ lock(&l); unlock(&l); } } void timing(char *s, void (*f)(void)) { uvlong t[2]; t[0] = nsec(); f(); t[1] = nsec(); fprint(2, "%s\t%llud nsec/call\n", s, (t[1] - t[0])/(uvlong)N); } void main(void) { nsec(); timing("loop", loop); timing("loopxinc", loopxinc); timing("looplock", looplock); exits(""); } --upas-auurhpxvnkofctevbmjbzbrhaa--