From mboxrd@z Thu Jan 1 00:00:00 1970 Message-Id: <30A0D4B5-1AAB-4D95-9B9F-FD09CB796E6D@bitblocks.com> From: Bakul Shah To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (iPhone Mail 7E18) Date: Sat, 7 May 2011 12:33:50 -0700 References: Subject: Re: [9fans] _xinc vs ainc Topicbox-Message-UUID: e0c9cc22-ead6-11e9-9d60-3106f5b1d025 On May 7, 2011, at 6:05 AM, erik quanstrom wrote: > i'm confused by the recent change to the thread library. > the old code was simply to do a locked incl. the new code > does a locked exchange /within a loop/ until it's seen that > nobody else has updated the value at the same time, thus > insuring that the value has indeed been updated. > > since the expensive operation is the MESI(F) negotiation > behind the scenes to get exclusive access to the cacheline, > i don't understand the motiviation is for replacing _xinc > with ainc. since ainc can loop on an expensive lock instruction. > > that is, i think the old version was wait free, and the new version > is not. > > can someone explain what i'm missing here? > thanks! > > - erik > > ---- > > TEXT _xinc(SB),$0 /* void _xinc(long *); */ > > MOVL l+0(FP),AX > LOCK > INCL 0(AX) > RET > > ---- > > TEXT ainc(SB), $0 /* long ainc(long *); */ > MOVL addr+0(FP), BX > ainclp: > MOVL (BX), AX > MOVL AX, CX > INCL CX > LOCK > BYTE $0x0F; BYTE $0xB1; BYTE $0x0B /* CMPXCHGL CX, (BX) */ > JNZ ainclp > MOVL CX, AX > RET > Just guessing. May be the new code allows more concurrency? If the value is not in the processor cache, will the old code block other processors for much longer? The new code forces caching with the first read so may be high likelyhood cmpxchg will finish faster. I haven't studied x86 cache behavior so this guess could be completely wrong. Suggest asking on comp.arch where people like Andy Glew can give you a definitive answer.