From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Sat, 7 May 2011 18:47:54 -0400 To: 9fans@9fans.net Message-ID: In-Reply-To: <30A0D4B5-1AAB-4D95-9B9F-FD09CB796E6D@bitblocks.com> References: <30A0D4B5-1AAB-4D95-9B9F-FD09CB796E6D@bitblocks.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Subject: Re: [9fans] _xinc vs ainc Topicbox-Message-UUID: e0d8a242-ead6-11e9-9d60-3106f5b1d025 > Just guessing. May be the new code allows more concurrency? If the > value is not in the processor cache, will the old code block other > processors for much longer? The new code forces caching with the first > read so may be high likelyhood cmpxchg will finish faster. I haven't > studied x86 cache behavior so this guess could be completely wrong. > Suggest asking on comp.arch where people like Andy Glew can give you a > definitive answer. according to intel, this is a myth. search for "myth" in this page. http://software.intel.com/en-us/articles/implementing-scalable-atomic-locks-for-multi-core-intel-em64t-and-ia32-architectures/ and this stands to reason, since both techniques revolve around a LOCK'd instruction, thus invoking the x86 architectural MESI(f) protocol. the difference, and my main point is that the loop in ainc means that it is not a wait-free algorithm. this is not only sub optimal, but also could lead to incorrect behavior. - erik