From mboxrd@z Thu Jan 1 00:00:00 1970 MIME-Version: 1.0 In-Reply-To: References: Date: Mon, 19 May 2014 17:01:37 -0400 Message-ID: From: "Devon H. O'Dell" To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [9fans] waitfree Topicbox-Message-UUID: e8dbad0c-ead8-11e9-9d60-3106f5b1d025 So you seem to be worried that N processors in a tight loop of LOCK XADD could have a single processor. This isn't a problem because locked instructions have total order. Section 8.2.3.8: "The memory-ordering model ensures that all processors agree on a single execution order of all locked instructions, including those that are larger than 8 bytes or are not naturally aligned." 2014-05-19 16:21 GMT-04:00 erik quanstrom : > > On Mon May 19 15:51:27 EDT 2014, devon.odell@gmail.com wrote: > > The LOCK prefix is effectively a tiny, tiny mutex, but for all intents > > and purposes, this is wait-free. The LOCK prefix forces N processors > > to synchronize on bus and cache operations and this is how there is a > > guarantee of an atomic read or an atomic write. For instructions like > > cmpxchg and xadd where reads and writes are implied, the bus is locked > > for the duration of the instruction. > > > > Wait-freedom provides a guarantee on an upper-bound for completion. > > The pipeline will not be reordered around a LOCK instruction, so your > > instruction will cause a pipeline stall. But you are guaranteed to be > > bounded on the time to complete the instruction, and the instruction > > decomposed into =CE=BCops will not be preempted as the =CE=BCops are ex= ecuted. > > there is no bus. What you mean is that the processor might not LOCK#, but it can. LOCK will signal LOCK# on the bus if it thinks it has to. (And if whatever is being ainc()'ed is ainc()'ed frequently, or whatever is being ainc()'ed is just never in cache for whatever reason, that will happen.) > what LOCK really does is invoke part of the MSEI protocol. the state > diagrams i've seen do not specifiy how this is arbitrated if there are > = 1 > processor trying to gain exclusive access to the same cacheline. > > > Wait-freedom is defined by every operation having a bound on the > > number of steps required before the operation completes. In this case, > > you are bound by the number of =CE=BCops of XADDL + latency to memory. = This > > is a finite number, so this is wait-freedom. > > i'm worried about the bound on the number of MSEI rounds. i don't see > where the memory coherency protocol states that if there are n processors > a cacheline will be acquired in at most f(n) rounds. In Section 8.2.3.8 of the manual: "8.2.3.8 Locked Instructions Have a Total Order The memory-ordering model ensures that all processors agree on a single execution order of all locked instructions, including those that are larger than 8 bytes or are not naturally aligned." Thus, LOCK-prefixed instructions will never cause processor starvation, which seems to be your worry. They're wait-free. --dho > - erik >