From mboxrd@z Thu Jan 1 00:00:00 1970 From: erik quanstrom Date: Thu, 19 Dec 2013 10:19:42 -0500 To: 9fans@9fans.net Message-ID: <389ed08237f11e30c2f310f5847e34c1@brasstown.quanstro.net> In-Reply-To: <0A651ACE-ADD2-4C9F-9491-0802B20923B9@gmail.com> References: <9d4d16071350d120a79044a6c0c1604f@felloff.net> <0A651ACE-ADD2-4C9F-9491-0802B20923B9@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [9fans] 9front pegs CPU on VMware Topicbox-Message-UUID: a0d9a3ec-ead8-11e9-9d60-3106f5b1d025 for those without much mwait experience, mwait is a kernel-only primitive (as per the instructions) that pauses the processor until a change has be= en made in some range of memory. the size is determined by probing the h/w, but think cacheline. so the discussion of locking is kernel specific as = well. > > On 17 Dec 2013, at 12:00, cinap_lenrek@felloff.net wrote: > >=20 > > thats a surprising result. by dog pile lock you mean the runq spinloc= k no? > >=20 >=20 > I guess it depends on the HW, but I don=C2=B4t find that so surprising.= You are looping > sending messages to the coherency fabric, which gets congested as a res= ult. > I have seen that happen. i assume you mean that there is contention on the cacheline holding the r= unq lock? i don't think there's classical congestion. as i believe cachelines not = involved in the mwait would experience no hold up. > You should back off, but sleeping for a fixed time is not a good soluti= on either. > Mwait is a perfect solution in this case, there is some latency, but yo= u are in a bad > place anyway and with mwait, performance does not degrade too much. mwait() does improve things and one would expect the latency to always be= better than spining*. but as it turns out the current scheduler is pretty hopel= ess in its locking anyway. simply grabbing the lock with lock rather than canlock makes mor= e sense to me. also, using ticket locks (see 9atom nix kernel) will provide automatic ba= ckoff within the lock. ticket locks are a poor solution as they're not really scalable but they = will scale to 24 cpus much better than tas locks. mcs locks or some other queueing-style lock is clearly the long-term solu= tion. but as charles points out one would really perfer to figure out a way to fit the= m to the lock api. i have some test code, but testing queueing locks in user space is = ... interesting. i need a new approach. - erik * have you done tests on this?