From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7713 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Refactoring atomics as llsc? Date: Wed, 20 May 2015 02:36:32 -0400 Message-ID: <20150520063631.GT17573@brightrain.aerifal.cx> References: <20150520051108.GA28347@brightrain.aerifal.cx> <20150520083323.2340cd1b@vostro> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1432103813 840 80.91.229.3 (20 May 2015 06:36:53 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Wed, 20 May 2015 06:36:53 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-7725-gllmg-musl=m.gmane.org@lists.openwall.com Wed May 20 08:36:48 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1YuxcV-0004D7-Ti for gllmg-musl@m.gmane.org; Wed, 20 May 2015 08:36:48 +0200 Original-Received: (qmail 23941 invoked by uid 550); 20 May 2015 06:36:46 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 23913 invoked from network); 20 May 2015 06:36:45 -0000 Content-Disposition: inline In-Reply-To: <20150520083323.2340cd1b@vostro> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:7713 Archived-At: On Wed, May 20, 2015 at 08:33:23AM +0300, Timo Teras wrote: > On Wed, 20 May 2015 01:11:08 -0400 > Rich Felker wrote: > > > Of course the big outlier is x86, which is not llsc based but has > > actual atomic primitives at the instruction level. If we defined the > > sc() primitive to take 3 args instead of 2 (address, old value from > > ll, new value to conditionally store; most archs would ignore the old > > value argument) then we could model x86 with ll being a plain load and > > sc being cmpxchg to allow any new custom primitives to work using > > cmpxchg. Then we would just continue providing custom versions of all > > the old a_* ops (a_cas, a_fetch_add, a_inc, a_dec, a_and, a_or, > > a_swap) to take advantage of the x86 instructions. These versions > > could probably be shared by all x86 variants (i386, x86_64, x32) since > > they're operating on 32-bit values and the asm should be the same. > > I wonder if calling that kind of emulation ll()/sc() would be > misleading. load-linked store-conditional has stronger guarantees. sc > will fail if the cache-line was invalidated in-between, thread was > pre-empted etc. > > Using cmpxchg can be used to emulate it only when the user is aware of > ABA problem (some other thread may have changed the value behind us > multiple times). Such emulation is of course ok for a_fetch_add, etc. > But one needs to be more careful if using pointers (and trying to make > sure the same pointer was not first removed and later re-added). > > And if you want to optimize the above mentioned cases, one really needs > to know if it's true ll+sc, or write the synchronization differently. > In these cases the algorithm is often implemented twice with the > different available atomics. And yes, an alternative would be not to provide fake ll/sc for archs without it but instead to have the existing generic cas-based implementations to be used when ll/sc is not available. Then we'd have 2 generic implementations of everything instead of just one, but it would probably be cleaner. Rich