From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/8245 Path: news.gmane.org!not-for-mail From: Jens Gustedt Newsgroups: gmane.linux.lib.musl.general Subject: Re: New optimized normal-type mutex? Date: Thu, 30 Jul 2015 12:00:59 +0200 Message-ID: <1438250459.10742.16.camel@inria.fr> References: <20150521234402.GA25373@brightrain.aerifal.cx> <8c49d81e.dNq.dMV.21.hNiSfA@mailjet.com> <1438207875.10742.3.camel@inria.fr> <20150729233054.GZ16376@brightrain.aerifal.cx> <1438213760.10742.5.camel@inria.fr> <20150730001014.GA16376@brightrain.aerifal.cx> <1438243654.10742.9.camel@inria.fr> <1438247427.10742.13.camel@inria.fr> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="=-0fDCxaMZAhwCEtQpr/uC" X-Trace: ger.gmane.org 1438250497 23049 80.91.229.3 (30 Jul 2015 10:01:37 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Thu, 30 Jul 2015 10:01:37 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-8258-gllmg-musl=m.gmane.org@lists.openwall.com Thu Jul 30 12:01:33 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1ZKkeW-0001du-LS for gllmg-musl@m.gmane.org; Thu, 30 Jul 2015 12:01:28 +0200 Original-Received: (qmail 19464 invoked by uid 550); 30 Jul 2015 10:01:27 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 18417 invoked from network); 30 Jul 2015 10:01:26 -0000 X-IronPort-AV: E=Sophos;i="5.15,576,1432591200"; d="scan'";a="172118679" In-Reply-To: Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwCAYAAABXAvmHAAAABHNCSVQICAgIfAhkiAAAEb9JREFUaN7VmXuQ3lV5xz/n/O7v7333fffdW3azSTY3Qi6wGy4aEc0qWlGhWcAq7WiDrWNpO61paS29EpzWaevYpvfL2JK0VRBbBxQtg0U2BRGKhIAI4ZKwSTbZ+3u//G7nnP7BtGM7UyVAnfb56/xznuf7nXnOnOf5fuH/eYjXI8mDn/5k8ch90xNj+eKka8xkI1MTGmY6Rk/XpD5w091fPvnf7xz8sQ/sLTQ6k54SY2mW1mQY3DVX8Kdv/PvPnfyBE7hn797bxBPP3NBv28RAqgyzJqNgJIs5q7bhHW+94Yd+5/fuBrjj5l/cs/LUs/vNiVMTo5aNJQX9lks3iaivHqlt/9D7xs778Y/Uf6AE/uWa6x7wT52ZTDyLKEkoOHmW4oSaSDFxQmq7tHPuDJ5TUkv10iAGgSLvediZotf16HS6KG1RuXDT/g99/vO3vtLa8vUgELuCVAja0mVJKRaylGbgYnwP27bImYRcozkmKrWS53vYjosfhNSUIjXQUoq2K0ksRdZqTZ5Lbfs/Dl/66E/uSebmb8hF8URab48JS+IPDVJtto9uuHj8wKW//6lD/1OSznxtLEPRv2kdPH2MSBpkqlDGkLoOIgGkIWcEiTYsk1JSDpGUFAAcBzfRCKXxV+ql7/XWnn7uxcm3vOGS2rrR1UcLV+6pC4B//tjPjGdHjhwttFOEcMmMAgFeLkezWsOTkmigfyYZGTzYN7HtwFt u+rU6wJnprxQf/JvP7beffH5fBIwGIZqEutJUlCKSNoFjYbQiihOkFmih0FJgSUGIpGxstE6RwiLQmsyzcXZesO8N1773YOHKPfUv/uxP726ePjsVNVoT9WZjMq02kRpEKay9+UPXjQmAL1991UuDZ+fGQulSVQrpuqg4pWkkupSjVGshtGZJpzQ9DysfQqZIkgzR7 jAc+qAFrm2xTIRJJKnr0okTfCnQKLRlo9MUicGWL7dcnCnytoOTKTxpERpBv8moomm7AZFjEddb9EiHdhJhPJdYanQKUZrivmH7Pnlm+itFuVgdi7WkpSSJdFlqRwRugLAkQakXJQ06cAhcB0tnRI0mstOllKYEvk8qJJGj8IygID16hwZBGIzIsIXCRREq6DGSsu9RkBYBBldKbCFwbMnA2mH8wKIrbQrGY1WkKbQ6DFkWOQEicMks8JTEQRJI0LOLJbvxwosTMlMk0kYbjTCGci4gzrr0Sod0YQFHWiSuRTHMkbRaCCxMZlh2FLGQLEjBmHFIhUBrEI0OlgYnyNN0LFKliYxGeT69rouqtOjP5+lLEtxugosgOjNHj7ZRGmwjaDoGX9skKEQQIFptkAIjIWckCQIrE9ija8aOznk5wiwh9lw8aeMlMamUeEKis4wOFiaFRtpEYxBJSoQA4ZM4ksgSPG4E9Siibjt02xUaUQsnzFFfSZlPM4TRlPp7yQWCs/PzlMs53HrMRLGPt/f2UFYRbkOzbGm6JkPgYjAgbVo6I/Ntup0unuPgWAK0RaLVy//AfVe864lgcWkicz1yloNvoJMmpFLTsiCJNIFt0VERHdcnGyxPW5598BmlJo6tVPc9v1RB2CHdqMWOYpkLCwHnhyHVTHNcpTxUrVCp1rAtyYZ1w0i3SLFk89A3/g2UZH3O4ubL30RxsU5ldhkpDFJIbGFRE4rYSLQQRHGEJQQeFm7gYHaev98CuO6NF62q1hqTKktJlKarNW0JDZUgHA9dys/Y q4emnQ3Df3nxj7z7+t2f+pO/Ysv5U0dm525uKYty3wCetPjRdYNcjaaQabq1Oq1MkVOKkc1reGFhiWIuj7C6JGlKyZNsNor3n7eFStLlkcoKz61fvX/79u3T9cWliTBWvrQEHa0QQmIZgdIK27FQmaKVxIRbN+77z5/4sT/+1PjJEycmosWVmXJvP3YpZM3aNYyuWz dTuHLPf5lPrnr3VXtPnDp10HNDlIqROEwO9fG+VosXspQ7lyto3+fE8ln2rlpDfniQP332WRCCd75tB43jZ7jUDRlspTiugxAeR+I2960s0hBy3503fmT69FcfONian5uQliDIF+m2YyKlUUIRxRF968eOfuC+r+4851Hill//+PiRx79ztNrs0mm0KZaLZEnGWwsB725WOFRv8HAmSDodSo7D3oF+bMfirxbOYkKHQDhc5lusznxaaO6fO8O2fIE9vf1M0+Go0jM73nT55EcvGa8duedr+3S1si8v7VLnzDxRpkjCHGJ4aHr1utVT7/mzP6/b5wJ++v57izf9wi8dVNrHcT3yxQK27RJ3Y6zM0LI1b+0ZoFldIvFsxvsH6LMELQOpkST1GBVq3NIAvvC54ztPcTwRLOs2b+sf4u2RS5JFY1++5yt3XbXnvZPX3/n5W4FbH/nLA+ueuOeBG/zefG3ynW87+N3D3jkRuP3226fiVEwkOiV0bNIsJep28QOfqOAjl2DEElw1WCZKFGUtkK6h7fj4DpRKfURJjeGwSE/UJcJga4d6p0lHCoo64w1BkfsqtYnf/LVPHgSuAdh1476TwMsD3t9/9jUMc9pMSiGxJRgV0+22Wa7XqTfqHH7pON8u9/NE6BB5PnZPD42BkMe14IG4gVXIsVSvkijFU2fPIrOUraPrsD3Jlt4SnTQlsi2qtqLSarG0VJnaObHztu8HyTqX9vn6Aw/cMTw0TJpkeH6AhaIncCn3FkmFzWNzc3y7U+ObjTqPVJs8Vq1h1oScabdoV xWxyQg9l57BATZZDpuCgH5PMrVxM3G3g513WfNDW3liYRGsPN1ua2LHtq0zM6dOPfmqCDx15227/+ILd58EaNa717/00smp3bsvo7fcS3VpGTcAKV2q1QZJ3MVIC88JsFJN2c+xbrDM+aHPlqE+ji0vY7kWvaFhaaXJzs3nYaoNNq/tJ1/tsiwtnpEOtz/+HHNLMXG UkmYp7XarVKlWD53TQtO89+7iN//stoPWqdkpPx/Wur3F6WccZ+KFdnuskXNZaCVkiWJ++Qy+59EbuHi6y1t2XkL27Ak2CJssS6lHMYPCYDLDP/dJvj1fQ0qb+WoMqk3BzrFlYAidc5hZXAQnh9DQ6TbRQhF3U7Is4eZf+YWJj9308Se/5z7w3fH8s8fGnNOnp1ZJie52S+VOd2oIeJObo2Vs5i2JWZWn1e/TrjQZK/WSiyPCE2eImh1S12PJaEJtYbsCN81oL7eROPQVodVI6ZiA1MvxQhyhOxGWlcOkKUHg4aQOaaoQ0hDFXf7pi3dPAk+e00r56HuurubnzpZsY5NKiYMhwUagcY1CCINB0pACIQXLoaBHWdTbXbTwWHEhHwQE9SbVgRJ/Mz/Hi4sr7H3fOIuVJvc+OEsYFnG8AKETGs0WljBYtk2j3nkZmRDoqMPqdSNHHztyZOc5rZTpQHlfIh2MpdDSgAZbZ/hGgzRIaSEF9AiNLyWylRFbNh0bfKEZsCxkK2JxZJi/Pv0CJyornLd5DQ8++hxHv7NIEFg0W00ajRqNRh3XtciUot5okBlFZgzSssm04fjxExN/+kcHxs+JwOWHDh06u3bkQNV1a7ZSIDWehEQoUgEaQGtMlpHlLXp6XPLdjFyqwBMsAvfR5D4Z0fFKbNqwgWajgnQHWF6K8V0H35EIrQFodWKSzGAJQc63sS2JylKkZaO1ZnTNqplXpUo07727+NV/vPuAv3D2hrWpRtYaZJbEwmBlINf00144S35kgMZsxEpgc0zEfL3Spu 1LklihlU2u6NBtNZDaohXHIDKadYWRLsZkZFhgNCaLcB2XzECWalAZhR6v9vzx472v+BF/dxSu3FPf9cY3z0TNLgO+YFNfHzozJHFE4CvW1CPGBvtYabQ5YQuON+aI7YCqUTSX2hgTUeop025IstSiv1wiUQu4fkiz2SBTHaSQ9AQOaZqBk6MTdzFGYAnB0FAvWjD9 fVWJ7xUjI+XJ54/NMNeVLM3XMEZjWRYGzf2dWUp5kJZNmlmE+RDdaXHpznFOzc6xOL8AKIqlArVaTKdTQSvD4sIifb29KOOgshaOJal0UzSgjcQYyNKU4zMzXHTR+F2vSRdKOm3CnIvnO4QFn77ePL7vErVaqLhN4OUYcGHL+j7iVoeBoX5mT5+mtjzHeRtLDA2UWFqYx3c8Ot2MocE875y8iMCHLElIogytDVJKjFIYLUiiCKVihkeGD379Xx869JoICGHj2C4qU2ASFqsLNFptvMBj9ZoSlzgen7hiFz919QQDoWB+9gwqU4SFXk6eXGHN6kH6e21Wls7SX3Zot5t889HvUKlFL8ssJkOgAIXv2+gswqiYjRvHjv7qr+zb94qEre8Vnu/juRlKCTJlgQhBxdRaTVb1lNh7yRb41tOUR99G72Afza6h0e7gezZWOMzRY7NIZVHoyVOtxli2hSUEaarJlwLqOqOTaoTMYVmagXKOQnHk6O/+7u9MTl5xZf01D3NrR0dK7W58pe/naHS7SB1z7fg6fm58I2/OuRSencH2HIIffjNzLRhfv5ZGo85itU6aSRr1Jq6laEcJjcgwWA7YMOoS5B2WKx38oEAUJURRhko1nm9P/8EffvrK7wf+FbfQjT/zUwdzoTvTjVv4UlK0XHZs3UTZd1mbaqyiCwKc02d4v51ycW2Zy9Zvw+AQ5DxW9RcoFwRDZZ9VA2UAdu26iD3vupQLto5SrXSwpIPjKsbWr+aDH/zg/lcC/pzU6Y98eO/4iReem67WOyU/6KFdX2S kELLRU3y0d5Cg1aEZBqSxQJdtprXL5546guXm6QtCwmKOTrvC2NoSq8tw8fYSDx9t8tATVSqNOo16lzdech7Ndrz/a4e/9fqr05+57dCTW7edtz/fE2JIySyHlxqKr8+3OFXooykND8ocf1vwuD1WHO1G5AKPUi5EYVGpLeA4OcqhZtVASKtjkLZFpjM6nZjNG0dnN p63fepcwL8qf+BdV+x+YGmxNimRtNOUJI3Z1APX7NjBPxx5ntjLQRLTTBKGSz4jwz3UajWGhgap1uqcPFXh0ovX88zTz1Mo9tJNQdiF6d273zh1y29/un6ueM7ZH7hw24apfGDPJN02ri0phXlUaZi/e3qGSpziOgKjI8qhS73eQKuUIBAsLS3i+x5DQ0W6zRUyY1hpiNrAwND+PXve+6rAv2qH5rdv+dXxr933tbtWlqpj/cOjkAkWaosUHUW5UGBmYZF1AwM0WjUKvX2g25TLJZ56dh7Pztg5vmFm8vIL9udHLrtr6trrXxXw12wxTd9/b/ETt95y15lTC5Pbtu2gUVkgaTeItSJKDI4Fo6t66cZtbNuiUu8QlvpnVq0auuGzd3758P8Jl3L6/nuLf3Tg05OWEJOFfH4iiuLJOMlYXmnSrDURaIxJEFJzwYXb9x+640u38v8hPvazP717+5atZtvmrWbN6vVm9dBqc83VP7znf6OW/Xok+a2b941XG9HY7KmTE81me/Khww+P5XMFkszQiFIKPQEQTwF3/58zui+98IK9KA4W8gVOn10kF4Tk8z7SdWl2I6JWA0fE9A+VqdXq0x/+iQ/f8PM3/cbJ14vAa7JZr3rXFR+bn188aFk2veUCxXIPQeiDkdQbbXKuZP3qIju3rcVKm7z9su2TTxy+a+Yzn3z/La8XAevVXrzu2j3jBnlXrVonVSmtbpdms40AwlyeTGeEtk2r00HYLhdsXc/OrR4XbVaU/Mbkb930nqme0rrowUefffIHTuC6a/fsPfbM C3d0WolfKhTp6clRyhcg04xtGCHt1Dh/rExRamYWGyxXa7xn1yYuvkAwP7fClg0Zhx+qr/rM7d+Yesc73nHD5OW7Zh478tRzPxACH/iRa245duzkAdsJfMuy8aXm/FUFlpeWWOm2WZidI9WapaUlFIaTlQ4jfR6/vO9D2NE32LTeYnZOUmnA6PAohx9+qnTy1NnrP/ Cj15cu23XZI4/826Px/xqB37z547sPH374oOvnKTsOb+1zuCoImZARM8ZiLlKkXcWOcoFRJ+Do/CwFO4e2DEe+9Ti+ZXA8ny/cU+X4bMKRZxZIVEiqXI6/+NKuUzPHb7z40ovnX3zx+Ctuq38HyuqWG7Tu+A0AAAAASUVORK5CYII= X-Mailer: Evolution 3.12.9-1+b1 Xref: news.gmane.org gmane.linux.lib.musl.general:8245 Archived-At: --=-0fDCxaMZAhwCEtQpr/uC Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Am Donnerstag, den 30.07.2015, 12:36 +0300 schrieb Alexander Monakov: > On Thu, 30 Jul 2015, Jens Gustedt wrote: >=20 > > Am Donnerstag, den 30.07.2015, 10:07 +0200 schrieb Jens Gustedt: > > > Am Mittwoch, den 29.07.2015, 20:10 -0400 schrieb Rich Felker: > > > > On Thu, Jul 30, 2015 at 01:49:20AM +0200, Jens Gustedt wrote: > > > > > Hm, could you be more specific about where this hurts? > > > > >=20 > > > > > In the code I have there is > > > > >=20 > > > > > for (;val & lockbit;) { > > > > > __syscall(SYS_futex, loc, FUTEX_WAIT, val, 0); > > > > > val =3D atomic_load_explicit(loc, memory_order_consume)= ; > > > > > } > > > > >=20 > > > > > so this should be robust against spurious wakeups, no? > > > >=20 > > > > The problem is that futex_wait returns immediately with EAGAIN if > > > > *loc!=3Dval, which happens very often if *loc is incremented or > > > > otherwise changed on each arriving waiter. > > >=20 > > > Yes, sure, it may change. Whether or not this is often may depend, I > > > don't think we can easily make a quantitative statement, here. > > >=20 > > > In the case of atomics the critical section is extremely short, and > > > the count, if it varies so much, should have a bit stabilized during > > > the spinlock phase before coming to the futex part. That futex part i= s > > > really only a last resort for the rare case that the thread that is > > > holding the lock has been descheduled in the middle. > > >=20 > > > My current test case is having X threads hammer on one single > > > location, X being up to some hundred. On my 2x2 hyperthreaded CPU for > > > a reasonable number of threads (X =3D 16 or 32) I have an overall > > > performance improvement of 30%, say, when using my version of the loc= k > > > instead of the original musl one. The point of inversion where the > > > original musl lock is better is at about 200 threads. > > >=20 > > > I'll see how I can get hold on occurrence statistics of the different > > > phases without being too intrusive (which would change the > > > scheduling). > >=20 > > So I tested briefly varying the number of threads from 2 up to 2048. > >=20 > > Out of the loop iterations on the slow path, less than 0.1 % try to go > > into futex wait, and out of these about 20 % come back with EGAIN. >=20 > That sounds like your testcase simulates a load where you'd be better off= with > a spinlock in the first place, no? Hm, this is not a "testcase" in the sense that this is the real code that I'd like to use for the generic atomic lock-full stuff. My test is just using this atomic lock-full thing, with a lot of threads that use the same head of a "lock-free" FIFO implementation. There the inner part in the critical section is just memcpy of some bytes. For reasonable uses of atomics this should be about 16 to 32 bytes that are copied. So this is really a use case that I consider important, and that I would like to see implemented with similar performance. (I didn't yet think of making this into a fullfledged mutex, implementing timed versions certainly needs some thinking.) > Have you tried simulating a load that does some non-trivial work between > lock/unlock, making a spinlock a poor fit? No. But I am not sure that there is such a case :) With this idea that the counter doesn't change once the thread is inside the lock-acquisition loop, there is much less noise on the lock value. This has two benefits. First the accesses in the loop are mainly reads, to see if there has been a change, no writes. So the bus pressure should be reduced. And second, because there are less writes in total, other threads that are inside the same loop perceive less perturbation, and the futex as a good chance to succeed. Jens --=20 :: INRIA Nancy Grand Est ::: Camus ::::::: ICube/ICPS ::: :: ::::::::::::::: office Strasbourg : +33 368854536 :: :: :::::::::::::::::::::: gsm France : +33 651400183 :: :: ::::::::::::::: gsm international : +49 15737185122 :: :: http://icube-icps.unistra.fr/index.php/Jens_Gustedt :: --=-0fDCxaMZAhwCEtQpr/uC Content-Type: application/pgp-signature; name="signature.asc" Content-Description: This is a digitally signed message part Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEABECAAYFAlW59dsACgkQD9PoadrVN+LyYgCgm4b8/V+Mp9VWi0pRtbwMbazz aiIAn1DJBtamXngmKAHTOjxZRvCl/qQ8 =aFy7 -----END PGP SIGNATURE----- --=-0fDCxaMZAhwCEtQpr/uC--