From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/9081 Path: news.gmane.org!not-for-mail From: Markus Wichmann Newsgroups: gmane.linux.lib.musl.general Subject: atomic.h cleanup Date: Sun, 10 Jan 2016 13:21:39 +0100 Message-ID: <20160110122139.GF2016@debian> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1452428560 26515 80.91.229.3 (10 Jan 2016 12:22:40 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 10 Jan 2016 12:22:40 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-9094-gllmg-musl=m.gmane.org@lists.openwall.com Sun Jan 10 13:22:39 2016 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1aIF15-0008Gb-6P for gllmg-musl@m.gmane.org; Sun, 10 Jan 2016 13:22:39 +0100 Original-Received: (qmail 9364 invoked by uid 550); 10 Jan 2016 12:22:37 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 8090 invoked from network); 10 Jan 2016 12:21:51 -0000 Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) X-Provags-ID: V03:K0:ScXp9X4WweP7EJsvqhGlbDZEGJTW+FRGIzzawcLiMxXSIakEKaD 71lRfuqLMf3utGYo9e+faQK5fy1UtMQJWPh26fTMViTBjc/4AZLk4rFTOOtkRCNoi0I0J6x sI29fx5sI1qNkvlekSaPX2ntkrJrU/aQLHCdlJ7pkArwmgkhYum/NTu7D5BRvdGHau9lag6 rNebsLvIwHPxV9qzVzAxw== X-UI-Out-Filterresults: notjunk:1;V01:K0:Xcr46YqrZcE=:7xpNxHmZB4FKKHhWVp2ute AWTCRNX4w/kZEjITgjp3FK9KC1If1Kr42FMfIMdjTlHk0J6MpZmhBjhpm0Brqq0/e5aM9TZQr 1FChIvwMM/Nbdu8tWWSCLfp3ggLH+nCi6c4ht+PA6S4kaqoFRKeVCKe7KkZxYd0vSbSsy0/Q+ YUFDcd+CDUQOQI1wKzjipPRUmeWzmxi+ZSJjpcbPzBnHDUwPoTj0X5qlf2Jmi4HXBP0L+NTPE 2AXsvPbx3wipaMP9R2BMQHHbPCQs/wKqyZb/3r5ZLo3SbXXRF/SvP8SMF6qcRxTznOaAWHO4C rJ3dY3ewaka4b09F4yOBmzNaGehBG1tovBOPi39+oGV8/UJypoJXYHargv94pN2MiCA0QZdah 0VFrXe3sdiWulcdbXuSlUGOanTtCMClc2ASfPb6NYi8P23Tk2xxiUukHgWm17KFyHeXNRVcCG hi5acDCwe7mYFmrdnxWNxzIitBSHGdJL6B9rhpYPTg7lbEoiSSjm69NiGdz0/AEmZIojxtDKU mRpB+j+cJzbdlWfFurwOhEUHIwk1/UsF0Er2fvN1VMPljQYGTP3JMlIV/EZps462qgeYqwOez gFahgTTMM8dDfunXBHnRIB6tnETlIUuDygzoPO4qqZntFzhlmUWHwqB9SR957JwfUiHqtNIXE xBo1YfTCmzMlUPNGK58b41j5M9W9h9dj6b8CQ6/kDMZJVOaGoe/sLXh/rtTPzkSb1OE9F2psM a72GUiuVjFZtXcq0xO7hsJRCKwGOciOpeYw5m55VPYdO+D16kqvkTSPbn1Q= Xref: news.gmane.org gmane.linux.lib.musl.general:9081 Archived-At: Hi all, The development roadmap on the musl wiki lists the ominous point "atomic.h cleanup" for 1.2.0. I assume you mean a sort of simplification and unification. I noticed that for the RISC arch's there are rather liberal amounts of inline assembly for the atomic operations. And I have always been taught, that as soon as you start copying code, you are probably doing it wrong. So first thing I'd do: add a new file, let's call it atomic_debruijn.h. It contains an implementation of a_ctz() and a_ctz_64() based on the DeBruijn number. That way, all the architectures currently implementing a_ctz() in this manner can just include that file, and a lot of duplicate code goes out the window. Second thing: We can reduce the inline assembly footprint and the amount of duplicate code by adding a new file, let's call it atomic_llsc.h, that implements a_cas(), a_cas_p(), a_swap(), a_fetch_add(), a_inc(), a_dec(), a_and() and a_or() in terms of new functions that would have to be defined, namely: static inline void a_presync(void) - execute any barrier needed before attempting an atomic operation, like "dmb ish" for arm, or "sync" for ppc. static inline void a_postsync(void) - execute any barrier needed afterwards, like "isync" for PPC, or, again, "dmb ish" for ARM. static inline int a_ll(int*) - perform an LL on the given pointer and return the value there. This would be "lwarx" for PPC, or "ldrex" for ARM. static inline int a_sc(int*, int) - perform an SC on the given pointer with the given value. Return zero iff that failed. static inline void* a_ll_p(void*) - same as a_ll(), but with machine words instead of int, if that's a difference. static inline int a_sc_p(void*, void*) - same as a_sc(), but with machine words. With these function we can implement e.g. CAS as: static inline int a_cas(volatile int *p, int t, int s) { int v; do { v = a_ll(p); if (v != t) break; } while (!a_sc(p, s)); return v; } Add some #ifdefs to only activate the pointer variations if they're needed (i.e. if we're on 64 bits) and Bob's your uncle. The only hardship would be in implementing a_sc(), but that can be solved by using a feature often referenced but rarely seen in the wild: ASM goto. How that works is that, if the arch's SC instruction returns success or failure in a flag and the CPU can jump on that flag (unlike, say, microblaze, which can only jump on comparisons), then you encode the jump in the assembly snippet but let the compiler handle the targets for you. Since in all cases, we want to jump on failure, that's what the assembly should do, so for instance for PowerPC: static inline int a_sc(volatile int* p, int x) { __asm__ goto ("stwcx. %0, 0, %1\n\tbne- %l2" : : "r"(x), "r"(p) : "cc", "memory" : fail); return 1; fail: return 0; } I already tried the compiler results for such a design, but I never tried running it for lack of hardware. Anyway, this code makes it possible for the compiler to redirect the conditional jump on failure to the top of the loop in a_cas(). Since the return value isn't used otherwise, the values 1 and 0 never appear in the generated assembly. What do you say to this design? Ciao, Markus