From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/15013 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Alexander Monakov Newsgroups: gmane.linux.lib.musl.general Subject: Re: [PATCH] a_ctz_32: Instead of a subtraction by 31, use an xor Date: Thu, 12 Dec 2019 18:10:31 +0300 (MSK) Message-ID: References: <20191108023955.4402-1-rosenp@gmail.com> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="61764"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Alpine 2.20.13 (LNX 116 2015-12-14) To: musl@lists.openwall.com Original-X-From: musl-return-15029-gllmg-musl=m.gmane.org@lists.openwall.com Thu Dec 12 16:10:52 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1ifQ74-000FqV-M5 for gllmg-musl@m.gmane.org; Thu, 12 Dec 2019 16:10:46 +0100 Original-Received: (qmail 15705 invoked by uid 550); 12 Dec 2019 15:10:43 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 15687 invoked from network); 12 Dec 2019 15:10:42 -0000 In-Reply-To: <20191108023955.4402-1-rosenp@gmail.com> Xref: news.gmane.org gmane.linux.lib.musl.general:15013 Archived-At: On Thu, 7 Nov 2019, Rosen Penev wrote: > This reduces produced assembly from a mov and sub to a single xor. > > Tested with godbolt: https://godbolt.org/z/Qz-Qr5 > > Signed-off-by: Rosen Penev > --- > src/internal/atomic.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/internal/atomic.h b/src/internal/atomic.h > index f938879b..196b3fb1 100644 > --- a/src/internal/atomic.h > +++ b/src/internal/atomic.h > @@ -256,7 +256,7 @@ static inline void a_crash() > static inline int a_ctz_32(uint32_t x) > { > #ifdef a_clz_32 > - return 31-a_clz_32(x&-x); > + return a_clz_32(x&-x)^31; (since there was no other response...) While certainly this is a nice improvement when considered in isolation, looking at how this is used in the rest of the library (in ffs, ffsl, qsort) reveals that returned value is used in further arithmetics. So e.g. ffs/ffsl do a_ctz_l(i)+1, and +1 may be folded together with 31-..., but not with the xor. Thus the original is preferable. Alexander