From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.0 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.2 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by inbox.vuxu.org (OpenSMTPD) with SMTP id 7ecc134e for ; Wed, 15 Jan 2020 16:36:13 +0000 (UTC) Received: (qmail 24371 invoked by uid 550); 15 Jan 2020 16:36:11 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 24352 invoked from network); 15 Jan 2020 16:36:11 -0000 Date: Wed, 15 Jan 2020 11:35:59 -0500 From: Rich Felker To: musl@lists.openwall.com Message-ID: <20200115163559.GI30412@brightrain.aerifal.cx> References: <20190913184432.29753-1-armccurdy@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190913184432.29753-1-armccurdy@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: Rich Felker Subject: Re: [musl] [PATCH 1/2] Add Thumb2 support to ARM assembler memcpy On Fri, Sep 13, 2019 at 11:44:31AM -0700, Andre McCurdy wrote: > For Thumb2 compatibility, replace two instances of a single > instruction "orr with a variable shift" with the two instruction > equivalent. Neither of the replacements are in a performance critical > loop. > --- > src/string/arm/memcpy.c | 2 +- > src/string/arm/memcpy_le.S | 17 ++++++++++------- > 2 files changed, 11 insertions(+), 8 deletions(-) > > diff --git a/src/string/arm/memcpy.c b/src/string/arm/memcpy.c > index f703c9bd..041614f4 100644 > --- a/src/string/arm/memcpy.c > +++ b/src/string/arm/memcpy.c > @@ -1,3 +1,3 @@ > -#if __ARMEB__ || __thumb__ > +#if __ARMEB__ > #include "../memcpy.c" > #endif > diff --git a/src/string/arm/memcpy_le.S b/src/string/arm/memcpy_le.S > index 9cfbcb2a..64bc5f9e 100644 > --- a/src/string/arm/memcpy_le.S > +++ b/src/string/arm/memcpy_le.S > @@ -1,4 +1,4 @@ > -#if !__ARMEB__ && !__thumb__ > +#if !__ARMEB__ > > /* > * Copyright (C) 2008 The Android Open Source Project > @@ -40,8 +40,9 @@ > * This file has been modified from the original for use in musl libc. > * The main changes are: addition of .type memcpy,%function to make the > * code safely callable from thumb mode, adjusting the return > - * instructions to be compatible with pre-thumb ARM cpus, and removal > - * of prefetch code that is not compatible with older cpus. > + * instructions to be compatible with pre-thumb ARM cpus, removal of > + * prefetch code that is not compatible with older cpus and support for > + * building as thumb 2. > */ > > .syntax unified > @@ -241,8 +242,9 @@ non_congruent: > beq 2f > ldr r5, [r1], #4 > sub r2, r2, #4 > - orr r4, r3, r5, lsl lr > - mov r3, r5, lsr r12 > + mov r4, r5, lsl lr > + orr r4, r4, r3 > + mov r3, r5, lsr r12 > str r4, [r0], #4 > cmp r2, #4 > bhs 1b This is outside of loops and not a hot path, > @@ -348,8 +350,9 @@ less_than_thirtytwo: > > 1: ldr r5, [r1], #4 > sub r2, r2, #4 > - orr r4, r3, r5, lsl lr > - mov r3, r5, lsr r12 > + mov r4, r5, lsl lr > + orr r4, r4, r3 > + mov r3, r5, lsr r12 > str r4, [r0], #4 > cmp r2, #4 > bhs 1b This one is in a loop, but perhaps not terribly critical to performance. We could keep old version with #if !__thumb__ but I doubt it matters, and it looks like hardly anyone is using pre-thumb2 ARM anymore anyway; a show-stopping bug went uncaught for over a year in other things for v6. One cosmetic fix I'd like to make when applying this is keeping the old gratuitously-ugly formatting just so the actual change isn't obscured by the formatting-only change on an adjacent line. I can handle that though. Rich