From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.2 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by inbox.vuxu.org (OpenSMTPD) with SMTP id 8c322cd7 for ; Wed, 15 Jan 2020 18:49:29 +0000 (UTC) Received: (qmail 3566 invoked by uid 550); 15 Jan 2020 18:49:27 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 3543 invoked from network); 15 Jan 2020 18:49:26 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=0bgYlwUCd8LvCn3x8xWaZASIT4wtmsmjE0Sa10h3ZWI=; b=axVBv87DGutWX4xAYVgjGTqnBd2nwUrKtyb6j+cyvNP8BSr9ZQrTrySowjnFGXFIfR 120+5sqexaVcboeVU+NqJUWkyXD0/PFq9Kz5WCqtyvMRravWLsmHukpbVgKLMRAvURlM qUAD57ldtBOP64VVVxi6VWwFYWzly0L+VGkizUM5t6a4NnLH05DK057xu9gK3/pBonj0 oSIpNgjE5nU+ypmDG0F5Dvtm+ZnooFB7bXagNieWStdc9A6B2EjH8OWjjSfO/ayEj6KE w8Rjc8yopaENNM38tSloz3tDZ+SFyeOF6UsztT6h6a8evvH5LnaUDLdriIUUGGSMBUtk CJlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=0bgYlwUCd8LvCn3x8xWaZASIT4wtmsmjE0Sa10h3ZWI=; b=fmnDAsrHSwVLHkAzIej8WAJsEHVg9betcsK6HwCzYJAeKrfWOoC8hvHxDZiOrs0p2U hZrDmvNXw89O3lydca2OateHphah4piCHgYrItB4NTkcYPu7jJVqjpQ4SrY2H6gRgZMl rmBplBswYlVUUgypGc/WjDMB2L6J2Y5LOVp1IwiVP0VEDf6c7dGvwxvYaZ4zfy2a8LDm 7NBWxNZ56m+WYgKm/R5C9oNXtgvSty9vaPrW/JmMfTVqI4HmAeWSLrnSHt9DAJks9I9u PSbSIOZZZ+lmZTUGOpDDIURCnww6N90fy7uuJNi/4CBAV5By9T0ZWOaD6Q1nYiAwjsrE 8EaA== X-Gm-Message-State: APjAAAWdc/uGcXitWr/G0+roQizSh/iqEtCNU8eT+4ON5/IkUaBCjnHB u8l0J/DOxq+QMD2mpj6iGPollK5PjjEMJhwHpSE9tA== X-Google-Smtp-Source: APXvYqxKMhanzYl8+uExbLVxW3AaHp0efK7kciVDUpvJqTm0jcs4Uf3+AwRFJJCcgblIie4LC92nGQdiD5dXN9hi4dI= X-Received: by 2002:ab0:66d6:: with SMTP id d22mr15285744uaq.92.1579114154745; Wed, 15 Jan 2020 10:49:14 -0800 (PST) MIME-Version: 1.0 References: <20190913184432.29753-1-armccurdy@gmail.com> <20200115163559.GI30412@brightrain.aerifal.cx> In-Reply-To: <20200115163559.GI30412@brightrain.aerifal.cx> From: Andre McCurdy Date: Wed, 15 Jan 2020 10:49:03 -0800 Message-ID: To: musl@lists.openwall.com Content-Type: text/plain; charset="UTF-8" Subject: Re: [musl] [PATCH 1/2] Add Thumb2 support to ARM assembler memcpy On Wed, Jan 15, 2020 at 8:36 AM Rich Felker wrote: > On Fri, Sep 13, 2019 at 11:44:31AM -0700, Andre McCurdy wrote: > > For Thumb2 compatibility, replace two instances of a single > > instruction "orr with a variable shift" with the two instruction > > equivalent. Neither of the replacements are in a performance critical > > loop. > > --- > > src/string/arm/memcpy.c | 2 +- > > src/string/arm/memcpy_le.S | 17 ++++++++++------- > > 2 files changed, 11 insertions(+), 8 deletions(-) > > > > diff --git a/src/string/arm/memcpy.c b/src/string/arm/memcpy.c > > index f703c9bd..041614f4 100644 > > --- a/src/string/arm/memcpy.c > > +++ b/src/string/arm/memcpy.c > > @@ -1,3 +1,3 @@ > > -#if __ARMEB__ || __thumb__ > > +#if __ARMEB__ > > #include "../memcpy.c" > > #endif > > diff --git a/src/string/arm/memcpy_le.S b/src/string/arm/memcpy_le.S > > index 9cfbcb2a..64bc5f9e 100644 > > --- a/src/string/arm/memcpy_le.S > > +++ b/src/string/arm/memcpy_le.S > > @@ -1,4 +1,4 @@ > > -#if !__ARMEB__ && !__thumb__ > > +#if !__ARMEB__ > > > > /* > > * Copyright (C) 2008 The Android Open Source Project > > @@ -40,8 +40,9 @@ > > * This file has been modified from the original for use in musl libc. > > * The main changes are: addition of .type memcpy,%function to make the > > * code safely callable from thumb mode, adjusting the return > > - * instructions to be compatible with pre-thumb ARM cpus, and removal > > - * of prefetch code that is not compatible with older cpus. > > + * instructions to be compatible with pre-thumb ARM cpus, removal of > > + * prefetch code that is not compatible with older cpus and support for > > + * building as thumb 2. > > */ > > > > .syntax unified > > @@ -241,8 +242,9 @@ non_congruent: > > beq 2f > > ldr r5, [r1], #4 > > sub r2, r2, #4 > > - orr r4, r3, r5, lsl lr > > - mov r3, r5, lsr r12 > > + mov r4, r5, lsl lr > > + orr r4, r4, r3 > > + mov r3, r5, lsr r12 > > str r4, [r0], #4 > > cmp r2, #4 > > bhs 1b > > This is outside of loops and not a hot path, > > > @@ -348,8 +350,9 @@ less_than_thirtytwo: > > > > 1: ldr r5, [r1], #4 > > sub r2, r2, #4 > > - orr r4, r3, r5, lsl lr > > - mov r3, r5, lsr r12 > > + mov r4, r5, lsl lr > > + orr r4, r4, r3 > > + mov r3, r5, lsr r12 > > str r4, [r0], #4 > > cmp r2, #4 > > bhs 1b > > This one is in a loop, but perhaps not terribly critical to > performance. Yes, it's in a loop, but I can confirm it's not a performance critical one. > We could keep old version with #if !__thumb__ but I doubt > it matters, and it looks like hardly anyone is using pre-thumb2 ARM > anymore anyway; a show-stopping bug went uncaught for over a year in > other things for v6. I was meaning to ask about that after seeing your recent commit in master. My primary target is pre-thumb2 armv6 and I hadn't noticed any problems... > One cosmetic fix I'd like to make when applying this is keeping the > old gratuitously-ugly formatting just so the actual change isn't > obscured by the formatting-only change on an adjacent line. I can > handle that though. > > Rich