From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.2 Received: from mother.openwall.net (mother.openwall.net [195.42.179.200]) by inbox.vuxu.org (OpenSMTPD) with SMTP id 4b4f1836 for ; Wed, 15 Jan 2020 20:20:34 +0000 (UTC) Received: (qmail 12278 invoked by uid 550); 15 Jan 2020 20:20:33 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 12260 invoked from network); 15 Jan 2020 20:20:32 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=xil/7VSz8OsoTOCBrJ01o/3GMTvh9Vioy8tkK2uGmrA=; b=ms9gPIDFaitJzLBP2t2duD/hmEVuGcZsk8QJGLJK1uD5RBYLMDNuilMv77h2T/4Why fXDHTjQUYBaDpqhXCttCUPUsNUu+CHQ8knQUaYLq982hQhykSDFlp0isKX7NjKXZebLF /hGSYnYwaIFFNSu22Rtq6RjbVbYtz6/GR3g/QIVvGEJnvHBAMlr+Ulzytt8OtUHxwiqw DAw4eg+sDlzAoAwDaUCVkDF5lBXGEmu6NRmRlwkd6CAvW0teVerxk+2ORLNkMUFf3tlH vzRo4AZghSL4FAhcxJ2O9CGffd2SRfiEIZmdxiJrIF7MVdnN2upZuRPLJqaDJgaAAmNY U12g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=xil/7VSz8OsoTOCBrJ01o/3GMTvh9Vioy8tkK2uGmrA=; b=Yfr2QTbLCsoBOQes9rvurxs6MOmL4Q5bQakEtuITJ9pJvVkKpK4JIj/qKUetefQI4S /+0Njrnlg5bqAM4UZGBF5m1wkLK5HjVOhyd6wBdvZIJ+UVA2ryFHyEW3zf/mqaq9AcKJ yZyY09q4pDQIx69L+tqkGUYu7civBchR3lU3fDRRHpMFftW3sUhCnurJXZV2F/vT+Hwg yGDj1Q/H7tK8q8st9G7FSlFX7fsZ41MnxXK75rJkNnnsEHj2htAtpkWsLTRDcRg0FyRn V2QR4+uYnqZOhZ5+Lu77m4u+9jQrcoC6zOxJKEcSykDQNMpYiq5cLX8K8WWvqmAWePi9 Iadg== X-Gm-Message-State: APjAAAW0biT17NSssz2mEdgVbFE6sgwQ47qlNXWAbu46cR6jdsOOy/A8 Yep2P/4heewasZ258j+RIrohOGlde9ZDz4WfpINoAyZN X-Google-Smtp-Source: APXvYqzrnIS8r4/oHnIvzpw1NlTZ8cn1EmuKyvpuouztduMtSWzMClgsu/rpK2K1RyqZDvQHbaAtYYfNU150iLK9ZCg= X-Received: by 2002:a05:6102:a10:: with SMTP id t16mr1969888vsa.130.1579119620278; Wed, 15 Jan 2020 12:20:20 -0800 (PST) MIME-Version: 1.0 References: <20190913184432.29753-1-armccurdy@gmail.com> <20200115163559.GI30412@brightrain.aerifal.cx> <20200115192403.GK30412@brightrain.aerifal.cx> In-Reply-To: <20200115192403.GK30412@brightrain.aerifal.cx> From: Andre McCurdy Date: Wed, 15 Jan 2020 12:20:08 -0800 Message-ID: To: musl@lists.openwall.com Content-Type: text/plain; charset="UTF-8" Subject: Re: [musl] [PATCH 1/2] Add Thumb2 support to ARM assembler memcpy On Wed, Jan 15, 2020 at 11:24 AM Rich Felker wrote: > On Wed, Jan 15, 2020 at 10:49:03AM -0800, Andre McCurdy wrote: > > On Wed, Jan 15, 2020 at 8:36 AM Rich Felker wrote: > > > On Fri, Sep 13, 2019 at 11:44:31AM -0700, Andre McCurdy wrote: > > > > For Thumb2 compatibility, replace two instances of a single > > > > instruction "orr with a variable shift" with the two instruction > > > > equivalent. Neither of the replacements are in a performance critical > > > > loop. > > > > --- > > > > src/string/arm/memcpy.c | 2 +- > > > > src/string/arm/memcpy_le.S | 17 ++++++++++------- > > > > 2 files changed, 11 insertions(+), 8 deletions(-) > > > > > > > > diff --git a/src/string/arm/memcpy.c b/src/string/arm/memcpy.c > > > > index f703c9bd..041614f4 100644 > > > > --- a/src/string/arm/memcpy.c > > > > +++ b/src/string/arm/memcpy.c > > > > @@ -1,3 +1,3 @@ > > > > -#if __ARMEB__ || __thumb__ > > > > +#if __ARMEB__ > > > > #include "../memcpy.c" > > > > #endif > > > > diff --git a/src/string/arm/memcpy_le.S b/src/string/arm/memcpy_le.S > > > > index 9cfbcb2a..64bc5f9e 100644 > > > > --- a/src/string/arm/memcpy_le.S > > > > +++ b/src/string/arm/memcpy_le.S > > > > @@ -1,4 +1,4 @@ > > > > -#if !__ARMEB__ && !__thumb__ > > > > +#if !__ARMEB__ > > > > > > > > /* > > > > * Copyright (C) 2008 The Android Open Source Project > > > > @@ -40,8 +40,9 @@ > > > > * This file has been modified from the original for use in musl libc. > > > > * The main changes are: addition of .type memcpy,%function to make the > > > > * code safely callable from thumb mode, adjusting the return > > > > - * instructions to be compatible with pre-thumb ARM cpus, and removal > > > > - * of prefetch code that is not compatible with older cpus. > > > > + * instructions to be compatible with pre-thumb ARM cpus, removal of > > > > + * prefetch code that is not compatible with older cpus and support for > > > > + * building as thumb 2. > > > > */ > > > > > > > > .syntax unified > > > > @@ -241,8 +242,9 @@ non_congruent: > > > > beq 2f > > > > ldr r5, [r1], #4 > > > > sub r2, r2, #4 > > > > - orr r4, r3, r5, lsl lr > > > > - mov r3, r5, lsr r12 > > > > + mov r4, r5, lsl lr > > > > + orr r4, r4, r3 > > > > + mov r3, r5, lsr r12 > > > > str r4, [r0], #4 > > > > cmp r2, #4 > > > > bhs 1b > > > > > > This is outside of loops and not a hot path, > > > > > > > @@ -348,8 +350,9 @@ less_than_thirtytwo: > > > > > > > > 1: ldr r5, [r1], #4 > > > > sub r2, r2, #4 > > > > - orr r4, r3, r5, lsl lr > > > > - mov r3, r5, lsr r12 > > > > + mov r4, r5, lsl lr > > > > + orr r4, r4, r3 > > > > + mov r3, r5, lsr r12 > > > > str r4, [r0], #4 > > > > cmp r2, #4 > > > > bhs 1b > > > > > > This one is in a loop, but perhaps not terribly critical to > > > performance. > > > > Yes, it's in a loop, but I can confirm it's not a performance critical one. > > Thanks. > > > > We could keep old version with #if !__thumb__ but I doubt > > > it matters, and it looks like hardly anyone is using pre-thumb2 ARM > > > anymore anyway; a show-stopping bug went uncaught for over a year in > > > other things for v6. > > > > I was meaning to ask about that after seeing your recent commit in > > master. My primary target is pre-thumb2 armv6 and I hadn't noticed any > > problems... > > I wonder if there was some magical mechanism by which the anticipated > crash failed to trigger. It certainly triggered in the other affected > arch, sh2, though. If you happen to look at it and find out what was > going on, let us know on the list. For ARM, testing libc.auxv is guarded by a test on __hwcap, so if __hwcap is also being initialised after __set_thread_area() then __set_thread_area() will never access libc.auxv ?