From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 25640 invoked from network); 25 Jun 2020 22:11:28 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 25 Jun 2020 22:11:28 -0000 Received: (qmail 3332 invoked by uid 550); 25 Jun 2020 22:11:27 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 3314 invoked from network); 25 Jun 2020 22:11:26 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :content-transfer-encoding; bh=DE52PsHMRtS8pTbl3IMdDe3NQXYFMAZNZDZUIFM7KUw=; b=op+e7uEqUWGH727ypyihTOrUfd1r4zYsuwnAJy90UnopjIu1nioljPuEeJngBlkudY Qgrk41e+Vdn2R9ThwtnN8vFVuLFa2ALQeNcrgJO2DGcpm02if/wKOT6xP67VwlYOglbj GdAj2MrZKegBh4ed1omOYb4zEZJxSPEJSssHQOfYW0iUxNv1/vj6/kJpbycjde/PGH4R IC53arA7QFYXp2vgS60rkD0WHjy3mUFbBVM6wJUZ51VSv+yAkTRi/5bZTwh//Ypcro7j AWg8GQr5ZgJrByiPlQMQX96JGz0J8ThUz5oKn3I57FtMOpF12e3qjPBVTjgxQsz0NUqU CAJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:content-transfer-encoding; bh=DE52PsHMRtS8pTbl3IMdDe3NQXYFMAZNZDZUIFM7KUw=; b=UXHlQq9umrb1pM+jIJV7OrKqcP1rOt9fRtYswqGVD7K0Xoh6iGiea0FsTZXreCF60h 5ZymOq6obpARxkkxGGO2tKlcrFZJB412+t+reJyVFeIQwgDFoFd3KQWor0Q1pBPe0awc 801MPwPbCcQKdT8UIF8bnojTS1wKq8KbSVyaOt/mJrz8M05bps5SDvVYHp8Ux53I71p3 rGqhq2R83ajqcYcHNQGhFaLeV1Jd+NwDf3IxwAaygirVGxP2hFZ3CRf1MZx8oh6io1GT ouKhzqXWrlIJYdPRUy7WHZuJiEBr8UF+w7VxIHJb3XqMsl0Twva0j0nOsonlxwu5r64k qsaQ== X-Gm-Message-State: AOAM531qsiRT4nPgEECZf+/TAMycTVMQxX2A+QPg8uWvoE5gmnncaDUw n8ZGG94+P9RAX7uR0JflX3vc9aa+s/+85mCHtPLk2w== X-Google-Smtp-Source: ABdhPJxgwr3KgqahxFLT9IrC9yZn1USUL3F6ANtIvphk/jl+9O7+cJj/qRNBAMaTeS16zRl/ls1O/O6Tbwj+zlHj6JM= X-Received: by 2002:a67:ef4f:: with SMTP id k15mr331320vsr.222.1593123074710; Thu, 25 Jun 2020 15:11:14 -0700 (PDT) MIME-Version: 1.0 References: <20200121185215.5958-1-armccurdy@gmail.com> <20200625215041.GT6430@brightrain.aerifal.cx> In-Reply-To: <20200625215041.GT6430@brightrain.aerifal.cx> From: Andre McCurdy Date: Thu, 25 Jun 2020 15:11:05 -0700 Message-ID: To: musl@lists.openwall.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Subject: Re: [musl] [PATCH v2] Add big-endian support to ARM assembler memcpy On Thu, Jun 25, 2020 at 3:06 PM Rich Felker wrote: > > On Tue, Jan 21, 2020 at 10:52:15AM -0800, Andre McCurdy wrote: > > Allow the existing ARM assembler memcpy implementation to be used for > > both big and little endian targets. > > --- > > > > Exactly the same changes as before but rebased to account for > > whitespace changes in the preceding patch to add Thumb2 support. > > > > COPYRIGHT | 2 +- > > src/string/arm/{memcpy_le.S =3D> memcpy.S} | 101 +++++++++++++++++++++= +- > > src/string/arm/memcpy.c | 3 - > > 3 files changed, 98 insertions(+), 8 deletions(-) > > rename src/string/arm/{memcpy_le.S =3D> memcpy.S} (82%) > > delete mode 100644 src/string/arm/memcpy.c > > > > diff --git a/COPYRIGHT b/COPYRIGHT > > index e6472371..d3edc2a2 100644 > > --- a/COPYRIGHT > > +++ b/COPYRIGHT > > @@ -127,7 +127,7 @@ Copyright =C2=A9 2017-2018 Arm Limited > > and labelled as such in comments in the individual source files. All > > have been licensed under extremely permissive terms. > > > > -The ARM memcpy code (src/string/arm/memcpy_el.S) is Copyright =C2=A9 2= 008 > > +The ARM memcpy code (src/string/arm/memcpy.S) is Copyright =C2=A9 2008 > > The Android Open Source Project and is licensed under a two-clause BSD > > license. It was taken from Bionic libc, used on Android. > > > > diff --git a/src/string/arm/memcpy_le.S b/src/string/arm/memcpy.S > > similarity index 82% > > rename from src/string/arm/memcpy_le.S > > rename to src/string/arm/memcpy.S > > index 7b35d305..869e3448 100644 > > --- a/src/string/arm/memcpy_le.S > > +++ b/src/string/arm/memcpy.S > > @@ -1,5 +1,3 @@ > > -#if !__ARMEB__ > > - > > /* > > * Copyright (C) 2008 The Android Open Source Project > > * All rights reserved. > > @@ -42,7 +40,7 @@ > > * code safely callable from thumb mode, adjusting the return > > * instructions to be compatible with pre-thumb ARM cpus, removal of > > * prefetch code that is not compatible with older cpus and support fo= r > > - * building as thumb 2. > > + * building as thumb 2 and big-endian. > > */ > > > > .syntax unified > > @@ -227,24 +225,45 @@ non_congruent: > > * becomes aligned to 32 bits (r5 =3D nb of words to copy for ali= gnment) > > */ > > movs r5, r5, lsl #31 > > + > > +#if __ARMEB__ > > + movmi r3, r3, ror #24 > > + strbmi r3, [r0], #1 > > + movcs r3, r3, ror #24 > > + strbcs r3, [r0], #1 > > + movcs r3, r3, ror #24 > > + strbcs r3, [r0], #1 > > +#else > > strbmi r3, [r0], #1 > > movmi r3, r3, lsr #8 > > strbcs r3, [r0], #1 > > movcs r3, r3, lsr #8 > > strbcs r3, [r0], #1 > > movcs r3, r3, lsr #8 > > +#endif > > > > cmp r2, #4 > > blo partial_word_tail > > > > +#if __ARMEB__ > > + mov r3, r3, lsr r12 > > + mov r3, r3, lsl r12 > > +#endif > > + > > /* Align destination to 32 bytes (cache line boundary) */ > > 1: tst r0, #0x1c > > beq 2f > > ldr r5, [r1], #4 > > sub r2, r2, #4 > > +#if __ARMEB__ > > + mov r4, r5, lsr lr > > + orr r4, r4, r3 > > + mov r3, r5, lsl r12 > > +#else > > mov r4, r5, lsl lr > > orr r4, r4, r3 > > mov r3, r5, lsr r12 > > +#endif > > Am I missing something or are both cases identical here? That would > either indicate this is gratuitous or there's a bug here and they were > intended not to be the same. Difference here and below is lsr (logical shift right) -vs- lsl (logical shift left). > > [...] > > @@ -350,9 +429,15 @@ less_than_thirtytwo: > > > > 1: ldr r5, [r1], #4 > > sub r2, r2, #4 > > +#if __ARMEB__ > > + mov r4, r5, lsr lr > > + orr r4, r4, r3 > > + mov r3, r5, lsl r12 > > +#else > > mov r4, r5, lsl lr > > orr r4, r4, r3 > > mov r3, r5, lsr r12 > > +#endif > > And again here. > > Rich