From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org
X-Spam-Level: 
X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham
	autolearn_force=no version=3.4.4
Received: (qmail 26947 invoked from network); 25 Jun 2020 22:24:22 -0000
Received: from mother.openwall.net (195.42.179.200)
  by inbox.vuxu.org with ESMTPUTF8; 25 Jun 2020 22:24:22 -0000
Received: (qmail 8100 invoked by uid 550); 25 Jun 2020 22:24:19 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Reply-To: musl@lists.openwall.com
Received: (qmail 8079 invoked from network); 25 Jun 2020 22:24:18 -0000
Date: Thu, 25 Jun 2020 18:24:05 -0400
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Message-ID: <20200625222405.GV6430@brightrain.aerifal.cx>
References: <20200121185215.5958-1-armccurdy@gmail.com>
 <20200625215041.GT6430@brightrain.aerifal.cx>
 <CAJ86T=Wjq1rH0CotDjO1ifzRKwEzUVO6+etAvCYw+dtSaaa05A@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CAJ86T=Wjq1rH0CotDjO1ifzRKwEzUVO6+etAvCYw+dtSaaa05A@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Subject: Re: [musl] [PATCH v2] Add big-endian support to ARM assembler memcpy

On Thu, Jun 25, 2020 at 03:11:05PM -0700, Andre McCurdy wrote:
> On Thu, Jun 25, 2020 at 3:06 PM Rich Felker <dalias@libc.org> wrote:
> >
> > On Tue, Jan 21, 2020 at 10:52:15AM -0800, Andre McCurdy wrote:
> > > Allow the existing ARM assembler memcpy implementation to be used for
> > > both big and little endian targets.
> > > ---
> > >
> > > Exactly the same changes as before but rebased to account for
> > > whitespace changes in the preceding patch to add Thumb2 support.
> > >
> > >  COPYRIGHT                                |   2 +-
> > >  src/string/arm/{memcpy_le.S => memcpy.S} | 101 ++++++++++++++++++++++-
> > >  src/string/arm/memcpy.c                  |   3 -
> > >  3 files changed, 98 insertions(+), 8 deletions(-)
> > >  rename src/string/arm/{memcpy_le.S => memcpy.S} (82%)
> > >  delete mode 100644 src/string/arm/memcpy.c
> > >
> > > diff --git a/COPYRIGHT b/COPYRIGHT
> > > index e6472371..d3edc2a2 100644
> > > --- a/COPYRIGHT
> > > +++ b/COPYRIGHT
> > > @@ -127,7 +127,7 @@ Copyright © 2017-2018 Arm Limited
> > >  and labelled as such in comments in the individual source files. All
> > >  have been licensed under extremely permissive terms.
> > >
> > > -The ARM memcpy code (src/string/arm/memcpy_el.S) is Copyright © 2008
> > > +The ARM memcpy code (src/string/arm/memcpy.S) is Copyright © 2008
> > >  The Android Open Source Project and is licensed under a two-clause BSD
> > >  license. It was taken from Bionic libc, used on Android.
> > >
> > > diff --git a/src/string/arm/memcpy_le.S b/src/string/arm/memcpy.S
> > > similarity index 82%
> > > rename from src/string/arm/memcpy_le.S
> > > rename to src/string/arm/memcpy.S
> > > index 7b35d305..869e3448 100644
> > > --- a/src/string/arm/memcpy_le.S
> > > +++ b/src/string/arm/memcpy.S
> > > @@ -1,5 +1,3 @@
> > > -#if !__ARMEB__
> > > -
> > >  /*
> > >   * Copyright (C) 2008 The Android Open Source Project
> > >   * All rights reserved.
> > > @@ -42,7 +40,7 @@
> > >   * code safely callable from thumb mode, adjusting the return
> > >   * instructions to be compatible with pre-thumb ARM cpus, removal of
> > >   * prefetch code that is not compatible with older cpus and support for
> > > - * building as thumb 2.
> > > + * building as thumb 2 and big-endian.
> > >   */
> > >
> > >  .syntax unified
> > > @@ -227,24 +225,45 @@ non_congruent:
> > >        * becomes aligned to 32 bits (r5 = nb of words to copy for alignment)
> > >        */
> > >       movs    r5, r5, lsl #31
> > > +
> > > +#if __ARMEB__
> > > +     movmi   r3, r3, ror #24
> > > +     strbmi  r3, [r0], #1
> > > +     movcs   r3, r3, ror #24
> > > +     strbcs  r3, [r0], #1
> > > +     movcs   r3, r3, ror #24
> > > +     strbcs  r3, [r0], #1
> > > +#else
> > >       strbmi r3, [r0], #1
> > >       movmi   r3, r3, lsr #8
> > >       strbcs r3, [r0], #1
> > >       movcs   r3, r3, lsr #8
> > >       strbcs r3, [r0], #1
> > >       movcs   r3, r3, lsr #8
> > > +#endif
> > >
> > >       cmp     r2, #4
> > >       blo     partial_word_tail
> > >
> > > +#if __ARMEB__
> > > +     mov     r3, r3, lsr r12
> > > +     mov     r3, r3, lsl r12
> > > +#endif
> > > +
> > >       /* Align destination to 32 bytes (cache line boundary) */
> > >  1:      tst     r0, #0x1c
> > >       beq     2f
> > >       ldr     r5, [r1], #4
> > >       sub     r2, r2, #4
> > > +#if __ARMEB__
> > > +     mov     r4, r5,                 lsr lr
> > > +     orr     r4, r4, r3
> > > +     mov     r3, r5,                 lsl r12
> > > +#else
> > >       mov     r4, r5,                 lsl lr
> > >       orr     r4, r4, r3
> > >       mov     r3, r5,                 lsr r12
> > > +#endif
> >
> > Am I missing something or are both cases identical here? That would
> > either indicate this is gratuitous or there's a bug here and they were
> > intended not to be the same.
> 
> Difference here and below is lsr (logical shift right) -vs- lsl
> (logical shift left).

Thanks!

Rich