From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 20091 invoked from network); 25 Jun 2020 21:16:00 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 25 Jun 2020 21:16:00 -0000 Received: (qmail 7877 invoked by uid 550); 25 Jun 2020 21:15:54 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 7859 invoked from network); 25 Jun 2020 21:15:53 -0000 Date: Thu, 25 Jun 2020 17:15:42 -0400 From: Rich Felker To: musl@lists.openwall.com Message-ID: <20200625211536.GS6430@brightrain.aerifal.cx> References: <20200624204243.GL6430@brightrain.aerifal.cx> <20200625081504.GE2048759@port70.net> <20200625153936.GP6430@brightrain.aerifal.cx> <20200625173125.GF2048759@port70.net> <20200625205024.GR6430@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200625205024.GR6430@brightrain.aerifal.cx> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] Release prep for 1.2.1, and afterwards On Thu, Jun 25, 2020 at 04:50:24PM -0400, Rich Felker wrote: > > > > but it would be nice if we could get the aarch64 > > > > memcpy patch in (the c implementation is really > > > > slow and i've seen ppl compare aarch64 vs x86 > > > > server performance with some benchmark on alpine..) > > > > > > OK, I'll look again. > > > > thanks. > > > > (there are more aarch64 string functions in the > > optimized-routines github repo but i think they > > are not as important as memcpy/memmove/memset) > > I found the code. Can you commend on performance and whether memset is > needed? (The C memset should be rather good already, moreso than > memcpy.) Are the assumptions (v8-a, unaligned access) documented in memcpy.S valid for all presently supportable aarch64? A couple comments for merging if we do, that aren't hard requirements but preferences: - I'd like to expand out the macros from ../asmdefs.h since that won't be available and they just hide things (I guess they're attractive for Apple/macho users or something but not relevant to musl) and since the symbol name lines need to be changed anyway to public name. "Local var name" macros are ok to leave; changing them would be too error-prone and they make the code more readable anyway. - I'd prefer not to have memmove logic in memcpy since it makes it larger and implies that misuse of memcpy when you mean memmove is supported usage. I'd be happy with an approach like x86 though, defining an __memcpy_fwd alias and having memmove tail call to that unless len>128 and reverse is needed, or just leaving memmove.c. Rich