From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, MAILING_LIST_MULTI,NICE_REPLY_A,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 1525 invoked from network); 12 Oct 2020 12:32:34 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 12 Oct 2020 12:32:34 -0000 Received: (qmail 14058 invoked by uid 550); 12 Oct 2020 12:31:27 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 3876 invoked from network); 12 Oct 2020 12:19:11 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1602505140; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E0/mNVdunqssIYXRFJm8bKI/uVzDQQnRMgkeEe6ZXYc=; b=IRRWKwOEZLRrTqkoZCTCf2K8R9bycwXIvueEAWNz/MOMGe0MjnpMHwki3uMwQry6o8Dp7x S34j93nwceZOF/IaHDRnEZinRUdFcyx/MFKClGBm64UXi5B/aQNZrucVF37IjLoFE6QNOR KtVitptAXM4CIeA8lyFYS+v9ER5pi14= X-MC-Unique: cVA2g2lfNJqaSKbe2dcviw-1 To: Rich Felker , Denys Vlasenko Cc: musl@lists.openwall.com References: <20201003223209.10307-1-vda.linux@googlemail.com> <20201011002514.GF17637@brightrain.aerifal.cx> From: Denys Vlasenko Message-ID: <08fd37cf-971c-2e6f-1b45-1442566b3416@redhat.com> Date: Mon, 12 Oct 2020 14:18:54 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: <20201011002514.GF17637@brightrain.aerifal.cx> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dvlasenk@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [musl] [PATCH] x86/memset: avoid performing final store twice On 10/11/20 2:25 AM, Rich Felker wrote: > On Sun, Oct 04, 2020 at 12:32:09AM +0200, Denys Vlasenko wrote: >> From: Denys Vlasenko >> >> For not very short NBYTES case: >> >> To handle the tail alignment, the code performs a potentially >> misaligned word store to fill the final 8 bytes of the buffer. >> This is done even if the buffer's end is aligned. >> >> Eventually code fills the rest of the buffer, which is a multiple >> of 8 bytes now, with NBYTES / 8 aligned word stores. >> >> However, this means that if NBYTES *was* divisible by 8, >> we store last word too, again. >> >> This patch decrements byte count before dividing it by 8, >> making one less store in "NBYTES is divisible by 8" case, >> and not changing anything in all other cases. >> ... >> --- a/src/string/x86_64/memset.s >> +++ b/src/string/x86_64/memset.s >> @@ -53,7 +53,7 @@ memset: >> 2: test $15,%edi >> mov %rdi,%r8 >> mov %rax,-8(%rdi,%rdx) >> - mov %rdx,%rcx >> + lea -1(%rdx),%rcx >> jnz 2f >> >> 1: shr $3,%rcx >> -- >> 2.25.0 > > Does this have measurably better performance on a system you've tested > it on? I did not test performance, I predict it will hardly be detectable.