From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/3978 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: Optimized C memset [v2] Date: Tue, 27 Aug 2013 12:22:06 -0400 Message-ID: <20130827162205.GU20515@brightrain.aerifal.cx> References: <20130827083020.GA4503@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="2oS5YaxWCcQjTEyO" X-Trace: ger.gmane.org 1377620536 31763 80.91.229.3 (27 Aug 2013 16:22:16 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 27 Aug 2013 16:22:16 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-3982-gllmg-musl=m.gmane.org@lists.openwall.com Tue Aug 27 18:22:19 2013 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1VEM27-0000Ah-1D for gllmg-musl@plane.gmane.org; Tue, 27 Aug 2013 18:22:19 +0200 Original-Received: (qmail 25986 invoked by uid 550); 27 Aug 2013 16:22:18 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 25978 invoked from network); 27 Aug 2013 16:22:18 -0000 Content-Disposition: inline In-Reply-To: <20130827083020.GA4503@brightrain.aerifal.cx> User-Agent: Mutt/1.5.21 (2010-09-15) Xref: news.gmane.org gmane.linux.lib.musl.general:3978 Archived-At: --2oS5YaxWCcQjTEyO Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Here's version 2 (filename version 6, in honor of glibc ;) of the memset code. I fixed a bug in the logic for coverage of the tail (the part past what's covered by the loop) for some values of n and alignments, and cleaned up the __GNUC__ usage a bit to use less #ifdeffery. The remaining test at the top for the __GNUC__ version is ugly, I admit, and should possibly just be removed and replaced by a configure check to add -D__may_alias__= to the CFLAGS if the compiler defines __GNUC__ but does not recognize __attribute__((__may_alias__)) -- opinions on this? Rich --2oS5YaxWCcQjTEyO Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="memset6.c" #include #include #if defined(__GNUC__) && 100*__GNUC__+__GNUC_MINOR__ < 302 #define __may_alias__ #endif void *memset(void *dest, int c, size_t n) { unsigned char *s = dest; size_t k; if (!n) return dest; s[0] = s[n-1] = c; if (n <= 2) return dest; s[1] = s[n-2] = c; s[2] = s[n-3] = c; if (n <= 6) return dest; s[3] = s[n-4] = c; if (n <= 8) return dest; k = -(uintptr_t)s & 3; s += k; n -= k; n &= -3; #ifdef __GNUC__ typedef uint32_t __attribute__((__may_alias__)) u32; typedef uint64_t __attribute__((__may_alias__)) u64; u32 c32 = ((u32)-1)/255 * (unsigned char)c; *(u32 *)(s+0) = c32; *(u32 *)(s+n-4) = c32; if (n <= 8) return dest; *(u32 *)(s+4) = c32; *(u32 *)(s+8) = c32; *(u32 *)(s+n-12) = c32; *(u32 *)(s+n-8) = c32; if (n <= 24) return dest; *(u32 *)(s+12) = c32; *(u32 *)(s+16) = c32; *(u32 *)(s+20) = c32; *(u32 *)(s+24) = c32; *(u32 *)(s+n-28) = c32; *(u32 *)(s+n-24) = c32; *(u32 *)(s+n-20) = c32; *(u32 *)(s+n-16) = c32; k = 24 + ((uintptr_t)s & 4); s += k; n -= k; u64 c64 = c32 | ((u64)c32 << 32); for (; n >= 32; n-=32, s+=32) { *(u64 *)(s+0) = c64; *(u64 *)(s+8) = c64; *(u64 *)(s+16) = c64; *(u64 *)(s+24) = c64; } #else for (; n; n--, s++) *s = c; #endif return dest; } --2oS5YaxWCcQjTEyO--