From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/10500 Path: news.gmane.org!.POSTED!not-for-mail From: Georg Sauthoff Newsgroups: gmane.linux.lib.musl.general Subject: Re: memchr() performance Date: Mon, 19 Sep 2016 15:29:53 +0200 Message-ID: <20160919132953.GA8375@dell12.lru.li> References: <20160918185422.GA2577@dell12.lru.li> <20160918204036.GZ1280@port70.net> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1474291832 12695 195.159.176.226 (19 Sep 2016 13:30:32 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Mon, 19 Sep 2016 13:30:32 +0000 (UTC) User-Agent: Mutt/1.7.0 (2016-08-17) To: musl@lists.openwall.com Original-X-From: musl-return-10513-gllmg-musl=m.gmane.org@lists.openwall.com Mon Sep 19 15:30:23 2016 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1blye7-0000dv-TO for gllmg-musl@m.gmane.org; Mon, 19 Sep 2016 15:30:08 +0200 Original-Received: (qmail 31752 invoked by uid 550); 19 Sep 2016 13:30:06 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 30707 invoked from network); 19 Sep 2016 13:30:05 -0000 Content-Disposition: inline In-Reply-To: <20160918204036.GZ1280@port70.net> Xref: news.gmane.org gmane.linux.lib.musl.general:10500 Archived-At: On Sun, Sep 18, 2016 at 10:40:36PM +0200, Szabolcs Nagy wrote: Hello, > * Georg Sauthoff [2016-09-18 20:54:22 +0200]: [..] > > On recent Intel CPUs it is even slower than a naive implementation: > > > > https://gms.tf/stdfind-and-memchr-optimizations.html#measurements > > https://gms.tf/sparc-and-ppc-find-benchmark-results.html > > > > Of course, on x86, other implementations that use SIMD instructions > > perform even better. > yes simd is expected to be faster. > but that needs asm which is expensive to maintain (there is no > portable simd language extension for c and there is the aliasing > issue: the reinterpret_cast in your code is formally ub). you mean because the vector-word pointer returned by reinterpret_cast is used to access vector-words in the memory passed via a char pointer and this is not covered by the ISO C++ strict aliasing rules? Yes. Sure, ub means that anything can happen, but this case should be ok with GCC - if the function is compiled in isolation in its own translation unit. I mean, there isn't much possibiltiy for reordering due to the application of strict-aliasing-rules that would yield a different result. There are no aliased write accesses. Btw, the current musl memchr() implementation has similar aliased accesses - there, unsigned characters are aliased via a size_t pointer. Best regards Georg