From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/10498
Path: news.gmane.org!.POSTED!not-for-mail
From: Szabolcs Nagy <nsz@port70.net>
Newsgroups: gmane.linux.lib.musl.general
Subject: Re: memchr() performance
Date: Sun, 18 Sep 2016 22:40:36 +0200
Message-ID: <20160918204036.GZ1280@port70.net>
References: <20160918185422.GA2577@dell12.lru.li>
Reply-To: musl@lists.openwall.com
NNTP-Posting-Host: blaine.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: blaine.gmane.org 1474231260 25605 195.159.176.226 (18 Sep 2016 20:41:00 GMT)
X-Complaints-To: usenet@blaine.gmane.org
NNTP-Posting-Date: Sun, 18 Sep 2016 20:41:00 +0000 (UTC)
User-Agent: Mutt/1.6.0 (2016-04-01)
Cc: musl@lists.openwall.com
To: Georg Sauthoff <mail@georg.so>
Original-X-From: musl-return-10512-gllmg-musl=m.gmane.org@lists.openwall.com Sun Sep 18 22:40:54 2016
Return-path: <musl-return-10512-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@m.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by blaine.gmane.org with smtp (Exim 4.84_2)
	(envelope-from <musl-return-10512-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1blitN-0005kq-HV
	for gllmg-musl@m.gmane.org; Sun, 18 Sep 2016 22:40:49 +0200
Original-Received: (qmail 22004 invoked by uid 550); 18 Sep 2016 20:40:48 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Original-Received: (qmail 21979 invoked from network); 18 Sep 2016 20:40:48 -0000
Mail-Followup-To: Georg Sauthoff <mail@georg.so>, musl@lists.openwall.com
Content-Disposition: inline
In-Reply-To: <20160918185422.GA2577@dell12.lru.li>
Xref: news.gmane.org gmane.linux.lib.musl.general:10498
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/10498>

* Georg Sauthoff <mail@georg.so> [2016-09-18 20:54:22 +0200]:
> 
> In general, musl's memchr() implementation doesn't perform better than a
> simple unrolled loop (as used in libstdc++ std::find()) - and that is
> consistent over different CPU generations and architectures.
> 

memchr in musl was never updated (same for >5 years) so probably
should be and last time the position was

"In the particular case of strlen, the naive unrolled strlen with no
OOB access is actually optimal on most or all 32-bit archs, better
than what we have now. I suspect the same is true for strchr and other
related functions."
http://www.openwall.com/lists/musl/2016/01/05/5

but we did not have benchmark numbers at the time.. note that
this benchmark does not measure the effect of more branch
prediction slots used in the unrolled case.

> On recent Intel CPUs it is even slower than a naive implementation:
> 
> https://gms.tf/stdfind-and-memchr-optimizations.html#measurements
> https://gms.tf/sparc-and-ppc-find-benchmark-results.html
> 
> Of course, on x86, other implementations that use SIMD instructions
> perform even better.
> 

yes simd is expected to be faster.

but that needs asm which is expensive to maintain (there is no
portable simd language extension for c and there is the aliasing
issue: the reinterpret_cast in your code is formally ub).

> Best regards
> Georg