From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/10499
Path: news.gmane.org!.POSTED!not-for-mail
From: Rich Felker <dalias@libc.org>
Newsgroups: gmane.linux.lib.musl.general
Subject: Re: memchr() performance
Date: Sun, 18 Sep 2016 16:40:30 -0400
Message-ID: <20160918204030.GC15995@brightrain.aerifal.cx>
References: <20160918185422.GA2577@dell12.lru.li>
Reply-To: musl@lists.openwall.com
NNTP-Posting-Host: blaine.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: blaine.gmane.org 1474231262 26677 195.159.176.226 (18 Sep 2016 20:41:02 GMT)
X-Complaints-To: usenet@blaine.gmane.org
NNTP-Posting-Date: Sun, 18 Sep 2016 20:41:02 +0000 (UTC)
User-Agent: Mutt/1.5.21 (2010-09-15)
To: musl@lists.openwall.com
Original-X-From: musl-return-10511-gllmg-musl=m.gmane.org@lists.openwall.com Sun Sep 18 22:40:58 2016
Return-path: <musl-return-10511-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@m.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by blaine.gmane.org with smtp (Exim 4.84_2)
	(envelope-from <musl-return-10511-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1blitH-0005Ln-Tf
	for gllmg-musl@m.gmane.org; Sun, 18 Sep 2016 22:40:44 +0200
Original-Received: (qmail 21600 invoked by uid 550); 18 Sep 2016 20:40:43 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Original-Received: (qmail 21582 invoked from network); 18 Sep 2016 20:40:43 -0000
Content-Disposition: inline
In-Reply-To: <20160918185422.GA2577@dell12.lru.li>
Original-Sender: Rich Felker <dalias@aerifal.cx>
Xref: news.gmane.org gmane.linux.lib.musl.general:10499
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/10499>

On Sun, Sep 18, 2016 at 08:54:22PM +0200, Georg Sauthoff wrote:
> (please CC me as I am not subscribed to this ML)
> 
> Hello,
> 
> fyi, I've done some benchmarking of different memchr() and std::find()
> versions.
> 
> I also included the memchr() version from musl.
> 
> In general, musl's memchr() implementation doesn't perform better than a
> simple unrolled loop (as used in libstdc++ std::find()) - and that is
> consistent over different CPU generations and architectures.
> 
> On recent Intel CPUs it is even slower than a naive implementation:

Are you assuming vectorization of the naive version by the compiler? I
think it's reasonable to assume that on x86_64 but not on 32-bit since
many users build for a baseline ISA that does not have vector ops
(i486 or i586).

> https://gms.tf/stdfind-and-memchr-optimizations.html#measurements
> https://gms.tf/sparc-and-ppc-find-benchmark-results.html
> 
> Of course, on x86, other implementations that use SIMD instructions
> perform even better.

I'm aware that musl's memchr (and more generally the related functions
like strchr, strlen, etc.) are not performing great, but it's not
clear to me what the right solution is, since the different approaches
vary A LOT in terms of how they compare with each other depending on
the exact cpu model and compiler. Improving this situation is probably
a big project.

Rich