From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/10494 Path: news.gmane.org!.POSTED!not-for-mail From: Georg Sauthoff Newsgroups: gmane.linux.lib.musl.general Subject: memchr() performance Date: Sun, 18 Sep 2016 20:54:22 +0200 Message-ID: <20160918185422.GA2577@dell12.lru.li> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: blaine.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: blaine.gmane.org 1474226334 7089 195.159.176.226 (18 Sep 2016 19:18:54 GMT) X-Complaints-To: usenet@blaine.gmane.org NNTP-Posting-Date: Sun, 18 Sep 2016 19:18:54 +0000 (UTC) User-Agent: Mutt/1.7.0 (2016-08-17) To: musl@lists.openwall.com Original-X-From: musl-return-10507-gllmg-musl=m.gmane.org@lists.openwall.com Sun Sep 18 21:18:50 2016 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.84_2) (envelope-from ) id 1blhbs-0008O6-3S for gllmg-musl@m.gmane.org; Sun, 18 Sep 2016 21:18:40 +0200 Original-Received: (qmail 32245 invoked by uid 550); 18 Sep 2016 19:18:40 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 18022 invoked from network); 18 Sep 2016 18:54:34 -0000 Content-Disposition: inline Xref: news.gmane.org gmane.linux.lib.musl.general:10494 Archived-At: (please CC me as I am not subscribed to this ML) Hello, fyi, I've done some benchmarking of different memchr() and std::find() versions. I also included the memchr() version from musl. In general, musl's memchr() implementation doesn't perform better than a simple unrolled loop (as used in libstdc++ std::find()) - and that is consistent over different CPU generations and architectures. On recent Intel CPUs it is even slower than a naive implementation: https://gms.tf/stdfind-and-memchr-optimizations.html#measurements https://gms.tf/sparc-and-ppc-find-benchmark-results.html Of course, on x86, other implementations that use SIMD instructions perform even better. Best regards Georg