mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Petr Skocik <pskocik@gmail.com>
To: musl@lists.openwall.com
Subject: [musl] faster memcmp
Date: Wed, 22 Jan 2020 18:12:28 +0100	[thread overview]
Message-ID: <3016485f-0790-4634-ae6c-d9a05cdb79aa@gmail.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 675 bytes --]

Hi,

I've noticed musl's implementation of memcmp was about way over times
slower than glibc's, which made memcpy-based comparisons of e.g.,
definitely contiguous structs actually quite a bit slower than applying
memcmp on them.

I've put together a memcmp that compares memory word by word if the
operands are aligned or identically misaligned (I figure the comparisons
of differently aligned objects is rare in real code).

It's still pretty compact, both in C (~40 LOC) and assembly (126 B @
x86_64 gcc -Os) and much closer to gcc's performance (less than twice
slower for comparison of 128B objects vs musl's 13 times).

If you like it, take it.

Regards,
Petr Skocik




[-- Attachment #2: my_memcmp.c --]
[-- Type: text/x-csrc, Size: 1598 bytes --]

#include <stdint.h>
#include <stddef.h>
#include <string.h>

__attribute((optimize("s"))) //a bit faster than -O2/-O3
int my_memcmp(const void *Vl, const void *Vr, size_t N)
{
	typedef size_t uword_tp;
	const unsigned char *l=Vl, *r=Vr;
	size_t ByteWiseCnt;
	size_t aloff = (uintptr_t)l%_Alignof(uword_tp);

	if ( aloff != (uintptr_t)r%_Alignof(uword_tp) ) {
		/*differently aligned, finish with bytewise cmp and let it fallthru to return 0
		 unless the return happens in the bytewise loop*/

		ByteWiseCnt = N;

		//the goto helps keep the output assembly small
		do_ByteWiseCmp: //run for all, at the beginning if equally misaligned, and possibly at the end
		N-=ByteWiseCnt; //<= disables the rest if run for all or at the end; adjusts N if run at the beginning 
		for (; 0!=ByteWiseCnt; ByteWiseCnt--, l++, r++)
			if (*l != *r) return *l-*r;

	}else if(aloff!=0) {
		/*do_ByteWiseCmp until aligned*/
		ByteWiseCnt= sizeof(uword_tp)-aloff < N ? sizeof(uword_tp)-aloff : N;
		goto do_ByteWiseCmp;
	}

	size_t nw = N/sizeof(uword_tp); N %= sizeof(uword_tp);
	for (; nw != 0; nw--,l+=sizeof(uword_tp),r+=sizeof(uword_tp)){
		size_t l0, r0; memcpy(&l0, l, sizeof(uword_tp)); memcpy(&r0, r, sizeof(uword_tp));
		if (l0 == r0) continue;
		else{
		   	if (1) break; //smaller assembly
			else { ByteWiseCnt = sizeof(uword_tp); goto do_ByteWiseCmp; } //larger code
		}
	}

	//if (nw!=0), then ByteWiseCnt will return within the next sizeof(size_t) bytes and
	//nw needs to be >=sizeof(size_t), otherwise it needs to be N
	if (0==(ByteWiseCnt=nw*sizeof(size_t)+N))
		return 0;

	goto do_ByteWiseCmp;
}

                 reply	other threads:[~2020-01-22 19:37 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3016485f-0790-4634-ae6c-d9a05cdb79aa@gmail.com \
    --to=pskocik@gmail.com \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).