mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@libc.org>
To: James Tirta Halim <tirtajames45@gmail.com>
Cc: musl@lists.openwall.com
Subject: Re: [musl] [PATCH] add memcmpeq: memcmp that returns length of first mismatch
Date: Tue, 27 Feb 2024 09:49:27 -0500	[thread overview]
Message-ID: <20240227144926.GK4163@brightrain.aerifal.cx> (raw)
In-Reply-To: <20240227140756.216904-1-tirtajames45@gmail.com>

On Tue, Feb 27, 2024 at 09:07:56PM +0700, James Tirta Halim wrote:
> (unaligned access must be supported for mempcmpeq if using word access)
> 
> mempcmpeq returns the length of the first mismatched byte. If S1 and S2
> are equal, n is returned.
> 
> The function is meant to be used internally in strstr and memmem:
> https://inbox.vuxu.org/musl/20181108193451.GD5150@brightrain.aerifal.cx/
> 
> glibc bench-memcmpeq timings (Core i3-1115G4):
> mempcmpeq memcmp_musl memcmp __memcmpeq_evex __memcmpeq_avx2 __memcmpeq_sse2
> Average:
> 62.3978 269.813 16.0426 14.9072 14.1125 20.6527
> Total:
> 30637.3 132478 7876.92 7319.43 6929.23 10140.5
> 
> Passes glibc test-memcmpeq.
> 
> Return value in testing and benchmarking is changed to suit the
> behavior of memcmpeq.

I don't understand this proposal. The only "memcmpeq" I can find
anywhere else is __memcmpeq in glibc, whose contract is very
different: it's a relaxed version of memcmp that's only usable for
testing equality and that does not give an order relationship.

As noted in the above-linked message, a portable C version of this
function is not going to help strstr or memmem because there's no
reason to expect the needle to be aligned modulo any wordsize in the
haystack. For it to help, you would need a version that's capable of
doing misaligned accesses, contingent on arch support for that.
Without that, you're just adding the cost of function call overhead
and making it slower.

> ---
>  src/string/mempcmpeq.c | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
>  create mode 100644 src/string/mempcmpeq.c
> 
> diff --git a/src/string/mempcmpeq.c b/src/string/mempcmpeq.c
> new file mode 100644
> index 00000000..bc17bc58
> --- /dev/null
> +++ b/src/string/mempcmpeq.c
> @@ -0,0 +1,19 @@
> +#include <stddef.h>
> +
> +size_t
> +mempcmpeq(const void *s1,
> +          const void *s2,
> +          size_t n)
> +{
> +	const size_t length = n;
> +#ifdef __GNUC__
> +	typedef size_t __attribute__((__may_alias__)) word;
> +	const unsigned char *p1 = (const unsigned char *)s1;
> +	const unsigned char *p2 = (const unsigned char *)s2;
> +	for (; n >= sizeof(word) && *(word *)p1 == *(word *)p2; p1+=sizeof(word), p2+=sizeof(word), n-=sizeof(word));
> +#endif
> +	for (; n; --n)
> +		if (*p1++ != *p2++)
> +			return (size_t)((p1 - 1 - (const unsigned char *)s1));
> +	return length;
> +}
> -- 
> 2.44.0

This code does not do what you claim it does and does not seem to
match your test results. As written, it should return immediately,
because the entire body except

> +	const size_t length = n;
> +	return length;

is dead code.

Rich

  reply	other threads:[~2024-02-27 14:49 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-27 14:07 James Tirta Halim
2024-02-27 14:49 ` Rich Felker [this message]
2024-02-27 14:51   ` Rich Felker
2024-02-28  4:40     ` Markus Wichmann
2024-02-28 23:14       ` Thorsten Glaser
2024-02-29  0:02         ` Pedro Falcato
2024-02-29  0:10           ` Thorsten Glaser
2024-02-29  0:57             ` Robert Clausecker
2024-02-29  1:13               ` Pedro Falcato
2024-02-29  4:19         ` Markus Wichmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240227144926.GK4163@brightrain.aerifal.cx \
    --to=dalias@libc.org \
    --cc=musl@lists.openwall.com \
    --cc=tirtajames45@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).