Hi, I've noticed musl's implementation of memcmp was about way over times slower than glibc's, which made memcpy-based comparisons of e.g., definitely contiguous structs actually quite a bit slower than applying memcmp on them. I've put together a memcmp that compares memory word by word if the operands are aligned or identically misaligned (I figure the comparisons of differently aligned objects is rare in real code). It's still pretty compact, both in C (~40 LOC) and assembly (126 B @ x86_64 gcc -Os) and much closer to gcc's performance (less than twice slower for comparison of 128B objects vs musl's 13 times). If you like it, take it. Regards, Petr Skocik