mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] [PATCH 0/1] riscv64: Add RVV optimized memset implementation
@ 2025-09-25 13:15 Pincheng Wang
  2025-09-25 13:15 ` [musl] [PATCH 1/1] riscv64: optimize memset implementation with vector extension Pincheng Wang
  0 siblings, 1 reply; 6+ messages in thread
From: Pincheng Wang @ 2025-09-25 13:15 UTC (permalink / raw)
  To: musl; +Cc: pincheng.plct

Hi all,

This patch introduces a RISC-V Vector (RVV) optimized implementation of
memset.

Key points:
- Use RVV instructions to fill memory in bulk, with a small-size
  head-tail fast path to reduce vsetvli overhead.
- Fall back to a scalar head-tail implementation (like generic C
  implementation) when RVV is not available.
- Reduce both instruction count and code size: memset.o shrinks by about
  16.5% compared to the generic C build.

Performance results on RVV-capable hardware show clear improvements:
- On Spacemit X60: up to ~3.1x faster (256B), with consistent gains
  across medium and large sizes.
- On XuanTie C908: up to ~2.1x faster (128B), with modest gains for
  larger sizes.

For very small sizes (<8 Bytes), there can be regressions compared to
the generic C version. A more aggresive fast path could remove these
regressions, but at the cost of added code complexity. Feedback on this
trade-off is welcome.

The implementation was tested under QEMU with RVV enabled and on real
hardware. Functional behavior matches the generic memset, with no
changes to the public interface.

Thanks,
Pincheng Wang

Pincheng Wang (1):
  riscv64: optimize memset implementation with vector extension

 src/string/riscv64/memset.S | 101 ++++++++++++++++++++++++++++++++++++
 1 file changed, 101 insertions(+)
 create mode 100644 src/string/riscv64/memset.S

-- 
2.39.5


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-09-26 11:22 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-09-25 13:15 [musl] [PATCH 0/1] riscv64: Add RVV optimized memset implementation Pincheng Wang
2025-09-25 13:15 ` [musl] [PATCH 1/1] riscv64: optimize memset implementation with vector extension Pincheng Wang
2025-09-25 15:30   ` Yao Zi
2025-09-26  0:31     ` Pincheng Wang
2025-09-26  3:37       ` Markus Wichmann
2025-09-26 11:21         ` Pincheng Wang

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).