* [musl] [PATCH 0/1] riscv: Add support for Zacas in atomic operations @ 2025-09-18 16:47 Pincheng Wang 2025-09-18 16:47 ` [musl] [PATCH 1/1] riscv: add Zacas extension support for atomic CAS Pincheng Wang 2025-10-16 15:37 ` [musl] [PATCH 0/1] riscv: Add support for Zacas in atomic operations Pincheng Wang 0 siblings, 2 replies; 5+ messages in thread From: Pincheng Wang @ 2025-09-18 16:47 UTC (permalink / raw) To: musl; +Cc: pincheng.plct Hi all, This patch adds support for the RISC-V Zacas (Atomic Compare-and-Swap) extension in musl's atomic operations for both riscv64 and riscv32. Currently, musl implements a_cas using a Load-Reserved/Store-Conditional (lr/sc) loop that: - Requires at least four instructions (lr+bne+sc+bnez) per CAS operation, - Contains a retry loop under contention, - Incurs branch penalties that may cause pipeline stalls. Zacas introduces amocas.w.aqrl/amocas.d.aqrl instructions that perform CAS atomically in a single instruction, eliminating retry loops and conditional branches. Due to hardware limitations, we evaluated this change under QEMU using both mcycle and minstret counters. The results show clear benefits: Metric lr/sc Zacas Improvement Instr. per CAS (50k ops average) 15.04 8.36 -44.4% Instr. per op (single-thread) 23.61 14.25 -39.6% Instr. per op (multi-thread, high contention) 528.24 251.14 -52.5% In addition, libc.a size is reduced by ~1.2% due to removal of loop code. The patch automatically falls back to the lr/sc implementation on systems where Zacas is not available, preserving full backward compatibility. This work provides a measurable reduction in instruction count, execution cycles and binary size, improving scalability of synchronization primitives under load. Thanks for reviewing! Best regards, Pincheng Wang Pincheng Wang (1): riscv: add Zacas extension support for atomic CAS arch/riscv32/atomic_arch.h | 17 +++++++++++++++++ arch/riscv64/atomic_arch.h | 30 ++++++++++++++++++++++++++++++ 2 files changed, 47 insertions(+) -- 2.39.5 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [musl] [PATCH 1/1] riscv: add Zacas extension support for atomic CAS 2025-09-18 16:47 [musl] [PATCH 0/1] riscv: Add support for Zacas in atomic operations Pincheng Wang @ 2025-09-18 16:47 ` Pincheng Wang 2025-10-21 0:30 ` Szabolcs Nagy 2025-10-16 15:37 ` [musl] [PATCH 0/1] riscv: Add support for Zacas in atomic operations Pincheng Wang 1 sibling, 1 reply; 5+ messages in thread From: Pincheng Wang @ 2025-09-18 16:47 UTC (permalink / raw) To: musl; +Cc: pincheng.plct Add compile-time detection for RISC-V Zacas extension and use amocas.w.aqrl/amocas.d.aqrl instructions when available. When __riscv_zacas is defined, a_cas() and a_cas_p() use single amocas instructions instead of lr/sc loops. Falls back to existing lr/sc implementation when Zacas is not available. Signed-off-by: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> --- arch/riscv32/atomic_arch.h | 17 +++++++++++++++++ arch/riscv64/atomic_arch.h | 30 ++++++++++++++++++++++++++++++ 2 files changed, 47 insertions(+) diff --git a/arch/riscv32/atomic_arch.h b/arch/riscv32/atomic_arch.h index 4d418f63..64ef05b7 100644 --- a/arch/riscv32/atomic_arch.h +++ b/arch/riscv32/atomic_arch.h @@ -3,6 +3,21 @@ static inline void a_barrier() { __asm__ __volatile__ ("fence rw,rw" : : : "memory"); } +#ifdef __riscv_zacas + +#define a_cas a_cas +static inline int a_cas(volatile int *p, int t, int s) +{ + int old = t; + __asm__ __volatile__ ( + "amocas.w.aqrl %0, %2, %1" + : "+r"(old), "+A"(*(volatile int *)p) + : "r"(s) + : "memory"); + return old; +} + +#else /* Fallback to lr/sc when Zacas is not available */ #define a_cas a_cas static inline int a_cas(volatile int *p, int t, int s) @@ -19,3 +34,5 @@ static inline int a_cas(volatile int *p, int t, int s) : "memory"); return old; } + +#endif /* __riscv_zacas */ \ No newline at end of file diff --git a/arch/riscv64/atomic_arch.h b/arch/riscv64/atomic_arch.h index 0c382588..9681505e 100644 --- a/arch/riscv64/atomic_arch.h +++ b/arch/riscv64/atomic_arch.h @@ -4,6 +4,34 @@ static inline void a_barrier() __asm__ __volatile__ ("fence rw,rw" : : : "memory"); } +#ifdef __riscv_zacas + +#define a_cas a_cas +static inline int a_cas(volatile int *p, int t, int s) +{ + int old = t; + __asm__ __volatile__ ( + "amocas.w.aqrl %0, %2, %1" + : "+r"(old), "+A"(*(volatile int *)p) + : "r"(s) + : "memory"); + return old; +} + +#define a_cas_p a_cas_p +static inline void *a_cas_p(volatile void *p, void *t, void *s) +{ + void *old = t; + __asm__ __volatile__ ( + "amocas.d.aqrl %0, %2, %1" + : "+r"(old), "+A"(*(void *volatile *)p) + : "r"(s) + : "memory"); + return old; +} + +#else /* Fallback to lr/sc when Zacas is not available */ + #define a_cas a_cas static inline int a_cas(volatile int *p, int t, int s) { @@ -36,3 +64,5 @@ static inline void *a_cas_p(volatile void *p, void *t, void *s) : "memory"); return old; } + +#endif /* __riscv_zacas */ -- 2.39.5 ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [musl] [PATCH 1/1] riscv: add Zacas extension support for atomic CAS 2025-09-18 16:47 ` [musl] [PATCH 1/1] riscv: add Zacas extension support for atomic CAS Pincheng Wang @ 2025-10-21 0:30 ` Szabolcs Nagy 2025-10-21 3:05 ` Pincheng Wang 0 siblings, 1 reply; 5+ messages in thread From: Szabolcs Nagy @ 2025-10-21 0:30 UTC (permalink / raw) To: Pincheng Wang; +Cc: musl * Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> [2025-09-19 00:47:20 +0800]: > Add compile-time detection for RISC-V Zacas extension and use > amocas.w.aqrl/amocas.d.aqrl instructions when available. > > When __riscv_zacas is defined, a_cas() and a_cas_p() use single amocas > instructions instead of lr/sc loops. Falls back to existing lr/sc > implementation when Zacas is not available. is this a supported extension? are there users? (implemented on existing cpus with released toolchain versions) what cflags enable the extension? (how to test) i can't review if the instructions have the right semantics, but the code looks ok, with some comments below. > > Signed-off-by: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> > --- > arch/riscv32/atomic_arch.h | 17 +++++++++++++++++ > arch/riscv64/atomic_arch.h | 30 ++++++++++++++++++++++++++++++ > 2 files changed, 47 insertions(+) > > diff --git a/arch/riscv32/atomic_arch.h b/arch/riscv32/atomic_arch.h > index 4d418f63..64ef05b7 100644 > --- a/arch/riscv32/atomic_arch.h > +++ b/arch/riscv32/atomic_arch.h > } > +#ifdef __riscv_zacas newline before #ifdef > +#else /* Fallback to lr/sc when Zacas is not available */ ... > +#endif /* __riscv_zacas */ > \ No newline at end of file newline after endif i think ifdef comments are not needed in such a simple file. > +++ b/arch/riscv64/atomic_arch.h > @@ -4,6 +4,34 @@ static inline void a_barrier() > __asm__ __volatile__ ("fence rw,rw" : : : "memory"); > } > > +#ifdef __riscv_zacas > + > +#define a_cas a_cas > +static inline int a_cas(volatile int *p, int t, int s) > +{ > + int old = t; > + __asm__ __volatile__ ( > + "amocas.w.aqrl %0, %2, %1" > + : "+r"(old), "+A"(*(volatile int *)p) > + : "r"(s) > + : "memory"); existing cas does not use +A constraint (check git log why and ensure this is ok). the ptr cast should not be needed. (same for rv32) ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [musl] [PATCH 1/1] riscv: add Zacas extension support for atomic CAS 2025-10-21 0:30 ` Szabolcs Nagy @ 2025-10-21 3:05 ` Pincheng Wang 0 siblings, 0 replies; 5+ messages in thread From: Pincheng Wang @ 2025-10-21 3:05 UTC (permalink / raw) To: musl, nsz On 2025/10/21 08:30, Szabolcs Nagy wrote: > * Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> [2025-09-19 00:47:20 +0800]: >> Add compile-time detection for RISC-V Zacas extension and use >> amocas.w.aqrl/amocas.d.aqrl instructions when available. >> >> When __riscv_zacas is defined, a_cas() and a_cas_p() use single amocas >> instructions instead of lr/sc loops. Falls back to existing lr/sc >> implementation when Zacas is not available. > > is this a supported extension? are there users? > (implemented on existing cpus with released toolchain versions) The Zacas extension was ratified in November 2023. CPUs such as the XuanTie C930 already support this extension [1]. For toolchain support, GCC added Zacas extension support in commit 11c2453 ("RISC-V: Add basic support for the Zacas extension") on Jul 30, 2024. Moreover, the RVA23 profile document [2] listed Zacas as a development option and states that it "is intented to become mandatory in the future RVA profile", suggesting broader adoption, particularly in high-preformance computing domains such as PCs and servers, in the near future. [1] https://www.xrvm.com/product/xuantie/C930 [2] https://docs.riscv.org/reference/profiles/rva23/_attachments/rva23-profile.pdf > what cflags enable the extension? (how to test) To enable this extension, use "-march=rv{32,64}gc_zacas" CFLAGS. In my development environment, I'm using riscv64-unknown-linux-gnu-gcc (version 15.1.0, commit g1b306039ac4) with the following configure command: `CC=riscv64-unknown-linux-gnu-gcc CFLAGS="-march=rv64gc_zacas" ./configure --prefix=/home/wpcwzy/sysroot-rv64` > i can't review if the instructions have the right semantics, > but the code looks ok, with some comments below. > >> >> Signed-off-by: Pincheng Wang <pincheng.plct@isrc.iscas.ac.cn> >> --- >> arch/riscv32/atomic_arch.h | 17 +++++++++++++++++ >> arch/riscv64/atomic_arch.h | 30 ++++++++++++++++++++++++++++++ >> 2 files changed, 47 insertions(+) >> >> diff --git a/arch/riscv32/atomic_arch.h b/arch/riscv32/atomic_arch.h >> index 4d418f63..64ef05b7 100644 >> --- a/arch/riscv32/atomic_arch.h >> +++ b/arch/riscv32/atomic_arch.h >> } >> +#ifdef __riscv_zacas > > newline before #ifdef > >> +#else /* Fallback to lr/sc when Zacas is not available */ > ... >> +#endif /* __riscv_zacas */ >> \ No newline at end of file > > newline after endif > > i think ifdef comments are not needed in such a simple file. > Thank you for the formatting suggestions. I'll address these in the next revision. >> +++ b/arch/riscv64/atomic_arch.h >> @@ -4,6 +4,34 @@ static inline void a_barrier() >> __asm__ __volatile__ ("fence rw,rw" : : : "memory"); >> } >> >> +#ifdef __riscv_zacas >> + >> +#define a_cas a_cas >> +static inline int a_cas(volatile int *p, int t, int s) >> +{ >> + int old = t; >> + __asm__ __volatile__ ( >> + "amocas.w.aqrl %0, %2, %1" >> + : "+r"(old), "+A"(*(volatile int *)p) >> + : "r"(s) >> + : "memory"); > > existing cas does not use +A constraint (check git log > why and ensure this is ok). > > the ptr cast should not be needed. (same for rv32) Thanks, I will review the git history and adjust the constraint and cast in the next patch revision. Best regards, Pincheng Wang ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [musl] [PATCH 0/1] riscv: Add support for Zacas in atomic operations 2025-09-18 16:47 [musl] [PATCH 0/1] riscv: Add support for Zacas in atomic operations Pincheng Wang 2025-09-18 16:47 ` [musl] [PATCH 1/1] riscv: add Zacas extension support for atomic CAS Pincheng Wang @ 2025-10-16 15:37 ` Pincheng Wang 1 sibling, 0 replies; 5+ messages in thread From: Pincheng Wang @ 2025-10-16 15:37 UTC (permalink / raw) To: musl On 2025/9/19 00:47, Pincheng Wang wrote: > Hi all, > > This patch adds support for the RISC-V Zacas (Atomic Compare-and-Swap) > extension in musl's atomic operations for both riscv64 and riscv32. > > Currently, musl implements a_cas using a > Load-Reserved/Store-Conditional (lr/sc) loop that: > - Requires at least four instructions (lr+bne+sc+bnez) per CAS > operation, > - Contains a retry loop under contention, > - Incurs branch penalties that may cause pipeline stalls. > > Zacas introduces amocas.w.aqrl/amocas.d.aqrl instructions that perform > CAS atomically in a single instruction, eliminating retry loops and > conditional branches. > > Due to hardware limitations, we evaluated this change under QEMU using > both mcycle and minstret counters. The results show clear benefits: > > Metric lr/sc Zacas Improvement > Instr. per CAS (50k ops average) 15.04 8.36 -44.4% > Instr. per op (single-thread) 23.61 14.25 -39.6% > Instr. per op (multi-thread, high contention) 528.24 251.14 -52.5% > > In addition, libc.a size is reduced by ~1.2% due to removal of loop > code. > > The patch automatically falls back to the lr/sc implementation on > systems where Zacas is not available, preserving full backward > compatibility. > > This work provides a measurable reduction in instruction count, > execution cycles and binary size, improving scalability of > synchronization primitives under load. > > Thanks for reviewing! > > Best regards, > Pincheng Wang > > > Pincheng Wang (1): > riscv: add Zacas extension support for atomic CAS > > arch/riscv32/atomic_arch.h | 17 +++++++++++++++++ > arch/riscv64/atomic_arch.h | 30 ++++++++++++++++++++++++++++++ > 2 files changed, 47 insertions(+) > Hi all, Friendly ping regarding my earlier patch on enabling the RISC-V Zacas (amocas.{w,d}) path for a_cas()/a_cas_p(). Best regards, Pincheng Wang ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-10-21 3:05 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-09-18 16:47 [musl] [PATCH 0/1] riscv: Add support for Zacas in atomic operations Pincheng Wang 2025-09-18 16:47 ` [musl] [PATCH 1/1] riscv: add Zacas extension support for atomic CAS Pincheng Wang 2025-10-21 0:30 ` Szabolcs Nagy 2025-10-21 3:05 ` Pincheng Wang 2025-10-16 15:37 ` [musl] [PATCH 0/1] riscv: Add support for Zacas in atomic operations Pincheng Wang
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).