From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.0 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: from second.openwall.net (second.openwall.net [193.110.157.125]) by inbox.vuxu.org (Postfix) with SMTP id 77BEC22A71 for ; Wed, 19 Nov 2025 06:41:17 +0100 (CET) Received: (qmail 28322 invoked by uid 550); 19 Nov 2025 05:41:14 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com x-ms-reactions: disallow Received: (qmail 28220 invoked from network); 19 Nov 2025 05:41:14 -0000 From: Pincheng Wang To: musl@lists.openwall.com Cc: pincheng.plct@isrc.iscas.ac.cn Date: Wed, 19 Nov 2025 13:40:58 +0800 Message-Id: <20251119054059.514848-1-pincheng.plct@isrc.iscas.ac.cn> X-Mailer: git-send-email 2.39.5 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID:zQCowAA3_29sWB1p+Ug9AQ--.12794S2 X-Coremail-Antispam: 1UD129KBjvJXoWxXF4DCF47CF4kZr4UZry7Jrb_yoW5GF1rpr 4IyryrKr17Zr97Gr4fXF1xAr13CrWrur45Gw17uwn0v3y8JF98uF9xJ3WFvrW3JF18G34Y 9r40kF15u3W0kaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUyv14x267AKxVWUJVW8JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26r1j6r1xM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r1j 6r4UM28EF7xvwVC2z280aVAFwI0_Jr0_Gr1l84ACjcxK6I8E87Iv6xkF7I0E14v26r4j6r 4UJwAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0 I7IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r 4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwCF04k20xvY0x0EwIxG rwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4 vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jrv_JF1lIxkGc2Ij64vIr41lIxAIcVC0I7IY x2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Jr0_Gr1lIxAIcVCF04k26c xKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAF wI0_Jr0_GrUvcSsGvfC2KfnxnUUI43ZEXa7VUbrMaUUUUUU== X-Originating-IP: [120.227.56.239] X-CM-SenderInfo: pslquxhhqjh1xofwqxxvufhxpvfd2hldfou0/ Subject: [musl] [PATCH v2 0/1] riscv64: add optimized string functions Hi all, This patch series supersedes my previous RVV-optimized memset patch series, now extending support to memmove and memcpy as well. It provides optimized versions of these string functions for RISC-V with vector extension, dispatching at runtime based on CPU capability (RVV support). Changes from v1: - Change function pointers' attribute from hidden to static. - Change conditional compilation condition from __riscv to __riscv && __riscv_xlen==64. Implementation details: - mem{set,cpy,move}.c are renamed internally to __mem{set,cpy,move}_scalar via macro, preserving the generic C implementation scalar fallback. - mem{set,cpy,move}_vector.S provide the optimized __mem{set,cpy,move}_vector symbols. - string_dispatch.c exports the public mem{set,cpy,move}() symbols, which dispatches via function pointer. - __init_riscv_string_optimizations is called in __libc_start_main to initialize function pointers based on AT_HWCAP. - The vector implementation uses m8 register grouping for bulk fills. Performance: Function Size Improvement memset 16B 0.06% memset 64B 49.22% memset 256B 127.81% memset 1KB 58.12% memset 4KB 47.95% memset 64KB 2.56% memcpy 16B 0.02% memcpy 64B 35.94% memcpy 256B 205.10% memcpy 1KB 126.01% memcpy 4KB 107.71% memcpy 64KB 36.15% memmove_bwd 16B -0.67% memmove_bwd 64B 47.03% memmove_bwd 256B 207.32% memmove_bwd 1KB 125.33% memmove_bwd 4KB 106.72% memmove_bwd 64KB 41.46% Benchmarks are conducted on a Spacemit X60 CPU. Functional behavior matches generic functions. Thanks, Pincheng Wang Pincheng Wang (1): riscv64: add optimized memset, memcpy and memmove src/env/__libc_start_main.c | 3 ++ src/internal/libc.h | 3 ++ src/string/riscv64/memcpy.c | 4 +++ src/string/riscv64/memcpy_vector.S | 28 +++++++++++++++ src/string/riscv64/memmove.c | 4 +++ src/string/riscv64/memmove_vector.S | 52 ++++++++++++++++++++++++++++ src/string/riscv64/memset.c | 4 +++ src/string/riscv64/memset_vector.S | 29 ++++++++++++++++ src/string/riscv64/string_dispatch.c | 52 ++++++++++++++++++++++++++++ 9 files changed, 179 insertions(+) create mode 100644 src/string/riscv64/memcpy.c create mode 100644 src/string/riscv64/memcpy_vector.S create mode 100644 src/string/riscv64/memmove.c create mode 100644 src/string/riscv64/memmove_vector.S create mode 100644 src/string/riscv64/memset.c create mode 100644 src/string/riscv64/memset_vector.S create mode 100644 src/string/riscv64/string_dispatch.c -- 2.39.5