From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL,RCVD_IN_ZEN_BLOCKED_OPENDNS autolearn=ham autolearn_force=no version=3.4.4 Received: from second.openwall.net (second.openwall.net [193.110.157.125]) by inbox.vuxu.org (Postfix) with SMTP id BB21722733 for ; Thu, 25 Sep 2025 15:16:41 +0200 (CEST) Received: (qmail 7317 invoked by uid 550); 25 Sep 2025 13:16:37 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com x-ms-reactions: disallow Received: (qmail 7285 invoked from network); 25 Sep 2025 13:16:36 -0000 From: Pincheng Wang To: musl@lists.openwall.com Cc: pincheng.plct@isrc.iscas.ac.cn Date: Thu, 25 Sep 2025 21:15:56 +0800 Message-Id: <20250925131557.8907-1-pincheng.plct@isrc.iscas.ac.cn> X-Mailer: git-send-email 2.39.5 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID:qwCowACX7Z+mQNVojvn3BQ--.2885S2 X-Coremail-Antispam: 1UD129KBjvJXoW7WF4Utr4rZF13tF15ur4Durg_yoW8GF1fpF 4SyrZ5Gr4Dt3srWr4fGFsrJr15A3yrGryDJr13t3W3A3y5trn5KF92kw1jkF9rC3W8Cw4a ga1qgr1UWa4UAaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUyq14x267AKxVWUJVW8JwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK02 1l84ACjcxK6xIIjxv20xvE14v26r1j6r1xM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r1j 6r4UM28EF7xvwVC2z280aVAFwI0_Jr0_Gr1l84ACjcxK6I8E87Iv6xkF7I0E14v26r1j6r 4UM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xII jxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr 1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxAIw28IcxkI7VAKI48J MxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwV AFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUXVWUAwCIc40Y0x0EwIxGrwCI42IY6xIIjxv2 0xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWUJVW8JwCI42IY6xAIw20EY4 v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AK xVWUJVW8JbIYCTnIWIevJa73UjIFyTuYvjfU5WlkUUUUU X-Originating-IP: [36.148.251.191] X-CM-SenderInfo: pslquxhhqjh1xofwqxxvufhxpvfd2hldfou0/ Subject: [musl] [PATCH 0/1] riscv64: Add RVV optimized memset implementation Hi all, This patch introduces a RISC-V Vector (RVV) optimized implementation of memset. Key points: - Use RVV instructions to fill memory in bulk, with a small-size head-tail fast path to reduce vsetvli overhead. - Fall back to a scalar head-tail implementation (like generic C implementation) when RVV is not available. - Reduce both instruction count and code size: memset.o shrinks by about 16.5% compared to the generic C build. Performance results on RVV-capable hardware show clear improvements: - On Spacemit X60: up to ~3.1x faster (256B), with consistent gains across medium and large sizes. - On XuanTie C908: up to ~2.1x faster (128B), with modest gains for larger sizes. For very small sizes (<8 Bytes), there can be regressions compared to the generic C version. A more aggresive fast path could remove these regressions, but at the cost of added code complexity. Feedback on this trade-off is welcome. The implementation was tested under QEMU with RVV enabled and on real hardware. Functional behavior matches the generic memset, with no changes to the public interface. Thanks, Pincheng Wang Pincheng Wang (1): riscv64: optimize memset implementation with vector extension src/string/riscv64/memset.S | 101 ++++++++++++++++++++++++++++++++++++ 1 file changed, 101 insertions(+) create mode 100644 src/string/riscv64/memset.S -- 2.39.5