From: enh <enh@google.com>
To: musl@lists.openwall.com
Subject: Re: Re: Re: [musl] [PATCH]Implementation of strlen function in riscv64 architecture
Date: Wed, 19 Apr 2023 15:39:06 -0700 [thread overview]
Message-ID: <CAJgzZooxj+5AKPg1O=kFW6pNQmY5sHuosG3LonOEyOd9aQBK7A@mail.gmail.com> (raw)
In-Reply-To: <484084b3.20c61.18798649e40.Coremail.zhangfei@nj.iscas.ac.cn>
[-- Attachment #1: Type: text/plain, Size: 8728 bytes --]
On Wed, Apr 19, 2023 at 12:22 AM 张飞 <zhangfei@nj.iscas.ac.cn> wrote:
> I did replace the C strlen code with a slower one except when
> musl is built for "#ifdef __riscv_vector" isa extension.So I referred
> to the C strlen code and implemented it with the basic instruction
> set, and the performance of both is basically the same.
>
> The reason for implementing two versions is to hope that the memset
> implemented
> using the basic instruction set can be applicable to all RISCV
> architecture CPUs,
> and the vector version can accelerate the hardware supporting vector
> expansion.
> When the compiler adds vector extensions through --with-arch=rv64gcv,
> __riscv_vector will also open by default.Similar macro definitions are
> common in
> riscv, such as setjmp/riscv64/setjmp.S in musl, which includes
> __riscv_float_abi_soft macro definitions.
>
> At present, the riscv vector extension instruction set is in a frozen
> state, and
> the instruction set is stable. In other open source libraries, such as
> openssl
> and openCV, riscv vector optimization is available.
is that actually checked in to openssl? the linux kernel patches to
save/restore vector state still haven't been merged to linux-next afaik,
and there's still no hwcaps support for V either. or are they using
`__riscv_vector` too, and not detecting V at runtime? (the kernel's own use
of V and Zb* seems to be based on an internal-only hwcap mechanism for now.)
> We know that the assembly generated
> by the compiler is often not the most efficient, and the automatic
> vectorization
> scenarios are limited, so we need to optimize the function by manual
> vectorization.
> For riscv, compiler automatic vectorization is still in its infancy.
>
have you tried sifive's autovectorization patches? do they help for this
code?
> I conducted tests on different data volumes and compared the performance
> of memset
> functions implemented in C language, basic instruction set, and vector
> instruction
> set.The test case is test_strlen.c
>
> Performance comparison between C language implementation and assembly
> implementation was
> tested on Sifive chips(RISC-V SiFive U74 Dual Core 64 Bit RV64GC ISA Chip
> Platform).
>
> The test results are as follows.Due to the consistent algorithm between
> the two, there
> is basically no difference in performance.
>
>
> --------------------------------------------------------------------------------
> length(byte) C language implementation(s) Basic instruction
> implementation(s)
>
> --------------------------------------------------------------------------------
> 2 0.00000528 0.000005441
> 4 0.00000544 0.000005437
> 8 0.00000464 0.00000496
> 16 0.00000544 0.00000512
> 32 0.0000064 0.00000592
> 64 0.000007994 0.000007841
> 128 0.000012 0.000012
> 256 0.000020321 0.000020481
> 512 0.000037282 0.000037762
> 1024 0.000069924 0.000070244
> 2048 0.000135046 0.000135528
> 4096 0.000264491 0.000264816
> 8192 0.000524342 0.000525631
> 16384 0.001069965 0.001047742
> 32768 0.002180252 0.002142207
> 65536 0.005921251 0.005883868
> 131072 0.012508934 0.012392895
> 262144 0.02503915 0.024896995
> 524288 0.049879091 0.049821832
> 1048576 0.09973658 0.099969603
>
> --------------------------------------------------------------------------------
>
> Due to the lack of a chip that supports vector extension, I conducted a
> performance
> comparison test of strlen using C language and vector implementation on
> the Spike
> simulator, which has certain reference value. It can be clearly seen that
> vector
> implementation is more efficient than C language implementation, with an
> average
> performance improvement of over 800%.
>
>
> --------------------------------------------------------------------------------
> length(byte) C language implementation(s) Vector instruction
> implementation(s)
>
> --------------------------------------------------------------------------------
> 2 0.000003639 0.000003339
> 4 0.000004239 0.000003339
> 8 0.000003639 0.000003339
> 16 0.000004339 0.000003339
> 32 0.000005739 0.000003339
> 64 0.000008539 0.000003339
> 128 0.000014139 0.000004039
> 256 0.000025339 0.000004739
> 512 0.000047739 0.000006139
> 1024 0.000092539 0.000008939
> 2048 0.000182139 0.000014539
> 4096 0.000361339 0.000025739
> 8192 0.000719739 0.000048139
> 16384 0.001436539 0.000092939
> 32768 0.002870139 0.000182539
> 65536 0.005737339 0.000361739
> 131072 0.011471739 0.000720139
> 262144 0.022940539 0.001436939
> 524288 0.045878139 0.002870539
> 1048576 0.091753339 0.005737739
>
> --------------------------------------------------------------------------------
>
> So I hope to pass __riscv_vector, which enables hardware that does not
> support vector
> extension to execute the basic instruction set implementation of strlen,
> has the same
> performance as the C language implementation. For support vector extended
> hardware,
> strlen implemented by vector instruction set is executed to achieve
> acceleration effect.
>
> Fei Zhang
>
> > -----原始邮件-----
> > 发件人: "Szabolcs Nagy" <nsz@port70.net>
> > 发送时间: 2023-04-11 20:48:22 (星期二)
> > 收件人: "张飞" <zhangfei@nj.iscas.ac.cn>
> > 抄送: musl@lists.openwall.com
> > 主题: Re: Re: [musl] [PATCH]Implementation of strlen function in
> riscv64 architecture
> >
> > * 张飞 <zhangfei@nj.iscas.ac.cn> [2023-04-10 13:59:22 +0800]:
> > > I have made modifications to the assembly implementation of the
> riscv64 strlen function, mainly
> > > focusing on address alignment processing to avoid the problem of
> data crossing
> > > pages during vector instruction memory access.
> > >
> > > I think the assembly implementation of strlen is necessary. In
> glibc,
> >
> > if the c definition is not correct then you have to explain why.
> > if it's very slow then please tell us so.
> >
> > > X86_64, aarch64, alpha, and others all have assembly
> implementations of this function,
> > > while for riscv64, it is blank.
> > > I have also analyzed the test sets of Spec2006 and Spec2017, and
> the strlen function is also a hot topic.
> >
> > an asm implementation has significant maintenance cost so you should
> > provide some benchmark data or other evidence/reasoning for us to
> > decide if it's worth the cost.
> >
> > it seems you replaced the c strlen code with a slower one except when
> > musl is built for "#ifdef __riscv_vector" isa extension. what cpus
> > does this affect? are linux distros expected to use this as baseline?
> > do different riscv cpus have similar simd performance properties? who
> > will tweak the asm if not?
> >
> > in principle what you did can be done by the compiler auto vectorizer
> > so maybe contributing to the compiler is more useful.
> >
> > note that glibc has cpu specific implementations that it can select
> > at runtime, but musl uses one generic implementation for all cpus.
> </zhangfei@nj.iscas.ac.cn></zhangfei@nj.iscas.ac.cn></nsz@port70.net>
[-- Attachment #2: Type: text/html, Size: 11231 bytes --]
prev parent reply other threads:[~2023-04-19 22:39 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-22 6:21 张飞
2023-03-22 6:27 ` A. Wilcox
2023-03-22 12:15 ` Rich Felker
2023-04-11 12:57 ` Szabolcs Nagy
2023-04-10 5:59 ` 张飞
2023-04-11 12:48 ` Szabolcs Nagy
2023-04-19 7:22 ` 张飞
2023-04-19 22:39 ` enh [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJgzZooxj+5AKPg1O=kFW6pNQmY5sHuosG3LonOEyOd9aQBK7A@mail.gmail.com' \
--to=enh@google.com \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).