On Fri, Oct 20, 2023 at 8:02 PM Damian McGuckin <damianm@esi.com.au> wrote:

>
> What modern CPUs have a penalty for double precision floating point
> arithmetic on scalars compared to single precision once they are in a
> register, i.e. ignoring memory fetch issues.
>
> I have Agner Fog's excellent document for X86-64 which basically says that
> 32
> bit and 64 bit operations for scalars take the same amount of time.
>
> I am looking for the same type of information for ARM and RISC-V. I found
> the
> data for 32-bit in the online documentation. But nothing bout 64 bit.
>
> I cannot find anything on this topic on RISC-V or POWER10.
>
> Maybe I am not searching on the right terms.
>
> Note that I am after the raw performance, not say the relative performance
> of say the MUSL sin() routine compared with the MUSL sinf().
>

Have you looked at the scheduler description for ARM, RISC-V and POWER in
GCC or LLVM?

David