On Fri, Oct 20, 2023 at 8:02 PM Damian McGuckin wrote: > > What modern CPUs have a penalty for double precision floating point > arithmetic on scalars compared to single precision once they are in a > register, i.e. ignoring memory fetch issues. > > I have Agner Fog's excellent document for X86-64 which basically says that > 32 > bit and 64 bit operations for scalars take the same amount of time. > > I am looking for the same type of information for ARM and RISC-V. I found > the > data for 32-bit in the online documentation. But nothing bout 64 bit. > > I cannot find anything on this topic on RISC-V or POWER10. > > Maybe I am not searching on the right terms. > > Note that I am after the raw performance, not say the relative performance > of say the MUSL sin() routine compared with the MUSL sinf(). > Have you looked at the scheduler description for ARM, RISC-V and POWER in GCC or LLVM? David