From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <musl-return-21184-ml=inbox.vuxu.org@lists.openwall.com>
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org
X-Spam-Level: 
X-Spam-Status: No, score=-3.3 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS,
	MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H4,
	RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4
Received: from second.openwall.net (second.openwall.net [193.110.157.125])
	by inbox.vuxu.org (Postfix) with SMTP id 2B1B12465E
	for <ml@inbox.vuxu.org>; Wed, 24 Jul 2024 02:13:26 +0200 (CEST)
Received: (qmail 27824 invoked by uid 550); 24 Jul 2024 00:13:20 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Reply-To: musl@lists.openwall.com
Received: (qmail 27788 invoked from network); 24 Jul 2024 00:13:19 -0000
Date: Tue, 23 Jul 2024 20:13:12 -0400
From: Rich Felker <dalias@libc.org>
To: Alex =?utf-8?B?UsO4bm5l?= Petersen <alex@alexrp.com>
Cc: musl@lists.openwall.com
Message-ID: <20240724001312.GB10433@brightrain.aerifal.cx>
References: <20240629020434.488975-1-alex@alexrp.com>
 <20240723212241.GV3766212@port70.net>
 <CAH9TF6OXdGZAQ5qsLceAvJzOxdGiFd5Q3XK5yRJJFaWsaw9vTg@mail.gmail.com>
 <20240723225853.GV10433@brightrain.aerifal.cx>
 <CAH9TF6PEusxj6f6UGD=wdzaWdikwJvrsAJdzgGPpCorstC=MkA@mail.gmail.com>
 <20240723232211.GX10433@brightrain.aerifal.cx>
 <CAH9TF6OCDyj_kVmN2JV1a-Ypnu+cy_=9rzXU=ty=AmuH+zx6LQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CAH9TF6OCDyj_kVmN2JV1a-Ypnu+cy_=9rzXU=ty=AmuH+zx6LQ@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Subject: Re: [musl] [PATCH] riscv: Fix setjmp assembly when compiling for
 ilp32f/lp64f.

On Wed, Jul 24, 2024 at 02:09:01AM +0200, Alex Rønne Petersen wrote:
> On Wed, Jul 24, 2024 at 1:22 AM Rich Felker <dalias@libc.org> wrote:
> >
> > On Wed, Jul 24, 2024 at 01:12:33AM +0200, Alex Rønne Petersen wrote:
> > > On Wed, Jul 24, 2024 at 12:58 AM Rich Felker <dalias@libc.org> wrote:
> > > >
> > > > On Wed, Jul 24, 2024 at 12:47:14AM +0200, Alex Rønne Petersen wrote:
> > > > > On Tue, Jul 23, 2024 at 11:22 PM Szabolcs Nagy <nsz@port70.net> wrote:
> > > > > >
> > > > > > * Alex Rønne Petersen <alex@alexrp.com> [2024-06-29 04:04:34 +0200]:
> > > > > > > To keep things simple, I just changed the instruction mnemonics appropriately,
> > > > > > > rather than adding complexity by changing the buffer size/offsets based on ABI.
> > > > > > >
> > > > > > > Signed-off-by: Alex Rønne Petersen <alex@alexrp.com>
> > > > > >
> > > > > > fwiw this looks good to me.
> > > > > >
> > > > > > the only weirdness is that the math code uses __riscv_flen
> > > > > > and this code __riscv_float_abi*. i don't know if there
> > > > > > is semantic difference.
> > > > >
> > > > > `__riscv_flen` tells you the width of the FP registers on the target
> > > > > CPU. This is semantically distinct from `__riscv_float_abi`. For
> > > > > example, while it would probably be a bit silly, there's no particular
> > > > > reason why I couldn't target the LP64F ABI on an RV64IMAFDC machine..
> > > > > In that case, no code needs to concern itself with the upper bits of
> > > > > the FP registers.
> > > > >
> > > > > I took a quick peek at some of the `__riscv_flen` checks in musl. They
> > > > > look ok. They're checking the capabilities of the machine for the
> > > > > purposes of performing a computation; they're not making ABI
> > > > > decisions. In my silly example above, if I tell the compiler to do so
> > > > > with `-march=rv64...d`, it would theoretically be fine for the
> > > > > compiler to generate double-precision float instructions for
> > > > > computations as long as values are passed/returned according to LP64F
> > > > > rules.
> > > >
> > > > If you're building code for -sf or -sp ABI, but could run on a machine
> > > > with larger floating point register file, it's possible that the user
> > > > could have libc built not to use fp registers at all or only 32-bit
> > > > registers (respectively), but the calling application could be built
> > > > for and running on a machine with 64-bit registers. In this case we
> > > > need to understand what the ABI says. Are the 64-bit register, if
> > > > present, call-saved in lower ABI levels where they don't participate
> > > > in the calling convention? If so, no #ifdef is sufficient and there
> > > > must be a runtime hwcap check here to determine which form of
> > > > save/restore to do, like on arm and powerpc.
> > >
> > > I don't think this is a scenario that the ABI considers to be
> > > supported. If you try to link code of different ABIs, you will get
> > > linker errors such as:
> > >
> > >     /tmp/ccz2Y86f.o: can't link soft-float modules with double-float modules
> > >
> > > or
> > >
> > >     /tmp/cc7rTh7R.o: can't link single-float modules with double-float modules
> >
> > Those are different ABIs. You can't link modules with mismatched ABI,
> > but you should be able to link modules that are both using -sf ABI (or
> > both using -sp ABI), where one is not using the fpu and the other is
> > using the full double fpu but only passing args in GPRs to conform
> > with the ABI. If that's not allowed, I would consider it a tooling
> > bug; there's no compatibility-constraint reason it can't be allowed.
> 
> Oh, of course. I misread your question, sorry.
> 
> Here's, I think, the relevant section of the calling convention:
> https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc#floating-point-register-convention
> 
> This part is a bit awkward (to me, at least):
> 
>     Floating-point values in callee-saved registers are only preserved
> across calls if they are no larger than the width of a floating-point
> register in the targeted ABI. Therefore, these registers can always be
> considered temporaries if targeting the base integer calling
> convention.
> 
> I'm not really sure why they're talking about "values" there; I would
> think the register width (in the machine vs the ABI) is the only thing
> we're concerned about in this context. I'm assuming that what they
> mean is:
> 
>     Floating-point registers in the callee-saved set are only
> preserved across calls if they are no larger than the width of a
> floating-point register in the targeted ABI.

OK, perfect. That means we only need to decide what to save based on
the ABI, not dynamic hwcap or FPU capabilities.

Rich