From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org
X-Spam-Level: 
X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham
	autolearn_force=no version=3.4.4
Received: (qmail 10559 invoked from network); 20 Sep 2023 13:16:38 -0000
Received: from second.openwall.net (193.110.157.125)
  by inbox.vuxu.org with ESMTPUTF8; 20 Sep 2023 13:16:38 -0000
Received: (qmail 17690 invoked by uid 550); 20 Sep 2023 13:16:32 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Reply-To: musl@lists.openwall.com
Received: (qmail 17635 invoked from network); 20 Sep 2023 13:16:31 -0000
Date: Wed, 20 Sep 2023 15:16:19 +0200
From: Szabolcs Nagy <nsz@port70.net>
To: Jianmin Lv <lvjianmin@loongson.cn>
Cc: musl@lists.openwall.com, Hongliang Wang <wanghongliang@loongson.cn>
Message-ID: <20230920131619.GA1427497@port70.net>
Mail-Followup-To: Jianmin Lv <lvjianmin@loongson.cn>,
	musl@lists.openwall.com, Hongliang Wang <wanghongliang@loongson.cn>
References: <0fd19586-376c-14ec-a50b-8c561f9f82f2@loongson.cn>
 <bb9b315e-1784-67e1-dc98-549da333f1fa@loongson.cn>
 <72a1a5d1-6f57-9a12-7775-84e1ff2c1df2@loongson.cn>
 <64955faf-c213-34bf-c7e2-691064da51f9@loongson.cn>
 <968652d9-3780-0843-908e-f21bc93b6ee1@loongson.cn>
 <320c8444.bb9a.189c45903f8.Coremail.zhaixiaojuan@loongson.cn>
 <20230805154307.GS4163@brightrain.aerifal.cx>
 <20230813014109.GU4163@brightrain.aerifal.cx>
 <99d954ca-faee-2cac-97af-7fc2ecdb9a89@loongson.cn>
 <65857ef7-0ef8-d3c8-d6d8-ea577b99e793@loongson.cn>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
In-Reply-To: <65857ef7-0ef8-d3c8-d6d8-ea577b99e793@loongson.cn>
Subject: Re: [musl] add loongarch64 port v7.

* Jianmin Lv <lvjianmin@loongson.cn> [2023-09-20 15:45:39 +0800]:
> Sorry to bother you, but I just want to know if there is any progress on
> this, because Alpine is blocking by this patch for a long time. As Hongli=
ang
> has explained questions you mentioned one by one, if any question need to=
 be
> discussed, please point them out, so that the patch can be handled furthe=
r.

i think you should post a v8 patch to move
this forward. (not on github, but here)

> On 2023/8/15 =E4=B8=8B=E5=8D=884:17, Hongliang Wang wrote:
> > =E5=9C=A8 2023/8/13 =E4=B8=8A=E5=8D=889:41, Rich Felker =E5=86=99=E9=81=
=93:
> > > On Sat, Aug 05, 2023 at 11:43:08AM -0400, Rich Felker wrote:
> > > > On Sat, Aug 05, 2023 at 02:18:35PM +0800, =E7=BF=9F=E5=B0=8F=E5=A8=
=9F wrote:
> > > > -+#define __BYTE_ORDER 1234
> > > > ++#define __BYTE_ORDER=C2=A0 __LITTLE_ENDIAN
> > >=20
> > > This is gratuitous, mismatches what is done on other archs, and is
> > > less safe.
> > >=20
> > The modification is based on the following review suggestion(
> > WANG Xuerui <i@xen0n.name> reviewed in
> > https://www.openwall.com/lists/musl/2022/10/12/1):
> >=20
> > `#define __BYTE_ORDER __LITTLE_ENDIAN` could be more consistent
> > with other arches.

please use 1234.

> > > > -+TYPEDEF unsigned nlink_t;
> > > > ++TYPEDEF unsigned int nlink_t;
> > >=20
> > > Gratuitous and in opposite direction of coding style used elsewhere in
> > > musl. There are a few other instances of this too.
> > >=20
> > Based on the following review question(WANG Xuerui <i@xen0n.name>
> > reviewed in https://www.openwall.com/lists/musl/2022/10/12/1):
> >=20
> > `unsigned int`? Same for other bare `unsigned` usages.
> >=20
> > I fixed it to a explicit definition.

please use plain unsigned, not unsigned int.

> > > > -+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 register uintptr_t tp __asm__("tp"=
);
> > > > -+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 __asm__ ("" : "=3Dr" (tp) );
> > > > ++=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 uintptr_t tp;
> > > > ++=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 __asm__ __volatile__("move %0, $tp=
" : "=3Dr"(tp));
> > > > =C2=A0 +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return tp;
> > >=20
> > > Not clear what the motivation is here. Is it working around a compiler
> > > bug? The original form was more optimal.
> >=20
> > The modification is based on the following review suggestion(
> > WANG Xuerui <i@xen0n.name> reviewed in
> > https://www.openwall.com/lists/musl/2022/10/12/1):
> >=20
> > While the current approach works, it's a bit fragile [1], and
> > the simple and plain riscv version works too:
> >=20
> > uintptr_t tp;
> > __asm__ ("move %0, $tp" : "=3Dr"(tp));
> >=20
> > [1]:https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Lo=
cal-Register-Variables

this looks ok to me.

(original code looks sketchy to me.)

> > > > =C2=A0 +#define CRTJMP(pc,sp) __asm__ __volatile__( \
> > > > -+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 "move $sp,%1 ; jr %0" : : "r"(pc),=
 "r"(sp) : "memory" )
> > > > -+
> > > > -+#define GETFUNCSYM(fp, sym, got) __asm__ ( \
> > > > -+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ".hidden " #sym "\n" \
> > > > -+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ".align 8 \n" \
> > > > -+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 "=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 la.local $t1, "#sym" \n" \
> > > > -+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 "=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 move %0, $t1 \n" \
> > > > -+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 : "=3Dr"(*(fp)) : : "memory" )
> > > > ++=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 "move $sp, %1 ; jr %0" : : "r"(pc)=
, "r"(sp) : "memory" )
> > >=20
> > > Not clear why this was changed. It was never discussed afaik. It looks
> > > somewhat dubious removing GETFUNCSYM and using the maybe-unreliable
> > > default definition.
> > >=20
> > Based on the following review question(
> > WANG Xuerui <i@xen0n.name> reviewed
> > in https://www.openwall.com/lists/musl/2022/10/12/1):
> >=20
> > Does the generic version residing in ldso/dlstart.c not work?
> >=20
> > I found the code logic is consistent with the generic version, So
> > I removed the definition here and replaced it with generic version.

i would change it back to the asm.

generic version should work, but maybe we don't
want to trust the compiler here (there may be
ways to compile the generic code that is not
compatible with incompletely relocated libc.so)
the asm is safer.

> > > It also looks like v5->v6 rewrote sigsetjmp.
> > >=20
> > Based on the following review question(
> > WANG Xuerui <i@xen0n.name> reviewed
> > in https://www.openwall.com/lists/musl/2022/10/12/1):
> >=20
> > This is crazy complicated compared to the riscv port, why is the
> > juggling between a0/a1 and t5/t6 necessary?
> >=20
> > I optimized the code implementations.

this should be ok.

> > > v6->v7:
> > >=20
> > > sigcontext/mcontext_t change mostly makes sense, but the
> > > namespace-safe and full mcontext_t differ in their use of the aligned
> > > attribute and it's not clear that the attribute is needed (since the
> > > offset is naturally aligned and any instance of the object is produced
> > > by the kernel, not by userspace).
> > >=20
> > > > -+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 long=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 __uc_pad;
> > >=20
> > > This change to ucontext_t actually makes the struct ABI-incompatible
> > > in the namespace-safe version without the alignment attribute. IIRC I
> > > explicitly requested the explicit padding field to avoid this kind of
> > > footgun. Removing it does not seem like a good idea.
> > >=20
> > Initially, we add __uc_pad to ensure uc_mcontext is 16 byte alignment.
> > Now, we added __attribute__((__aligned__(16))) to
> > uc_mcontext.__extcontext[],this can ensure uc_mcontext is also 16 byte
> > alignment. so __uc_pad is not used.
> >=20
> > Remove __uc_pad, from the point of struct layout, musl and kernel are
> > consistent. otherwise, I think it may bring a sense of inconsistency
> > between kernel and musl. Due to Loongarch is not merged into musl now,
> > Remove it no compatibility issues.

please add back the padding field.

> > > In stat.h:
> > >=20
> > > > -+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 unsigned long __pad;
> > > > ++=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 unsigned long __pad1;
> > >=20
> > > This is gratuitous and makes the definition gratuitously mismatch what
> > > will become the "generic" version of bits/stat.h. There is no contract
> > > for applications to be able to access these padding fields by name, so
> > > no motivation to make their names match glibc's names or the kernel's.
> > >=20
> > OK.

please fix it.

> > > In fenv.S:
> > >=20
> > > > ++#ifdef __clang__
> > > > ++#define FCSR $fcsr0
> > > > ++#else
> > > > ++#define FCSR $r0
> > > > ++#endif
> > >=20
> > > It's not clear to me what's going on here. Is there a clang
> > > incompatibility in the assembler language you're working around? Or
> > > what? If so that seems like a tooling bug that should be fixed.
> > >=20
> > The GNU assembler cannot correctly recognize $fcsr0, but the
> > LLVM IAS does not have this issue, so make a distinction.
> > This issue has been fixed in GNU assembler 2.41. but for compatible
> > with GNU assembler 2.40 and below, $r0 need reserved.

it sounds like the correct asm is $fcsr0, so that's
what should be used on all assemblers, not just on
clang as.

only broken old gnu as should use different syntax.
for this a configure test is needed that adds a
CFLAG like -DBROKEN_LOONGARCH_FCSR_ASM when fails.
and use that for the ifdef.

> >=20
> > The linux kernel also has a similar distinction:
> >=20
> > commit 38bb46f94544c5385bc35aa2bfc776dcf53a7b5d
> > Author: WANG Xuerui <git@xen0n.name>
> > Date:=C2=A0=C2=A0 Thu Jun 29 20:58:43 2023 +0800
> >=20
> > =C2=A0=C2=A0=C2=A0 LoongArch: Prepare for assemblers with proper FCSR c=
lass support
> >=20
> > =C2=A0=C2=A0=C2=A0 The GNU assembler (as of 2.40) mis-treats FCSR opera=
nds as GPRs, but
> > =C2=A0=C2=A0=C2=A0 the LLVM IAS does not. Probe for this and refer to F=
CSRs as"$fcsrNN"
> > =C2=A0=C2=A0=C2=A0 if support is present.
> >=20
> > =C2=A0=C2=A0=C2=A0 Signed-off-by: WANG Xuerui <git@xen0n.name>
> > =C2=A0=C2=A0=C2=A0 Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
> >=20
> > > Otherwise, everything looks as expected, I think. I'm okay with making
> > > any fixups for merging rather than throwing this back on your team for
> > > more revisions, but can you please follow up and clarify the above?

all issues look minor.

if you post a v8 it likely gets into a release faster.