Hi, I have fixed the issues listed below, and post 0001-add-loongarch64-port-v8.patch,as shown in the attachment. please review again, if there are anything that need to be modified, please point out, thank you. Hongliang Wang 在 2023/9/22 上午9:37, Hongliang Wang 写道: > Hi, > > Thank you for your review, I will modify it according to the review > suggestions and post v8 patch later. > > Hongliang Wang > > 在 2023/9/20 下午9:16, Szabolcs Nagy 写道: >> * Jianmin Lv [2023-09-20 15:45:39 +0800]: >>> Sorry to bother you, but I just want to know if there is any progress on >>> this, because Alpine is blocking by this patch for a long time. As >>> Hongliang >>> has explained questions you mentioned one by one, if any question >>> need to be >>> discussed, please point them out, so that the patch can be handled >>> further. >> >> i think you should post a v8 patch to move >> this forward. (not on github, but here) >> >>> On 2023/8/15 下午4:17, Hongliang Wang wrote: >>>> 在 2023/8/13 上午9:41, Rich Felker 写道: >>>>> On Sat, Aug 05, 2023 at 11:43:08AM -0400, Rich Felker wrote: >>>>>> On Sat, Aug 05, 2023 at 02:18:35PM +0800, 翟小娟 wrote: >>>>>> -+#define __BYTE_ORDER 1234 >>>>>> ++#define __BYTE_ORDER  __LITTLE_ENDIAN >>>>> >>>>> This is gratuitous, mismatches what is done on other archs, and is >>>>> less safe. >>>>> >>>> The modification is based on the following review suggestion( >>>> WANG Xuerui reviewed in >>>> https://www.openwall.com/lists/musl/2022/10/12/1): >>>> >>>> `#define __BYTE_ORDER __LITTLE_ENDIAN` could be more consistent >>>> with other arches. >> >> please use 1234. --- a/arch/loongarch64/bits/alltypes.h.in +++ b/arch/loongarch64/bits/alltypes.h.in @@ -2,7 +2,7 @@ #define _Int64 long #define _Reg long -#define __BYTE_ORDER __LITTLE_ENDIAN +#define __BYTE_ORDER 1234 #define __LONG_MAX 0x7fffffffffffffffL >> >>>>>> -+TYPEDEF unsigned nlink_t; >>>>>> ++TYPEDEF unsigned int nlink_t; >>>>> >>>>> Gratuitous and in opposite direction of coding style used elsewhere in >>>>> musl. There are a few other instances of this too. >>>>> >>>> Based on the following review question(WANG Xuerui >>>> reviewed in https://www.openwall.com/lists/musl/2022/10/12/1): >>>> >>>> `unsigned int`? Same for other bare `unsigned` usages. >>>> >>>> I fixed it to a explicit definition. >> >> please use plain unsigned, not unsigned int. --- a/arch/loongarch64/bits/alltypes.h.in +++ b/arch/loongarch64/bits/alltypes.h.in @@ -2,7 +2,7 @@ -TYPEDEF unsigned int nlink_t; +TYPEDEF unsigned nlink_t; TYPEDEF int blksize_t; >> >>>>>> -+      register uintptr_t tp __asm__("tp"); >>>>>> -+      __asm__ ("" : "=r" (tp) ); >>>>>> ++      uintptr_t tp; >>>>>> ++      __asm__ __volatile__("move %0, $tp" : "=r"(tp)); >>>>>>    +      return tp; >>>>> >>>>> Not clear what the motivation is here. Is it working around a compiler >>>>> bug? The original form was more optimal. >>>> >>>> The modification is based on the following review suggestion( >>>> WANG Xuerui reviewed in >>>> https://www.openwall.com/lists/musl/2022/10/12/1): >>>> >>>> While the current approach works, it's a bit fragile [1], and >>>> the simple and plain riscv version works too: >>>> >>>> uintptr_t tp; >>>> __asm__ ("move %0, $tp" : "=r"(tp)); >>>> >>>> [1]:https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html#Local-Register-Variables >>>> >> >> this looks ok to me. >> >> (original code looks sketchy to me.) >> >>>>>>    +#define CRTJMP(pc,sp) __asm__ __volatile__( \ >>>>>> -+      "move $sp,%1 ; jr %0" : : "r"(pc), "r"(sp) : "memory" ) >>>>>> -+ >>>>>> -+#define GETFUNCSYM(fp, sym, got) __asm__ ( \ >>>>>> -+      ".hidden " #sym "\n" \ >>>>>> -+      ".align 8 \n" \ >>>>>> -+      "       la.local $t1, "#sym" \n" \ >>>>>> -+      "       move %0, $t1 \n" \ >>>>>> -+      : "=r"(*(fp)) : : "memory" ) >>>>>> ++      "move $sp, %1 ; jr %0" : : "r"(pc), "r"(sp) : "memory" ) >>>>> >>>>> Not clear why this was changed. It was never discussed afaik. It looks >>>>> somewhat dubious removing GETFUNCSYM and using the maybe-unreliable >>>>> default definition. >>>>> >>>> Based on the following review question( >>>> WANG Xuerui reviewed >>>> in https://www.openwall.com/lists/musl/2022/10/12/1): >>>> >>>> Does the generic version residing in ldso/dlstart.c not work? >>>> >>>> I found the code logic is consistent with the generic version, So >>>> I removed the definition here and replaced it with generic version. >> >> i would change it back to the asm. >> >> generic version should work, but maybe we don't >> want to trust the compiler here (there may be >> ways to compile the generic code that is not >> compatible with incompletely relocated libc.so) >> the asm is safer. >> --- a/arch/loongarch64/reloc.h +++ b/arch/loongarch64/reloc.h @@ -18,3 +18,10 @@ #define CRTJMP(pc,sp) __asm__ __volatile__( \ "move $sp, %1 ; jr %0" : : "r"(pc), "r"(sp) : "memory" ) + +#define GETFUNCSYM(fp, sym, got) __asm__ ( \ + ".hidden " #sym "\n" \ + ".align 8 \n" \ + " la.local $t1, "#sym" \n" \ + " move %0, $t1 \n" \ + : "=r"(*(fp)) : : "memory" ) >>>>> It also looks like v5->v6 rewrote sigsetjmp. >>>>> >>>> Based on the following review question( >>>> WANG Xuerui reviewed >>>> in https://www.openwall.com/lists/musl/2022/10/12/1): >>>> >>>> This is crazy complicated compared to the riscv port, why is the >>>> juggling between a0/a1 and t5/t6 necessary? >>>> >>>> I optimized the code implementations. >> >> this should be ok. >> >>>>> v6->v7: >>>>> >>>>> sigcontext/mcontext_t change mostly makes sense, but the >>>>> namespace-safe and full mcontext_t differ in their use of the aligned >>>>> attribute and it's not clear that the attribute is needed (since the >>>>> offset is naturally aligned and any instance of the object is produced >>>>> by the kernel, not by userspace). >>>>> >>>>>> -+      long               __uc_pad; >>>>> >>>>> This change to ucontext_t actually makes the struct ABI-incompatible >>>>> in the namespace-safe version without the alignment attribute. IIRC I >>>>> explicitly requested the explicit padding field to avoid this kind of >>>>> footgun. Removing it does not seem like a good idea. >>>>> >>>> Initially, we add __uc_pad to ensure uc_mcontext is 16 byte alignment. >>>> Now, we added __attribute__((__aligned__(16))) to >>>> uc_mcontext.__extcontext[],this can ensure uc_mcontext is also 16 byte >>>> alignment. so __uc_pad is not used. >>>> >>>> Remove __uc_pad, from the point of struct layout, musl and kernel are >>>> consistent. otherwise, I think it may bring a sense of inconsistency >>>> between kernel and musl. Due to Loongarch is not merged into musl now, >>>> Remove it no compatibility issues. >> >> please add back the padding field. --- a/arch/loongarch64/bits/signal.h +++ b/arch/loongarch64/bits/signal.h @@ -40,6 +40,7 @@ typedef struct __ucontext struct __ucontext *uc_link; stack_t uc_stack; sigset_t uc_sigmask; + long __uc_pad; mcontext_t uc_mcontext; } ucontext_t; >> >>>>> In stat.h: >>>>> >>>>>> -+      unsigned long __pad; >>>>>> ++      unsigned long __pad1; >>>>> >>>>> This is gratuitous and makes the definition gratuitously mismatch what >>>>> will become the "generic" version of bits/stat.h. There is no contract >>>>> for applications to be able to access these padding fields by name, so >>>>> no motivation to make their names match glibc's names or the kernel's. >>>>> >>>> OK. >> >> please fix it. --- a/arch/loongarch64/bits/stat.h +++ b/arch/loongarch64/bits/stat.h @@ -6,7 +6,7 @@ struct stat { uid_t st_uid; gid_t st_gid; dev_t st_rdev; - unsigned long __pad1; + unsigned long __pad; off_t st_size; blksize_t st_blksize; int __pad2; >> >>>>> In fenv.S: >>>>> >>>>>> ++#ifdef __clang__ >>>>>> ++#define FCSR $fcsr0 >>>>>> ++#else >>>>>> ++#define FCSR $r0 >>>>>> ++#endif >>>>> >>>>> It's not clear to me what's going on here. Is there a clang >>>>> incompatibility in the assembler language you're working around? Or >>>>> what? If so that seems like a tooling bug that should be fixed. >>>>> >>>> The GNU assembler cannot correctly recognize $fcsr0, but the >>>> LLVM IAS does not have this issue, so make a distinction. >>>> This issue has been fixed in GNU assembler 2.41. but for compatible >>>> with GNU assembler 2.40 and below, $r0 need reserved. >> >> it sounds like the correct asm is $fcsr0, so that's >> what should be used on all assemblers, not just on >> clang as. >> >> only broken old gnu as should use different syntax. >> for this a configure test is needed that adds a >> CFLAG like -DBROKEN_LOONGARCH_FCSR_ASM when fails. >> and use that for the ifdef. >> --- a/configure +++ b/configure if test "$ARCH" = "loongarch64" ; then trycppif __loongarch_soft_float "$t" && SUBARCH=${SUBARCH}-sf +printf "checking whether compiler support FCSRs... " +echo "__asm__(\"movfcsr2gr \$t0,\$fcsr0\");" > "$tmpc" +if $CC -c -o /dev/null "$tmpc" >/dev/null 2>&1 ; then +printf "yes\n" +else +printf "no\n" +CFLAGS_AUTO="$CFLAGS_AUTO -DBROKEN_LOONGARCH_FCSR_ASM" +fi fi --- a/src/fenv/loongarch64/fenv.S +++ b/src/fenv/loongarch64/fenv.S @@ -1,9 +1,9 @@ #ifndef __loongarch_soft_float -#ifdef __clang__ -#define FCSR $fcsr0 -#else +#ifdef BROKEN_LOONGARCH_FCSR_ASM #define FCSR $r0 +#else +#define FCSR $fcsr0 #endif .global feclearexcept >>>> >>>> The linux kernel also has a similar distinction: >>>> >>>> commit 38bb46f94544c5385bc35aa2bfc776dcf53a7b5d >>>> Author: WANG Xuerui >>>> Date:   Thu Jun 29 20:58:43 2023 +0800 >>>> >>>>      LoongArch: Prepare for assemblers with proper FCSR class support >>>> >>>>      The GNU assembler (as of 2.40) mis-treats FCSR operands as >>>> GPRs, but >>>>      the LLVM IAS does not. Probe for this and refer to FCSRs >>>> as"$fcsrNN" >>>>      if support is present. >>>> >>>>      Signed-off-by: WANG Xuerui >>>>      Signed-off-by: Huacai Chen >>>> >>>>> Otherwise, everything looks as expected, I think. I'm okay with making >>>>> any fixups for merging rather than throwing this back on your team for >>>>> more revisions, but can you please follow up and clarify the above? >> >> all issues look minor. >> >> if you post a v8 it likely gets into a release faster. >>