From: Jaydeep Patil <Jaydeep.Patil@imgtec.com>
To: Rich Felker <dalias@libc.org>
Cc: "musl@lists.openwall.com" <musl@lists.openwall.com>
Subject: RE: [PATCH] Fix atomic_arch.h for MIPS32 R6
Date: Wed, 30 Mar 2016 09:45:59 +0000 [thread overview]
Message-ID: <BD7773622145634B952E5B54ACA8E349AA24C18C@PUMAIL01.pu.imgtec.org> (raw)
In-Reply-To: <20160329133254.GM21636@brightrain.aerifal.cx>
>-----Original Message-----
>From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker
>Sent: 29 March 2016 PM 07:03
>To: Jaydeep Patil
>Cc: musl@lists.openwall.com
>Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>
>On Tue, Mar 29, 2016 at 07:16:46AM +0000, Jaydeep Patil wrote:
>> >-----Original Message-----
>> >From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker
>> >Sent: 29 March 2016 AM 09:41
>> >To: Jaydeep Patil
>> >Cc: musl@lists.openwall.com
>> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>> >
>> >On Tue, Mar 29, 2016 at 03:54:02AM +0000, Jaydeep Patil wrote:
>> >> >-----Original Message-----
>> >> >From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich
>> >> >Felker
>> >> >Sent: 28 March 2016 PM 06:35
>> >> >To: Jaydeep Patil
>> >> >Cc: musl@lists.openwall.com
>> >> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>> >> >
>> >> >On Mon, Mar 28, 2016 at 05:07:39AM +0000, Jaydeep Patil wrote:
>> >> >> >> >I was just saying it makes the code less cluttered to use
>> >> >> >> >them spuriously even though we don't need to:
>> >> >> >> >
>> >> >> >> > ".set push ; "
>> >> >> >> >#if __mips_isa_rev < 6
>> >> >> >> > ".set mips2 ; "
>> >> >> >> >#endif
>> >> >> >> > "ll %0, %1 ; .set pop"
>> >> >> >> >
>> >> >> >> >or similar.
>> >> >> >> >
>> >> >> >> >It's also not clear to me whether the "m" constraint is
>> >> >> >> >valid anymore for the R6 ll/sc instructions since they take
>> >> >> >> >a 9-bit offset now instead of a
>> >> >> >16-bit offset.
>> >> >> >> >The compiler could generate an address expression whose
>> >> >> >> >offset part does not fit in 9 bits. In that case we may need
>> >> >> >> >to #if the whole function (or at least the __asm__
>> >> >> >> >statement) separately rather than just
>> >> >> >skipping the .set mips2....
>> >> >> >> >
>> >> >> >>
>> >> >> >> The "m" constrain is still valid here, as the offset will be 0 in this
>case..
>> >> >> >
>> >> >> >How can you assume the offset will be 0? It's the compiler's
>> >> >> >choice what to use. For instance, a_cas(&foo->bar, t, s) is
>> >> >> >likely to have an offset equal to
>> >> >> >offsetof(__typeof__(foo),bar). AFAIK this happens in practice
>> >> >> >with small offsets in mutex structures, etc. so the bug may be
>> >> >> >unlikely to be hit, but I think it's still an incorrect-
>> >constraint bug.
>> >> >>
>> >> >> Compiler generates appropriate LL/SC based on the offset.
>> >> >> Compiler adds the offset to the base register if it does not fit 9bits.
>> >> >
>> >> >The compiler has no way of knowing that the operand will be used
>> >> >with ll with the 9-bit offset restriction; as far as it knows, it
>> >> >will be used in a normal context where a 16-bit offset is valid. I
>> >> >don't have a toolchain that will target r6, but you can try the
>> >> >following program which produces an offset of 4096 for loading p[1024]:
>> >> >
>> >> >unsigned ll1k(volatile unsigned *p) {
>> >> > unsigned val;
>> >> > __asm__ __volatile__ ("ll %0, %1" : "=r"(val) : "m"(p[1024]) :
>> >> >"memory" );
>> >> > return val;
>> >> >}
>> >> >
>> >> >I would expect this to produce errors at assembly time on r6.
>> >> >Rich
>> >>
>> >> This is what compiler has generated for above function:
>> >>
>> >> $ gcc -c -o main.o main.c -O3 -mips32r6 -mabi=32
>> >>
>> >> Objdump:
>> >>
>> >> 00000000 <ll1k>:
>> >> 0: 24821000 addiu v0,a0,4096
>> >> 4: 7c420036 ll v0,0(v0)
>> >> 8: d81f0000 jrc ra
>> >> c: 00000000 nop
>> >
>> >Can you try gcc -S instead of -c (still at -O3) to produce asm output
>> >without assembling it?
>>
>> Generated asssembly:
>>
>> #APP
>> # 4 "test.c" 1
>> ll $2, 4096($4)
>> # 0 "" 2
>> #NO_APP
>> jrc $31
>>
>> Even if we set "noreorder" before LL, assembler generates addiu+ll:
>>
>> 00000000 <ll1k>:
>> 0: 24821000 addiu v0,a0,4096
>> 4: 7c420036 ll v0,0(v0)
>> 8: d81f0000 jrc ra
>> c: 00000000 nop
>
>I see. I suspected the assembler was doing it. "noat", not "noreorder", is the
>way to suppress things like this but I doubt even "noat" does it since a
>separate temp register ("at") is not needed in this case.
>
>If all assembers that support R6 support this rewriting, then the ZC constraint
>in gcc is really just an optimization, not strictly necessary. We should probably
>check (1) whether clang's internal assembler can do the rewriting, and (2)
>whether clang supports the ZC constraint. I would prefer using ZC but I want
>to do whatever is more compatible; I don't think the codegen efficiency
>matters a lot either way.
>Rich
Clang's integrated assembler does not support this rewriting. However ZC is supported.
I have modified both atomic_arch.h and pthread_arch.h to reflect this.
Please refer to https://github.com/JaydeepIMG/musl-1/tree/fix_inline_asm_for_R6 for the patch (also listed below).
I have also added R6 as subarch.
From 20054ee55643d9e81163ca58ac63cc38b5080969 Mon Sep 17 00:00:00 2001
From: Jaydeep Patil <jaydeep.patil@imgtec.com>
Date: Wed, 30 Mar 2016 10:37:30 +0100
Subject: [PATCH] [MIPS] Update inline asm for R6 and add R6 as subtarget
---
arch/mips/atomic_arch.h | 17 +++--------------
arch/mips/pthread_arch.h | 8 +-------
arch/mips64/atomic_arch.h | 12 +++++-------
arch/mips64/pthread_arch.h | 7 +------
configure | 2 ++
5 files changed, 12 insertions(+), 34 deletions(-)
diff --git a/arch/mips/atomic_arch.h b/arch/mips/atomic_arch.h
index ce2823b..4dbe4bb 100644
--- a/arch/mips/atomic_arch.h
+++ b/arch/mips/atomic_arch.h
@@ -3,10 +3,8 @@ static inline int a_ll(volatile int *p)
{
int v;
__asm__ __volatile__ (
- ".set push ; .set mips2\n\t"
"ll %0, %1"
- "\n\t.set pop"
- : "=r"(v) : "m"(*p));
+ : "=r"(v) : "ZC"(*p));
return v;
}
@@ -15,24 +13,15 @@ static inline int a_sc(volatile int *p, int v)
{
int r;
__asm__ __volatile__ (
- ".set push ; .set mips2\n\t"
"sc %0, %1"
- "\n\t.set pop"
- : "=r"(r), "=m"(*p) : "0"(v) : "memory");
+ : "=r"(r), "=ZC"(*p) : "0"(v) : "memory");
return r;
}
#define a_barrier a_barrier
static inline void a_barrier()
{
- /* mips2 sync, but using too many directives causes
- * gcc not to inline it, so encode with .long instead. */
- __asm__ __volatile__ (".long 0xf" : : : "memory");
-#if 0
- __asm__ __volatile__ (
- ".set push ; .set mips2 ; sync ; .set pop"
- : : : "memory");
-#endif
+ __asm__ __volatile__ ("sync" : : : "memory");
}
#define a_pre_llsc a_barrier
diff --git a/arch/mips/pthread_arch.h b/arch/mips/pthread_arch.h
index 8a49965..d8b6955 100644
--- a/arch/mips/pthread_arch.h
+++ b/arch/mips/pthread_arch.h
@@ -1,13 +1,7 @@
static inline struct pthread *__pthread_self()
{
-#ifdef __clang__
- char *tp;
- __asm__ __volatile__ (".word 0x7c03e83b ; move %0, $3" : "=r" (tp) : : "$3" );
-#else
register char *tp __asm__("$3");
- /* rdhwr $3,$29 */
- __asm__ __volatile__ (".word 0x7c03e83b" : "=r" (tp) );
-#endif
+ __asm__ __volatile__ ("rdhwr %0,$29" : "=r" (tp));
return (pthread_t)(tp - 0x7000 - sizeof(struct pthread));
}
diff --git a/arch/mips64/atomic_arch.h b/arch/mips64/atomic_arch.h
index b468fd9..ac92891 100644
--- a/arch/mips64/atomic_arch.h
+++ b/arch/mips64/atomic_arch.h
@@ -4,7 +4,7 @@ static inline int a_ll(volatile int *p)
int v;
__asm__ __volatile__ (
"ll %0, %1"
- : "=r"(v) : "m"(*p));
+ : "=r"(v) : "ZC"(*p));
return v;
}
@@ -14,7 +14,7 @@ static inline int a_sc(volatile int *p, int v)
int r;
__asm__ __volatile__ (
"sc %0, %1"
- : "=r"(r), "=m"(*p) : "0"(v) : "memory");
+ : "=r"(r), "=ZC"(*p) : "0"(v) : "memory");
return r;
}
@@ -24,7 +24,7 @@ static inline void *a_ll_p(volatile void *p)
void *v;
__asm__ __volatile__ (
"lld %0, %1"
- : "=r"(v) : "m"(*(void *volatile *)p));
+ : "=r"(v) : "ZC"(*(void *volatile *)p));
return v;
}
@@ -34,16 +34,14 @@ static inline int a_sc_p(volatile void *p, void *v)
long r;
__asm__ __volatile__ (
"scd %0, %1"
- : "=r"(r), "=m"(*(void *volatile *)p) : "0"(v) : "memory");
+ : "=r"(r), "=ZC"(*(void *volatile *)p) : "0"(v) : "memory");
return r;
}
#define a_barrier a_barrier
static inline void a_barrier()
{
- /* mips2 sync, but using too many directives causes
- * gcc not to inline it, so encode with .long instead. */
- __asm__ __volatile__ (".long 0xf" : : : "memory");
+ __asm__ __volatile__ ("sync" : : : "memory");
}
#define a_pre_llsc a_barrier
diff --git a/arch/mips64/pthread_arch.h b/arch/mips64/pthread_arch.h
index b42edbe..d8b6955 100644
--- a/arch/mips64/pthread_arch.h
+++ b/arch/mips64/pthread_arch.h
@@ -1,12 +1,7 @@
static inline struct pthread *__pthread_self()
{
-#ifdef __clang__
- char *tp;
- __asm__ __volatile__ (".word 0x7c03e83b ; move %0, $3" : "=r" (tp) : : "$3" );
-#else
register char *tp __asm__("$3");
- __asm__ __volatile__ (".word 0x7c03e83b" : "=r" (tp) );
-#endif
+ __asm__ __volatile__ ("rdhwr %0,$29" : "=r" (tp));
return (pthread_t)(tp - 0x7000 - sizeof(struct pthread));
}
diff --git a/configure b/configure
index 213a825..969671d 100755
--- a/configure
+++ b/configure
@@ -612,11 +612,13 @@ trycppif __AARCH64EB__ "$t" && SUBARCH=${SUBARCH}_be
fi
if test "$ARCH" = "mips" ; then
+trycppif "__mips_isa_rev >= 6" "$t" && SUBARCH=${SUBARCH}r6
trycppif "_MIPSEL || __MIPSEL || __MIPSEL__" "$t" && SUBARCH=${SUBARCH}el
trycppif __mips_soft_float "$t" && SUBARCH=${SUBARCH}-sf
fi
if test "$ARCH" = "mips64" ; then
+trycppif "__mips_isa_rev >= 6" "$t" && SUBARCH=${SUBARCH}r6
trycppif "_MIPSEL || __MIPSEL || __MIPSEL__" "$t" && SUBARCH=${SUBARCH}el
trycppif __mips_soft_float "$t" && SUBARCH=${SUBARCH}-sf
fi
--
2.1.4
Thanks,
Jaydeep
next prev parent reply other threads:[~2016-03-30 9:45 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-21 6:03 Jaydeep Patil
2016-03-21 17:37 ` dalias
2016-03-22 4:58 ` Jaydeep Patil
2016-03-22 21:22 ` Rich Felker
2016-03-23 6:37 ` Jaydeep Patil
2016-03-23 15:03 ` Rich Felker
2016-03-28 5:07 ` Jaydeep Patil
2016-03-28 13:04 ` Rich Felker
2016-03-29 2:19 ` Rich Felker
2016-03-29 3:54 ` Jaydeep Patil
2016-03-29 4:10 ` Rich Felker
2016-03-29 7:16 ` Jaydeep Patil
2016-03-29 13:32 ` Rich Felker
2016-03-30 9:45 ` Jaydeep Patil [this message]
2016-03-30 14:29 ` Rich Felker
2016-03-30 15:28 ` Rich Felker
2016-03-31 5:20 ` Jaydeep Patil
2016-03-29 3:55 ` Jaydeep Patil
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=BD7773622145634B952E5B54ACA8E349AA24C18C@PUMAIL01.pu.imgtec.org \
--to=jaydeep.patil@imgtec.com \
--cc=dalias@libc.org \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).