mailing list of musl libc
 help / color / mirror / code / Atom feed
* [PATCH] Fix atomic_arch.h for MIPS32 R6
@ 2016-03-21  6:03 Jaydeep Patil
  2016-03-21 17:37 ` dalias
  0 siblings, 1 reply; 18+ messages in thread
From: Jaydeep Patil @ 2016-03-21  6:03 UTC (permalink / raw)
  To: dalias; +Cc: musl, nsz

[-- Attachment #1: Type: text/plain, Size: 1447 bytes --]

Hi Rich,

The arch/mips/atomic_arch.h uses MIPS2 opcode for LL and SC instructions. Opcodes of these instructions differ on MIPSR6.

Refer to https://github.com/JaydeepIMG/musl-1/tree/fix_atomic_for_MIPS32_R6 for the patch.

From 63428cfc5dfa75d2771ba8205067c438942e1a60 Mon Sep 17 00:00:00 2001
From: Jaydeep Patil <jaydeep.patil@imgtec.com>
Date: Mon, 21 Mar 2016 06:00:39 +0000
Subject: [PATCH] Fix for opcodes of LL/SC instructions for MIPSR6

---
arch/mips/atomic_arch.h | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/arch/mips/atomic_arch.h b/arch/mips/atomic_arch.h
index ce2823b..16b1542 100644
--- a/arch/mips/atomic_arch.h
+++ b/arch/mips/atomic_arch.h
@@ -3,9 +3,13 @@ static inline int a_ll(volatile int *p)
{
        int v;
        __asm__ __volatile__ (
+#if __mips_isa_rev < 6
                ".set push ; .set mips2\n\t"
+#endif
                "ll %0, %1"
+#if __mips_isa_rev < 6
                "\n\t.set pop"
+#endif
                : "=r"(v) : "m"(*p));
        return v;
}
@@ -15,9 +19,13 @@ static inline int a_sc(volatile int *p, int v)
{
        int r;
        __asm__ __volatile__ (
+#if __mips_isa_rev < 6
                ".set push ; .set mips2\n\t"
+#endif
                "sc %0, %1"
+#if __mips_isa_rev < 6
                "\n\t.set pop"
+#endif
                : "=r"(r), "=m"(*p) : "0"(v) : "memory");
        return r;
}
--
2.1.4

Regards,
Jaydeep


[-- Attachment #2: Type: text/html, Size: 6123 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-21  6:03 [PATCH] Fix atomic_arch.h for MIPS32 R6 Jaydeep Patil
@ 2016-03-21 17:37 ` dalias
  2016-03-22  4:58   ` Jaydeep Patil
  0 siblings, 1 reply; 18+ messages in thread
From: dalias @ 2016-03-21 17:37 UTC (permalink / raw)
  To: musl

On Mon, Mar 21, 2016 at 06:03:47AM +0000, Jaydeep Patil wrote:
> Hi Rich,
> 
> The arch/mips/atomic_arch.h uses MIPS2 opcode for LL and SC
> instructions. Opcodes of these instructions differ on MIPSR6.

Does this mean MIPSR6 is an incompatible ISA that can't run normal
MIPS binaries? If so that's a messy situation we need to find a way to
deal with; if the difference is just LLSC though then perhaps the
kernel's emulation handles it (albeit very slowly).

It would be helpful if you could provide a link to the documentation
of this issue (different opcodes).

> Refer to https://github.com/JaydeepIMG/musl-1/tree/fix_atomic_for_MIPS32_R6 for the patch.
> 
> >From 63428cfc5dfa75d2771ba8205067c438942e1a60 Mon Sep 17 00:00:00 2001
> From: Jaydeep Patil <jaydeep.patil@imgtec.com>
> Date: Mon, 21 Mar 2016 06:00:39 +0000
> Subject: [PATCH] Fix for opcodes of LL/SC instructions for MIPSR6
> 
> ---
> arch/mips/atomic_arch.h | 8 ++++++++
> 1 file changed, 8 insertions(+)
> 
> diff --git a/arch/mips/atomic_arch.h b/arch/mips/atomic_arch.h
> index ce2823b..16b1542 100644
> --- a/arch/mips/atomic_arch.h
> +++ b/arch/mips/atomic_arch.h
> @@ -3,9 +3,13 @@ static inline int a_ll(volatile int *p)
> {
>         int v;
>         __asm__ __volatile__ (
> +#if __mips_isa_rev < 6
>                 ".set push ; .set mips2\n\t"
> +#endif
>                 "ll %0, %1"
> +#if __mips_isa_rev < 6
>                 "\n\t.set pop"
> +#endif

I think just the .set mips2 could be inside #ifdef with the push/pop
always used; that produces less #ifdef clutter. But first we need to
figure out the above issues.

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-21 17:37 ` dalias
@ 2016-03-22  4:58   ` Jaydeep Patil
  2016-03-22 21:22     ` Rich Felker
  0 siblings, 1 reply; 18+ messages in thread
From: Jaydeep Patil @ 2016-03-22  4:58 UTC (permalink / raw)
  To: musl; +Cc: dalias, nsz

>-----Original Message-----
>From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of dalias@libc.org
>Sent: 21 March 2016 PM 11:08
>To: musl@lists.openwall.com
>Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>
>On Mon, Mar 21, 2016 at 06:03:47AM +0000, Jaydeep Patil wrote:
>> Hi Rich,
>>
>> The arch/mips/atomic_arch.h uses MIPS2 opcode for LL and SC
>> instructions. Opcodes of these instructions differ on MIPSR6.
>
>Does this mean MIPSR6 is an incompatible ISA that can't run normal MIPS
>binaries? If so that's a messy situation we need to find a way to deal with; if
>the difference is just LLSC though then perhaps the kernel's emulation
>handles it (albeit very slowly).
>
>
>It would be helpful if you could provide a link to the documentation of this
>issue (different opcodes).

Refer to https://imagination-technologies-cloudfront-assets.s3.amazonaws.com/documentation/MD00086-2B-MIPS32BIS-AFP-06.04.pdf (Page 209) for details.

>> Refer to https://github.com/JaydeepIMG/musl-
>1/tree/fix_atomic_for_MIPS32_R6 for the patch.
>>
>> >From 63428cfc5dfa75d2771ba8205067c438942e1a60 Mon Sep 17 00:00:00
>> >2001
>> From: Jaydeep Patil <jaydeep.patil@imgtec.com>
>> Date: Mon, 21 Mar 2016 06:00:39 +0000
>> Subject: [PATCH] Fix for opcodes of LL/SC instructions for MIPSR6
>>
>> ---
>> arch/mips/atomic_arch.h | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/arch/mips/atomic_arch.h b/arch/mips/atomic_arch.h index
>> ce2823b..16b1542 100644
>> --- a/arch/mips/atomic_arch.h
>> +++ b/arch/mips/atomic_arch.h
>> @@ -3,9 +3,13 @@ static inline int a_ll(volatile int *p) {
>>         int v;
>>         __asm__ __volatile__ (
>> +#if __mips_isa_rev < 6
>>                 ".set push ; .set mips2\n\t"
>> +#endif
>>                 "ll %0, %1"
>> +#if __mips_isa_rev < 6
>>                 "\n\t.set pop"
>> +#endif
>
>I think just the .set mips2 could be inside #ifdef with the push/pop always
>used; that produces less #ifdef clutter. But first we need to figure out the
>above issues.
>

We don't need push/pop if we are not doing mips2

>Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-22  4:58   ` Jaydeep Patil
@ 2016-03-22 21:22     ` Rich Felker
  2016-03-23  6:37       ` Jaydeep Patil
  0 siblings, 1 reply; 18+ messages in thread
From: Rich Felker @ 2016-03-22 21:22 UTC (permalink / raw)
  To: musl

On Tue, Mar 22, 2016 at 04:58:51AM +0000, Jaydeep Patil wrote:
> >-----Original Message-----
> >From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of dalias@libc.org
> >Sent: 21 March 2016 PM 11:08
> >To: musl@lists.openwall.com
> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
> >
> >On Mon, Mar 21, 2016 at 06:03:47AM +0000, Jaydeep Patil wrote:
> >> Hi Rich,
> >>
> >> The arch/mips/atomic_arch.h uses MIPS2 opcode for LL and SC
> >> instructions. Opcodes of these instructions differ on MIPSR6.
> >
> >Does this mean MIPSR6 is an incompatible ISA that can't run normal MIPS
> >binaries? If so that's a messy situation we need to find a way to deal with; if
> >the difference is just LLSC though then perhaps the kernel's emulation
> >handles it (albeit very slowly).
> >
> >
> >It would be helpful if you could provide a link to the documentation of this
> >issue (different opcodes).
> 
> Refer to
> https://imagination-technologies-cloudfront-assets.s3.amazonaws.com/documentation/MD00086-2B-MIPS32BIS-AFP-06.04.pdf
> (Page 209) for details.

Page 454 contains the best info I could find, which seems to say that
MIPS R6 is essentially a MIPS-incompatible ISA (it can't reliably
execute pre-R6 code). Is this correct? If so that's really
unfortunate. Unfortunately there does not seem to be any info here
specific to LL/SC, and the page 209 you cited is really sparse.

If this is really the case then we probably need to consider whether
some kind of awful runtime-switching mechanism is needed for baseline
MIPS ISA levels, or whether users just have to consider R6 "incapable
of running pre-R6 binaries". But in the latter case we might want to
use a different dynamic linker name for R6.

> >> arch/mips/atomic_arch.h | 8 ++++++++
> >> 1 file changed, 8 insertions(+)
> >>
> >> diff --git a/arch/mips/atomic_arch.h b/arch/mips/atomic_arch.h index
> >> ce2823b..16b1542 100644
> >> --- a/arch/mips/atomic_arch.h
> >> +++ b/arch/mips/atomic_arch.h
> >> @@ -3,9 +3,13 @@ static inline int a_ll(volatile int *p) {
> >>         int v;
> >>         __asm__ __volatile__ (
> >> +#if __mips_isa_rev < 6
> >>                 ".set push ; .set mips2\n\t"
> >> +#endif
> >>                 "ll %0, %1"
> >> +#if __mips_isa_rev < 6
> >>                 "\n\t.set pop"
> >> +#endif
> >
> >I think just the .set mips2 could be inside #ifdef with the push/pop always
> >used; that produces less #ifdef clutter. But first we need to figure out the
> >above issues.
> 
> We don't need push/pop if we are not doing mips2

I was just saying it makes the code less cluttered to use them
spuriously even though we don't need to:

		".set push ; "
#if __mips_isa_rev < 6
		".set mips2 ; "
#endif
		"ll %0, %1 ; .set pop"

or similar.

It's also not clear to me whether the "m" constraint is valid anymore
for the R6 ll/sc instructions since they take a 9-bit offset now
instead of a 16-bit offset. The compiler could generate an address
expression whose offset part does not fit in 9 bits. In that case we
may need to #if the whole function (or at least the __asm__ statement)
separately rather than just skipping the .set mips2...

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-22 21:22     ` Rich Felker
@ 2016-03-23  6:37       ` Jaydeep Patil
  2016-03-23 15:03         ` Rich Felker
  0 siblings, 1 reply; 18+ messages in thread
From: Jaydeep Patil @ 2016-03-23  6:37 UTC (permalink / raw)
  To: musl

>-----Original Message-----
>From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker
>Sent: 23 March 2016 AM 02:52
>To: musl@lists.openwall.com
>Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>
>On Tue, Mar 22, 2016 at 04:58:51AM +0000, Jaydeep Patil wrote:
>> >-----Original Message-----
>> >From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of
>> >dalias@libc.org
>> >Sent: 21 March 2016 PM 11:08
>> >To: musl@lists.openwall.com
>> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>> >
>> >On Mon, Mar 21, 2016 at 06:03:47AM +0000, Jaydeep Patil wrote:
>> >> Hi Rich,
>> >>
>> >> The arch/mips/atomic_arch.h uses MIPS2 opcode for LL and SC
>> >> instructions. Opcodes of these instructions differ on MIPSR6.
>> >
>> >Does this mean MIPSR6 is an incompatible ISA that can't run normal
>> >MIPS binaries? If so that's a messy situation we need to find a way
>> >to deal with; if the difference is just LLSC though then perhaps the
>> >kernel's emulation handles it (albeit very slowly).
>> >
>> >
>> >It would be helpful if you could provide a link to the documentation
>> >of this issue (different opcodes).
>>
>> Refer to
>> https://imagination-technologies-cloudfront-assets.s3.amazonaws.com/do
>> cumentation/MD00086-2B-MIPS32BIS-AFP-06.04.pdf
>> (Page 209) for details.
>
>Page 454 contains the best info I could find, which seems to say that MIPS R6
>is essentially a MIPS-incompatible ISA (it can't reliably execute pre-R6 code). Is
>this correct? If so that's really unfortunate. Unfortunately there does not

R6 is not binary compatible with pre-R6.

>seem to be any info here specific to LL/SC, and the page 209 you cited is really
>sparse.

It is page 219 which talks about R6 and pre-R6 LL opcodes. 

>If this is really the case then we probably need to consider whether some kind
>of awful runtime-switching mechanism is needed for baseline MIPS ISA levels,
>or whether users just have to consider R6 "incapable of running pre-R6
>binaries". But in the latter case we might want to use a different dynamic
>linker name for R6.
>
>> >> arch/mips/atomic_arch.h | 8 ++++++++
>> >> 1 file changed, 8 insertions(+)
>> >>
>> >> diff --git a/arch/mips/atomic_arch.h b/arch/mips/atomic_arch.h
>> >> index
>> >> ce2823b..16b1542 100644
>> >> --- a/arch/mips/atomic_arch.h
>> >> +++ b/arch/mips/atomic_arch.h
>> >> @@ -3,9 +3,13 @@ static inline int a_ll(volatile int *p) {
>> >>         int v;
>> >>         __asm__ __volatile__ (
>> >> +#if __mips_isa_rev < 6
>> >>                 ".set push ; .set mips2\n\t"
>> >> +#endif
>> >>                 "ll %0, %1"
>> >> +#if __mips_isa_rev < 6
>> >>                 "\n\t.set pop"
>> >> +#endif
>> >
>> >I think just the .set mips2 could be inside #ifdef with the push/pop
>> >always used; that produces less #ifdef clutter. But first we need to
>> >figure out the above issues.
>>
>> We don't need push/pop if we are not doing mips2
>
>I was just saying it makes the code less cluttered to use them spuriously even
>though we don't need to:
>
>		".set push ; "
>#if __mips_isa_rev < 6
>		".set mips2 ; "
>#endif
>		"ll %0, %1 ; .set pop"
>
>or similar.
>
>It's also not clear to me whether the "m" constraint is valid anymore for the R6
>ll/sc instructions since they take a 9-bit offset now instead of a 16-bit offset.
>The compiler could generate an address expression whose offset part does
>not fit in 9 bits. In that case we may need to #if the whole function (or at least
>the __asm__ statement) separately rather than just skipping the .set mips2...
>

The "m" constrain is still valid here, as the offset will be 0 in this case.

>Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-23  6:37       ` Jaydeep Patil
@ 2016-03-23 15:03         ` Rich Felker
  2016-03-28  5:07           ` Jaydeep Patil
  0 siblings, 1 reply; 18+ messages in thread
From: Rich Felker @ 2016-03-23 15:03 UTC (permalink / raw)
  To: Jaydeep Patil; +Cc: musl

On Wed, Mar 23, 2016 at 06:37:41AM +0000, Jaydeep Patil wrote:
> >-----Original Message-----
> >From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker
> >Sent: 23 March 2016 AM 02:52
> >To: musl@lists.openwall.com
> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
> >
> >On Tue, Mar 22, 2016 at 04:58:51AM +0000, Jaydeep Patil wrote:
> >> >-----Original Message-----
> >> >From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of
> >> >dalias@libc.org
> >> >Sent: 21 March 2016 PM 11:08
> >> >To: musl@lists.openwall.com
> >> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
> >> >
> >> >On Mon, Mar 21, 2016 at 06:03:47AM +0000, Jaydeep Patil wrote:
> >> >> Hi Rich,
> >> >>
> >> >> The arch/mips/atomic_arch.h uses MIPS2 opcode for LL and SC
> >> >> instructions. Opcodes of these instructions differ on MIPSR6.
> >> >
> >> >Does this mean MIPSR6 is an incompatible ISA that can't run normal
> >> >MIPS binaries? If so that's a messy situation we need to find a way
> >> >to deal with; if the difference is just LLSC though then perhaps the
> >> >kernel's emulation handles it (albeit very slowly).
> >> >
> >> >
> >> >It would be helpful if you could provide a link to the documentation
> >> >of this issue (different opcodes).
> >>
> >> Refer to
> >> https://imagination-technologies-cloudfront-assets.s3.amazonaws.com/do
> >> cumentation/MD00086-2B-MIPS32BIS-AFP-06.04.pdf
> >> (Page 209) for details.
> >
> >Page 454 contains the best info I could find, which seems to say that MIPS R6
> >is essentially a MIPS-incompatible ISA (it can't reliably execute pre-R6 code). Is
> >this correct? If so that's really unfortunate. Unfortunately there does not
> 
> R6 is not binary compatible with pre-R6.

If R6 doesn't even try to be compatible with pre-R6 then I think we
should treat it as a separate subarch and add "r6" in the dynamic
linker name, either before or after the optional "el" component. Is
there some general documentation of this incompatibility? How do GCC
and binutils handle it? Are the ELF files flagged incompatible
somehow?

> >I was just saying it makes the code less cluttered to use them spuriously even
> >though we don't need to:
> >
> >		".set push ; "
> >#if __mips_isa_rev < 6
> >		".set mips2 ; "
> >#endif
> >		"ll %0, %1 ; .set pop"
> >
> >or similar.
> >
> >It's also not clear to me whether the "m" constraint is valid anymore for the R6
> >ll/sc instructions since they take a 9-bit offset now instead of a 16-bit offset.
> >The compiler could generate an address expression whose offset part does
> >not fit in 9 bits. In that case we may need to #if the whole function (or at least
> >the __asm__ statement) separately rather than just skipping the .set mips2....
> >
> 
> The "m" constrain is still valid here, as the offset will be 0 in this case..

How can you assume the offset will be 0? It's the compiler's choice
what to use. For instance, a_cas(&foo->bar, t, s) is likely to have an
offset equal to offsetof(__typeof__(foo),bar). AFAIK this happens in
practice with small offsets in mutex structures, etc. so the bug may
be unlikely to be hit, but I think it's still an incorrect-constraint
bug.

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-23 15:03         ` Rich Felker
@ 2016-03-28  5:07           ` Jaydeep Patil
  2016-03-28 13:04             ` Rich Felker
  0 siblings, 1 reply; 18+ messages in thread
From: Jaydeep Patil @ 2016-03-28  5:07 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

>-----Original Message-----
>From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker
>Sent: 23 March 2016 PM 08:33
>To: Jaydeep Patil
>Cc: musl@lists.openwall.com
>Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>
>On Wed, Mar 23, 2016 at 06:37:41AM +0000, Jaydeep Patil wrote:
>> >-----Original Message-----
>> >From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker
>> >Sent: 23 March 2016 AM 02:52
>> >To: musl@lists.openwall.com
>> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>> >
>> >On Tue, Mar 22, 2016 at 04:58:51AM +0000, Jaydeep Patil wrote:
>> >> >-----Original Message-----
>> >> >From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of
>> >> >dalias@libc.org
>> >> >Sent: 21 March 2016 PM 11:08
>> >> >To: musl@lists.openwall.com
>> >> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>> >> >
>> >> >On Mon, Mar 21, 2016 at 06:03:47AM +0000, Jaydeep Patil wrote:
>> >> >> Hi Rich,
>> >> >>
>> >> >> The arch/mips/atomic_arch.h uses MIPS2 opcode for LL and SC
>> >> >> instructions. Opcodes of these instructions differ on MIPSR6.
>> >> >
>> >> >Does this mean MIPSR6 is an incompatible ISA that can't run normal
>> >> >MIPS binaries? If so that's a messy situation we need to find a
>> >> >way to deal with; if the difference is just LLSC though then
>> >> >perhaps the kernel's emulation handles it (albeit very slowly).
>> >> >
>> >> >
>> >> >It would be helpful if you could provide a link to the
>> >> >documentation of this issue (different opcodes).
>> >>
>> >> Refer to
>> >> https://imagination-technologies-cloudfront-assets.s3.amazonaws.com
>> >> /do cumentation/MD00086-2B-MIPS32BIS-AFP-06.04.pdf
>> >> (Page 209) for details.
>> >
>> >Page 454 contains the best info I could find, which seems to say that
>> >MIPS R6 is essentially a MIPS-incompatible ISA (it can't reliably
>> >execute pre-R6 code). Is this correct? If so that's really
>> >unfortunate. Unfortunately there does not
>>
>> R6 is not binary compatible with pre-R6.
>
>If R6 doesn't even try to be compatible with pre-R6 then I think we should
>treat it as a separate subarch and add "r6" in the dynamic linker name, either
>before or after the optional "el" component. Is there some general
>documentation of this incompatibility? How do GCC and binutils handle it? Are
>the ELF files flagged incompatible somehow?

There is no document which talks about this incompatibility particularly.
The e_flags member of the ELF header has two new flags (E_MIPS_ARCH_32R6 and E_MIPS_ARCH_64R6) added for R6.
We can choose to add "r6" in the dynamic linker's name.

>> >I was just saying it makes the code less cluttered to use them
>> >spuriously even though we don't need to:
>> >
>> >		".set push ; "
>> >#if __mips_isa_rev < 6
>> >		".set mips2 ; "
>> >#endif
>> >		"ll %0, %1 ; .set pop"
>> >
>> >or similar.
>> >
>> >It's also not clear to me whether the "m" constraint is valid anymore
>> >for the R6 ll/sc instructions since they take a 9-bit offset now instead of a
>16-bit offset.
>> >The compiler could generate an address expression whose offset part
>> >does not fit in 9 bits. In that case we may need to #if the whole
>> >function (or at least the __asm__ statement) separately rather than just
>skipping the .set mips2....
>> >
>>
>> The "m" constrain is still valid here, as the offset will be 0 in this case..
>
>How can you assume the offset will be 0? It's the compiler's choice what to
>use. For instance, a_cas(&foo->bar, t, s) is likely to have an offset equal to
>offsetof(__typeof__(foo),bar). AFAIK this happens in practice with small
>offsets in mutex structures, etc. so the bug may be unlikely to be hit, but I
>think it's still an incorrect-constraint bug.

Compiler generates appropriate LL/SC based on the offset. 
Compiler adds the offset to the base register if it does not fit 9bits.

>Rich



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-28  5:07           ` Jaydeep Patil
@ 2016-03-28 13:04             ` Rich Felker
  2016-03-29  2:19               ` Rich Felker
                                 ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Rich Felker @ 2016-03-28 13:04 UTC (permalink / raw)
  To: Jaydeep Patil; +Cc: musl

On Mon, Mar 28, 2016 at 05:07:39AM +0000, Jaydeep Patil wrote:
> >> >I was just saying it makes the code less cluttered to use them
> >> >spuriously even though we don't need to:
> >> >
> >> >		".set push ; "
> >> >#if __mips_isa_rev < 6
> >> >		".set mips2 ; "
> >> >#endif
> >> >		"ll %0, %1 ; .set pop"
> >> >
> >> >or similar.
> >> >
> >> >It's also not clear to me whether the "m" constraint is valid anymore
> >> >for the R6 ll/sc instructions since they take a 9-bit offset now instead of a
> >16-bit offset.
> >> >The compiler could generate an address expression whose offset part
> >> >does not fit in 9 bits. In that case we may need to #if the whole
> >> >function (or at least the __asm__ statement) separately rather than just
> >skipping the .set mips2....
> >> >
> >>
> >> The "m" constrain is still valid here, as the offset will be 0 in this case..
> >
> >How can you assume the offset will be 0? It's the compiler's choice what to
> >use. For instance, a_cas(&foo->bar, t, s) is likely to have an offset equal to
> >offsetof(__typeof__(foo),bar). AFAIK this happens in practice with small
> >offsets in mutex structures, etc. so the bug may be unlikely to be hit, but I
> >think it's still an incorrect-constraint bug.
> 
> Compiler generates appropriate LL/SC based on the offset. 
> Compiler adds the offset to the base register if it does not fit 9bits.

The compiler has no way of knowing that the operand will be used with
ll with the 9-bit offset restriction; as far as it knows, it will be
used in a normal context where a 16-bit offset is valid. I don't have
a toolchain that will target r6, but you can try the following program
which produces an offset of 4096 for loading p[1024]:

unsigned ll1k(volatile unsigned *p)
{
	unsigned val;
	__asm__ __volatile__ ("ll %0, %1" : "=r"(val) : "m"(p[1024]) : "memory" );
	return val;
}

I would expect this to produce errors at assembly time on r6.

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-28 13:04             ` Rich Felker
@ 2016-03-29  2:19               ` Rich Felker
  2016-03-29  3:54               ` Jaydeep Patil
  2016-03-29  3:55               ` Jaydeep Patil
  2 siblings, 0 replies; 18+ messages in thread
From: Rich Felker @ 2016-03-29  2:19 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 2473 bytes --]

On Mon, Mar 28, 2016 at 09:04:51AM -0400, Rich Felker wrote:
> On Mon, Mar 28, 2016 at 05:07:39AM +0000, Jaydeep Patil wrote:
> > >> >I was just saying it makes the code less cluttered to use them
> > >> >spuriously even though we don't need to:
> > >> >
> > >> >		".set push ; "
> > >> >#if __mips_isa_rev < 6
> > >> >		".set mips2 ; "
> > >> >#endif
> > >> >		"ll %0, %1 ; .set pop"
> > >> >
> > >> >or similar.
> > >> >
> > >> >It's also not clear to me whether the "m" constraint is valid anymore
> > >> >for the R6 ll/sc instructions since they take a 9-bit offset now instead of a
> > >16-bit offset.
> > >> >The compiler could generate an address expression whose offset part
> > >> >does not fit in 9 bits. In that case we may need to #if the whole
> > >> >function (or at least the __asm__ statement) separately rather than just
> > >skipping the .set mips2....
> > >> >
> > >>
> > >> The "m" constrain is still valid here, as the offset will be 0 in this case..
> > >
> > >How can you assume the offset will be 0? It's the compiler's choice what to
> > >use. For instance, a_cas(&foo->bar, t, s) is likely to have an offset equal to
> > >offsetof(__typeof__(foo),bar). AFAIK this happens in practice with small
> > >offsets in mutex structures, etc. so the bug may be unlikely to be hit, but I
> > >think it's still an incorrect-constraint bug.
> > 
> > Compiler generates appropriate LL/SC based on the offset. 
> > Compiler adds the offset to the base register if it does not fit 9bits.
> 
> The compiler has no way of knowing that the operand will be used with
> ll with the 9-bit offset restriction; as far as it knows, it will be
> used in a normal context where a 16-bit offset is valid. I don't have
> a toolchain that will target r6, but you can try the following program
> which produces an offset of 4096 for loading p[1024]:
> 
> unsigned ll1k(volatile unsigned *p)
> {
> 	unsigned val;
> 	__asm__ __volatile__ ("ll %0, %1" : "=r"(val) : "m"(p[1024]) : "memory" );
> 	return val;
> }
> 
> I would expect this to produce errors at assembly time on r6.

Indeed, the ZC constraint seems to have been added to address this;
see:

https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html

It's not available on old gcc versions (pre-mipsr6) so I think the
attached patch would be appropriate. Let me know if it works for you.

We still need to add r6 detection in configure and the dynamic linker
name in reloc.h, and fix the pthread_arch.h issue.

Rich

[-- Attachment #2: mipsr6-atomic.diff --]
[-- Type: text/plain, Size: 1279 bytes --]

diff --git a/arch/mips/atomic_arch.h b/arch/mips/atomic_arch.h
index ce2823b..ad534f2 100644
--- a/arch/mips/atomic_arch.h
+++ b/arch/mips/atomic_arch.h
@@ -2,11 +2,17 @@
 static inline int a_ll(volatile int *p)
 {
 	int v;
+#if __mips_isa_rev < 6
 	__asm__ __volatile__ (
 		".set push ; .set mips2\n\t"
 		"ll %0, %1"
 		"\n\t.set pop"
 		: "=r"(v) : "m"(*p));
+#else
+	__asm__ __volatile__ (
+		"ll %0, %1"
+		: "=r"(v) : "ZC"(*p));
+#endif
 	return v;
 }
 
@@ -14,24 +20,29 @@ static inline int a_ll(volatile int *p)
 static inline int a_sc(volatile int *p, int v)
 {
 	int r;
+#if __mips_isa_rev < 6
 	__asm__ __volatile__ (
 		".set push ; .set mips2\n\t"
 		"sc %0, %1"
 		"\n\t.set pop"
 		: "=r"(r), "=m"(*p) : "0"(v) : "memory");
+#else
+	__asm__ __volatile__ (
+		"sc %0, %1"
+		: "=r"(r), "=ZC"(*p) : "0"(v) : "memory");
+#endif
 	return r;
 }
 
 #define a_barrier a_barrier
 static inline void a_barrier()
 {
+#if __mips_isa_rev < 2
 	/* mips2 sync, but using too many directives causes
 	 * gcc not to inline it, so encode with .long instead. */
 	__asm__ __volatile__ (".long 0xf" : : : "memory");
-#if 0
-	__asm__ __volatile__ (
-		".set push ; .set mips2 ; sync ; .set pop"
-		: : : "memory");
+#else
+	__asm__ __volatile__ ("sync" : : : "memory");
 #endif
 }
 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-28 13:04             ` Rich Felker
  2016-03-29  2:19               ` Rich Felker
@ 2016-03-29  3:54               ` Jaydeep Patil
  2016-03-29  4:10                 ` Rich Felker
  2016-03-29  3:55               ` Jaydeep Patil
  2 siblings, 1 reply; 18+ messages in thread
From: Jaydeep Patil @ 2016-03-29  3:54 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

>-----Original Message-----
>From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker
>Sent: 28 March 2016 PM 06:35
>To: Jaydeep Patil
>Cc: musl@lists.openwall.com
>Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>
>On Mon, Mar 28, 2016 at 05:07:39AM +0000, Jaydeep Patil wrote:
>> >> >I was just saying it makes the code less cluttered to use them
>> >> >spuriously even though we don't need to:
>> >> >
>> >> >		".set push ; "
>> >> >#if __mips_isa_rev < 6
>> >> >		".set mips2 ; "
>> >> >#endif
>> >> >		"ll %0, %1 ; .set pop"
>> >> >
>> >> >or similar.
>> >> >
>> >> >It's also not clear to me whether the "m" constraint is valid
>> >> >anymore for the R6 ll/sc instructions since they take a 9-bit
>> >> >offset now instead of a
>> >16-bit offset.
>> >> >The compiler could generate an address expression whose offset
>> >> >part does not fit in 9 bits. In that case we may need to #if the
>> >> >whole function (or at least the __asm__ statement) separately
>> >> >rather than just
>> >skipping the .set mips2....
>> >> >
>> >>
>> >> The "m" constrain is still valid here, as the offset will be 0 in this case..
>> >
>> >How can you assume the offset will be 0? It's the compiler's choice
>> >what to use. For instance, a_cas(&foo->bar, t, s) is likely to have
>> >an offset equal to offsetof(__typeof__(foo),bar). AFAIK this happens
>> >in practice with small offsets in mutex structures, etc. so the bug
>> >may be unlikely to be hit, but I think it's still an incorrect-constraint bug.
>>
>> Compiler generates appropriate LL/SC based on the offset.
>> Compiler adds the offset to the base register if it does not fit 9bits.
>
>The compiler has no way of knowing that the operand will be used with ll with
>the 9-bit offset restriction; as far as it knows, it will be used in a normal
>context where a 16-bit offset is valid. I don't have a toolchain that will target
>r6, but you can try the following program which produces an offset of 4096 for
>loading p[1024]:
>
>unsigned ll1k(volatile unsigned *p)
>{
>	unsigned val;
>	__asm__ __volatile__ ("ll %0, %1" : "=r"(val) : "m"(p[1024]) :
>"memory" );
>	return val;
>}
>
>I would expect this to produce errors at assembly time on r6.
>Rich

This is what compiler has generated for above function:

$ gcc -c -o main.o main.c -O3 -mips32r6 -mabi=32

Objdump:

00000000 <ll1k>:
   0:   24821000        addiu   v0,a0,4096
   4:   7c420036        ll      v0,0(v0)
   8:   d81f0000        jrc     ra
   c:   00000000        nop

Regards,
Jaydeep



^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-28 13:04             ` Rich Felker
  2016-03-29  2:19               ` Rich Felker
  2016-03-29  3:54               ` Jaydeep Patil
@ 2016-03-29  3:55               ` Jaydeep Patil
  2 siblings, 0 replies; 18+ messages in thread
From: Jaydeep Patil @ 2016-03-29  3:55 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

>-----Original Message-----
>From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker
>Sent: 28 March 2016 PM 06:35
>To: Jaydeep Patil
>Cc: musl@lists.openwall.com
>Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>
>On Mon, Mar 28, 2016 at 05:07:39AM +0000, Jaydeep Patil wrote:
>> >> >I was just saying it makes the code less cluttered to use them
>> >> >spuriously even though we don't need to:
>> >> >
>> >> >		".set push ; "
>> >> >#if __mips_isa_rev < 6
>> >> >		".set mips2 ; "
>> >> >#endif
>> >> >		"ll %0, %1 ; .set pop"
>> >> >
>> >> >or similar.
>> >> >
>> >> >It's also not clear to me whether the "m" constraint is valid
>> >> >anymore for the R6 ll/sc instructions since they take a 9-bit
>> >> >offset now instead of a
>> >16-bit offset.
>> >> >The compiler could generate an address expression whose offset
>> >> >part does not fit in 9 bits. In that case we may need to #if the
>> >> >whole function (or at least the __asm__ statement) separately
>> >> >rather than just
>> >skipping the .set mips2....
>> >> >
>> >>
>> >> The "m" constrain is still valid here, as the offset will be 0 in this case..
>> >
>> >How can you assume the offset will be 0? It's the compiler's choice
>> >what to use. For instance, a_cas(&foo->bar, t, s) is likely to have
>> >an offset equal to offsetof(__typeof__(foo),bar). AFAIK this happens
>> >in practice with small offsets in mutex structures, etc. so the bug
>> >may be unlikely to be hit, but I think it's still an incorrect-constraint bug.
>>
>> Compiler generates appropriate LL/SC based on the offset.
>> Compiler adds the offset to the base register if it does not fit 9bits.
>
>The compiler has no way of knowing that the operand will be used with ll with
>the 9-bit offset restriction; as far as it knows, it will be used in a normal
>context where a 16-bit offset is valid. I don't have a toolchain that will target
>r6, but you can try the following program which produces an offset of 4096 for
>loading p[1024]:
>
>unsigned ll1k(volatile unsigned *p)
>{
>	unsigned val;
>	__asm__ __volatile__ ("ll %0, %1" : "=r"(val) : "m"(p[1024]) :
>"memory" );
>	return val;
>}
>
>I would expect this to produce errors at assembly time on r6.
>
>Rich

With -O0 optimization level:

00000000 <ll1k>:
   0:   27bdffe8        addiu   sp,sp,-24
   4:   afbe0014        sw      s8,20(sp)
   8:   03a0f025        move    s8,sp
   c:   afc40018        sw      a0,24(s8)
  10:   8fc20018        lw      v0,24(s8)
  14:   24421000        addiu   v0,v0,4096
  18:   7c420036        ll      v0,0(v0)
  1c:   afc20008        sw      v0,8(s8)
  20:   8fc20008        lw      v0,8(s8)
  24:   03c0e825        move    sp,s8
  28:   8fbe0014        lw      s8,20(sp)
  2c:   27bd0018        addiu   sp,sp,24
  30:   d81f0000        jrc     ra




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-29  3:54               ` Jaydeep Patil
@ 2016-03-29  4:10                 ` Rich Felker
  2016-03-29  7:16                   ` Jaydeep Patil
  0 siblings, 1 reply; 18+ messages in thread
From: Rich Felker @ 2016-03-29  4:10 UTC (permalink / raw)
  To: Jaydeep Patil; +Cc: musl

On Tue, Mar 29, 2016 at 03:54:02AM +0000, Jaydeep Patil wrote:
> >-----Original Message-----
> >From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker
> >Sent: 28 March 2016 PM 06:35
> >To: Jaydeep Patil
> >Cc: musl@lists.openwall.com
> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
> >
> >On Mon, Mar 28, 2016 at 05:07:39AM +0000, Jaydeep Patil wrote:
> >> >> >I was just saying it makes the code less cluttered to use them
> >> >> >spuriously even though we don't need to:
> >> >> >
> >> >> >		".set push ; "
> >> >> >#if __mips_isa_rev < 6
> >> >> >		".set mips2 ; "
> >> >> >#endif
> >> >> >		"ll %0, %1 ; .set pop"
> >> >> >
> >> >> >or similar.
> >> >> >
> >> >> >It's also not clear to me whether the "m" constraint is valid
> >> >> >anymore for the R6 ll/sc instructions since they take a 9-bit
> >> >> >offset now instead of a
> >> >16-bit offset.
> >> >> >The compiler could generate an address expression whose offset
> >> >> >part does not fit in 9 bits. In that case we may need to #if the
> >> >> >whole function (or at least the __asm__ statement) separately
> >> >> >rather than just
> >> >skipping the .set mips2....
> >> >> >
> >> >>
> >> >> The "m" constrain is still valid here, as the offset will be 0 in this case..
> >> >
> >> >How can you assume the offset will be 0? It's the compiler's choice
> >> >what to use. For instance, a_cas(&foo->bar, t, s) is likely to have
> >> >an offset equal to offsetof(__typeof__(foo),bar). AFAIK this happens
> >> >in practice with small offsets in mutex structures, etc. so the bug
> >> >may be unlikely to be hit, but I think it's still an incorrect-constraint bug.
> >>
> >> Compiler generates appropriate LL/SC based on the offset.
> >> Compiler adds the offset to the base register if it does not fit 9bits.
> >
> >The compiler has no way of knowing that the operand will be used with ll with
> >the 9-bit offset restriction; as far as it knows, it will be used in a normal
> >context where a 16-bit offset is valid. I don't have a toolchain that will target
> >r6, but you can try the following program which produces an offset of 4096 for
> >loading p[1024]:
> >
> >unsigned ll1k(volatile unsigned *p)
> >{
> >	unsigned val;
> >	__asm__ __volatile__ ("ll %0, %1" : "=r"(val) : "m"(p[1024]) :
> >"memory" );
> >	return val;
> >}
> >
> >I would expect this to produce errors at assembly time on r6.
> >Rich
> 
> This is what compiler has generated for above function:
> 
> $ gcc -c -o main.o main.c -O3 -mips32r6 -mabi=32
> 
> Objdump:
> 
> 00000000 <ll1k>:
>    0:   24821000        addiu   v0,a0,4096
>    4:   7c420036        ll      v0,0(v0)
>    8:   d81f0000        jrc     ra
>    c:   00000000        nop

Can you try gcc -S instead of -c (still at -O3) to produce asm output
without assembling it?

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-29  4:10                 ` Rich Felker
@ 2016-03-29  7:16                   ` Jaydeep Patil
  2016-03-29 13:32                     ` Rich Felker
  0 siblings, 1 reply; 18+ messages in thread
From: Jaydeep Patil @ 2016-03-29  7:16 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

>-----Original Message-----
>From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker
>Sent: 29 March 2016 AM 09:41
>To: Jaydeep Patil
>Cc: musl@lists.openwall.com
>Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>
>On Tue, Mar 29, 2016 at 03:54:02AM +0000, Jaydeep Patil wrote:
>> >-----Original Message-----
>> >From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker
>> >Sent: 28 March 2016 PM 06:35
>> >To: Jaydeep Patil
>> >Cc: musl@lists.openwall.com
>> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>> >
>> >On Mon, Mar 28, 2016 at 05:07:39AM +0000, Jaydeep Patil wrote:
>> >> >> >I was just saying it makes the code less cluttered to use them
>> >> >> >spuriously even though we don't need to:
>> >> >> >
>> >> >> >		".set push ; "
>> >> >> >#if __mips_isa_rev < 6
>> >> >> >		".set mips2 ; "
>> >> >> >#endif
>> >> >> >		"ll %0, %1 ; .set pop"
>> >> >> >
>> >> >> >or similar.
>> >> >> >
>> >> >> >It's also not clear to me whether the "m" constraint is valid
>> >> >> >anymore for the R6 ll/sc instructions since they take a 9-bit
>> >> >> >offset now instead of a
>> >> >16-bit offset.
>> >> >> >The compiler could generate an address expression whose offset
>> >> >> >part does not fit in 9 bits. In that case we may need to #if
>> >> >> >the whole function (or at least the __asm__ statement)
>> >> >> >separately rather than just
>> >> >skipping the .set mips2....
>> >> >> >
>> >> >>
>> >> >> The "m" constrain is still valid here, as the offset will be 0 in this case..
>> >> >
>> >> >How can you assume the offset will be 0? It's the compiler's
>> >> >choice what to use. For instance, a_cas(&foo->bar, t, s) is likely
>> >> >to have an offset equal to offsetof(__typeof__(foo),bar). AFAIK
>> >> >this happens in practice with small offsets in mutex structures,
>> >> >etc. so the bug may be unlikely to be hit, but I think it's still an incorrect-
>constraint bug.
>> >>
>> >> Compiler generates appropriate LL/SC based on the offset.
>> >> Compiler adds the offset to the base register if it does not fit 9bits.
>> >
>> >The compiler has no way of knowing that the operand will be used with
>> >ll with the 9-bit offset restriction; as far as it knows, it will be
>> >used in a normal context where a 16-bit offset is valid. I don't have
>> >a toolchain that will target r6, but you can try the following
>> >program which produces an offset of 4096 for loading p[1024]:
>> >
>> >unsigned ll1k(volatile unsigned *p)
>> >{
>> >	unsigned val;
>> >	__asm__ __volatile__ ("ll %0, %1" : "=r"(val) : "m"(p[1024]) :
>> >"memory" );
>> >	return val;
>> >}
>> >
>> >I would expect this to produce errors at assembly time on r6.
>> >Rich
>>
>> This is what compiler has generated for above function:
>>
>> $ gcc -c -o main.o main.c -O3 -mips32r6 -mabi=32
>>
>> Objdump:
>>
>> 00000000 <ll1k>:
>>    0:   24821000        addiu   v0,a0,4096
>>    4:   7c420036        ll      v0,0(v0)
>>    8:   d81f0000        jrc     ra
>>    c:   00000000        nop
>
>Can you try gcc -S instead of -c (still at -O3) to produce asm output without
>assembling it?

Generated asssembly:

#APP
 # 4 "test.c" 1
        ll $2, 4096($4)
 # 0 "" 2
#NO_APP
        jrc     $31

Even if we set "noreorder" before LL, assembler generates addiu+ll:

00000000 <ll1k>:
   0:   24821000        addiu   v0,a0,4096
   4:   7c420036        ll      v0,0(v0)
   8:   d81f0000        jrc     ra
   c:   00000000        nop

>Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-29  7:16                   ` Jaydeep Patil
@ 2016-03-29 13:32                     ` Rich Felker
  2016-03-30  9:45                       ` Jaydeep Patil
  0 siblings, 1 reply; 18+ messages in thread
From: Rich Felker @ 2016-03-29 13:32 UTC (permalink / raw)
  To: Jaydeep Patil; +Cc: musl

On Tue, Mar 29, 2016 at 07:16:46AM +0000, Jaydeep Patil wrote:
> >-----Original Message-----
> >From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker
> >Sent: 29 March 2016 AM 09:41
> >To: Jaydeep Patil
> >Cc: musl@lists.openwall.com
> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
> >
> >On Tue, Mar 29, 2016 at 03:54:02AM +0000, Jaydeep Patil wrote:
> >> >-----Original Message-----
> >> >From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker
> >> >Sent: 28 March 2016 PM 06:35
> >> >To: Jaydeep Patil
> >> >Cc: musl@lists.openwall.com
> >> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
> >> >
> >> >On Mon, Mar 28, 2016 at 05:07:39AM +0000, Jaydeep Patil wrote:
> >> >> >> >I was just saying it makes the code less cluttered to use them
> >> >> >> >spuriously even though we don't need to:
> >> >> >> >
> >> >> >> >		".set push ; "
> >> >> >> >#if __mips_isa_rev < 6
> >> >> >> >		".set mips2 ; "
> >> >> >> >#endif
> >> >> >> >		"ll %0, %1 ; .set pop"
> >> >> >> >
> >> >> >> >or similar.
> >> >> >> >
> >> >> >> >It's also not clear to me whether the "m" constraint is valid
> >> >> >> >anymore for the R6 ll/sc instructions since they take a 9-bit
> >> >> >> >offset now instead of a
> >> >> >16-bit offset.
> >> >> >> >The compiler could generate an address expression whose offset
> >> >> >> >part does not fit in 9 bits. In that case we may need to #if
> >> >> >> >the whole function (or at least the __asm__ statement)
> >> >> >> >separately rather than just
> >> >> >skipping the .set mips2....
> >> >> >> >
> >> >> >>
> >> >> >> The "m" constrain is still valid here, as the offset will be 0 in this case..
> >> >> >
> >> >> >How can you assume the offset will be 0? It's the compiler's
> >> >> >choice what to use. For instance, a_cas(&foo->bar, t, s) is likely
> >> >> >to have an offset equal to offsetof(__typeof__(foo),bar). AFAIK
> >> >> >this happens in practice with small offsets in mutex structures,
> >> >> >etc. so the bug may be unlikely to be hit, but I think it's still an incorrect-
> >constraint bug.
> >> >>
> >> >> Compiler generates appropriate LL/SC based on the offset.
> >> >> Compiler adds the offset to the base register if it does not fit 9bits.
> >> >
> >> >The compiler has no way of knowing that the operand will be used with
> >> >ll with the 9-bit offset restriction; as far as it knows, it will be
> >> >used in a normal context where a 16-bit offset is valid. I don't have
> >> >a toolchain that will target r6, but you can try the following
> >> >program which produces an offset of 4096 for loading p[1024]:
> >> >
> >> >unsigned ll1k(volatile unsigned *p)
> >> >{
> >> >	unsigned val;
> >> >	__asm__ __volatile__ ("ll %0, %1" : "=r"(val) : "m"(p[1024]) :
> >> >"memory" );
> >> >	return val;
> >> >}
> >> >
> >> >I would expect this to produce errors at assembly time on r6.
> >> >Rich
> >>
> >> This is what compiler has generated for above function:
> >>
> >> $ gcc -c -o main.o main.c -O3 -mips32r6 -mabi=32
> >>
> >> Objdump:
> >>
> >> 00000000 <ll1k>:
> >>    0:   24821000        addiu   v0,a0,4096
> >>    4:   7c420036        ll      v0,0(v0)
> >>    8:   d81f0000        jrc     ra
> >>    c:   00000000        nop
> >
> >Can you try gcc -S instead of -c (still at -O3) to produce asm output without
> >assembling it?
> 
> Generated asssembly:
> 
> #APP
>  # 4 "test.c" 1
>         ll $2, 4096($4)
>  # 0 "" 2
> #NO_APP
>         jrc     $31
> 
> Even if we set "noreorder" before LL, assembler generates addiu+ll:
> 
> 00000000 <ll1k>:
>    0:   24821000        addiu   v0,a0,4096
>    4:   7c420036        ll      v0,0(v0)
>    8:   d81f0000        jrc     ra
>    c:   00000000        nop

I see. I suspected the assembler was doing it. "noat", not
"noreorder", is the way to suppress things like this but I doubt even
"noat" does it since a separate temp register ("at") is not needed in
this case.

If all assembers that support R6 support this rewriting, then the ZC
constraint in gcc is really just an optimization, not strictly
necessary. We should probably check (1) whether clang's internal
assembler can do the rewriting, and (2) whether clang supports the ZC
constraint. I would prefer using ZC but I want to do whatever is more
compatible; I don't think the codegen efficiency matters a lot either
way.

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-29 13:32                     ` Rich Felker
@ 2016-03-30  9:45                       ` Jaydeep Patil
  2016-03-30 14:29                         ` Rich Felker
  0 siblings, 1 reply; 18+ messages in thread
From: Jaydeep Patil @ 2016-03-30  9:45 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

>-----Original Message-----
>From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker
>Sent: 29 March 2016 PM 07:03
>To: Jaydeep Patil
>Cc: musl@lists.openwall.com
>Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>
>On Tue, Mar 29, 2016 at 07:16:46AM +0000, Jaydeep Patil wrote:
>> >-----Original Message-----
>> >From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker
>> >Sent: 29 March 2016 AM 09:41
>> >To: Jaydeep Patil
>> >Cc: musl@lists.openwall.com
>> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>> >
>> >On Tue, Mar 29, 2016 at 03:54:02AM +0000, Jaydeep Patil wrote:
>> >> >-----Original Message-----
>> >> >From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich
>> >> >Felker
>> >> >Sent: 28 March 2016 PM 06:35
>> >> >To: Jaydeep Patil
>> >> >Cc: musl@lists.openwall.com
>> >> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>> >> >
>> >> >On Mon, Mar 28, 2016 at 05:07:39AM +0000, Jaydeep Patil wrote:
>> >> >> >> >I was just saying it makes the code less cluttered to use
>> >> >> >> >them spuriously even though we don't need to:
>> >> >> >> >
>> >> >> >> >		".set push ; "
>> >> >> >> >#if __mips_isa_rev < 6
>> >> >> >> >		".set mips2 ; "
>> >> >> >> >#endif
>> >> >> >> >		"ll %0, %1 ; .set pop"
>> >> >> >> >
>> >> >> >> >or similar.
>> >> >> >> >
>> >> >> >> >It's also not clear to me whether the "m" constraint is
>> >> >> >> >valid anymore for the R6 ll/sc instructions since they take
>> >> >> >> >a 9-bit offset now instead of a
>> >> >> >16-bit offset.
>> >> >> >> >The compiler could generate an address expression whose
>> >> >> >> >offset part does not fit in 9 bits. In that case we may need
>> >> >> >> >to #if the whole function (or at least the __asm__
>> >> >> >> >statement) separately rather than just
>> >> >> >skipping the .set mips2....
>> >> >> >> >
>> >> >> >>
>> >> >> >> The "m" constrain is still valid here, as the offset will be 0 in this
>case..
>> >> >> >
>> >> >> >How can you assume the offset will be 0? It's the compiler's
>> >> >> >choice what to use. For instance, a_cas(&foo->bar, t, s) is
>> >> >> >likely to have an offset equal to
>> >> >> >offsetof(__typeof__(foo),bar). AFAIK this happens in practice
>> >> >> >with small offsets in mutex structures, etc. so the bug may be
>> >> >> >unlikely to be hit, but I think it's still an incorrect-
>> >constraint bug.
>> >> >>
>> >> >> Compiler generates appropriate LL/SC based on the offset.
>> >> >> Compiler adds the offset to the base register if it does not fit 9bits.
>> >> >
>> >> >The compiler has no way of knowing that the operand will be used
>> >> >with ll with the 9-bit offset restriction; as far as it knows, it
>> >> >will be used in a normal context where a 16-bit offset is valid. I
>> >> >don't have a toolchain that will target r6, but you can try the
>> >> >following program which produces an offset of 4096 for loading p[1024]:
>> >> >
>> >> >unsigned ll1k(volatile unsigned *p) {
>> >> >	unsigned val;
>> >> >	__asm__ __volatile__ ("ll %0, %1" : "=r"(val) : "m"(p[1024]) :
>> >> >"memory" );
>> >> >	return val;
>> >> >}
>> >> >
>> >> >I would expect this to produce errors at assembly time on r6.
>> >> >Rich
>> >>
>> >> This is what compiler has generated for above function:
>> >>
>> >> $ gcc -c -o main.o main.c -O3 -mips32r6 -mabi=32
>> >>
>> >> Objdump:
>> >>
>> >> 00000000 <ll1k>:
>> >>    0:   24821000        addiu   v0,a0,4096
>> >>    4:   7c420036        ll      v0,0(v0)
>> >>    8:   d81f0000        jrc     ra
>> >>    c:   00000000        nop
>> >
>> >Can you try gcc -S instead of -c (still at -O3) to produce asm output
>> >without assembling it?
>>
>> Generated asssembly:
>>
>> #APP
>>  # 4 "test.c" 1
>>         ll $2, 4096($4)
>>  # 0 "" 2
>> #NO_APP
>>         jrc     $31
>>
>> Even if we set "noreorder" before LL, assembler generates addiu+ll:
>>
>> 00000000 <ll1k>:
>>    0:   24821000        addiu   v0,a0,4096
>>    4:   7c420036        ll      v0,0(v0)
>>    8:   d81f0000        jrc     ra
>>    c:   00000000        nop
>
>I see. I suspected the assembler was doing it. "noat", not "noreorder", is the
>way to suppress things like this but I doubt even "noat" does it since a
>separate temp register ("at") is not needed in this case.
>
>If all assembers that support R6 support this rewriting, then the ZC constraint
>in gcc is really just an optimization, not strictly necessary. We should probably
>check (1) whether clang's internal assembler can do the rewriting, and (2)
>whether clang supports the ZC constraint. I would prefer using ZC but I want
>to do whatever is more compatible; I don't think the codegen efficiency
>matters a lot either way.
>Rich

Clang's integrated assembler does not support this rewriting. However ZC is supported.
I have modified both atomic_arch.h and pthread_arch.h to reflect this. 
Please refer to https://github.com/JaydeepIMG/musl-1/tree/fix_inline_asm_for_R6 for the patch (also listed below).
I have also added R6 as subarch.



From 20054ee55643d9e81163ca58ac63cc38b5080969 Mon Sep 17 00:00:00 2001
From: Jaydeep Patil <jaydeep.patil@imgtec.com>
Date: Wed, 30 Mar 2016 10:37:30 +0100
Subject: [PATCH] [MIPS] Update inline asm for R6 and add R6 as subtarget

---
 arch/mips/atomic_arch.h    | 17 +++--------------
 arch/mips/pthread_arch.h   |  8 +-------
 arch/mips64/atomic_arch.h  | 12 +++++-------
 arch/mips64/pthread_arch.h |  7 +------
 configure                  |  2 ++
 5 files changed, 12 insertions(+), 34 deletions(-)

diff --git a/arch/mips/atomic_arch.h b/arch/mips/atomic_arch.h
index ce2823b..4dbe4bb 100644
--- a/arch/mips/atomic_arch.h
+++ b/arch/mips/atomic_arch.h
@@ -3,10 +3,8 @@ static inline int a_ll(volatile int *p)
 {
 	int v;
 	__asm__ __volatile__ (
-		".set push ; .set mips2\n\t"
 		"ll %0, %1"
-		"\n\t.set pop"
-		: "=r"(v) : "m"(*p));
+		: "=r"(v) : "ZC"(*p));
 	return v;
 }
 
@@ -15,24 +13,15 @@ static inline int a_sc(volatile int *p, int v)
 {
 	int r;
 	__asm__ __volatile__ (
-		".set push ; .set mips2\n\t"
 		"sc %0, %1"
-		"\n\t.set pop"
-		: "=r"(r), "=m"(*p) : "0"(v) : "memory");
+		: "=r"(r), "=ZC"(*p) : "0"(v) : "memory");
 	return r;
 }
 
 #define a_barrier a_barrier
 static inline void a_barrier()
 {
-	/* mips2 sync, but using too many directives causes
-	 * gcc not to inline it, so encode with .long instead. */
-	__asm__ __volatile__ (".long 0xf" : : : "memory");
-#if 0
-	__asm__ __volatile__ (
-		".set push ; .set mips2 ; sync ; .set pop"
-		: : : "memory");
-#endif
+	__asm__ __volatile__ ("sync" : : : "memory");
 }
 
 #define a_pre_llsc a_barrier
diff --git a/arch/mips/pthread_arch.h b/arch/mips/pthread_arch.h
index 8a49965..d8b6955 100644
--- a/arch/mips/pthread_arch.h
+++ b/arch/mips/pthread_arch.h
@@ -1,13 +1,7 @@
 static inline struct pthread *__pthread_self()
 {
-#ifdef __clang__
-	char *tp;
-	__asm__ __volatile__ (".word 0x7c03e83b ; move %0, $3" : "=r" (tp) : : "$3" );
-#else
 	register char *tp __asm__("$3");
-	/* rdhwr $3,$29 */
-	__asm__ __volatile__ (".word 0x7c03e83b" : "=r" (tp) );
-#endif
+	__asm__ __volatile__ ("rdhwr %0,$29" : "=r" (tp));
 	return (pthread_t)(tp - 0x7000 - sizeof(struct pthread));
 }
 
diff --git a/arch/mips64/atomic_arch.h b/arch/mips64/atomic_arch.h
index b468fd9..ac92891 100644
--- a/arch/mips64/atomic_arch.h
+++ b/arch/mips64/atomic_arch.h
@@ -4,7 +4,7 @@ static inline int a_ll(volatile int *p)
 	int v;
 	__asm__ __volatile__ (
 		"ll %0, %1"
-		: "=r"(v) : "m"(*p));
+		: "=r"(v) : "ZC"(*p));
 	return v;
 }
 
@@ -14,7 +14,7 @@ static inline int a_sc(volatile int *p, int v)
 	int r;
 	__asm__ __volatile__ (
 		"sc %0, %1"
-		: "=r"(r), "=m"(*p) : "0"(v) : "memory");
+		: "=r"(r), "=ZC"(*p) : "0"(v) : "memory");
 	return r;
 }
 
@@ -24,7 +24,7 @@ static inline void *a_ll_p(volatile void *p)
 	void *v;
 	__asm__ __volatile__ (
 		"lld %0, %1"
-		: "=r"(v) : "m"(*(void *volatile *)p));
+		: "=r"(v) : "ZC"(*(void *volatile *)p));
 	return v;
 }
 
@@ -34,16 +34,14 @@ static inline int a_sc_p(volatile void *p, void *v)
 	long r;
 	__asm__ __volatile__ (
 		"scd %0, %1"
-		: "=r"(r), "=m"(*(void *volatile *)p) : "0"(v) : "memory");
+		: "=r"(r), "=ZC"(*(void *volatile *)p) : "0"(v) : "memory");
 	return r;
 }
 
 #define a_barrier a_barrier
 static inline void a_barrier()
 {
-	/* mips2 sync, but using too many directives causes
-	 * gcc not to inline it, so encode with .long instead. */
-	__asm__ __volatile__ (".long 0xf" : : : "memory");
+	__asm__ __volatile__ ("sync" : : : "memory");
 }
 
 #define a_pre_llsc a_barrier
diff --git a/arch/mips64/pthread_arch.h b/arch/mips64/pthread_arch.h
index b42edbe..d8b6955 100644
--- a/arch/mips64/pthread_arch.h
+++ b/arch/mips64/pthread_arch.h
@@ -1,12 +1,7 @@
 static inline struct pthread *__pthread_self()
 {
-#ifdef __clang__
-	char *tp;
-	__asm__ __volatile__ (".word 0x7c03e83b ; move %0, $3" : "=r" (tp) : : "$3" );
-#else
 	register char *tp __asm__("$3");
-	__asm__ __volatile__ (".word 0x7c03e83b" : "=r" (tp) );
-#endif
+	__asm__ __volatile__ ("rdhwr %0,$29" : "=r" (tp));
 	return (pthread_t)(tp - 0x7000 - sizeof(struct pthread));
 }
 
diff --git a/configure b/configure
index 213a825..969671d 100755
--- a/configure
+++ b/configure
@@ -612,11 +612,13 @@ trycppif __AARCH64EB__ "$t" && SUBARCH=${SUBARCH}_be
 fi
 
 if test "$ARCH" = "mips" ; then
+trycppif "__mips_isa_rev >= 6" "$t" && SUBARCH=${SUBARCH}r6
 trycppif "_MIPSEL || __MIPSEL || __MIPSEL__" "$t" && SUBARCH=${SUBARCH}el
 trycppif __mips_soft_float "$t" && SUBARCH=${SUBARCH}-sf
 fi
 
 if test "$ARCH" = "mips64" ; then
+trycppif "__mips_isa_rev >= 6" "$t" && SUBARCH=${SUBARCH}r6
 trycppif "_MIPSEL || __MIPSEL || __MIPSEL__" "$t" && SUBARCH=${SUBARCH}el
 trycppif __mips_soft_float "$t" && SUBARCH=${SUBARCH}-sf
 fi
-- 
2.1.4


Thanks,
Jaydeep



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-30  9:45                       ` Jaydeep Patil
@ 2016-03-30 14:29                         ` Rich Felker
  2016-03-30 15:28                           ` Rich Felker
  0 siblings, 1 reply; 18+ messages in thread
From: Rich Felker @ 2016-03-30 14:29 UTC (permalink / raw)
  To: Jaydeep Patil; +Cc: musl

On Wed, Mar 30, 2016 at 09:45:59AM +0000, Jaydeep Patil wrote:
> >> >> >The compiler has no way of knowing that the operand will be used
> >> >> >with ll with the 9-bit offset restriction; as far as it knows, it
> >> >> >will be used in a normal context where a 16-bit offset is valid. I
> >> >> >don't have a toolchain that will target r6, but you can try the
> >> >> >following program which produces an offset of 4096 for loading p[1024]:
> >> >> >
> >> >> >unsigned ll1k(volatile unsigned *p) {
> >> >> >	unsigned val;
> >> >> >	__asm__ __volatile__ ("ll %0, %1" : "=r"(val) : "m"(p[1024]) :
> >> >> >"memory" );
> >> >> >	return val;
> >> >> >}
> >> >> >
> >> >> >I would expect this to produce errors at assembly time on r6.
> >> >> >Rich
> >> >>
> >> >> This is what compiler has generated for above function:
> >> >>
> >> >> $ gcc -c -o main.o main.c -O3 -mips32r6 -mabi=32
> >> >>
> >> >> Objdump:
> >> >>
> >> >> 00000000 <ll1k>:
> >> >>    0:   24821000        addiu   v0,a0,4096
> >> >>    4:   7c420036        ll      v0,0(v0)
> >> >>    8:   d81f0000        jrc     ra
> >> >>    c:   00000000        nop
> >> >
> >> >Can you try gcc -S instead of -c (still at -O3) to produce asm output
> >> >without assembling it?
> >>
> >> Generated asssembly:
> >>
> >> #APP
> >>  # 4 "test.c" 1
> >>         ll $2, 4096($4)
> >>  # 0 "" 2
> >> #NO_APP
> >>         jrc     $31
> >>
> >> Even if we set "noreorder" before LL, assembler generates addiu+ll:
> >>
> >> 00000000 <ll1k>:
> >>    0:   24821000        addiu   v0,a0,4096
> >>    4:   7c420036        ll      v0,0(v0)
> >>    8:   d81f0000        jrc     ra
> >>    c:   00000000        nop
> >
> >I see. I suspected the assembler was doing it. "noat", not "noreorder", is the
> >way to suppress things like this but I doubt even "noat" does it since a
> >separate temp register ("at") is not needed in this case.
> >
> >If all assembers that support R6 support this rewriting, then the ZC constraint
> >in gcc is really just an optimization, not strictly necessary. We should probably
> >check (1) whether clang's internal assembler can do the rewriting, and (2)
> >whether clang supports the ZC constraint. I would prefer using ZC but I want
> >to do whatever is more compatible; I don't think the codegen efficiency
> >matters a lot either way.
> >Rich
> 
> Clang's integrated assembler does not support this rewriting. However ZC is supported.
> I have modified both atomic_arch.h and pthread_arch.h to reflect this. 
> Please refer to https://github.com/JaydeepIMG/musl-1/tree/fix_inline_asm_for_R6 for the patch (also listed below).
> I have also added R6 as subarch.
> 
> 
> 
> >From 20054ee55643d9e81163ca58ac63cc38b5080969 Mon Sep 17 00:00:00 2001
> From: Jaydeep Patil <jaydeep.patil@imgtec.com>
> Date: Wed, 30 Mar 2016 10:37:30 +0100
> Subject: [PATCH] [MIPS] Update inline asm for R6 and add R6 as subtarget
> 
> ---
>  arch/mips/atomic_arch.h    | 17 +++--------------
>  arch/mips/pthread_arch.h   |  8 +-------
>  arch/mips64/atomic_arch.h  | 12 +++++-------
>  arch/mips64/pthread_arch.h |  7 +------
>  configure                  |  2 ++
>  5 files changed, 12 insertions(+), 34 deletions(-)
> 
> diff --git a/arch/mips/atomic_arch.h b/arch/mips/atomic_arch.h
> index ce2823b..4dbe4bb 100644
> --- a/arch/mips/atomic_arch.h
> +++ b/arch/mips/atomic_arch.h
> @@ -3,10 +3,8 @@ static inline int a_ll(volatile int *p)
>  {
>  	int v;
>  	__asm__ __volatile__ (
> -		".set push ; .set mips2\n\t"
>  		"ll %0, %1"
> -		"\n\t.set pop"
> -		: "=r"(v) : "m"(*p));
> +		: "=r"(v) : "ZC"(*p));
>  	return v;
>  }

Did you see the proposed patch I sent? Your version does not work
because it removes support for pre-mips2. It also removed support for
gcc versions prior to 4.9, since "ZC" was not added until r6 support
was added (or maybe later). My version of the patch addresses both of
these issues by conditioning on __mips_isa_rev>=6.

>  #define a_pre_llsc a_barrier
> diff --git a/arch/mips/pthread_arch.h b/arch/mips/pthread_arch.h
> index 8a49965..d8b6955 100644
> --- a/arch/mips/pthread_arch.h
> +++ b/arch/mips/pthread_arch.h
> @@ -1,13 +1,7 @@
>  static inline struct pthread *__pthread_self()
>  {
> -#ifdef __clang__
> -	char *tp;
> -	__asm__ __volatile__ (".word 0x7c03e83b ; move %0, $3" : "=r" (tp) : : "$3" );
> -#else
>  	register char *tp __asm__("$3");
> -	/* rdhwr $3,$29 */
> -	__asm__ __volatile__ (".word 0x7c03e83b" : "=r" (tp) );
> -#endif
> +	__asm__ __volatile__ ("rdhwr %0,$29" : "=r" (tp));
>  	return (pthread_t)(tp - 0x7000 - sizeof(struct pthread));
>  }

This also needs to either be conditioned on __mips_isa_rev>=2 or
similar, or else the .set push/pop stuff needs to be used for
__mips_isa_rev<6.

>  #define a_barrier a_barrier
>  static inline void a_barrier()
>  {
> -	/* mips2 sync, but using too many directives causes
> -	 * gcc not to inline it, so encode with .long instead. */
> -	__asm__ __volatile__ (".long 0xf" : : : "memory");
> +	__asm__ __volatile__ ("sync" : : : "memory");
>  }

I think this should be okay, since all mips64 ISA levels have this
insn as far as I know.

> diff --git a/configure b/configure
> index 213a825..969671d 100755
> --- a/configure
> +++ b/configure
> @@ -612,11 +612,13 @@ trycppif __AARCH64EB__ "$t" && SUBARCH=${SUBARCH}_be
>  fi
>  
>  if test "$ARCH" = "mips" ; then
> +trycppif "__mips_isa_rev >= 6" "$t" && SUBARCH=${SUBARCH}r6
>  trycppif "_MIPSEL || __MIPSEL || __MIPSEL__" "$t" && SUBARCH=${SUBARCH}el
>  trycppif __mips_soft_float "$t" && SUBARCH=${SUBARCH}-sf
>  fi
>  
>  if test "$ARCH" = "mips64" ; then
> +trycppif "__mips_isa_rev >= 6" "$t" && SUBARCH=${SUBARCH}r6
>  trycppif "_MIPSEL || __MIPSEL || __MIPSEL__" "$t" && SUBARCH=${SUBARCH}el
>  trycppif __mips_soft_float "$t" && SUBARCH=${SUBARCH}-sf
>  fi

This looks ok, but a similar condition in arch/mips*/reloc.h is also
needed.

Since I've done most of the thinking about the above issues already
and have a patch for some of them, let me prepare a complete patch and
send it to the list for you to check and make sure it meets your needs
for r6. I should be able to prepare it very quickly. Then we can look
at applying the same changes to the n32 port and reviewing it.

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-30 14:29                         ` Rich Felker
@ 2016-03-30 15:28                           ` Rich Felker
  2016-03-31  5:20                             ` Jaydeep Patil
  0 siblings, 1 reply; 18+ messages in thread
From: Rich Felker @ 2016-03-30 15:28 UTC (permalink / raw)
  To: Jaydeep Patil; +Cc: musl

[-- Attachment #1: Type: text/plain, Size: 784 bytes --]

On Wed, Mar 30, 2016 at 10:29:26AM -0400, Rich Felker wrote:
> Since I've done most of the thinking about the above issues already
> and have a patch for some of them, let me prepare a complete patch and
> send it to the list for you to check and make sure it meets your needs
> for r6. I should be able to prepare it very quickly. Then we can look
> at applying the same changes to the n32 port and reviewing it.

Can you see if the attached patch works for you? It not only adds r6
support but improves support for non-baseline (i.e. > mips1) ISA
levels by optimizing out the unnecessary .set's (which hurt gcc's
inlining, because gcc is dumb about them) and lifts the $3 register
restriction on rdhwr for ISA levels where the instructions are known
to be available natively.

Rich

[-- Attachment #2: mipsr6.diff --]
[-- Type: text/plain, Size: 5927 bytes --]

diff --git a/arch/mips/atomic_arch.h b/arch/mips/atomic_arch.h
index ce2823b..1248d17 100644
--- a/arch/mips/atomic_arch.h
+++ b/arch/mips/atomic_arch.h
@@ -1,12 +1,24 @@
+#if __mips_isa_rev < 6
+#define LLSC_M "m"
+#else
+#define LLSC_M "ZC"
+#endif
+
 #define a_ll a_ll
 static inline int a_ll(volatile int *p)
 {
 	int v;
+#if __mips < 2
 	__asm__ __volatile__ (
 		".set push ; .set mips2\n\t"
 		"ll %0, %1"
 		"\n\t.set pop"
 		: "=r"(v) : "m"(*p));
+#else
+	__asm__ __volatile__ (
+		"ll %0, %1"
+		: "=r"(v) : LLSC_M(*p));
+#endif
 	return v;
 }
 
@@ -14,26 +26,33 @@ static inline int a_ll(volatile int *p)
 static inline int a_sc(volatile int *p, int v)
 {
 	int r;
+#if __mips < 2
 	__asm__ __volatile__ (
 		".set push ; .set mips2\n\t"
 		"sc %0, %1"
 		"\n\t.set pop"
 		: "=r"(r), "=m"(*p) : "0"(v) : "memory");
+#else
+	__asm__ __volatile__ (
+		"sc %0, %1"
+		: "=r"(r), "="LLSC_M(*p) : "0"(v) : "memory");
+#endif
 	return r;
 }
 
 #define a_barrier a_barrier
 static inline void a_barrier()
 {
+#if __mips < 2
 	/* mips2 sync, but using too many directives causes
 	 * gcc not to inline it, so encode with .long instead. */
 	__asm__ __volatile__ (".long 0xf" : : : "memory");
-#if 0
-	__asm__ __volatile__ (
-		".set push ; .set mips2 ; sync ; .set pop"
-		: : : "memory");
+#else
+	__asm__ __volatile__ ("sync" : : : "memory");
 #endif
 }
 
 #define a_pre_llsc a_barrier
 #define a_post_llsc a_barrier
+
+#undef LLSC_M
diff --git a/arch/mips/pthread_arch.h b/arch/mips/pthread_arch.h
index 8a49965..e581265 100644
--- a/arch/mips/pthread_arch.h
+++ b/arch/mips/pthread_arch.h
@@ -1,12 +1,11 @@
 static inline struct pthread *__pthread_self()
 {
-#ifdef __clang__
-	char *tp;
-	__asm__ __volatile__ (".word 0x7c03e83b ; move %0, $3" : "=r" (tp) : : "$3" );
-#else
+#if __mips_isa_rev < 2
 	register char *tp __asm__("$3");
-	/* rdhwr $3,$29 */
 	__asm__ __volatile__ (".word 0x7c03e83b" : "=r" (tp) );
+#else
+	char *tp;
+	__asm__ __volatile__ ("rdhwr %0, $29" : "=r" (tp) );
 #endif
 	return (pthread_t)(tp - 0x7000 - sizeof(struct pthread));
 }
diff --git a/arch/mips/reloc.h b/arch/mips/reloc.h
index 8c52df0..b3d59a4 100644
--- a/arch/mips/reloc.h
+++ b/arch/mips/reloc.h
@@ -1,5 +1,11 @@
 #include <endian.h>
 
+#if __mips_isa_rev >= 6
+#define ISA_SUFFIX "r6"
+#else
+#define ISA_SUFFIX ""
+#endif
+
 #if __BYTE_ORDER == __LITTLE_ENDIAN
 #define ENDIAN_SUFFIX "el"
 #else
@@ -12,7 +18,7 @@
 #define FP_SUFFIX ""
 #endif
 
-#define LDSO_ARCH "mips" ENDIAN_SUFFIX FP_SUFFIX
+#define LDSO_ARCH "mips" ISA_SUFFIX ENDIAN_SUFFIX FP_SUFFIX
 
 #define TPOFF_K (-0x7000)
 
diff --git a/arch/mips64/atomic_arch.h b/arch/mips64/atomic_arch.h
index b468fd9..d0f8b4a 100644
--- a/arch/mips64/atomic_arch.h
+++ b/arch/mips64/atomic_arch.h
@@ -1,10 +1,16 @@
+#if __mips_isa_rev < 6
+#define LLSC_M "m"
+#else
+#define LLSC_M "ZC"
+#endif
+
 #define a_ll a_ll
 static inline int a_ll(volatile int *p)
 {
 	int v;
 	__asm__ __volatile__ (
 		"ll %0, %1"
-		: "=r"(v) : "m"(*p));
+		: "=r"(v) : LLSC_M(*p));
 	return v;
 }
 
@@ -14,7 +20,7 @@ static inline int a_sc(volatile int *p, int v)
 	int r;
 	__asm__ __volatile__ (
 		"sc %0, %1"
-		: "=r"(r), "=m"(*p) : "0"(v) : "memory");
+		: "=r"(r), "="LLSC_M(*p) : "0"(v) : "memory");
 	return r;
 }
 
@@ -24,7 +30,7 @@ static inline void *a_ll_p(volatile void *p)
 	void *v;
 	__asm__ __volatile__ (
 		"lld %0, %1"
-		: "=r"(v) : "m"(*(void *volatile *)p));
+		: "=r"(v) : LLSC_M(*(void *volatile *)p));
 	return v;
 }
 
@@ -34,17 +40,17 @@ static inline int a_sc_p(volatile void *p, void *v)
 	long r;
 	__asm__ __volatile__ (
 		"scd %0, %1"
-		: "=r"(r), "=m"(*(void *volatile *)p) : "0"(v) : "memory");
+		: "=r"(r), "="LLSC_M(*(void *volatile *)p) : "0"(v) : "memory");
 	return r;
 }
 
 #define a_barrier a_barrier
 static inline void a_barrier()
 {
-	/* mips2 sync, but using too many directives causes
-	 * gcc not to inline it, so encode with .long instead. */
-	__asm__ __volatile__ (".long 0xf" : : : "memory");
+	__asm__ __volatile__ ("sync" : : : "memory");
 }
 
 #define a_pre_llsc a_barrier
 #define a_post_llsc a_barrier
+
+#undef LLSC_M
diff --git a/arch/mips64/pthread_arch.h b/arch/mips64/pthread_arch.h
index b42edbe..e581265 100644
--- a/arch/mips64/pthread_arch.h
+++ b/arch/mips64/pthread_arch.h
@@ -1,11 +1,11 @@
 static inline struct pthread *__pthread_self()
 {
-#ifdef __clang__
-	char *tp;
-	__asm__ __volatile__ (".word 0x7c03e83b ; move %0, $3" : "=r" (tp) : : "$3" );
-#else
+#if __mips_isa_rev < 2
 	register char *tp __asm__("$3");
 	__asm__ __volatile__ (".word 0x7c03e83b" : "=r" (tp) );
+#else
+	char *tp;
+	__asm__ __volatile__ ("rdhwr %0, $29" : "=r" (tp) );
 #endif
 	return (pthread_t)(tp - 0x7000 - sizeof(struct pthread));
 }
diff --git a/arch/mips64/reloc.h b/arch/mips64/reloc.h
index 5933147..bbd9bd9 100644
--- a/arch/mips64/reloc.h
+++ b/arch/mips64/reloc.h
@@ -4,6 +4,12 @@
 #define _GNU_SOURCE
 #include <endian.h>
 
+#if __mips_isa_rev >= 6
+#define ISA_SUFFIX "r6"
+#else
+#define ISA_SUFFIX ""
+#endif
+
 #if __BYTE_ORDER == __LITTLE_ENDIAN
 #define ENDIAN_SUFFIX "el"
 #else
@@ -16,7 +22,7 @@
 #define FP_SUFFIX ""
 #endif
 
-#define LDSO_ARCH "mips64" ENDIAN_SUFFIX FP_SUFFIX
+#define LDSO_ARCH "mips64" ISA_SUFFIX ENDIAN_SUFFIX FP_SUFFIX
 
 #define TPOFF_K (-0x7000)
 
diff --git a/configure b/configure
index 213a825..969671d 100755
--- a/configure
+++ b/configure
@@ -612,11 +612,13 @@ trycppif __AARCH64EB__ "$t" && SUBARCH=${SUBARCH}_be
 fi
 
 if test "$ARCH" = "mips" ; then
+trycppif "__mips_isa_rev >= 6" "$t" && SUBARCH=${SUBARCH}r6
 trycppif "_MIPSEL || __MIPSEL || __MIPSEL__" "$t" && SUBARCH=${SUBARCH}el
 trycppif __mips_soft_float "$t" && SUBARCH=${SUBARCH}-sf
 fi
 
 if test "$ARCH" = "mips64" ; then
+trycppif "__mips_isa_rev >= 6" "$t" && SUBARCH=${SUBARCH}r6
 trycppif "_MIPSEL || __MIPSEL || __MIPSEL__" "$t" && SUBARCH=${SUBARCH}el
 trycppif __mips_soft_float "$t" && SUBARCH=${SUBARCH}-sf
 fi

^ permalink raw reply	[flat|nested] 18+ messages in thread

* RE: [PATCH] Fix atomic_arch.h for MIPS32 R6
  2016-03-30 15:28                           ` Rich Felker
@ 2016-03-31  5:20                             ` Jaydeep Patil
  0 siblings, 0 replies; 18+ messages in thread
From: Jaydeep Patil @ 2016-03-31  5:20 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

>-----Original Message-----
>From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker
>Sent: 30 March 2016 PM 08:58
>To: Jaydeep Patil
>Cc: musl@lists.openwall.com
>Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6
>
>On Wed, Mar 30, 2016 at 10:29:26AM -0400, Rich Felker wrote:
>> Since I've done most of the thinking about the above issues already
>> and have a patch for some of them, let me prepare a complete patch and
>> send it to the list for you to check and make sure it meets your needs
>> for r6. I should be able to prepare it very quickly. Then we can look
>> at applying the same changes to the n32 port and reviewing it.
>
>Can you see if the attached patch works for you? It not only adds r6 support
>but improves support for non-baseline (i.e. > mips1) ISA levels by optimizing
>out the unnecessary .set's (which hurt gcc's inlining, because gcc is dumb
>about them) and lifts the $3 register restriction on rdhwr for ISA levels where
>the instructions are known to be available natively.

The patch works fine. I have tried it for mips32 r2-r6, mips64 r2-r6, micromips32 r2-r5.
I have tried it with both GCC and Clang 3.9.0 compilers.

>Rich

Thanks,
Jaydeep




^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-03-31  5:20 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-21  6:03 [PATCH] Fix atomic_arch.h for MIPS32 R6 Jaydeep Patil
2016-03-21 17:37 ` dalias
2016-03-22  4:58   ` Jaydeep Patil
2016-03-22 21:22     ` Rich Felker
2016-03-23  6:37       ` Jaydeep Patil
2016-03-23 15:03         ` Rich Felker
2016-03-28  5:07           ` Jaydeep Patil
2016-03-28 13:04             ` Rich Felker
2016-03-29  2:19               ` Rich Felker
2016-03-29  3:54               ` Jaydeep Patil
2016-03-29  4:10                 ` Rich Felker
2016-03-29  7:16                   ` Jaydeep Patil
2016-03-29 13:32                     ` Rich Felker
2016-03-30  9:45                       ` Jaydeep Patil
2016-03-30 14:29                         ` Rich Felker
2016-03-30 15:28                           ` Rich Felker
2016-03-31  5:20                             ` Jaydeep Patil
2016-03-29  3:55               ` Jaydeep Patil

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).