From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/9770 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: [PATCH] Fix atomic_arch.h for MIPS32 R6 Date: Tue, 29 Mar 2016 09:32:54 -0400 Message-ID: <20160329133254.GM21636@brightrain.aerifal.cx> References: <20160321173754.GC21636@brightrain.aerifal.cx> <20160322212211.GG21636@brightrain.aerifal.cx> <20160323150302.GK21636@brightrain.aerifal.cx> <20160328130451.GH21636@brightrain.aerifal.cx> <20160329041055.GL21636@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1459258411 19354 80.91.229.3 (29 Mar 2016 13:33:31 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 29 Mar 2016 13:33:31 +0000 (UTC) Cc: "musl@lists.openwall.com" To: Jaydeep Patil Original-X-From: musl-return-9783-gllmg-musl=m.gmane.org@lists.openwall.com Tue Mar 29 15:33:29 2016 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1aktlv-0005yN-02 for gllmg-musl@m.gmane.org; Tue, 29 Mar 2016 15:33:27 +0200 Original-Received: (qmail 27856 invoked by uid 550); 29 Mar 2016 13:33:12 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 27838 invoked from network); 29 Mar 2016 13:33:11 -0000 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:9770 Archived-At: On Tue, Mar 29, 2016 at 07:16:46AM +0000, Jaydeep Patil wrote: > >-----Original Message----- > >From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker > >Sent: 29 March 2016 AM 09:41 > >To: Jaydeep Patil > >Cc: musl@lists.openwall.com > >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6 > > > >On Tue, Mar 29, 2016 at 03:54:02AM +0000, Jaydeep Patil wrote: > >> >-----Original Message----- > >> >From: Rich Felker [mailto:dalias@aerifal.cx] On Behalf Of Rich Felker > >> >Sent: 28 March 2016 PM 06:35 > >> >To: Jaydeep Patil > >> >Cc: musl@lists.openwall.com > >> >Subject: Re: [musl] [PATCH] Fix atomic_arch.h for MIPS32 R6 > >> > > >> >On Mon, Mar 28, 2016 at 05:07:39AM +0000, Jaydeep Patil wrote: > >> >> >> >I was just saying it makes the code less cluttered to use them > >> >> >> >spuriously even though we don't need to: > >> >> >> > > >> >> >> > ".set push ; " > >> >> >> >#if __mips_isa_rev < 6 > >> >> >> > ".set mips2 ; " > >> >> >> >#endif > >> >> >> > "ll %0, %1 ; .set pop" > >> >> >> > > >> >> >> >or similar. > >> >> >> > > >> >> >> >It's also not clear to me whether the "m" constraint is valid > >> >> >> >anymore for the R6 ll/sc instructions since they take a 9-bit > >> >> >> >offset now instead of a > >> >> >16-bit offset. > >> >> >> >The compiler could generate an address expression whose offset > >> >> >> >part does not fit in 9 bits. In that case we may need to #if > >> >> >> >the whole function (or at least the __asm__ statement) > >> >> >> >separately rather than just > >> >> >skipping the .set mips2.... > >> >> >> > > >> >> >> > >> >> >> The "m" constrain is still valid here, as the offset will be 0 in this case.. > >> >> > > >> >> >How can you assume the offset will be 0? It's the compiler's > >> >> >choice what to use. For instance, a_cas(&foo->bar, t, s) is likely > >> >> >to have an offset equal to offsetof(__typeof__(foo),bar). AFAIK > >> >> >this happens in practice with small offsets in mutex structures, > >> >> >etc. so the bug may be unlikely to be hit, but I think it's still an incorrect- > >constraint bug. > >> >> > >> >> Compiler generates appropriate LL/SC based on the offset. > >> >> Compiler adds the offset to the base register if it does not fit 9bits. > >> > > >> >The compiler has no way of knowing that the operand will be used with > >> >ll with the 9-bit offset restriction; as far as it knows, it will be > >> >used in a normal context where a 16-bit offset is valid. I don't have > >> >a toolchain that will target r6, but you can try the following > >> >program which produces an offset of 4096 for loading p[1024]: > >> > > >> >unsigned ll1k(volatile unsigned *p) > >> >{ > >> > unsigned val; > >> > __asm__ __volatile__ ("ll %0, %1" : "=r"(val) : "m"(p[1024]) : > >> >"memory" ); > >> > return val; > >> >} > >> > > >> >I would expect this to produce errors at assembly time on r6. > >> >Rich > >> > >> This is what compiler has generated for above function: > >> > >> $ gcc -c -o main.o main.c -O3 -mips32r6 -mabi=32 > >> > >> Objdump: > >> > >> 00000000 : > >> 0: 24821000 addiu v0,a0,4096 > >> 4: 7c420036 ll v0,0(v0) > >> 8: d81f0000 jrc ra > >> c: 00000000 nop > > > >Can you try gcc -S instead of -c (still at -O3) to produce asm output without > >assembling it? > > Generated asssembly: > > #APP > # 4 "test.c" 1 > ll $2, 4096($4) > # 0 "" 2 > #NO_APP > jrc $31 > > Even if we set "noreorder" before LL, assembler generates addiu+ll: > > 00000000 : > 0: 24821000 addiu v0,a0,4096 > 4: 7c420036 ll v0,0(v0) > 8: d81f0000 jrc ra > c: 00000000 nop I see. I suspected the assembler was doing it. "noat", not "noreorder", is the way to suppress things like this but I doubt even "noat" does it since a separate temp register ("at") is not needed in this case. If all assembers that support R6 support this rewriting, then the ZC constraint in gcc is really just an optimization, not strictly necessary. We should probably check (1) whether clang's internal assembler can do the rewriting, and (2) whether clang supports the ZC constraint. I would prefer using ZC but I want to do whatever is more compatible; I don't think the codegen efficiency matters a lot either way. Rich