From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-1.6 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 31238 invoked from network); 8 Dec 2021 13:37:29 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 8 Dec 2021 13:37:29 -0000 Received: (qmail 3708 invoked by uid 550); 8 Dec 2021 13:37:27 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 3685 invoked from network); 8 Dec 2021 13:37:26 -0000 Date: Wed, 8 Dec 2021 08:37:13 -0500 From: Rich Felker To: Stijn Tintel Cc: musl@lists.openwall.com Message-ID: <20211208133712.GT7074@brightrain.aerifal.cx> References: <20211206234358.2174444-1-stijn@linux-ipv6.be> <87tufljlmv.fsf@oldenburg.str.redhat.com> <20211207005940.GK7074@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] [PATCH] ppc64: check for AltiVec in setjmp/longjmp On Wed, Dec 08, 2021 at 10:43:05AM +0200, Stijn Tintel wrote: > On 7/12/2021 02:59, Rich Felker wrote: > > On Tue, Dec 07, 2021 at 01:37:12AM +0100, Florian Weimer wrote: > >> * Stijn Tintel: > >> > >>> diff --git a/src/setjmp/powerpc64/setjmp.s b/src/setjmp/powerpc64/setjmp.s > >>> index 37683fda..32853693 100644 > >>> --- a/src/setjmp/powerpc64/setjmp.s > >>> +++ b/src/setjmp/powerpc64/setjmp.s > >>> @@ -69,7 +69,17 @@ __setjmp_toc: > >>> stfd 30, 38*8(3) > >>> stfd 31, 39*8(3) > >>> > >>> - # 5) store vector registers v20-v31 > >>> + # 5) store vector registers v20-v31 if hardware supports AltiVec > >>> + mflr 0 > >>> + bl 1f > >>> + .hidden __hwcap > >>> + .long __hwcap-. > >>> +1: mflr 4 > >> This de-balances the return stack and probably has quite severe > >> performance impact. The ISA manual says to use > >> > >> bcl 20,31,$+4 > >> > >> and you'll have to store the __hwcap offset somewhere else. > > To begin with, let's change the .s files to .S files and put the whole > > branch logic inside #ifndef __ALTIVEC__ so that it does not impact > > normal builds with an ISA level where Altivec can be assumed to be > > present. > > > > I'm not sufficiently familiar with the PowerPC ISA to know how bcl > > works, but if there's a less expensive solution along those lines > > that's compatible with all ISA levels, by all means let's use it. The > > same could be done for powerpc-sf (32-bit) and its SPE branches, too. > > > > Also the add and lwz can be used into lwzx (indexed load). > > > The code for ppc64 uses ld after add, not lwz. This is required to make > it work on both big and little endian systems. We therefore cannot use > lwzx, but have to use ldx. OK, I don't understand why endianness would matter, but I do see a problem here: ld expects to load a 64-bit value, but the value is only 32-bit (.long). Unless I'm missing something, we need to either make it 64-bit (.llong, and with proper alignment) or use a sign-extending 32-bit load. The latter would assume a model where the whole program (for static linking) or libc.so (for dynamic) fits in ±2GB. This is clearly valid for dynamic but dubious for static (although maybe GCC already assumes this with how it loads the GOT address and DSO-local globals?). Rich