From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 12791 invoked from network); 7 Dec 2021 00:59:57 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 7 Dec 2021 00:59:57 -0000 Received: (qmail 9768 invoked by uid 550); 7 Dec 2021 00:59:55 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 9745 invoked from network); 7 Dec 2021 00:59:54 -0000 Date: Mon, 6 Dec 2021 19:59:41 -0500 From: Rich Felker To: Florian Weimer Cc: Stijn Tintel , musl@lists.openwall.com Message-ID: <20211207005940.GK7074@brightrain.aerifal.cx> References: <20211206234358.2174444-1-stijn@linux-ipv6.be> <87tufljlmv.fsf@oldenburg.str.redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87tufljlmv.fsf@oldenburg.str.redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] [PATCH] ppc64: check for AltiVec in setjmp/longjmp On Tue, Dec 07, 2021 at 01:37:12AM +0100, Florian Weimer wrote: > * Stijn Tintel: > > > diff --git a/src/setjmp/powerpc64/setjmp.s b/src/setjmp/powerpc64/setjmp.s > > index 37683fda..32853693 100644 > > --- a/src/setjmp/powerpc64/setjmp.s > > +++ b/src/setjmp/powerpc64/setjmp.s > > @@ -69,7 +69,17 @@ __setjmp_toc: > > stfd 30, 38*8(3) > > stfd 31, 39*8(3) > > > > - # 5) store vector registers v20-v31 > > + # 5) store vector registers v20-v31 if hardware supports AltiVec > > + mflr 0 > > + bl 1f > > + .hidden __hwcap > > + .long __hwcap-. > > +1: mflr 4 > > This de-balances the return stack and probably has quite severe > performance impact. The ISA manual says to use > > bcl 20,31,$+4 > > and you'll have to store the __hwcap offset somewhere else. To begin with, let's change the .s files to .S files and put the whole branch logic inside #ifndef __ALTIVEC__ so that it does not impact normal builds with an ISA level where Altivec can be assumed to be present. I'm not sufficiently familiar with the PowerPC ISA to know how bcl works, but if there's a less expensive solution along those lines that's compatible with all ISA levels, by all means let's use it. The same could be done for powerpc-sf (32-bit) and its SPE branches, too. Also the add and lwz can be used into lwzx (indexed load). Rich