From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 17771 invoked from network); 7 Dec 2021 01:39:48 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 7 Dec 2021 01:39:48 -0000 Received: (qmail 1399 invoked by uid 550); 7 Dec 2021 01:39:46 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 1370 invoked from network); 7 Dec 2021 01:39:46 -0000 Date: Mon, 6 Dec 2021 20:39:32 -0500 From: Rich Felker To: David Edelsohn Cc: musl@lists.openwall.com, Florian Weimer , Stijn Tintel Message-ID: <20211207013930.GM7074@brightrain.aerifal.cx> References: <20211206234358.2174444-1-stijn@linux-ipv6.be> <87tufljlmv.fsf@oldenburg.str.redhat.com> <20211207005940.GK7074@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] [PATCH] ppc64: check for AltiVec in setjmp/longjmp On Mon, Dec 06, 2021 at 08:15:48PM -0500, David Edelsohn wrote: > On Mon, Dec 6, 2021 at 7:59 PM Rich Felker wrote: > > > > On Tue, Dec 07, 2021 at 01:37:12AM +0100, Florian Weimer wrote: > > > * Stijn Tintel: > > > > > > > diff --git a/src/setjmp/powerpc64/setjmp.s b/src/setjmp/powerpc64/setjmp.s > > > > index 37683fda..32853693 100644 > > > > --- a/src/setjmp/powerpc64/setjmp.s > > > > +++ b/src/setjmp/powerpc64/setjmp.s > > > > @@ -69,7 +69,17 @@ __setjmp_toc: > > > > stfd 30, 38*8(3) > > > > stfd 31, 39*8(3) > > > > > > > > - # 5) store vector registers v20-v31 > > > > + # 5) store vector registers v20-v31 if hardware supports AltiVec > > > > + mflr 0 > > > > + bl 1f > > > > + .hidden __hwcap > > > > + .long __hwcap-. > > > > +1: mflr 4 > > > > > > This de-balances the return stack and probably has quite severe > > > performance impact. The ISA manual says to use > > > > > > bcl 20,31,$+4 > > > > > > and you'll have to store the __hwcap offset somewhere else. > > > > To begin with, let's change the .s files to .S files and put the whole > > branch logic inside #ifndef __ALTIVEC__ so that it does not impact > > normal builds with an ISA level where Altivec can be assumed to be > > present. > > > > I'm not sufficiently familiar with the PowerPC ISA to know how bcl > > works, but if there's a less expensive solution along those lines > > that's compatible with all ISA levels, by all means let's use it. The > > same could be done for powerpc-sf (32-bit) and its SPE branches, too. > > bl = branch and link > bcl = branch conditional and link > > link means place the next instruction address in the link register. > Normally a branch and link would be used for a matching "return" > instruction, but in this case it is being used to compute a position > independent code address. As Florian correctly points out, the "bl" > will corrupt the link stack in the processor used to predict return > addresses and the recommended sequence is the one that he suggests. > > bcl 20,31,addr > > which means branch always and, because the condition register bits are > irrelevant, a special value that instructs the processor to not push > the address onto the link stack so that the "calls" and "returns" > remain matched. Thanks. Am I correct in understanding then that we don't need $+4, but can instead use the 1f just as now, with inline .long __hwcap-. -- in other words that "bcl 20,31," is a drop-in replacement for "bl" without the link stack impact? Rich