From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 12241 invoked from network); 7 Dec 2021 13:39:46 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 7 Dec 2021 13:39:46 -0000 Received: (qmail 16259 invoked by uid 550); 7 Dec 2021 13:39:44 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 16227 invoked from network); 7 Dec 2021 13:39:43 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=dCun0da5Kqi64A81AmkLpVgflP8e8h8R2sYX305Lc0M=; b=g5M6If1YbhlfLLyZic7DZiG5PxZQEtO6wcJEK8wnrm1Uts82o6zidbPB1Hp7vEEjeu mOU9ELPlMmgvkXW1vj/G7StyFVltV991ylXTj/Icg2bmaHOfi6BqatMhBtsV3IayRu8v LVUzpMwMtZGNMxS/d8Oh9h2ygDgSi9KcH4YJEXxyYtRZdik8pvoox2ptCzRVc6zoVfyf R7ZTSTNDsQXOpAzjz89TZEa3bTAN53M/r3XoH1bt1eQclaF90lFqriRNdT/HeEKEaLHM uyB1ltBTKhEYfqU0IGYEG7CAx6/CuKmf40mRbPRB2nayeWBDPUoI+yDD0Cu8PQvv+o5A Yb6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=dCun0da5Kqi64A81AmkLpVgflP8e8h8R2sYX305Lc0M=; b=mnYFC9Yb74xEgojZ5EwJc7eznR98dOwFgIdXSBapeN3Wg7dgfGqTzK4C19GQgTYzNn NegKc/g7y8V17I2pWsX+bEc6vcH0F4vVVkIPNaPhfSdPqbA8rT3O/6ELzkefjxojku7E d2ld5nkVS8L/m97jUIQC4szwq2lGD4ASuhO4U0F8fEhIBWQTx7ntC/F70GQAqYVo6R/l rj9MRy24O9D8DOBuX3wkk7/24hw2Rn1qyUPNz+YjKg6X3KEWAqHfoD7125jQ1NWL8I22 p9WIayScONcqOq5BqRuOzXGr6W21yIS528Jt3rogg7d/5RBGchfDLXOdKoKtVTJxYLlX qyJA== X-Gm-Message-State: AOAM530hbXy/OQFhOg7rtJvqus4B1wCxn7LshkaUZRG0bZv7/OALhYyM zegrysUtVBZOwvhlL53pktTDybcQS+yP/Z5rLo0= X-Google-Smtp-Source: ABdhPJxk3YiGSlXmreFN7GzVT9+IXjmIKNkM5Nf8i3JBT+T62XhsM//pn3qTC3JAZ6LDB0JSCsy873TwepQRXt3WCnk= X-Received: by 2002:a67:6643:: with SMTP id a64mr45196120vsc.85.1638884371160; Tue, 07 Dec 2021 05:39:31 -0800 (PST) MIME-Version: 1.0 References: <20211206234358.2174444-1-stijn@linux-ipv6.be> <87tufljlmv.fsf@oldenburg.str.redhat.com> <20211207005940.GK7074@brightrain.aerifal.cx> <20211207013930.GM7074@brightrain.aerifal.cx> <20211207132509.GO7074@brightrain.aerifal.cx> In-Reply-To: <20211207132509.GO7074@brightrain.aerifal.cx> From: David Edelsohn Date: Tue, 7 Dec 2021 08:39:20 -0500 Message-ID: To: Rich Felker Cc: musl@lists.openwall.com, Florian Weimer , Stijn Tintel Content-Type: text/plain; charset="UTF-8" Subject: Re: [musl] [PATCH] ppc64: check for AltiVec in setjmp/longjmp On Tue, Dec 7, 2021 at 8:25 AM Rich Felker wrote: > > On Mon, Dec 06, 2021 at 08:44:47PM -0500, David Edelsohn wrote: > > On Mon, Dec 6, 2021 at 8:39 PM Rich Felker wrote: > > > > > > On Mon, Dec 06, 2021 at 08:15:48PM -0500, David Edelsohn wrote: > > > > On Mon, Dec 6, 2021 at 7:59 PM Rich Felker wrote: > > > > > > > > > > On Tue, Dec 07, 2021 at 01:37:12AM +0100, Florian Weimer wrote: > > > > > > * Stijn Tintel: > > > > > > > > > > > > > diff --git a/src/setjmp/powerpc64/setjmp.s b/src/setjmp/powerpc64/setjmp.s > > > > > > > index 37683fda..32853693 100644 > > > > > > > --- a/src/setjmp/powerpc64/setjmp.s > > > > > > > +++ b/src/setjmp/powerpc64/setjmp.s > > > > > > > @@ -69,7 +69,17 @@ __setjmp_toc: > > > > > > > stfd 30, 38*8(3) > > > > > > > stfd 31, 39*8(3) > > > > > > > > > > > > > > - # 5) store vector registers v20-v31 > > > > > > > + # 5) store vector registers v20-v31 if hardware supports AltiVec > > > > > > > + mflr 0 > > > > > > > + bl 1f > > > > > > > + .hidden __hwcap > > > > > > > + .long __hwcap-. > > > > > > > +1: mflr 4 > > > > > > > > > > > > This de-balances the return stack and probably has quite severe > > > > > > performance impact. The ISA manual says to use > > > > > > > > > > > > bcl 20,31,$+4 > > > > > > > > > > > > and you'll have to store the __hwcap offset somewhere else. > > > > > > > > > > To begin with, let's change the .s files to .S files and put the whole > > > > > branch logic inside #ifndef __ALTIVEC__ so that it does not impact > > > > > normal builds with an ISA level where Altivec can be assumed to be > > > > > present. > > > > > > > > > > I'm not sufficiently familiar with the PowerPC ISA to know how bcl > > > > > works, but if there's a less expensive solution along those lines > > > > > that's compatible with all ISA levels, by all means let's use it. The > > > > > same could be done for powerpc-sf (32-bit) and its SPE branches, too. > > > > > > > > bl = branch and link > > > > bcl = branch conditional and link > > > > > > > > link means place the next instruction address in the link register. > > > > Normally a branch and link would be used for a matching "return" > > > > instruction, but in this case it is being used to compute a position > > > > independent code address. As Florian correctly points out, the "bl" > > > > will corrupt the link stack in the processor used to predict return > > > > addresses and the recommended sequence is the one that he suggests. > > > > > > > > bcl 20,31,addr > > > > > > > > which means branch always and, because the condition register bits are > > > > irrelevant, a special value that instructs the processor to not push > > > > the address onto the link stack so that the "calls" and "returns" > > > > remain matched. > > > > > > Thanks. Am I correct in understanding then that we don't need $+4, but > > > can instead use the 1f just as now, with inline .long __hwcap-. -- in > > > other words that "bcl 20,31," is a drop-in replacement for "bl" > > > without the link stack impact? > > > > It should work, but it's slightly preferred to use $+4 because one > > explicitly wants the address of the next instruction and labels of the > > In this case we don't want the address of the next instruction. We > want the address of the constant __hwcap-. .hidden __hwcap is not an instruction. It will not emit any data. Thanks, David