From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 21013 invoked from network); 7 Dec 2021 14:48:57 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 7 Dec 2021 14:48:57 -0000 Received: (qmail 21686 invoked by uid 550); 7 Dec 2021 14:48:54 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 21663 invoked from network); 7 Dec 2021 14:48:54 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7AnDIYkIngUC//IK23EL36A0HxgtsZHi4k482B3I+D0=; b=m8xaRU8n4ZVzbsrMPB3W4PKZKr6zIMhjqoTwpxLtRJ+wp3geafHt4HWpcEPNqO/AIB mlhbzJK1bCdxA+WvttB4cj70oGlcL7bNMfLx+t97pmNQiAj2Ijynb0gipo+gSIkvsXhd zPKU3SuftwJAiOTxmw1TA06XSohhEDp37K+vAuI3Flj+6HiARYAcGPAO499i5V40f/8C Xz3/Y/f7DrNrqfIgUsrNKLcvTxYasAlpMoK7FptSWEj8DYjI6OtSt+NpCz2YbIH4z6AG QlNzfOAvXFv9IHllz9UiG1sqjR2wctf2Fu/Ine3kQJU4d+3A9QqX0PsmC8jlkn50HYRM L/PA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7AnDIYkIngUC//IK23EL36A0HxgtsZHi4k482B3I+D0=; b=tFyJm64f+my7mZjZH/ma1eEtysZvgBKXNPlfVONlTOFgAkvqmt001/VSld//6S5fIq DX/MGstpJ5AE3lm7yDAHVRx2OM9pjQKZ6XDodHffYZ4lubstD9pKQ2TchRICbN/2LJLg RG99lXhBj2eMCgNkx4diUSQYl3GjVNI4TJ0b3MZqNWbftnEZ9hpFnv1vyGdlL4yHxwzx /1iUszJVyBRn7TeNJDUqzobBB3IVt7Ib+SUjuCm2VE4RC+b19SlUs13Cj6EVaB6IX56v RLQ5TeYVtghu+ARUBmTdIAC9tmog1z9tMWX0/nGOieaMDgnR3nLLKcgQsllpZn+9MAQv HJcw== X-Gm-Message-State: AOAM5329pn9gxt8LqVN8eCYMhfv47nDF0CRvyi+vcSFJWlVvW633NSlw Zlxo09lUbbydh6hvICLE4L0cPzy5+bvEQZWc8AI= X-Google-Smtp-Source: ABdhPJxwVHj7+jyqiHgaxGviH0LG+hvNaj17GKssP1KJyNEE1TOv024xi7MPhuIgG80C0WHphgjRBcs72upD6h0GvYY= X-Received: by 2002:a05:6122:21a6:: with SMTP id j38mr53774127vkd.3.1638888522252; Tue, 07 Dec 2021 06:48:42 -0800 (PST) MIME-Version: 1.0 References: <20211206234358.2174444-1-stijn@linux-ipv6.be> <87tufljlmv.fsf@oldenburg.str.redhat.com> <20211207005940.GK7074@brightrain.aerifal.cx> <20211207013930.GM7074@brightrain.aerifal.cx> <20211207132509.GO7074@brightrain.aerifal.cx> <20211207144334.GP7074@brightrain.aerifal.cx> In-Reply-To: <20211207144334.GP7074@brightrain.aerifal.cx> From: David Edelsohn Date: Tue, 7 Dec 2021 09:48:31 -0500 Message-ID: To: Rich Felker Cc: musl@lists.openwall.com, Florian Weimer , Stijn Tintel Content-Type: text/plain; charset="UTF-8" Subject: Re: [musl] [PATCH] ppc64: check for AltiVec in setjmp/longjmp On Tue, Dec 7, 2021 at 9:43 AM Rich Felker wrote: > > On Tue, Dec 07, 2021 at 08:39:20AM -0500, David Edelsohn wrote: > > On Tue, Dec 7, 2021 at 8:25 AM Rich Felker wrote: > > > > > > On Mon, Dec 06, 2021 at 08:44:47PM -0500, David Edelsohn wrote: > > > > On Mon, Dec 6, 2021 at 8:39 PM Rich Felker wrote: > > > > > > > > > > On Mon, Dec 06, 2021 at 08:15:48PM -0500, David Edelsohn wrote: > > > > > > On Mon, Dec 6, 2021 at 7:59 PM Rich Felker wrote: > > > > > > > > > > > > > > On Tue, Dec 07, 2021 at 01:37:12AM +0100, Florian Weimer wrote: > > > > > > > > * Stijn Tintel: > > > > > > > > > > > > > > > > > diff --git a/src/setjmp/powerpc64/setjmp.s b/src/setjmp/powerpc64/setjmp.s > > > > > > > > > index 37683fda..32853693 100644 > > > > > > > > > --- a/src/setjmp/powerpc64/setjmp.s > > > > > > > > > +++ b/src/setjmp/powerpc64/setjmp.s > > > > > > > > > @@ -69,7 +69,17 @@ __setjmp_toc: > > > > > > > > > stfd 30, 38*8(3) > > > > > > > > > stfd 31, 39*8(3) > > > > > > > > > > > > > > > > > > - # 5) store vector registers v20-v31 > > > > > > > > > + # 5) store vector registers v20-v31 if hardware supports AltiVec > > > > > > > > > + mflr 0 > > > > > > > > > + bl 1f > > > > > > > > > + .hidden __hwcap > > > > > > > > > + .long __hwcap-. > > > > > > > > > +1: mflr 4 > > > > > > > > > > > > > > > > This de-balances the return stack and probably has quite severe > > > > > > > > performance impact. The ISA manual says to use > > > > > > > > > > > > > > > > bcl 20,31,$+4 > > > > > > > > > > > > > > > > and you'll have to store the __hwcap offset somewhere else. > > > > > > > > > > > > > > To begin with, let's change the .s files to .S files and put the whole > > > > > > > branch logic inside #ifndef __ALTIVEC__ so that it does not impact > > > > > > > normal builds with an ISA level where Altivec can be assumed to be > > > > > > > present. > > > > > > > > > > > > > > I'm not sufficiently familiar with the PowerPC ISA to know how bcl > > > > > > > works, but if there's a less expensive solution along those lines > > > > > > > that's compatible with all ISA levels, by all means let's use it. The > > > > > > > same could be done for powerpc-sf (32-bit) and its SPE branches, too. > > > > > > > > > > > > bl = branch and link > > > > > > bcl = branch conditional and link > > > > > > > > > > > > link means place the next instruction address in the link register. > > > > > > Normally a branch and link would be used for a matching "return" > > > > > > instruction, but in this case it is being used to compute a position > > > > > > independent code address. As Florian correctly points out, the "bl" > > > > > > will corrupt the link stack in the processor used to predict return > > > > > > addresses and the recommended sequence is the one that he suggests. > > > > > > > > > > > > bcl 20,31,addr > > > > > > > > > > > > which means branch always and, because the condition register bits are > > > > > > irrelevant, a special value that instructs the processor to not push > > > > > > the address onto the link stack so that the "calls" and "returns" > > > > > > remain matched. > > > > > > > > > > Thanks. Am I correct in understanding then that we don't need $+4, but > > > > > can instead use the 1f just as now, with inline .long __hwcap-. -- in > > > > > other words that "bcl 20,31," is a drop-in replacement for "bl" > > > > > without the link stack impact? > > > > > > > > It should work, but it's slightly preferred to use $+4 because one > > > > explicitly wants the address of the next instruction and labels of the > > > > > > In this case we don't want the address of the next instruction. We > > > want the address of the constant __hwcap-. > > > > ..hidden __hwcap > > > > is not an instruction. It will not emit any data. > > Of course it won't. .long __hwcap-. is the directive that does, on the > next line, which you seem to have missed. I'm sorry that you don't understand what I am expressing. Snide comments are not productive. Do what you want. Thanks, David