From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 3180 invoked from network); 8 Dec 2021 05:02:25 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 8 Dec 2021 05:02:25 -0000 Received: (qmail 7561 invoked by uid 550); 8 Dec 2021 05:02:22 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 7541 invoked from network); 8 Dec 2021 05:02:21 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1638939727; bh=OzTc7nOOIcUT9uDkq9HvjX52czjOqcE1SvF3y5QZjA0=; h=X-UI-Sender-Class:Date:From:To:Cc:Subject:References:In-Reply-To; b=UZcoug7ave8QUMd4wH7nyF1/aExkG4qJRKuNH4vKRUqHiFMhuIsP8VVszjh/IBmVn 3flbaXuz0On/Z2MebdDaU3rQUTn1qyeSsNtcZnSeb18W/rAnaIBUVKax1OYc0lPZJ0 jX/uIr0aHfXA/78n9qTUandRJzqXlmP9Q+8Ehrk8= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Date: Wed, 8 Dec 2021 06:02:03 +0100 From: Markus Wichmann To: Rich Felker Cc: Florian Weimer , musl@lists.openwall.com Message-ID: <20211208050203.GD8506@voyager> References: <87tufljlmv.fsf@oldenburg.str.redhat.com> <20211207005940.GK7074@brightrain.aerifal.cx> <20211207013930.GM7074@brightrain.aerifal.cx> <20211207132509.GO7074@brightrain.aerifal.cx> <20211207183933.GA8506@voyager> <87o85snrj7.fsf@oldenburg.str.redhat.com> <20211207201505.GC8506@voyager> <20211207202920.GQ7074@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211207202920.GQ7074@brightrain.aerifal.cx> User-Agent: Mutt/1.9.4 (2018-02-28) X-Provags-ID: V03:K1:15fSUD410AFjK5dggxBFaGBNrqfCTd+2MO18pKrNshzovPFLgRx 3rvBAmWuPxgyDXTlRL8ucTggpulLO+qzBlcEgd1Jp4vtoKExDc1c96/DP/6s7x2Aq44WI4I K1khK0LgqYFzATDg0HlbZwtZyBvLI6xlw8lpzFPs7ZCTRmlZN0ihOKRA0FtL9QOcvBKRjey FOVGkDlg0ThLjfNV68NCQ== X-UI-Out-Filterresults: notjunk:1;V03:K0:CXXYoKnSr0w=:3Y4pQyrg94CshZnplF5+Ys h3F2YUv14fhYKS62hpP3u08SLsnFkUghUX5hL15pdd8GQ++JNvT2Ghe7CwuLffDSKL4BHr7cz +fqICt2ZJ5sOM68n+3DMNM4gKHnmZejzzQKsOkR+/+zchOBtYE4ykJyhO891rz99bdy/hvnGp 4tipCQgP6XkXbV0A2sUQxt47hbnV3Z8F1Yj2R1nfXdAup5l4PK9xc0lfczf6eNhBYPl9C+2KZ em8Tn/oTCCERQ2iDL0cGNGbUyQ+xO/2vzO6MDqy2BFHfrS7c3YubEGige/JTlTU1gfNv/LLyM U3wqMTh7WY8GfVfRkVECyZx9PY4+MRk0wkWCtKyTO2OcOI9XxDw2IhcaSlQ0cFg33ADMdqB9V 9lBwFtv2FQ6YZt1Ptc60oKY3h9gzROz2iDPTR3QfqjbOmkFQnuhQ9SBML26bbNwngcyN9VUiM d7On/l9NQ0Qcpd8HCA3RdqQq35F3HbmUP4mJz7SAsUyF6sTX/I7ZxexNRBehmeX1xnjKhmqWF 4YZiGLa0hHtsKXxP+aBM6fOTEfJfM2WiCXgaMdcHVpUNygMKxb1r4pUvRms0H9wyGsciiNgih pPTVw/EYepHy4/RoesMfMVaRcm3RximL1rju9DomlOZKwUi5V5OyvtD7LpFJDfWB1ZdR3zPqC hx0/JLXWyJvF3EfltPyRROy4k9JC9chIc2rR+TcNNuw6lSy1VulSeLbbPMd+JmeHJiuzxbYmt KIVykvViZziyOWxpTs8Zeg3defoD0FpmWqBSLD2T/AvHZ0Mv7VbQWJV1S9KG5t7TYq1If+xhH GX3z9eOl2K3Bx9D206ubQ8YdglwMzEXnxQAM6mtkFIdH70d5khqdF6RfSvdX03+cZ/E/ikAfR +a0e+nlnrQP6cYA8Q31UbqUk5SxKOSiZJfBgG4+NxmReVEbU4QgflAMCqei97RDpDr/6t8Nt6 r7wxmUNt/cEnXjPJYAcRoFnrdGOc/hG9xcpMxsbEaPqYN/PYBjGsnt2JMNgFJISg8zoEaPmfP 13ZS/nTgmuHvdn7jSAk+qva+bcFKIw+r+7SjgemgACbY1eQenpXnNTwAswKg1djZjNGPnh76+ EzoJzQx84UH0w0= Content-Transfer-Encoding: quoted-printable Subject: Re: [musl] [PATCH] ppc64: check for AltiVec in setjmp/longjmp On Tue, Dec 07, 2021 at 03:29:21PM -0500, Rich Felker wrote: > In general I would prefer the "obvious what it's doing" form over the > "special cased for performance" form in places where performance can't > matter -- for example, the ones you cited that execute once per > program invocation. But if it's easy to read either way, fine -- and > it probably can be made so. > I foresee no issue with readability. Indeed most avid PPC assembly readers will recognize "bcl 20,31" as "just getting the instruction pointer" sooner than "bl", but the functions in question are so small it doesn't really matter either way. > Note that if the __hwcap-. constant is moved out of line, I think it's > possible to avoid any added cost. Something along the lines of the > following: > > bcl 20,31,1f > 1: mflr 4 > lwz 5,2f-1b(4) > lwzx 4,4,5 > ... > 2: .long __hwcap-1b > > Does this look right? Seems right to me. David's warning made me remember an article I read once about branch prediction and cache instructions: Basically, cache instructions have no execution phase, I mean, architecturally they have no effect. They change no memory and no registers, they change an implementation detail that ought to be transparent to the programmer. So if a branch is mispredicted to hit a given cache instruction, that cache instruction will be executed to the fullest even if the pipeline is flushed (pipeline flush simply skips execution phase, which cache instructions don't have). Now, the XBox 360 CPU had a special cache instruction (I believe it was "xdcbl" or so), which could circumvent the L2 cache. Unfortunately, all access synchronization between CPUs happens through the L2 cache. Therefore this instruction should not be used on memory that can be shared between CPUs, which is pretty much all memory in user space (any thread might be preempted and migrated at any time, so not even stack is safe). Unfortunately, with the above mentioned branch prediction drama, the instruction can cause issues if it merely shows up in the instruction stream, even if it is ultimately never executed. They had to remove any instance of this instruction from their programs to get the issues to disappear. Now with your hwcap pointer, you have no idea what instruction it will end up looking like. But if we put the pointer into .rodata, the offset between labels 2 and 1 might be larger than 32kB, making the code more complicated. You could put "b ." in front of it, to stop any branch misprediction before it. I don't know, you figure it out. Ciao, Markus