From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/3631
Path: news.gmane.org!not-for-mail
From: Rich Felker <dalias@aerifal.cx>
Newsgroups: gmane.linux.lib.musl.general
Subject: Re: Thinking about release
Date: Thu, 11 Jul 2013 23:16:15 -0400
Message-ID: <20130712031615.GS29800@brightrain.aerifal.cx>
References: <CAPfzE3aerGrdmTkj15o0CTVtt8TZpTyAnSAj1Joau+Jb_cNGUA@mail.gmail.com>
 <20130709053711.GO29800@brightrain.aerifal.cx>
 <CAPfzE3ZTxynUeJjq7KWijZGhsV==NymW4vqLhnQbEYCXRxVf-g@mail.gmail.com>
 <CAPfzE3ZsMpC9d4VDZyHabhKOffOQW0dnG7Nwpm8EqVBLUXNZKg@mail.gmail.com>
 <CAPfzE3YDFjqHxRaZFeiy0CvbYWYGKzgDGEp-71xSz-03GhNTxw@mail.gmail.com>
 <20130711033754.GL29800@brightrain.aerifal.cx>
 <CAPfzE3ZMGwEvs2n_4LCKzMv0FROS55_1N+HdBw7HgNhexgM+eA@mail.gmail.com>
 <CAPfzE3aoD4mpO9RrV-enuXxkCvMPY_7rEE6e9w8NuX-ntEqtqA@mail.gmail.com>
 <20130711124613.GO29800@brightrain.aerifal.cx>
 <CAPfzE3aKG7JYE_u3oVDfkF2xDSdhzdrY3ui-H0bUduQXUOQ6Vg@mail.gmail.com>
Reply-To: musl@lists.openwall.com
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: ger.gmane.org 1373598988 16761 80.91.229.3 (12 Jul 2013 03:16:28 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Fri, 12 Jul 2013 03:16:28 +0000 (UTC)
To: musl@lists.openwall.com
Original-X-From: musl-return-3635-gllmg-musl=m.gmane.org@lists.openwall.com Fri Jul 12 05:16:29 2013
Return-path: <musl-return-3635-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@plane.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by plane.gmane.org with smtp (Exim 4.69)
	(envelope-from <musl-return-3635-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1UxTqP-0004b4-B7
	for gllmg-musl@plane.gmane.org; Fri, 12 Jul 2013 05:16:29 +0200
Original-Received: (qmail 22119 invoked by uid 550); 12 Jul 2013 03:16:28 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
Original-Received: (qmail 22111 invoked from network); 12 Jul 2013 03:16:28 -0000
Content-Disposition: inline
In-Reply-To: <CAPfzE3aKG7JYE_u3oVDfkF2xDSdhzdrY3ui-H0bUduQXUOQ6Vg@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Xref: news.gmane.org gmane.linux.lib.musl.general:3631
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/3631>

On Fri, Jul 12, 2013 at 10:34:31AM +1200, Andre Renaud wrote:
> I've rejiggled it a bit, and it appears to be working. I wasn't
> entirely sure what you meant about the proper constraints. There is an
> additional reason why 8*4 was used for the align - to force the whole
> loop to work in cache-line blocks. I've now done this explicitly on
> the lead-in by doing the first few copies as 32-bit, then going to the
> full cache-line asm. This has the same performance as the fully native
> assembler. However to get that I had to use the same trick that the
> native assembler uses - doing a load of the next block prior to
> storing this one. I'm a bit concerned that this would mean we'd be
> doing a read that was out of bounds, and I can't entirely see why this
> wouldn't be happening with the existing assembler (but I'm presuming
> it doesn't). Any comments on this side of it?

I was unable to measure any difference in performance of your version
with the prefetch hack versus simply:

	__asm__ __volatile__(
	"ldmia %1!,{a4,v1,v2,v3,v4,v5,v6,v7}\n\t"
	"stmia %0!,{a4,v1,v2,v3,v4,v5,v6,v7}\n\t"
	: "+r"(d), "+r"(s) :
	: "a4", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "memory");

in the inner loop.

Rich