From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7071
Path: news.gmane.org!not-for-mail
From: Rich Felker <dalias@libc.org>
Newsgroups: gmane.linux.lib.musl.general
Subject: Re: [PATCH] x86_64/memset: use "small block" code for blocks
 up to 30 bytes long
Date: Tue, 17 Feb 2015 12:40:45 -0500
Message-ID: <20150217174045.GH23507@brightrain.aerifal.cx>
References: <1423845589-5920-1-git-send-email-vda.linux@googlemail.com>
 <20150214193533.GK23507@brightrain.aerifal.cx>
 <20150215040655.GM23507@brightrain.aerifal.cx>
 <CAK1hOcPQ=mADeAUP3i-Xt3rvHmgUrVVoz2yUEOkUEYQ2xRVN2g@mail.gmail.com>
 <20150215150313.GO23507@brightrain.aerifal.cx>
 <CAK1hOcMgM5j-EtOk2aPao6ma=M7PVyA_3U=22+8HbPu+S9GXdw@mail.gmail.com>
 <20150216173634.GA23507@brightrain.aerifal.cx>
 <CAK1hOcOEznkzfBGDJwWvdaNXTKwEiiz88r=9tEQF--T=CQvXJg@mail.gmail.com>
 <20150217161222.GF23507@brightrain.aerifal.cx>
 <CAK1hOcOaN4SnpO2jMGib3tFEf+c8=Tu8Nwi2YnOhzefpSSqTng@mail.gmail.com>
Reply-To: musl@lists.openwall.com
NNTP-Posting-Host: plane.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: ger.gmane.org 1424195609 26601 80.91.229.3 (17 Feb 2015 17:53:29 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Tue, 17 Feb 2015 17:53:29 +0000 (UTC)
Cc: musl <musl@lists.openwall.com>
To: Denys Vlasenko <vda.linux@googlemail.com>
Original-X-From: musl-return-7084-gllmg-musl=m.gmane.org@lists.openwall.com Tue Feb 17 18:53:29 2015
Return-path: <musl-return-7084-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@m.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by plane.gmane.org with smtp (Exim 4.69)
	(envelope-from <musl-return-7084-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1YNmKb-0003Sp-60
	for gllmg-musl@m.gmane.org; Tue, 17 Feb 2015 18:53:09 +0100
Original-Received: (qmail 13833 invoked by uid 550); 17 Feb 2015 17:53:06 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
Original-Received: (qmail 13679 invoked from network); 17 Feb 2015 17:53:00 -0000
Content-Disposition: inline
In-Reply-To: <CAK1hOcOaN4SnpO2jMGib3tFEf+c8=Tu8Nwi2YnOhzefpSSqTng@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Original-Sender: Rich Felker <dalias@aerifal.cx>
Xref: news.gmane.org gmane.linux.lib.musl.general:7071
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/7071>

On Tue, Feb 17, 2015 at 05:51:11PM +0100, Denys Vlasenko wrote:
> On Tue, Feb 17, 2015 at 5:12 PM, Rich Felker <dalias@libc.org> wrote:
> > On Tue, Feb 17, 2015 at 02:08:52PM +0100, Denys Vlasenko wrote:
> >> >> Please see attached file.
> >> >
> >> > I tried it and it's ~1 cycle slower for at least sizes 16-30;
> >> > presumably we're seeing the cost of the extra compare/branch at these
> >> > sizes but not at others. What does your timing test show?
> >>
> >> See below.
> >> First column - result of my2.s
> >> Second column - result of vda1.s
> >>
> >> Basically, the "rep stosq" code path got a bit faster, while
> >> small memsets stayed the same.
> >
> > Can you post your test program for me to try out? Here's what I've
> > been using, attached.
> 
> With your program I see similar results:
> 
> ....
> size 50: min=10, avg=10           min=10, avg=10
> size 52: min=10, avg=10           min=10, avg=10

The ... was the part where mine seemed better. :)

Anyway thanks; I'll give your test program a run and see what comes
out. I don't think the difference is going to be big either way, but I
suspect mine is slightly faster for small sizes (~1-30) and slightly
slower for large sizes (>126).

BTW I appreciate your work and interest in improving this. I just
don't like string-ops optimization in general because determining that
changes are actually a net gain for a wide range of cpus and usage
cases and not just for one benchmark turns into a big time sink. :-(
But at least it's fun...

Rich