From: Rich Felker <dalias@libc.org>
To: Denys Vlasenko <vda.linux@googlemail.com>
Cc: musl <musl@lists.openwall.com>
Subject: Re: [PATCH] x86_64/memset: use "small block" code for blocks up to 30 bytes long
Date: Tue, 17 Feb 2015 11:12:22 -0500 [thread overview]
Message-ID: <20150217161222.GF23507@brightrain.aerifal.cx> (raw)
In-Reply-To: <CAK1hOcOEznkzfBGDJwWvdaNXTKwEiiz88r=9tEQF--T=CQvXJg@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 580 bytes --]
On Tue, Feb 17, 2015 at 02:08:52PM +0100, Denys Vlasenko wrote:
> >> Please see attached file.
> >
> > I tried it and it's ~1 cycle slower for at least sizes 16-30;
> > presumably we're seeing the cost of the extra compare/branch at these
> > sizes but not at others. What does your timing test show?
>
> See below.
> First column - result of my2.s
> Second column - result of vda1.s
>
> Basically, the "rep stosq" code path got a bit faster, while
> small memsets stayed the same.
Can you post your test program for me to try out? Here's what I've
been using, attached.
Rich
[-- Attachment #2: memset-cycles.c --]
[-- Type: text/plain, Size: 1274 bytes --]
#define _XOPEN_SOURCE 700
#include <stdio.h>
#include <time.h>
#include <stdlib.h>
#include <string.h>
static inline unsigned rdtsc()
{
#if defined __i386__ || defined __x86_64__
unsigned x;
__asm__ __volatile__ ( "rdtsc" : "=a"(x) : : "rdx" );
// __asm__ __volatile__ ( "cpuid ; rdtsc" : "=a"(x)
// : : "rbx", "rcx", "rdx" );
return x;
#else
struct timespec ts;
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &ts);
return ts.tv_nsec;
#endif
}
char buf[32768+100];
int main()
{
unsigned a=0;
unsigned i, j, t, tmin=-1;
unsigned long long tmean=0;
unsigned overhead = -1;
size_t n;
for (i=0; i<0+1*4096; i++) {
t = rdtsc();
__asm__ __volatile__("nop");
t = rdtsc()-t;
if (t < overhead) overhead = t;
}
//overhead = 0;
for (n=2; n<32768; n+=(n<64 ? 2 : n<512 ? 32 : n)) {
tmin = -1;
tmean = 0;
for (i=0; i<0+1*4096; i++) {
__asm__ __volatile__ ("" : : : "memory");
t = rdtsc();
for (j=0; j<64; j++) {
memset(buf, 0, n);
__asm__ __volatile__ ("" : : : "memory");
}
t = rdtsc()-t;
__asm__ __volatile__ ("" : : : "memory");
if (t < tmin) tmin = t;
tmean += t;
}
tmin -= overhead;
tmean -= 4096*overhead;
tmin /= 64;
tmean /= 64;
tmean /= 4096;
printf("size %zu: min=%u, avg=%llu\n", n, tmin, tmean);
}
}
next prev parent reply other threads:[~2015-02-17 16:12 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-13 16:39 Denys Vlasenko
2015-02-14 19:35 ` Rich Felker
2015-02-15 4:06 ` Rich Felker
2015-02-15 14:07 ` Denys Vlasenko
2015-02-15 15:03 ` Rich Felker
2015-02-15 21:44 ` Denys Vlasenko
2015-02-15 22:55 ` Rich Felker
2015-02-16 10:09 ` Denys Vlasenko
2015-02-16 15:12 ` Rich Felker
2015-02-16 17:36 ` Rich Felker
2015-02-17 13:08 ` Denys Vlasenko
2015-02-17 16:12 ` Rich Felker [this message]
2015-02-17 16:51 ` Denys Vlasenko
2015-02-17 17:30 ` Denys Vlasenko
2015-02-17 17:40 ` Rich Felker
2015-02-17 18:53 ` Denys Vlasenko
2015-02-17 21:12 ` Rich Felker
2015-02-18 9:05 ` Denys Vlasenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150217161222.GF23507@brightrain.aerifal.cx \
--to=dalias@libc.org \
--cc=musl@lists.openwall.com \
--cc=vda.linux@googlemail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).