mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Rich Felker <dalias@aerifal.cx>
To: musl@lists.openwall.com
Subject: Re: Optimized C memset [v2]
Date: Tue, 27 Aug 2013 21:24:33 -0400	[thread overview]
Message-ID: <20130828012433.GZ20515@brightrain.aerifal.cx> (raw)
In-Reply-To: <CAPfzE3ZiR0xngueXCgFSsO0mHFo5wOUjRZ=b3gVKQRgpF4yTpA@mail.gmail.com>

On Wed, Aug 28, 2013 at 12:05:43PM +1200, Andre Renaud wrote:
> Hi Rich,
> 
> On 28 August 2013 04:22, Rich Felker <dalias@aerifal.cx> wrote:
> > Here's version 2 (filename version 6, in honor of glibc ;) of the
> > memset code. I fixed a bug in the logic for coverage of the tail (the
> > part past what's covered by the loop) for some values of n and
> > alignments, and cleaned up the __GNUC__ usage a bit to use less
> > #ifdeffery. The remaining test at the top for the __GNUC__ version is
> > ugly, I admit, and should possibly just be removed and replaced by a
> > configure check to add -D__may_alias__= to the CFLAGS if the compiler
> > defines __GNUC__ but does not recognize __attribute__((__may_alias__))
> > -- opinions on this?
> 
> Can you explain the algorithm a bit - I can't entirely follow the us
> of negation/masking, but it looks like at the end you're doing a loop
> of 64-bit aligned writes, but I don't see how it can work if the tail
> end ends in something that isn't 64-bit aligned? Is this assuming that
> unaligned writes will work ok?

See the version I committed a couple hours ago. It has comments added.
The basic thing you're missing is that the code before the loop fills
from both the beginning and the end, not just the beginning. This
allows for a really effective O(log n) branch strategy to fill n
bytes: essentially, knowing n>=k allows you to fill up to 2*k bytes:
0,1,...,k-1 and n-1,n-2,n-3,...,n-k. If n<2*k, some of these will
overlap, but it doesn't matter.

Rich


      reply	other threads:[~2013-08-28  1:24 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-27  8:30 Optimized C memset Rich Felker
2013-08-27  8:52 ` Jens Gustedt
2013-08-27  9:17   ` Rich Felker
2013-08-27  9:50     ` Jens Gustedt
2013-08-27 14:21       ` Rich Felker
2013-08-27 14:34         ` Luca Barbato
2013-08-27 14:39           ` Rich Felker
2013-08-27 15:20         ` John Spencer
2013-08-27 15:34           ` Rich Felker
2013-08-27 16:22 ` Optimized C memset [v2] Rich Felker
2013-08-27 17:28   ` Jeremy Huntwork
2013-08-27 21:27     ` Rich Felker
2013-08-28  0:05   ` Andre Renaud
2013-08-28  1:24     ` Rich Felker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130828012433.GZ20515@brightrain.aerifal.cx \
    --to=dalias@aerifal.cx \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).