mailing list of musl libc
 help / color / mirror / code / Atom feed
From: Andre Renaud <andre@bluewatersys.com>
To: musl@lists.openwall.com
Subject: Re: Thinking about release
Date: Wed, 24 Jul 2013 16:40:16 +1200	[thread overview]
Message-ID: <CAPfzE3b3w+wS2RCE_nncAp6c2_TN6Qh9JZWe-At26mqSd+YLjQ@mail.gmail.com> (raw)
In-Reply-To: <20130724034843.GP3249@brightrain.aerifal.cx>

Hi Rich,
> It looks buggy as-is; as far as I can tell, it will crash if src/dest
> are aligned with respect to each other but not aligned mod 4, i.e. the
> code starts out copying word-at-a-time rather than byte-at-a-time.

Yes, you are correct, I'd messed that up while looking at the cache
alignment stuff (along with anoter small size related bug). Fixing it
is relatively straight forward though:
#define SS sizeof(size_t)
#define ALIGN (SS - 1)
void * noinline my_asm_memcpy(void * restrict dest, const void *
restrict src, size_t n)
{
    unsigned char *d = dest;
    const unsigned char *s = src;

    if (((uintptr_t)d & ALIGN) != ((uintptr_t)s & ALIGN))
        goto misaligned;

    /* Get them word aligned */
    for (; ((uintptr_t)d & ALIGN) && n; n--) *d++ = *s++;

    /* ARM has 32-byte cache lines, so align to that for performance */
    for (; ((uintptr_t)d & ((8 * SS) - 1)) && n >= SS; n-=SS) {
            *(size_t *)d = *(size_t *)s;
            d += SS;
            s += SS;
    }
    /* Do full cache line read/writes */
    for (; n>=(8 * SS); n-= (8 * SS))
            __asm__ __volatile__(
                            "ldmia %1!,{a4,v1,v2,v3,v4,v5,v6,v7}\n\t"
                            "ldrhi r12, [%1]\n"
                            "stmia %0!,{a4,v1,v2,v3,v4,v5,v6,v7}\n\t"
                            : "+r"(d), "+r"(s) :
                            : "a4", "v1", "v2", "v3", "v4", "v5",
"v6", "v7", "r12", "memory");

misaligned:
        for (; n; n--) *d++ = *s++;
    return dest;

}

> I think the C version would be acceptable if we get the bugs fixed and
> test it well, but I'd also like to still keep the asm under
> consideration. There are lots of cases not covered by the C version,
> like misaligned copies (important for strings, not for much else). Do
> you think these cases are important?

At the moment the mis-aligned copies perform terribly (18MB/s vs glibc
@ 100MB/s). However the existing C implementation in musl is no
different, so we're not degrading the current system.

We're essentially missing the non-congruent copying stuff from the asm
code. I'll have a look at this and see if I can write a similar C
version.

Regards,
Andre


  reply	other threads:[~2013-07-24  4:40 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-13  1:25 Rich Felker
2013-06-13  1:33 ` Andre Renaud
2013-06-13  1:43   ` Rich Felker
2013-07-09  5:06     ` Andre Renaud
2013-07-09  5:37       ` Rich Felker
2013-07-09  6:24         ` Harald Becker
2013-07-09 21:28         ` Andre Renaud
2013-07-09 22:26           ` Andre Renaud
2013-07-10  6:42             ` Jens Gustedt
2013-07-10  7:50               ` Rich Felker
2013-07-10 22:44             ` Andre Renaud
2013-07-11  3:37               ` Rich Felker
2013-07-11  4:04                 ` Andre Renaud
2013-07-11  5:10                   ` Andre Renaud
2013-07-11 12:46                     ` Rich Felker
2013-07-11 22:34                       ` Andre Renaud
2013-07-12  3:16                         ` Rich Felker
2013-07-12  3:36                           ` Andre Renaud
2013-07-12  4:16                             ` Rich Felker
2013-07-24  1:34                               ` Andre Renaud
2013-07-24  3:48                                 ` Rich Felker
2013-07-24  4:40                                   ` Andre Renaud [this message]
2013-07-28  8:09                                     ` Rich Felker
2013-07-11  5:27                 ` Daniel Cegiełka
2013-07-11 12:49                   ` Rich Felker
2013-07-15  4:25                 ` Rob Landley
2013-07-10 19:42           ` Rich Felker
2013-07-14  6:37             ` Rob Landley
2013-07-11  4:30           ` Strake
2013-07-11  4:33             ` Rich Felker
2013-07-10 19:38         ` Rob Landley
2013-07-10 20:34           ` Andre Renaud
2013-07-10 20:49             ` Nathan McSween
2013-07-10 21:01             ` Rich Felker
2013-06-13 15:46 ` Isaac
2013-06-26  1:44 ` Rich Felker
2013-06-26 10:19   ` Szabolcs Nagy
2013-06-26 14:21     ` Rich Felker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPfzE3b3w+wS2RCE_nncAp6c2_TN6Qh9JZWe-At26mqSd+YLjQ@mail.gmail.com \
    --to=andre@bluewatersys.com \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).