Re: Query regarding malloc if statement

mailing list of musl libc
 help / color / mirror / code / Atom feed

From: Markus Wichmann <nullplan@gmx.net>
To: musl@lists.openwall.com
Subject: Re: Query regarding malloc if statement
Date: Tue, 20 Jun 2017 06:14:29 +0200	[thread overview]
Message-ID: <20170620041429.zjmzwpeyycwwpcvr@voyager> (raw)
In-Reply-To: <CY4PR02MB2231108AB4127E78B5AB87EA82C40@CY4PR02MB2231.namprd02.prod.outlook.com>

On Mon, Jun 19, 2017 at 09:02:00PM +0000, Jamie Mccrae wrote:
> My understanding is that doing a read followed by a possible write is slower than always doing a write for the reason that upon doing a read the process will halt
> until the memory is brought into the CPU's cache which isn't a problem when just doing a write. I've just thrown together a simple application to test this (testing on a modern PC running alpine linux 64-bit in a virtualbox VM with 512MB RAM and 1 CPU core) with a normal musl library and a modified one whereby I've removed the 'if' check:
> 

Woah, you're mixing up a few things here. A cache miss and a page fault
are two very different things.

Besides, doesn't a cache miss on write mean that a cache-line for the
write area has to be allocated first?

> #include <time.h>
> #include <stdlib.h>
> #include <stdio.h>
> #include <stdint.h>
> 
> void TimedFunc()
> {
>     uint32_t loops = 64;
>     uint32_t *ptr;
>     while (loops > 0)
>     {
>         ptr = calloc(64, 2);
>         free(ptr);
>         --loops;
>     }
> }
> 
> void main()
> {
>     clock_t stime, etime;
>     stime = clock();
> 
>     uint32_t runs = 0;
>     while (runs < 16384)
>     {
>         TimedFunc();
>         ++runs;
>     }
> 
>     etime = clock();
>     printf("%d loops in %d ms\r\n", runs, ((etime - stime) * 1000 / CLOCKS_PER_SEC));
> }
> 

Hmm... looks about right (except for "void main", but let's not be
pedantic here). But, as I said, the whole thing only works if brk() is
disabled. If you don't want to recompile your kernel, you can use a
seccomp filter to disallow that system call. This forces musl to fall
back to allocating heap with mmap().

Also, you are allocating 128 bytes, which is too small to trigger the
effect. Try 100kB (if my maths did not fail me, for a 32-bit platform
the mmap threshold is at 112kB, and for a 64-bit platform it is twice
that, so 100kB is well below that).
> 
> Results are 74-148ms for the normal library and 70-72ms when the if statement is removed (about twice as fast). I've also got am original raspberry pi with a single CPU and have alpine linux on that so I've performed the same test using 32 loops, calloc(32, 2) and 8192 loops instead and see a similar result although it's much closer 411-412ms for the normal library and 405-408ms when the if statement is removed.

Interesting. So it appears to not be beneficial, time-wise, for small
allocations.

> Surely a page fault will occur when attempting to read memory not writing it, it doesn't need to bring the page into the cache if no read is taking place therefore a page fault will not occur?

No, not really. See, if Linux is doing the right thing, then it will
always have a zero page handy. If an application requests memory via
mmap() with anonymous pages, what Linux should do is write into the page
tables in the CPU-facing bytes that the pages exist and all point to the
zero page and are read-only. In the OS-facing bits, it needs to record
that those pages are copy-on-write, of course. Then a read of those
pages will return bytes from the zero page (so always zero), and a write
will cause a page fault.

Linux will of course handle that page fault by allocating a fresh
physical page and copying the zero page there and rewriting the page
tables and invalidating the page table cache. Before continuing the
program.

Of course, I don't know if Linux really does that. It might just answer
a request for memory with completely inaccessible pages that cause a
fault as soon as they are accessed in any way. The interface would be
fulfilled either way.

Oh, and the CPU cache doesn't have anything to do with this. The page
fault mechanism is so slow that a cache miss or two make no odds here.

Ciao,
Markus

next prev parent reply	other threads:[~2017-06-20  4:14 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-19 15:16 Jamie Mccrae
2017-06-19 18:34 ` Markus Wichmann
2017-06-19 21:02   ` Jamie Mccrae
2017-06-20  4:14     ` Markus Wichmann [this message]
2017-06-20 14:35       ` Szabolcs Nagy
2017-06-20 20:04         ` Markus Wichmann
2017-06-20 23:08           ` A. Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170620041429.zjmzwpeyycwwpcvr@voyager \
    --to=nullplan@gmx.net \
    --cc=musl@lists.openwall.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).