From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org
X-Spam-Level: 
X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_PASS
	autolearn=ham autolearn_force=no version=3.4.2
Received: (qmail 11656 invoked from network); 4 Apr 2020 18:20:04 -0000
Received-SPF:  pass (mother.openwall.net: domain of lists.openwall.com
  designates 195.42.179.200 as permitted sender)
  receiver=inbox.vuxu.org; client-ip=195.42.179.200
  envelope-from=<musl-return-15654-ml=inbox.vuxu.org@lists.openwall.com>
Received: from mother.openwall.net (195.42.179.200)
  by inbox.vuxu.org with UTF8ESMTPZ; 4 Apr 2020 18:20:04 -0000
Received: (qmail 29844 invoked by uid 550); 4 Apr 2020 18:20:01 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Reply-To: musl@lists.openwall.com
Received: (qmail 29823 invoked from network); 4 Apr 2020 18:20:00 -0000
Date: Sat, 4 Apr 2020 14:19:48 -0400
From: Rich Felker <dalias@libc.org>
To: musl@lists.openwall.com
Message-ID: <20200404181948.GH11469@brightrain.aerifal.cx>
References: <20200403213110.GD11469@brightrain.aerifal.cx>
 <20200404025554.GG11469@brightrain.aerifal.cx>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20200404025554.GG11469@brightrain.aerifal.cx>
User-Agent: Mutt/1.5.21 (2010-09-15)
Subject: Re: [musl] New malloc tuning for low usage

On Fri, Apr 03, 2020 at 10:55:54PM -0400, Rich Felker wrote:
> In working on this, I noticed that it looks like the coarse size class
> threshold (6) in top-level malloc() is too low. At that threshold, the
> first fine-grained-class group allocation will be roughly a 100%
> increase in memory usage by the class; I'd rather keep the relative
> increase bounded by 50% or less. It should probably be something more
> like 10 or 12 to achieve this. With 12, repeated allocations of 16k
> first produce 7 individual 20k mmaps, then a 3-slot class-37
> (21824-byte slots) group, then a 7-slot class-36 (18704-byte slots)
> group.
> 
> One thing that's not clear to me is whether it's useful at all to
> produce the 3-slot class-37 group rather than just going on making
> more individual mmaps until it's time to switch to the larger group.
> It's easy to tune things to do the latter, and seems to offer more
> flexibility in how memory is used. It also allows slightly more
> fragmentation, but the number of such objects is highly bounded to
> begin with because we use increasingly larger groups as usage goes up,
> so the contribution should be asymptotically irrelevant.

The answer is that it depends on where the sizes fall. At 16k,
rounding up to page size produces 20k usage (5 pages) but the 3-slot
class-37 group uses 5+1/3 pages, so individual mmaps are preferable.
However if we requested 20k, individual mmaps would be 24k (6 pages)
while the 3-slot group would still just use 5+1/3 page, and would be
preferable to switch to. The condition seems to be just whether the
rounded-up-to-whole-pages request size is larger than the slot size,
and we should prefer individual mmaps if (1) it's smaller than the
slot size, or (2) using a multi-slot group would be a relative usage
increase in the class of more than 50% (or whatever threshold it ends
up being tuned to).

I'll see if I can put together a quick implementation of this and see
how it works.

Rich