From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_PASS autolearn=ham autolearn_force=no version=3.4.2 Received: (qmail 11656 invoked from network); 4 Apr 2020 18:20:04 -0000 Received-SPF: pass (mother.openwall.net: domain of lists.openwall.com designates 195.42.179.200 as permitted sender) receiver=inbox.vuxu.org; client-ip=195.42.179.200 envelope-from= Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with UTF8ESMTPZ; 4 Apr 2020 18:20:04 -0000 Received: (qmail 29844 invoked by uid 550); 4 Apr 2020 18:20:01 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 29823 invoked from network); 4 Apr 2020 18:20:00 -0000 Date: Sat, 4 Apr 2020 14:19:48 -0400 From: Rich Felker To: musl@lists.openwall.com Message-ID: <20200404181948.GH11469@brightrain.aerifal.cx> References: <20200403213110.GD11469@brightrain.aerifal.cx> <20200404025554.GG11469@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200404025554.GG11469@brightrain.aerifal.cx> User-Agent: Mutt/1.5.21 (2010-09-15) Subject: Re: [musl] New malloc tuning for low usage On Fri, Apr 03, 2020 at 10:55:54PM -0400, Rich Felker wrote: > In working on this, I noticed that it looks like the coarse size class > threshold (6) in top-level malloc() is too low. At that threshold, the > first fine-grained-class group allocation will be roughly a 100% > increase in memory usage by the class; I'd rather keep the relative > increase bounded by 50% or less. It should probably be something more > like 10 or 12 to achieve this. With 12, repeated allocations of 16k > first produce 7 individual 20k mmaps, then a 3-slot class-37 > (21824-byte slots) group, then a 7-slot class-36 (18704-byte slots) > group. > > One thing that's not clear to me is whether it's useful at all to > produce the 3-slot class-37 group rather than just going on making > more individual mmaps until it's time to switch to the larger group. > It's easy to tune things to do the latter, and seems to offer more > flexibility in how memory is used. It also allows slightly more > fragmentation, but the number of such objects is highly bounded to > begin with because we use increasingly larger groups as usage goes up, > so the contribution should be asymptotically irrelevant. The answer is that it depends on where the sizes fall. At 16k, rounding up to page size produces 20k usage (5 pages) but the 3-slot class-37 group uses 5+1/3 pages, so individual mmaps are preferable. However if we requested 20k, individual mmaps would be 24k (6 pages) while the 3-slot group would still just use 5+1/3 page, and would be preferable to switch to. The condition seems to be just whether the rounded-up-to-whole-pages request size is larger than the slot size, and we should prefer individual mmaps if (1) it's smaller than the slot size, or (2) using a multi-slot group would be a relative usage increase in the class of more than 50% (or whatever threshold it ends up being tuned to). I'll see if I can put together a quick implementation of this and see how it works. Rich