From mboxrd@z Thu Jan  1 00:00:00 1970
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org
X-Spam-Level: 
X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI,
	RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham
	autolearn_force=no version=3.4.4
Received: (qmail 26918 invoked from network); 25 May 2020 18:13:22 -0000
Received: from mother.openwall.net (195.42.179.200)
  by inbox.vuxu.org with ESMTPUTF8; 25 May 2020 18:13:22 -0000
Received: (qmail 3607 invoked by uid 550); 25 May 2020 18:13:17 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Reply-To: musl@lists.openwall.com
Received: (qmail 3586 invoked from network); 25 May 2020 18:13:16 -0000
X-Virus-Scanned: by amavisd-new-2.10.1 (20141025) (Debian) at wwcom.ch
To: Rich Felker <dalias@libc.org>
Cc: musl@lists.openwall.com
References: <20200510180934.GV21576@brightrain.aerifal.cx>
 <20200516002912.GN21576@brightrain.aerifal.cx>
 <20200516032901.GO21576@brightrain.aerifal.cx>
 <20200517033025.GQ21576@brightrain.aerifal.cx>
 <20200518185351.GF21576@brightrain.aerifal.cx>
 <4abf07d7-542a-414f-9cc2-2fdf60074462@wwcom.ch>
 <20200525175446.GR1079@brightrain.aerifal.cx>
From: Pirmin Walthert <pirmin.walthert@wwcom.ch>
Message-ID: <cde945ea-54ea-f47a-3e74-313eec10d844@wwcom.ch>
Date: Mon, 25 May 2020 20:13:02 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.8.0
MIME-Version: 1.0
In-Reply-To: <20200525175446.GR1079@brightrain.aerifal.cx>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Content-Language: en-US
Subject: Re: [musl] mallocng progress and growth chart

Am 25.05.20 um 19:54 schrieb Rich Felker:

> On Mon, May 25, 2020 at 05:45:33PM +0200, Pirmin Walthert wrote:
>> Am 18.05.20 um 20:53 schrieb Rich Felker:
>>
>>> On Sat, May 16, 2020 at 11:30:25PM -0400, Rich Felker wrote:
>>>> Another alternative for avoiding eagar commit at low usage, which
>>>> works for all but nommu: when adding groups with nontrivial slot count
>>>> at low usage, don't activate all the slots right away. Reserve vm
>>>> space for 7 slots for a 7x4672, but only unprotect the first 2 pages,
>>>> and treat it as a group of just 1 slot until there are no slots free
>>>> and one is needed. Then, unprotect another page (or more if needed to
>>>> fit another slot, as would be needed at larger sizes) and adjust the
>>>> slot count to match. (Conceptually; implementation-wise, the slot
>>>> count would be fixed, and there would just be a limit on the number of
>>>> slots made avilable when transformed from "freed" to "available" for
>>>> activation.)
>>>>
>>>> Note that this is what happens anyway with physical memory as clean
>>>> anonymous pages are first touched, but (1) doing it without explicit
>>>> unprotect over-counts the not-yet-used slots for commit charge
>>>> purposes and breaks tightly-memory-constrained environments (global
>>>> commit limit or cgroup) and (2) when all slots are initially available
>>>> as they are now, repeated free/malloc cycles for the same size will
>>>> round-robin all the slots, touching them all.
>>>>
>>>> Here, property (2) is of course desirable for hardening at moderate to
>>>> high usage, but at low usage UAF tends to be less of a concern
>>>> (because you don't have complex data structures with complex lifetimes
>>>> if you hardly have any malloc).
>>>> c
>>>> Note also that (2) could be solved without addressing (1) just by
>>>> skipping the protection aspect of this idea and only using the
>>>> available-slot-limiting part.
>>> One abstract way of thinking about the above is that it's just a
>>> per-size-class bump allocator, pre-reserving enough virtual address
>>> space to end sufficiently close to a page boundary that there's no
>>> significant memory waste. This is actually fairly elegant, and might
>>> obsolete some of the other measures taken to avoid overly eagar
>>> allocation. So this might be a worthwhile direction to pursue.
>> Dear Rich,
>>
>> Currently we use mallocng in production for most applications in our
>> "embedded like" virtualised system setups, it even helped to find
>> some bugs (for example in asterisk) as mallocng was less forgiving
>> than the old malloc implementation. So if you're interested in real
>> world feedback: everything seems to be running quite smoothly so
>> far, thanks for this great work.
>>
>> Currently we use the git version of April 24th, so the version
>> before you merged the huge optimization changes. As you mentioned in
>> your "brainstorming mails", if I got them right, that you might
>> rethink a few of these changes, I'd like to ask: do you think it
>> would be better to use the current git-master version rather than
>> the version of April 24th (we are not THAT memory constrained, so
>> stability is the most important thing) or do you think it would be
>> better to stick on the old version and wait for the next changes to
>> be merged?
> Thanks for the feedback!
>
> Which are the "huge optimization changes" you're wondering about?
> Indeed there's a large series of commits after the version you're
> using but I think you're possibly misattributing them.
>
> A number of the commits are bug fixes -- mostly not for hard bugs, but
> for unwanted and unintended behaviors:
>
> a709dde fix unexpected allocation of 7x144 group in non-power-of-two slot
> dda5a88 fix exact size tracking in memalign
> 915a914 adjust several size classes to fix nested groups of non-power-of-2 size
> 7acd61e allow in-place realloc when ideal size class is off-by-one
> caca917 add support for aligned_alloc alignments 1M and over
>
> There were also quite a few around an idea that didn't go well and was
> mostly reverted, but with major improvements to the original behavior:
>
> 5bff93c overhaul bounce counter to work with map sizes instead of size classes
> 71262cd tune bounce counter to avoid triggering early
> 9601aaa prevent overflow of unmap counter part of bounce counter
> aca1f32 don't let the mmap cache limit grow unboundedly or overflow
> 6fbee31 second partial overhaul of bounce counter system
> 150de6e revert from map cache to old okay_to_free scheme, but improved
> 1e972da initial conversion of bounce counting to use sequence numbers, decay
> e3eecb1 factor bounce/sequence counter logic into meta.h
> 6693738 account seq for individually-mmapped allocations above hard threshold
> 4443f64 fix complete regression (malloc always fails) on variable-pagesize archs
>
> If you don't care about low usage, that whole change series is fairly
> unimportant, but should be harmless. It just changes decisions about
> choices where either choice produces as valid state for the allocator
> but there are tradeoffs between memory usage and performance. The new
> behavior should be better, though.
>
> A few commits were reordering the dependency between memalign and the
> standard memalign-variant functions, which is a minor namespace
> detail:
>
> da4c88e rename aligned_alloc.c
> 04407f7 reverse dependency order of memalign and aligned_alloc
> 74e6657 rename aligned_alloc source file back to its proper name
> c990cb1 rename memalign source file back to its proper name
>
> A couple were hardening:
>
> 5bf4e92 clear group header pointer to meta when freeing groups
> bd04c75 in get_meta, check offset against maplen (minor hardening)
> 77cea57 add support for allocating meta areas via legacy brk
>
> And pretty much all the rest of the changes are tuning behavior for
> "optimization" of some sort or another, which may be what you were
> referring to:
>
> 26143c4 limit slot count growth to 25% instead of 50% of current usage in class
> a9187f0 remove unnecessary optimization tuning flags from Makefile CFLAGS
> 045cc04 move coarse size classing logic to malloc fast path
> 8348a82 eliminate med_twos_tab
> e619034 allow slot count 1 for size classes 3 mod 4 as natural/first-class
> c9d54f4 activate coarse size classing for small classes down to 4 (but not 6)
> 44092d8 improve individual-mmap decision
> d355eaf remove slot count reduction to 1 for size classes 1 mod 3
> c555ebe fix off-by-one in logic to use single-slot groups
> 9d5ec34 switch from MADV_DONTNEED to MADV_FREE for large free slots
> 584c7aa avoid over-use of reduced-count groups due to coarse size classing
> f9bfb0a increase threshold for 3->2 slot reduction to 16 pages
> 20da09e disable coarse size classing for large classes (over 8k)
>
> I don't think any of these changes are potentially obsoleted by
> further ideas in the above thread. I am working on delaying activation
> of slots until they're actually needed, so that we don't dirty pages
> we could avoid touching, but I proposed this as an alternative to
> other more complex tricks that I didn't really like, which have not
> been implemented and probably won't be now.
>
> So, in summary, I don't see any good reason not to go with latest.
>
> Rich

Many thanks for your detailed answer. I'll give it a try then!

Pirmin