From mboxrd@z Thu Jan 1 00:00:00 1970 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on inbox.vuxu.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=5.0 tests=MAILING_LIST_MULTI, RCVD_IN_DNSWL_MED,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL autolearn=ham autolearn_force=no version=3.4.4 Received: (qmail 3604 invoked from network); 13 May 2020 02:53:18 -0000 Received: from mother.openwall.net (195.42.179.200) by inbox.vuxu.org with ESMTPUTF8; 13 May 2020 02:53:18 -0000 Received: (qmail 12072 invoked by uid 550); 13 May 2020 02:53:16 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Reply-To: musl@lists.openwall.com Received: (qmail 12042 invoked from network); 13 May 2020 02:53:15 -0000 Date: Tue, 12 May 2020 22:53:01 -0400 From: Rich Felker To: musl@lists.openwall.com Message-ID: <20200513025300.GT21576@brightrain.aerifal.cx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Subject: [musl] malloc hardening comparison draft As a review of hardening measures and whether there are further improvements that should be made, I've started drafting a comparison of the properties of oldmalloc, mallocng, and hardened_malloc (https://github.com/GrapheneOS/hardened_malloc), the latter being the source from which many of the hardening ideas in mallocng were inspired. Protection of metadata ---------------------- oldmalloc: in-band; consistency of header/footer only mallocng: nearly(*) all attack surface is all out-of-band, protected below by guard pages. some non-attack-surface metadata is in-band, traps if inconsistent with out-of-band. (*=double-free of a slot another group has been nested in is only protected in-band.) hardened_malloc: all out-of-band, randomized location, determined at startup so that it is known never to have been used for other purposes Possible further improvements for mallocng: Rather than relying on ASLR to assign address for out-of-band metadata areas, we could attempt to choose a candidate address to pass to mmap with heavier randomization, and hope the kernel will honor or allocate close to it. For 64-bit archs this is very strong and flexible. For 32-bit archs doing any heavy randomization sacrifices the ability to utilize multiple very large allocations (~1GB rough order of magnitude) and the best we could do would be picking to aim high or low with limited randomization. (hardened_malloc avoids this by not supporting 32-bit.) This kind of improvement could also help avoid allocating metadata areas in address ranges that were previously visible to the application. While I don't see any reasonable out-of-band solution to protect nested groups from double-free using an old pointer held by the application, it should be possible to make inexpensive transformations on the in-band metadata to prevent falsifying it without access to malloc's random secrets. Invalid free detection ---------------------- oldmalloc: best-effort via C_INUSE header bit which is overloaded as mmapped flag and consistency of header/footer. mallocng: deterministic(*) detection and trap on any attempt to free a slot that's already free or an address not part of an allocation obtained by malloc. consistency-check-based detection and trap in case where slot is in use but last-allocated offset in slot does not match address passed to free. (*=conditional on attacker not possessing random secret or write access to oob metadata) hardened_malloc: deterministic detection of any attempt to free an address not belonging to the start of a currently valid allocation. Write-after-free detection -------------------------- oldmalloc: possible crash or compromise of allocator state if application overwrites linked list pointers in free chunk mallocng: possible detection at subsequent call if in-band metadata has been made inconsistent by the write. hardened_malloc: "detection of write-after-free for slab allocations by verifying zero filling is intact at allocation time" Possible further improvements for mallocng: The same approach used in hardened_malloc could be used here. Possibly just at low offsets. But the cycling of current offset within a slot complicates this; the address assigned by malloc may be deep into the range owned by the slot, and in order to cover all possibilities free would have to zero the whole slot (expensive). If the possible offsets for the next slot are known, however -- I think they're only 0 and previous+16 -- then it would be easy to zero just these. Detection of overflows ---------------------- oldmalloc: via inconsistency of footer with header on free; only when overflow is by large enough margin to hit footer/boundary with next chunk. mallocng: detection and trap of single-byte overflow with any non-zero value at realloc/free time. detection and trap of overflow into subsequent slot header at time of realloc/free of this slot or malloc/realloc/free of subsequent slot, unless effort is made to match the in-band metadata clobbered by the overflow. hardened_malloc: detection and trap of overflows with strong random canary (value per-slab). not clear if detection is only at free of the overflowed slot or other times as well. Possible further improvements for mallocng: For allocations with significant gap between their size and the slot size, all of the reserved bytes could be checked, rather than just the first one, and a nonzero canary could be used. Checking all is probably too expensive, especially at large sizes or alignments (memalign) where the gap may be very large. But we could reasonably check up to 8 bytes right after the zero byte. Guard pages ----------- oldmalloc: none mallocng: only below out-of-band metadata areas, but aggressive return of freed memory to system tends to leave a lot of unmapped/faulting address ranges. hardened_malloc: randomly intersperses guard pages between slabs in slab size class regions, and around metadata. Possible further improvements for mallocng: Probably none. Further use of guard pages risks hitting kernel VMA limits. Limited quarantine may be a possibility though. Valid-double-free and use-after-free mitigations ------------------------------------------------ oldmalloc: none mallocng: roughly LRU order to free slot allocation, combined with cycling of the used offset range within a slot whenever the requested size is less than the full slot size in its size class, stretches interval to exact address reuse. attempted free of non-exact address reuse, even if the same slot was reused, will trap. hardened_malloc: quarantine of large freed areas so same virtual address ranges cannot be used again until quarantine fills and pushes old entries out. random delayed free and random slot selection for allocations within slabs. Possible further improvements for mallocng: Quarantine could be implemented by replacing freed groups with PROT_NONE maps. With a reasonable limit on quarantine size this should not lead to serious fregmentation of virtual address space or hit VMA limit. Optionally a build with increased hardening could always use slots significantly larger than the requested size n, so that offset cycling period is guaranteed to be greater than 1. Freed data leak prevention -------------------------- oldmalloc: none except as a side effect of clobbering first two pointer-sized words. mallocng: none. hardened_malloc: all freed memory is zeroed. Possible further improvements for mallocng: It may be practical to zero some initial segment of the freed memory. This also has benefit of catching some UAF. ASLR ---- oldmalloc: usually limited to initial brk gap randomization. mallocng: matching kernel mmap ASLR. hardened_malloc: strong randomization in 64-bit address space.