mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] Mallocng algorithm high-level overview
@ 2021-09-30 14:02 Jack Bond-Preston
  2021-09-30 14:50 ` Fangrui Song
  2021-09-30 16:24 ` Rich Felker
  0 siblings, 2 replies; 6+ messages in thread
From: Jack Bond-Preston @ 2021-09-30 14:02 UTC (permalink / raw)
  To: musl

Hello,

I'm currently working on porting mallocng to a new architecture and
could use some assistance understanding the algorithm. From searching
the web, I couldn't seem to find any high-level overview of musl's
mallocng allocator (save for the readme at github/richfelker
/mallocng-draft, which is a little briefer than what I am looking
for). If any such description exists, I would much appreciate
being pointed towards it. If not, would anyone be able to explain
some of the details of the allocator?

Mostly I am interested in a more general high-level overview of how
the allocator works. There are also some specifics I am interested
in, if anyone is able to shine some light on these:
- The uses/purposes of the structures in meta.h. Particularly, meta
   and group, and the relation between the two.
- The general overview of in-band and out-of-band metadata, and how/
   when they are used.
- The purpose/meaning of the UNIT define in meta.h.
- Any assumptions about alignment/pointer size the allocator may
   make.

Thanks very much for your time, I appreciate the request is a bit
broad, but any information is appreciated. Please don't hesitate to
reach out for more information.

Cheers,
Jack

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] Mallocng algorithm high-level overview
  2021-09-30 14:02 [musl] Mallocng algorithm high-level overview Jack Bond-Preston
@ 2021-09-30 14:50 ` Fangrui Song
  2021-09-30 15:35   ` Jack Bond-Preston
  2021-09-30 16:24 ` Rich Felker
  1 sibling, 1 reply; 6+ messages in thread
From: Fangrui Song @ 2021-09-30 14:50 UTC (permalink / raw)
  To: musl

On Thu, Sep 30, 2021 at 10:02 AM Jack Bond-Preston
<jack.bond-preston@arm.com> wrote:
>
> Hello,
>
> I'm currently working on porting mallocng to a new architecture and
> could use some assistance understanding the algorithm. From searching
> the web, I couldn't seem to find any high-level overview of musl's
> mallocng allocator (save for the readme at github/richfelker
> /mallocng-draft, which is a little briefer than what I am looking
> for). If any such description exists, I would much appreciate
> being pointed towards it. If not, would anyone be able to explain
> some of the details of the allocator?

Not answering the specific questions below, but I have some notes on
https://gist.github.com/MaskRay/ac54b26d72452ac77ac578f2e625369f :)

> Mostly I am interested in a more general high-level overview of how
> the allocator works. There are also some specifics I am interested
> in, if anyone is able to shine some light on these:
> - The uses/purposes of the structures in meta.h. Particularly, meta
>    and group, and the relation between the two.
> - The general overview of in-band and out-of-band metadata, and how/
>    when they are used.
> - The purpose/meaning of the UNIT define in meta.h.
> - Any assumptions about alignment/pointer size the allocator may
>    make.
>
> Thanks very much for your time, I appreciate the request is a bit
> broad, but any information is appreciated. Please don't hesitate to
> reach out for more information.
>
> Cheers,
> Jack

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] Mallocng algorithm high-level overview
  2021-09-30 14:50 ` Fangrui Song
@ 2021-09-30 15:35   ` Jack Bond-Preston
  0 siblings, 0 replies; 6+ messages in thread
From: Jack Bond-Preston @ 2021-09-30 15:35 UTC (permalink / raw)
  To: musl

On 9/30/21 3:50 PM, Fangrui Song wrote:
> Not answering the specific questions below, but I have some notes on
> https://gist.github.com/MaskRay/ac54b26d72452ac77ac578f2e625369f  :)

Thanks a lot for this, it's very useful :)

Cheers,
Jack

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] Mallocng algorithm high-level overview
  2021-09-30 14:02 [musl] Mallocng algorithm high-level overview Jack Bond-Preston
  2021-09-30 14:50 ` Fangrui Song
@ 2021-09-30 16:24 ` Rich Felker
  2021-10-04 14:24   ` Jack Bond-Preston
  1 sibling, 1 reply; 6+ messages in thread
From: Rich Felker @ 2021-09-30 16:24 UTC (permalink / raw)
  To: Jack Bond-Preston; +Cc: musl

On Thu, Sep 30, 2021 at 03:02:23PM +0100, Jack Bond-Preston wrote:
> Hello,
> 
> I'm currently working on porting mallocng to a new architecture and
> could use some assistance understanding the algorithm. From searching
> the web, I couldn't seem to find any high-level overview of musl's
> mallocng allocator (save for the readme at github/richfelker
> /mallocng-draft, which is a little briefer than what I am looking
> for). If any such description exists, I would much appreciate
> being pointed towards it. If not, would anyone be able to explain
> some of the details of the allocator?

Some documents:

- README in draft repo, as well as its entire detailed git history:
  https://github.com/richfelker/mallocng-draft

- New malloc - intro, motivation & design goals document:
  https://www.openwall.com/lists/musl/2019/10/22/3

- Review of mallocng motivation/goals:
  https://www.openwall.com/lists/musl/2020/05/18/3

There may be others I'm forgetting; I'll follow up if I think of some.

> Mostly I am interested in a more general high-level overview of how
> the allocator works. There are also some specifics I am interested
> in, if anyone is able to shine some light on these:
> - The uses/purposes of the structures in meta.h. Particularly, meta
>   and group, and the relation between the two.
> 
> - The general overview of in-band and out-of-band metadata, and how/
>   when they are used.

struct group represents the storage for a group of slots allocated
contiguously (something like a slab), with in-band metadata encoded in
the bits of 3 bytes between slots, and a pointer to the out-of-band
metadata at the very beginning. A group may be allocated in memory
obtained from mmap or, for size and count smaller than a page, inside
one slot of a larger group.

struct meta is the out-of-band metadata that's allocated in memory
intended to be difficult to reach/attack. It always contains a pointer
back to the group it goes with for validation, and has a header
containing a random secret at address&-4096 (beginning of 'page' for a
page unit that need not match system page size) that also validates.
This prevents invalid-/double-free bugs (in the absence of other much
more powerful gadgets) from being used to construct fake heap metadata
to produce an inconsistent allocator state.

In-band metadata is treated as low-trust input, and only used for
finding out-of-band metadata and validating lack of out-of-bounds
writes between slots. It also facilitates rotating the used range
within a slot each time it's reused to greatly extend the period for
reuse of identical pointers, and catch UAF/DF.

> - The purpose/meaning of the UNIT define in meta.h.

UNIT is the fundamental allocation unit/alignment. On targets where
alignof(max_align_t)==8 it could in theory be 8 instead of 16, but
some additional tweaks might be needed to actually make this work, at
least on 64-bit archs, due to lack of space for in-band metadata. If
alignof(max_align_t) were larger it would need to be larger, which is
a bad thing for memory usage, but should work without breaking
anything.

I made UNIT a constant 16 for the time being rather than an expression
in terms of sizeof(void *) and alignof(max_align_t) because the
emergent consequences of dropping it to 8, and how that works with
size class thresholds for whole pages, were not invesitated to know if
anything inefficient would come out.

> - Any assumptions about alignment/pointer size the allocator may
>   make.

Mainly that UNIT contains sufficient space for a pointer (to out of
band meta) and the 4 byte header for the first slot.

> Thanks very much for your time, I appreciate the request is a bit
> broad, but any information is appreciated. Please don't hesitate to
> reach out for more information.
> 
> Cheers,
> Jack

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] Mallocng algorithm high-level overview
  2021-09-30 16:24 ` Rich Felker
@ 2021-10-04 14:24   ` Jack Bond-Preston
  2021-10-04 15:28     ` Rich Felker
  0 siblings, 1 reply; 6+ messages in thread
From: Jack Bond-Preston @ 2021-10-04 14:24 UTC (permalink / raw)
  To: musl

Thanks for the reply Rich, it has been a great help.

I have a couple more specific questions, sorry:
- Am I correct in saying IB represents the size (in bytes) of the
in-band metadata between slots?
- Is it assumed that sizeof(struct group) == sizeof(UNIT) throughout the
code (the struct is defined such that this is true)? If the size of the
in-band metadata were to increase (due to additional/larger metadata)
such that UNIT < sizeof(struct group) (due to the size of the group
struct needing to increase to accommodate the larger in-band metadata),
I assume some code would have to be changed to instead use the size of
the new group struct where appropriate (e.g. when allocating some new
group). I noticed there are a lot of expressions containing x +
UNIT/UNIT + x, are these generally to ensure allocations etc. include
enough space for the contents of the group struct, or are there other
reasons for these (e.g. some kind of 1-UNIT buffer between slots)?

Cheers,
Jack

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [musl] Mallocng algorithm high-level overview
  2021-10-04 14:24   ` Jack Bond-Preston
@ 2021-10-04 15:28     ` Rich Felker
  0 siblings, 0 replies; 6+ messages in thread
From: Rich Felker @ 2021-10-04 15:28 UTC (permalink / raw)
  To: Jack Bond-Preston; +Cc: musl

On Mon, Oct 04, 2021 at 03:24:28PM +0100, Jack Bond-Preston wrote:
> Thanks for the reply Rich, it has been a great help.
> 
> I have a couple more specific questions, sorry:
> - Am I correct in saying IB represents the size (in bytes) of the
> in-band metadata between slots?
> - Is it assumed that sizeof(struct group) == sizeof(UNIT) throughout the
> code (the struct is defined such that this is true)?

UNIT must be large enough to:
- Meet any alignment requirement (be divisible by alignof(max_align_t).
- Hold IB bytes of in-band metadata between slots
- Fit a group header in UNIT-IB (pad[] in group header must be >= IB)

With UNIT==16 this leaves 7 bytes free in the group header on 32-bit
archs and 3 bytes free on 64-bit archs.

> If the size of the
> in-band metadata were to increase (due to additional/larger metadata)
> such that UNIT < sizeof(struct group) (due to the size of the group
> struct needing to increase to accommodate the larger in-band metadata),
> I assume some code would have to be changed to instead use the size of
> the new group struct where appropriate (e.g. when allocating some new
> group).

The fit of small sizeclass groups inside slots of larger ones is
dependent on group header fitting in UNIT. While in theory the header
could be made larger without hard breakage (assuming cleanup to ensure
consistent usage of sizeof(struct group) where appropriate), such a
change would break the math that makes things fit together
efficiently.

In terms of design, group header and in-band metadata ~should not~ be
expanded because they are attack surface and are designed to be
minimal for their purpose. The active_idx was only added to reduce the
physical memory cost/pressure of allocating N slots as a group, due to
divisibility properties of the needed sizeclass, when only 1 or 2
slots are wanted. It's not attack surface; clobbering it will just
lead to premature use of slots intended to be inactive.

> I noticed there are a lot of expressions containing x +
> UNIT/UNIT + x, are these generally to ensure allocations etc. include
> enough space for the contents of the group struct, or are there other
> reasons for these (e.g. some kind of 1-UNIT buffer between slots)?

The only space "between" slots is IB. Where UNIT is added it's
normally to account for a group header (lines 152, 206, 210, 234, 241,
256, 260, 261, 271, 309).

Commit d8715a10b32de699080f820c607db027f0b268b2 in the draft repo
documented this somewhat.

Rich

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-10-04 15:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-30 14:02 [musl] Mallocng algorithm high-level overview Jack Bond-Preston
2021-09-30 14:50 ` Fangrui Song
2021-09-30 15:35   ` Jack Bond-Preston
2021-09-30 16:24 ` Rich Felker
2021-10-04 14:24   ` Jack Bond-Preston
2021-10-04 15:28     ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).