From: Rich Felker <dalias@libc.org>
To: Dominique MARTINET <dominique.martinet@atmark-techno.com>
Cc: musl@lists.openwall.com
Subject: Re: [musl] infinite loop in mallocng's try_avail
Date: Wed, 25 Jan 2023 00:53:23 -0500 [thread overview]
Message-ID: <20230125055323.GK4163@brightrain.aerifal.cx> (raw)
In-Reply-To: <Y9B48H0MsvL9yLBv@atmark-techno.com>
On Wed, Jan 25, 2023 at 09:33:52AM +0900, Dominique MARTINET wrote:
> > If this code is being reached, either the allocator state has been
> > corrupted by some UB in the application, or there's a logic bug in
> > mallocng. The sequence of events that seem to have to happen to get
> > there are:
> >
> > 1. Previously active group has no more available slots (line 120).
>
> Right, that one has already likely been dequeued (or at least
> traversed), so I do not see how to look at it but that sounds possible.
>
> > 2. Freed mask of newly activating group (line 131 or 138) is either
> > zero (line 145) or the active_idx (read from in-band memory
> > susceptible to application buffer overflows etc) is wrong and
> > produces zero when its bits are anded with the freed mask (line
> > 145).
>
> m->freed_mask looks like it is zero from values below; I cannot tell if
> that comes from a corruption outside of musl or not.
>
> > > (gdb) p __malloc_context
> > > $94 = {
> > > secret = 15756413639004407235,
> > > init_done = 1,
> > > mmap_counter = 135,
> > > free_meta_head = 0x0,
> > > avail_meta = 0x18a3f70,
> > > avail_meta_count = 6,
> > > avail_meta_area_count = 0,
> > > meta_alloc_shift = 0,
> > > meta_area_head = 0x18a3000,
> > > meta_area_tail = 0x18a3000,
> > > avail_meta_areas = 0x18a4000 <error: Cannot access memory at address 0x18a4000>,
> > > active = {0x18a3e98, 0x18a3eb0, 0x18a3208, 0x18a3280, 0x0, 0x0, 0x0, 0x18a31c0, 0x0, 0x0, 0x0, 0x18a3148, 0x0, 0x0, 0x0, 0x18a3dd8, 0x0, 0x0, 0x0, 0x18a3d90, 0x0,
> > > 0x18a31f0, 0x0, 0x18a3b68, 0x0, 0x18a3f28, 0x0, 0x0, 0x0, 0x18a3238, 0x0 <repeats 18 times>},
> > > usage_by_class = {2580, 600, 10, 7, 0 <repeats 11 times>, 96, 0, 0, 0, 20, 0, 3, 0, 8, 0, 3, 0, 0, 0, 3, 0 <repeats 18 times>},
> > > unmap_seq = '\000' <repeats 31 times>,
> > > bounces = '\000' <repeats 18 times>, "w", '\000' <repeats 12 times>,
> > > seq = 1 '\001',
> > > brk = 25837568
> > > }
> > > (gdb) p *__malloc_context->active[0]
> > > $95 = {
> > > prev = 0x18a3f40,
> > > next = 0x18a3e80,
> > > mem = 0xb6f57b30,
> > > avail_mask = 1073741822,
> > > freed_mask = 0,
> > > last_idx = 29,
> > > freeable = 1,
> > > sizeclass = 0,
> > > maplen = 0
> > > }
> > > (gdb) p *__malloc_context->active[0]->mem
> > > $97 = {
> > > meta = 0x18a3e98,
> > > active_idx = 29 '\035',
> > > pad = "\000\000\000\000\000\000\000\000\377\000",
> > > storage = 0xb6f57b40 ""
> > > }
> >
> > This is really weird, because at the point of the infinite loop, the
> > new group should not yet be activated (line 163), so
> > __malloc_context->active[0] should still point to the old active
> > group. But its avail_mask has all bits set and active_idx is not
> > corrupted, so try_avail should just have obtained an available slot
> > from it without ever entering the block at line 120. So I'm confused
> > how it got to the loop.
>
> try_avail's pm is `__malloc_context->active[0]`, which is overwritten by
> either dequeue(pm, m) or *pm = m (lines 123,128), so the original
> m->avail_mask could have been zero, with the next element having a zero
> freed mask?
No, avail_mask is only supposed to be able to be nonzero after
activate_group, which is only called on the head of an active list
(free.c:86 or malloc.c:163) and which atomically pulls bits off
freed_mask to move them to avail_mask. If we're observing avail_mask
nonzero at the point you saw it, some invariant seems to have been
violated.
> > One odd thing I noticed is that the backtrace pm=0xb6f692e8 does not
> > match the __malloc_context->active[0] address. Were thse from
> > different runs?
>
> These were from the same run, I've only observed this single occurence
> first-hand.
>
> pm is &__malloc_context->active[0], so it's not 0x18a3e98 (first value
> of active) but its address (e.g. __malloc_context+48 as per gdb symbol
> resolution in the backtrace)
> I didn't print __malloc_context but I don't see why gdb would have
> gotten that wrong.
Ah, I forgot I was looking at an additional level of indirection here.
It would be nice to know if m is the same active[0] as at entry; that
would help figure out where things went wrong...
Rich
next prev parent reply other threads:[~2023-01-25 5:53 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-23 1:33 Dominique MARTINET
2023-01-24 8:37 ` Rich Felker
2023-01-25 0:33 ` Dominique MARTINET
2023-01-25 5:53 ` Rich Felker [this message]
2023-01-25 6:48 ` Dominique MARTINET
2023-01-27 6:20 ` Dominique MARTINET
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230125055323.GK4163@brightrain.aerifal.cx \
--to=dalias@libc.org \
--cc=dominique.martinet@atmark-techno.com \
--cc=musl@lists.openwall.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/musl/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).