From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/12707
Path: news.gmane.org!.POSTED!not-for-mail
From: Rich Felker <dalias@libc.org>
Newsgroups: gmane.linux.lib.musl.general
Subject: Re: [RFC PATCH] reduce severity of ldso reclaim_gaps hack
Date: Thu, 12 Apr 2018 00:04:08 -0400
Message-ID: <20180412040408.GP3094@brightrain.aerifal.cx>
References: <20170628165243.16502-1-amonakov@ispras.ru>
 <20180411201057.GO3094@brightrain.aerifal.cx>
 <alpine.LNX.2.20.13.1804112347250.24851@monopod.intra.ispras.ru>
Reply-To: musl@lists.openwall.com
NNTP-Posting-Host: blaine.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: blaine.gmane.org 1523505740 30865 195.159.176.226 (12 Apr 2018 04:02:20 GMT)
X-Complaints-To: usenet@blaine.gmane.org
NNTP-Posting-Date: Thu, 12 Apr 2018 04:02:20 +0000 (UTC)
User-Agent: Mutt/1.5.21 (2010-09-15)
To: musl@lists.openwall.com
Original-X-From: musl-return-12723-gllmg-musl=m.gmane.org@lists.openwall.com Thu Apr 12 06:02:15 2018
Return-path: <musl-return-12723-gllmg-musl=m.gmane.org@lists.openwall.com>
Envelope-to: gllmg-musl@m.gmane.org
Original-Received: from mother.openwall.net ([195.42.179.200])
	by blaine.gmane.org with smtp (Exim 4.84_2)
	(envelope-from <musl-return-12723-gllmg-musl=m.gmane.org@lists.openwall.com>)
	id 1f6TR8-0007si-Gz
	for gllmg-musl@m.gmane.org; Thu, 12 Apr 2018 06:02:14 +0200
Original-Received: (qmail 16358 invoked by uid 550); 12 Apr 2018 04:04:21 -0000
Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm
Precedence: bulk
List-Post: <mailto:musl@lists.openwall.com>
List-Help: <mailto:musl-help@lists.openwall.com>
List-Unsubscribe: <mailto:musl-unsubscribe@lists.openwall.com>
List-Subscribe: <mailto:musl-subscribe@lists.openwall.com>
List-ID: <musl.lists.openwall.com>
Original-Received: (qmail 16340 invoked from network); 12 Apr 2018 04:04:20 -0000
Content-Disposition: inline
In-Reply-To: <alpine.LNX.2.20.13.1804112347250.24851@monopod.intra.ispras.ru>
Original-Sender: Rich Felker <dalias@aerifal.cx>
Xref: news.gmane.org gmane.linux.lib.musl.general:12707
Archived-At: <http://permalink.gmane.org/gmane.linux.lib.musl.general/12707>

On Thu, Apr 12, 2018 at 12:19:24AM +0300, Alexander Monakov wrote:
> > > @@ -410,10 +412,9 @@ void *realloc(void *p, size_t n)
> > >  		size_t newlen = n + extra;
> > >  		/* Crash on realloc of freed chunk */
> > >  		if (extra & 1) a_crash();
> > > -		if (newlen < PAGE_SIZE && (new = malloc(n))) {
> > > -			memcpy(new, p, n-OVERHEAD);
> > > -			free(p);
> > > -			return new;
> > > +		if (newlen < PAGE_SIZE && (new = malloc(n-OVERHEAD))) {
> > > +			n0 = n;
> > > +			goto copy_free_ret;
> > >  		}
> > >  		newlen = (newlen + PAGE_SIZE-1) & -PAGE_SIZE;
> > >  		if (oldlen == newlen) return p;
> > > @@ -456,34 +457,20 @@ copy_realloc:
> > >  	/* As a last resort, allocate a new chunk and copy to it. */
> > >  	new = malloc(n-OVERHEAD);
> > >  	if (!new) return 0;
> > > +copy_free_ret:
> > >  	memcpy(new, p, n0-OVERHEAD);
> > >  	free(CHUNK_TO_MEM(self));
> > >  	return new;
> > >  }
> > 
> > This looks like an independent change that fixes a separate
> > slight-overallocation bug. Is it related?
> 
> No, it would be nicer to commit it separately.

OK.

> > > +#if defined(__GNUC__)
> > > +__attribute__((cold))
> > > +#endif
> > 
> > This can't be used as-is. It would need a configure check (gcc version
> > dependent) and __cold__ if we want to, but unless there's a strong
> > reason to include it I'd just omit it.
> 
> malloc.c is compiled with -O3, causing 'free' to be inlined in all local
> callers. This combats code growth and should also slightly improve code
> layout in the new 'free' (not that it would help much, of course).

I'd like to actually do away with the -O3, so maybe now is a good time
to look at that, at least for malloc. I think the main reason it was
used is already covered by the always_inline, which was to avoid GOT
register reloads at each call frame level on i386 and similar archs.
I suspect it might turn out better to just use -O2 instead of -Os
globally (possibly with -falign-*=1) than doing the per-dir/glob stuff
we're doing now, especially on modern gcc versions, but -O3 might
still be preferable for string functions if we want to get
vectorization there.

Of course I don't want to hold your patch up on this issue, but I
think we can just omit the cold with a plan to investigate -O level
and inlining issues soon.

> > > +static void unmap_chunk(struct chunk *self)
> > > +{
> > > +	size_t extra = self->psize;
> > > +	char *base = (char *)self - extra;
> > > +	size_t len = CHUNK_SIZE(self) + extra;
> > > +	/* Crash on double free */
> > > +	if (extra & 1) a_crash();
> > > +	__munmap(base, len);
> > > +}
> > > +
> > > +void free(void *p)
> > > +{
> > > +	if (!p) return;
> > > +
> > > +	struct chunk *self = MEM_TO_CHUNK(p);
> > > +
> > > +	if (IS_MMAPPED(self))
> > > +		unmap_chunk(self);
> > > +	else
> > > +		bin_chunk(self);
> > > +}
> > > +
> > > +void __malloc_donate(char *start, char *end)
> > > +{
> > 
> > It's confusing to see that this is equivalent to what's being removed
> > from dynlink.c, but I think it may be correct.
> > 
> > > +	if (end - start < 2*OVERHEAD + SIZE_ALIGN) return;
> > 
> > This does not seem like a sufficient bound to ensure the block is
> > usable, but the next check after alignment may cover it.
> 
> Yes, this is to ensure that we don't create invalid pointers when aligning,

OK.

> > > +	start += OVERHEAD + SIZE_ALIGN - 1;
> > > +	start -= (uintptr_t)start & (SIZE_ALIGN - 1);
> > 
> > This looks correct.
> > 
> > > +	end -= (uintptr_t)end & (SIZE_ALIGN - 1);
> > 
> > This does not subtract the OVERHEAD, but I think it's just a
> > notational difference; my "end" pointed to the end of the chunk to be
> > freed, and your "end" points to the beginning of the next
> > (nonexistent) chunk. The MEM_TO_CHUNK below should compensate.
> > 
> > > +	if (end - start < OVERHEAD + SIZE_ALIGN) return;
> > 
> > At this point, start and end both point just past a chunk header,
> > meaning they have to differ by a multiple of SIZE_ALIGN. I don't see
> > why OVERHEAD is needed here too. The check should be against
> > SIZE_ALIGN I think (although by alignment they're equivalent).

I think I was wrong about them being equivalent, but...

> I don't recall if I had a specific reason to spell it like that.

...end-start==SIZE_ALIGN is certainly a valid free chunk size
(yielding SIZE_ALIGN-OVERHEAD == OVERHEAD bytes of storage), albeit
not terribly helpful.

> > > +	if (end - start >= MMAP_THRESHOLD) return;
> > 
> > This does not seem necessary. Free chunks in the last bin can be
> > larger than MMAP_THRESHOLD; they're just broken up to satisfy
> > allocations. Of course it's unlikely to happen anyway.
> 
> Do such oversized chunks appear in normal operation? This seems non-obvious,
> so a comment pointing that out would probably be helpful.

The only way I could see it happening is on an arch ABI that allows
very large pages (and has the ELF load segments aligned accordingly,
as x86_64 does). In this case if the kernel/hardware only supported
large (e.g. 2MB) pages, you'd pretty much always end up with >1.5MB of
reclaimed space per DSO. IMO this is an awful kernel/hardware
constraint to have, very wasteful, but it's exactly the situation
where you'd most care about the gaps getting reclaimed for something
useful.

Rich