From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/15026 Path: news.gmane.org!.POSTED.blaine.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: max_align_t mess on i386 Date: Sun, 15 Dec 2019 13:51:25 -0500 Message-ID: <20191215185125.GB1666@brightrain.aerifal.cx> References: <20191214151932.GW1666@brightrain.aerifal.cx> <20191215182314.GB986899@wirbelwind.zhasha.com> Reply-To: musl@lists.openwall.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: blaine.gmane.org; posting-host="blaine.gmane.org:195.159.176.226"; logging-data="222131"; mail-complaints-to="usenet@blaine.gmane.org" User-Agent: Mutt/1.5.21 (2010-09-15) To: musl@lists.openwall.com Original-X-From: musl-return-15042-gllmg-musl=m.gmane.org@lists.openwall.com Sun Dec 15 19:51:41 2019 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by blaine.gmane.org with smtp (Exim 4.89) (envelope-from ) id 1igYzU-000vfP-E3 for gllmg-musl@m.gmane.org; Sun, 15 Dec 2019 19:51:40 +0100 Original-Received: (qmail 25632 invoked by uid 550); 15 Dec 2019 18:51:38 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 25614 invoked from network); 15 Dec 2019 18:51:37 -0000 Content-Disposition: inline In-Reply-To: <20191215182314.GB986899@wirbelwind.zhasha.com> Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:15026 Archived-At: On Sun, Dec 15, 2019 at 07:23:14PM +0100, Joakim Sindholt wrote: > On Sun, Dec 15, 2019 at 01:06:29PM -0500, Jeffrey Walton wrote: > > On Sat, Dec 14, 2019 at 10:19 AM Rich Felker wrote: > > > > > > In reserching how much memory could be saved, and how practical it > > > would be, for the new malloc to align only to 8-byte boundaries > > > instead of 16-byte on archs where alignof(max_align_t) is 8 (pretty > > > much all 32-bit archs), I discovered that GCC quietly changed its > > > idead of i386 max_align_t to 16-byte alignment in GCC 7, to better > > > accommodate the new _Float128 access via SSE. Presumably (I haven't > > > checked) the change is reflected with changes in the psABI document to > > > make it "official". > > > > Be careful with policy changes like this. The malloc (3) man page says: > > > > The malloc() and calloc() functions return a pointer to the > > allocated memory that is suitably aligned for any kind of variable. > > Your man pages are not the standard, but the standard does have this to > say: > > The pointer returned if the allocation succeeds shall be suitably > > aligned so that it may be assigned to a pointer to any type of object > > and then used to access such an object in the space allocated (until the > > space is explicitly freed or reallocated). > > To me this sounds like my next suggestion is technically disallowed. > > > I expect to be able to use a pointer returned by malloc (and friends) > > in MMX, SSE and AVX functions. > > I might agree, but would it not be feasible to have the alignment of the > returned pointer be dependent on the size of the allocation? That way, > if you allocate <16 bytes you can get 8 byte alignment. You might even > be able to go all the way down to 4 byte alignment for <8 byte > allocations. This is a nice idea and the bump allocator (simple_malloc) in musl for static-linked programs that don't use free does pretty much exactly that. With a nontrivial allocator it gets more complicated though, and I don't think there's any way to take advantage of this with the new malloc. For example, in the new allocator with 4-byte inband slot headers, 16-byte slots don't need 16-byte alignment because the largest object they can hold is 12 bytes, and the largest alignment such an object can need is 8-byte. However, since they're spaced 16 bytes apart, there's no advantage to being able to misalign them mod 16; as long as the first one in a run is aligned, all of them are. The same would apply if we had 8-byte slots, but those are mostly uninteresting with 4 bytes taken for headers. Taking advantage of it with dlmalloc-type designs that don't involve evenly-spaced slots is perhaps more practical, but can lead to messy split/merge since the small underaligned chunks aren't starting on valid boundaries to merge with adjacent free chunks. I think they'll tend to eventually get tied up as unusable space at the bottom of adjacent chunks, unnecessarily limiting the size of the allocations just below them. > It might violate the standard technically speaking, but I don't know of > any examples of types smaller than 16 bytes that require 16 byte > alignment. It doesn't since no object can have size smaller than its alignment. (As long as pointer types aren't lossy; if some pointer types lost low bits, then it would be non-conforming.) Rich