From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/9193 Path: news.gmane.org!not-for-mail From: Dan Gohman Newsgroups: gmane.linux.lib.musl.general Subject: Re: Bits deduplication: current situation Date: Mon, 25 Jan 2016 11:22:13 -0800 Message-ID: References: <20160125035925.GA2288@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary=001a114e47167efaba052a2d7c50 X-Trace: ger.gmane.org 1453749751 3196 80.91.229.3 (25 Jan 2016 19:22:31 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Mon, 25 Jan 2016 19:22:31 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-9206-gllmg-musl=m.gmane.org@lists.openwall.com Mon Jan 25 20:22:30 2016 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1aNmia-0004su-Jc for gllmg-musl@m.gmane.org; Mon, 25 Jan 2016 20:22:28 +0100 Original-Received: (qmail 17762 invoked by uid 550); 25 Jan 2016 19:22:26 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-ID: Original-Received: (qmail 17736 invoked from network); 25 Jan 2016 19:22:25 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mozilla-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=gMwOcSbbDuqjDh6aYCUDBS8hXTKDnr6ibqpE0ofWBz4=; b=lnhbSsryWj7MvNWVaWh9zThE934SQAAICd2cwR68a34NStqWljQzWrnxoFYPMo6QNh l9cq/D7UyNqIWyBU2cJ8FlghIs+PLDSJSMFGbOkVVPrzPCw57twVmnAutF+CuAC2KdI7 J9HNKIeLo0qrcOS6lKDnqk7d9xJ6YE3aoIf0Kcny6dwLk6cl7FxzJV8wROkKeM9qrLC+ bvX3kQY64QQiKl8hbhADsdrLyblJx7fd0PFsXPrp7tKc1DwGhILdYlMUeGwv0U/d9tOl 57xKHgnLHdddA5+zGkEHx0OYTjIjp6zbaPCsGwbVuBrpJjkQhYkHKpNtVqWoKM1T+sDG 49Sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=gMwOcSbbDuqjDh6aYCUDBS8hXTKDnr6ibqpE0ofWBz4=; b=gJ6LY1AJXsglzBfVuA+CTw/JCtz53wA9sOyK5GO0Wi3NDV5DVVtuxINqrjO+9IUHvB zLX+X4CDz0Zl6KWqdLnsRvzCGnKahwQzyyDUIQOLWSP6Ks5RIAFmXa9P+w6MXF68OTeE SUifablMc5fpAwUIB9XP054vRYu8lR4ITfw1B5njE6bDJAaA+Ug97buMejc/oqXB2VSs WCT/bib8ZU8J5fSGekdCAEV7rSIl4M2gsjZwirWhAMis6zUsdHWzpvpdfLYtcAkWJ6wK tQj5Rdfvy8jWK43ciQgbDqcXi6XgJSJUiyHaGIrhGTftXB362zuTpf2EXXqDzHz2F9nh PmJg== X-Gm-Message-State: AG10YORv5d2C7W4EOFGPJrxhCzQyrwXIuDcV4gUmYzQmEzRF6raWH/xWuxuXuX2W12UOx9uQA0SGAQrAXvSGhP17 X-Received: by 10.13.201.130 with SMTP id l124mr10396213ywd.212.1453749734109; Mon, 25 Jan 2016 11:22:14 -0800 (PST) In-Reply-To: <20160125035925.GA2288@brightrain.aerifal.cx> Xref: news.gmane.org gmane.linux.lib.musl.general:9193 Archived-At: --001a114e47167efaba052a2d7c50 Content-Type: multipart/alternative; boundary=001a114e47167efab6052a2d7c4e --001a114e47167efab6052a2d7c4e Content-Type: text/plain; charset=UTF-8 Concerning stdint.h, there are a few details beyond just 32-bit vs 64-bit. For example, int64_t can be either "long" or "long long" on an LP64 target. The difference usually doesn't matter, but there are things which end up noticing, like C++ name mangling and C format-string checking. GCC >= 4.5 and clang predefine macros providing almost everything stdint.h (and inttypes.h) needs. For example, see the attached file. Would you be interested in a patch which refactors stdint.h to use this approach by default, with a mechanism to support older compilers if needed? Dan On Sun, Jan 24, 2016 at 7:59 PM, Rich Felker wrote: > I'm about to try starting the bits deduplication, but before getting > started, I took a quick survey of the current bits headers we have: > > > endian.h: We could have generic ones for little and big, but each arch > that has subarchs with both endians needs its own custom version that > tests the psABI-defined macro. > > errno.h: Almost all archs can share a generic errno.h. Those that > don't might be able to share sub subset (thus benefiting from a more > elaborate bits-header-gen system) but only a couple ugly archs are > affected anyway. > > fcntl.h: Not sure how much these differ or how much they could share. > Almost all archs' versions are unique now, but some may only have > cosmetic differences. > > fenv.h: We can have a generic softfloat/no-fenv version, but each arch > with hard float basically needs its own version. > > float.h: Only 3 generic versions should need to exist: ld64, ld80, and > ld128(ieeequad). > > io.h: Most archs can use a generic empty file. > > ioctl.h: Varies highly but it may be possible to have generic versions > (perhaps one 32-bit and one 64-bit) for the clean archs to share. > > ipc.h: Lots of trivial variations to account for kernel bugs in > type/padding/etc. Not sure if they can be unified. > > limits.h: Varies by page size and 32/64-bit. Not sure if it makes > sense to have generic versions; the logic to pick which one would be > as large as the file. It would be nice to get the #ifdefs out of it > though. > > mman.h: Seems to vary but differences may be mostly cosmetic; not > sure. > > msg.h: Same deal as ipc.h. > > poll.h: Empty except for mips; generic definitions are in top-level > poll.h now. With bits dedup we could move them to a generic bits file > so that top-level doesn't have a nasty #ifndef. > > posix.h: Only 2 versions: ILP32 and LP64. They can be generic. > > reg.h: Completely arch-specific except in the case of multiple logical > archs for the same ISA (x32). > > resource.h: Same deal as poll.h. > > sem.h: Same deal as ipc.h. > > setjmp.h: Arch-specific, same as reg.h. > > shm.h: Same deal as ipc.h. > > signal.h: Arch-specific, and currently omits siginfo_t which is > gratuitously different on mips (and thus broken). Moving siginfo_t > into it would add A LOT of duplication and maintenance burden unless > we have an elaborate bits generation system that can piece these > headers together from multiple parts so the siginfo_t part can be > shared by all but mips. > > socket.h: The main difference is that workarounds for bogus kernel > definitions of msghdr and cmsghdr are needed on 64-bit archs. A few > archs also have their own definitions of some constants which override > the top-level file's. > > stat.h: It varies a lot on current archs, but in principle there's a > generic stat/stat64 that should be used for all new archs on the > kernel side, so perhaps we could have a generic one for that. > > statfs.h: Mostly generic, but mips and x32 have quirks. > > stdarg.h: Not even used except with ancient/broken compilers. Same on > all archs but i386 where the invalid legacy defs are provided. > Probably should be dropped entirely. > > stdint.h: Purely a matter of 32 vs 64 bit, otherwise totally generic. > > syscall.h: Arch-specific except new kernel archs should use the > generic one, which we can do as a generic. > > termios.h: Generic except for wacky archs (mips and powerpc). > > user.h: Highly arch-specific. > > > The good news is that there are not a lot of places where there's > value in doing anything elaborate with the deduplication. Just having > a fixed ordered list of include dirs to search while building, and > installation rules to pick the first matching one and install it in > $(includedir)/bits, would probably work. > > It's possible that we could eliminate some bits headers entirely by > having features.h (via a new bits/features.h) expose some parameters > like endianness, ILP32-vs-LP64, etc. which the top-level headers could > then use to define things in a non-arch-specific way. I'm not sure > whether I like doing that though. It simplifies porting and header > maintenance work, but at the cost of some explicitness whereby you can > just open the header file (or the bits header file) and see how > something is defined right away. > > A possible compromise is to highly abstract these things at the musl > source level, but generate flat bits files to install, or even flatten > the headers completely to remove bits so that all definitions are > inline and explicit in the top-level headers. > > Ideas/requests/preferences/etc.? > > Rich > --001a114e47167efab6052a2d7c4e Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Concerning stdint.h, there are a few details beyond j= ust 32-bit vs 64-bit. For example, int64_t can be either "long" o= r "long long" on an LP64 target. The difference usually doesn'= ;t matter, but there are things which end up noticing, like C++ name mangli= ng and C format-string checking.

GCC >=3D 4.5 and clan= g predefine macros providing almost everything stdint.h (and inttypes.h) ne= eds. For example, see the attached file. Would you be interested in a patch= which refactors stdint.h to use this approach by default, with a mechanism= to support older compilers if needed?

Dan
=

O= n Sun, Jan 24, 2016 at 7:59 PM, Rich Felker <dalias@libc.org> wrote:
I'm about to try starting th= e bits deduplication, but before getting
started, I took a quick survey of the current bits headers we have:


endian.h: We could have generic ones for little and big, but each arch
that has subarchs with both endians needs its own custom version that
tests the psABI-defined macro.

errno.h: Almost all archs can share a generic errno.h. Those that
don't might be able to share sub subset (thus benefiting from a more elaborate bits-header-gen system) but only a couple ugly archs are
affected anyway.

fcntl.h: Not sure how much these differ or how much they could share.
Almost all archs' versions are unique now, but some may only have
cosmetic differences.

fenv.h: We can have a generic softfloat/no-fenv version, but each arch
with hard float basically needs its own version.

float.h: Only 3 generic versions should need to exist: ld64, ld80, and
ld128(ieeequad).

io.h: Most archs can use a generic empty file.

ioctl.h: Varies highly but it may be possible to have generic versions
(perhaps one 32-bit and one 64-bit) for the clean archs to share.

ipc.h: Lots of trivial variations to account for kernel bugs in
type/padding/etc. Not sure if they can be unified.

limits.h: Varies by page size and 32/64-bit. Not sure if it makes
sense to have generic versions; the logic to pick which one would be
as large as the file. It would be nice to get the #ifdefs out of it
though.

mman.h: Seems to vary but differences may be mostly cosmetic; not
sure.

msg.h: Same deal as ipc.h.

poll.h: Empty except for mips; generic definitions are in top-level
poll.h now. With bits dedup we could move them to a generic bits file
so that top-level doesn't have a nasty #ifndef.

posix.h: Only 2 versions: ILP32 and LP64. They can be generic.

reg.h: Completely arch-specific except in the case of multiple logical
archs for the same ISA (x32).

resource.h: Same deal as poll.h.

sem.h: Same deal as ipc.h.

setjmp.h: Arch-specific, same as reg.h.

shm.h: Same deal as ipc.h.

signal.h: Arch-specific, and currently omits siginfo_t which is
gratuitously different on mips (and thus broken). Moving siginfo_t
into it would add A LOT of duplication and maintenance burden unless
we have an elaborate bits generation system that can piece these
headers together from multiple parts so the siginfo_t part can be
shared by all but mips.

socket.h: The main difference is that workarounds for bogus kernel
definitions of msghdr and cmsghdr are needed on 64-bit archs. A few
archs also have their own definitions of some constants which override
the top-level file's.

stat.h: It varies a lot on current archs, but in principle there's a generic stat/stat64 that should be used for all new archs on the
kernel side, so perhaps we could have a generic one for that.

statfs.h: Mostly generic, but mips and x32 have quirks.

stdarg.h: Not even used except with ancient/broken compilers. Same on
all archs but i386 where the invalid legacy defs are provided.
Probably should be dropped entirely.

stdint.h: Purely a matter of 32 vs 64 bit, otherwise totally generic.

syscall.h: Arch-specific except new kernel archs should use the
generic one, which we can do as a generic.

termios.h: Generic except for wacky archs (mips and powerpc).

user.h: Highly arch-specific.


The good news is that there are not a lot of places where there's
value in doing anything elaborate with the deduplication. Just having
a fixed ordered list of include dirs to search while building, and
installation rules to pick the first matching one and install it in
$(includedir)/bits, would probably work.

It's possible that we could eliminate some bits headers entirely by
having features.h (via a new bits/features.h) expose some parameters
like endianness, ILP32-vs-LP64, etc. which the top-level headers could
then use to define things in a non-arch-specific way. I'm not sure
whether I like doing that though. It simplifies porting and header
maintenance work, but at the cost of some explicitness whereby you can
just open the header file (or the bits header file) and see how
something is defined right away.

A possible compromise is to highly abstract these things at the musl
source level, but generate flat bits files to install, or even flatten
the headers completely to remove bits so that all definitions are
inline and explicit in the top-level headers.

Ideas/requests/preferences/etc.?

Rich

--001a114e47167efab6052a2d7c4e-- --001a114e47167efaba052a2d7c50 Content-Type: text/x-chdr; charset=US-ASCII; name="stdint-generic.h" Content-Disposition: attachment; filename="stdint-generic.h" Content-Transfer-Encoding: base64 X-Attachment-Id: f_ijucmidl0 dHlwZWRlZiBfX0lOVDhfVFlQRV9fIGludDhfdDsKdHlwZWRlZiBfX0lOVDE2X1RZUEVfXyBpbnQx Nl90Owp0eXBlZGVmIF9fSU5UMzJfVFlQRV9fIGludDMyX3Q7CnR5cGVkZWYgX19JTlQ2NF9UWVBF X18gaW50NjRfdDsKdHlwZWRlZiBfX1VJTlQ4X1RZUEVfXyB1aW50OF90Owp0eXBlZGVmIF9fVUlO VDE2X1RZUEVfXyB1aW50MTZfdDsKdHlwZWRlZiBfX1VJTlQzMl9UWVBFX18gdWludDMyX3Q7CnR5 cGVkZWYgX19VSU5UNjRfVFlQRV9fIHVpbnQ2NF90OwoKdHlwZWRlZiBfX0lOVF9GQVNUOF9UWVBF X18gaW50X2Zhc3Q4X3Q7CnR5cGVkZWYgX19JTlRfRkFTVDE2X1RZUEVfXyBpbnRfZmFzdDE2X3Q7 CnR5cGVkZWYgX19JTlRfRkFTVDMyX1RZUEVfXyBpbnRfZmFzdDMyX3Q7CnR5cGVkZWYgX19JTlRf RkFTVDY0X1RZUEVfXyBpbnRfZmFzdDY0X3Q7CnR5cGVkZWYgX19VSU5UX0ZBU1Q4X1RZUEVfXyB1 aW50X2Zhc3Q4X3Q7CnR5cGVkZWYgX19VSU5UX0ZBU1QxNl9UWVBFX18gdWludF9mYXN0MTZfdDsK dHlwZWRlZiBfX1VJTlRfRkFTVDMyX1RZUEVfXyB1aW50X2Zhc3QzMl90Owp0eXBlZGVmIF9fVUlO VF9GQVNUNjRfVFlQRV9fIHVpbnRfZmFzdDY0X3Q7CgojZGVmaW5lIElOVF9GQVNUOF9NSU4gICgt X19VSU5UX0ZBU1Q4X01BWF9fIC0gMSkKI2RlZmluZSBJTlRfRkFTVDE2X01JTiAgKC1fX1VJTlRf RkFTVDE2X01BWF9fIC0gMSkKI2RlZmluZSBJTlRfRkFTVDMyX01JTiAgKC1fX1VJTlRfRkFTVDMy X01BWF9fIC0gMSkKI2RlZmluZSBJTlRfRkFTVDY0X01JTiAgKC1fX1VJTlRfRkFTVDY0X01BWF9f IC0gMSkKCiNkZWZpbmUgSU5UX0ZBU1Q4X01BWCAgX19VSU5UX0ZBU1Q4X01BWF9fCiNkZWZpbmUg SU5UX0ZBU1QxNl9NQVggIF9fVUlOVF9GQVNUMTZfTUFYX18KI2RlZmluZSBJTlRfRkFTVDMyX01B WCAgX19VSU5UX0ZBU1QzMl9NQVhfXwojZGVmaW5lIElOVF9GQVNUNjRfTUFYICBfX1VJTlRfRkFT VDY0X01BWF9fCgojZGVmaW5lIFVJTlRfRkFTVDhfTUFYIF9fVUlOVF9GQVNUOF9NQVgKI2RlZmlu ZSBVSU5UX0ZBU1QxNl9NQVggX19VSU5UX0ZBU1QxNl9NQVgKI2RlZmluZSBVSU5UX0ZBU1QzMl9N QVggX19VSU5UX0ZBU1QzMl9NQVgKI2RlZmluZSBVSU5UX0ZBU1Q2NF9NQVggX19VSU5UX0ZBU1Q2 NF9NQVgKCnR5cGVkZWYgX19JTlRfTEVBU1Q4X1RZUEVfXyBpbnRfbGVhc3Q4X3Q7CnR5cGVkZWYg X19JTlRfTEVBU1QxNl9UWVBFX18gaW50X2xlYXN0MTZfdDsKdHlwZWRlZiBfX0lOVF9MRUFTVDMy X1RZUEVfXyBpbnRfbGVhc3QzMl90Owp0eXBlZGVmIF9fSU5UX0xFQVNUNjRfVFlQRV9fIGludF9s ZWFzdDY0X3Q7CnR5cGVkZWYgX19VSU5UX0xFQVNUOF9UWVBFX18gdWludF9sZWFzdDhfdDsKdHlw ZWRlZiBfX1VJTlRfTEVBU1QxNl9UWVBFX18gdWludF9sZWFzdDE2X3Q7CnR5cGVkZWYgX19VSU5U X0xFQVNUMzJfVFlQRV9fIHVpbnRfbGVhc3QzMl90Owp0eXBlZGVmIF9fVUlOVF9MRUFTVDY0X1RZ UEVfXyB1aW50X2xlYXN0NjRfdDsKCiNkZWZpbmUgSU5UX0xFQVNUOF9NSU4gICgtX19VSU5UX0xF QVNUOF9NQVhfXyAtIDEpCiNkZWZpbmUgSU5UX0xFQVNUMTZfTUlOICAoLV9fVUlOVF9MRUFTVDE2 X01BWF9fIC0gMSkKI2RlZmluZSBJTlRfTEVBU1QzMl9NSU4gICgtX19VSU5UX0xFQVNUMzJfTUFY X18gLSAxKQojZGVmaW5lIElOVF9MRUFTVDY0X01JTiAgKC1fX1VJTlRfTEVBU1Q2NF9NQVhfXyAt IDEpCgojZGVmaW5lIElOVF9MRUFTVDhfTUFYICBfX1VJTlRfTEVBU1Q4X01BWF9fCiNkZWZpbmUg SU5UX0xFQVNUMTZfTUFYICBfX1VJTlRfTEVBU1QxNl9NQVhfXwojZGVmaW5lIElOVF9MRUFTVDMy X01BWCAgX19VSU5UX0xFQVNUMzJfTUFYX18KI2RlZmluZSBJTlRfTEVBU1Q2NF9NQVggIF9fVUlO VF9MRUFTVDY0X01BWF9fCgojZGVmaW5lIFVJTlRfTEVBU1Q4X01BWCBfX1VJTlRfTEVBU1Q4X01B WAojZGVmaW5lIFVJTlRfTEVBU1QxNl9NQVggX19VSU5UX0xFQVNUMTZfTUFYCiNkZWZpbmUgVUlO VF9MRUFTVDMyX01BWCBfX1VJTlRfTEVBU1QzMl9NQVgKI2RlZmluZSBVSU5UX0xFQVNUNjRfTUFY IF9fVUlOVF9MRUFTVDY0X01BWAoKI2RlZmluZSBJTlRQVFJfTUlOICAgICAgKC1fX0lOVFBUUl9N QVhfXyAtIDEpCiNkZWZpbmUgSU5UUFRSX01BWCAgICAgIF9fSU5UUFRSX01BWF9fCiNkZWZpbmUg VUlOVFBUUl9NQVggICAgIF9fVUlOVFBUUl9NQVhfXwojZGVmaW5lIFBUUkRJRkZfTUlOICAgICAo LV9fUFRSRElGRl9NQVhfXyAtIDEpCiNkZWZpbmUgUFRSRElGRl9NQVggICAgIF9fUFRSRElGRl9N QVhfXwojZGVmaW5lIFNJWkVfTUFYICAgICAgICBfX1NJWkVfTUFYX18K --001a114e47167efaba052a2d7c50--