Bits deduplication: current situation

mailing list of musl libc
 help / color / mirror / code / Atom feed

* Bits deduplication: current situation
@ 2016-01-25  3:59 Rich Felker
  2016-01-25  8:08 ` Natanael Copa
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Rich Felker @ 2016-01-25  3:59 UTC (permalink / raw)
  To: musl

I'm about to try starting the bits deduplication, but before getting
started, I took a quick survey of the current bits headers we have:

endian.h: We could have generic ones for little and big, but each arch
that has subarchs with both endians needs its own custom version that
tests the psABI-defined macro.

errno.h: Almost all archs can share a generic errno.h. Those that
don't might be able to share sub subset (thus benefiting from a more
elaborate bits-header-gen system) but only a couple ugly archs are
affected anyway.

fcntl.h: Not sure how much these differ or how much they could share.
Almost all archs' versions are unique now, but some may only have
cosmetic differences.

fenv.h: We can have a generic softfloat/no-fenv version, but each arch
with hard float basically needs its own version.

float.h: Only 3 generic versions should need to exist: ld64, ld80, and
ld128(ieeequad).

io.h: Most archs can use a generic empty file.

ioctl.h: Varies highly but it may be possible to have generic versions
(perhaps one 32-bit and one 64-bit) for the clean archs to share.

ipc.h: Lots of trivial variations to account for kernel bugs in
type/padding/etc. Not sure if they can be unified.

limits.h: Varies by page size and 32/64-bit. Not sure if it makes
sense to have generic versions; the logic to pick which one would be
as large as the file. It would be nice to get the #ifdefs out of it
though.

mman.h: Seems to vary but differences may be mostly cosmetic; not
sure.

msg.h: Same deal as ipc.h.

poll.h: Empty except for mips; generic definitions are in top-level
poll.h now. With bits dedup we could move them to a generic bits file
so that top-level doesn't have a nasty #ifndef.

posix.h: Only 2 versions: ILP32 and LP64. They can be generic.

reg.h: Completely arch-specific except in the case of multiple logical
archs for the same ISA (x32).

resource.h: Same deal as poll.h.

sem.h: Same deal as ipc.h.

setjmp.h: Arch-specific, same as reg.h.

shm.h: Same deal as ipc.h.

signal.h: Arch-specific, and currently omits siginfo_t which is
gratuitously different on mips (and thus broken). Moving siginfo_t
into it would add A LOT of duplication and maintenance burden unless
we have an elaborate bits generation system that can piece these
headers together from multiple parts so the siginfo_t part can be
shared by all but mips.

socket.h: The main difference is that workarounds for bogus kernel
definitions of msghdr and cmsghdr are needed on 64-bit archs. A few
archs also have their own definitions of some constants which override
the top-level file's.

stat.h: It varies a lot on current archs, but in principle there's a
generic stat/stat64 that should be used for all new archs on the
kernel side, so perhaps we could have a generic one for that.

statfs.h: Mostly generic, but mips and x32 have quirks.

stdarg.h: Not even used except with ancient/broken compilers. Same on
all archs but i386 where the invalid legacy defs are provided.
Probably should be dropped entirely.

stdint.h: Purely a matter of 32 vs 64 bit, otherwise totally generic.

syscall.h: Arch-specific except new kernel archs should use the
generic one, which we can do as a generic.

termios.h: Generic except for wacky archs (mips and powerpc).

user.h: Highly arch-specific.

The good news is that there are not a lot of places where there's
value in doing anything elaborate with the deduplication. Just having
a fixed ordered list of include dirs to search while building, and
installation rules to pick the first matching one and install it in
$(includedir)/bits, would probably work.

It's possible that we could eliminate some bits headers entirely by
having features.h (via a new bits/features.h) expose some parameters
like endianness, ILP32-vs-LP64, etc. which the top-level headers could
then use to define things in a non-arch-specific way. I'm not sure
whether I like doing that though. It simplifies porting and header
maintenance work, but at the cost of some explicitness whereby you can
just open the header file (or the bits header file) and see how
something is defined right away.

A possible compromise is to highly abstract these things at the musl
source level, but generate flat bits files to install, or even flatten
the headers completely to remove bits so that all definitions are
inline and explicit in the top-level headers.

Ideas/requests/preferences/etc.?

Rich

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bits deduplication: current situation
  2016-01-25  3:59 Bits deduplication: current situation Rich Felker
@ 2016-01-25  8:08 ` Natanael Copa
  2016-01-25 17:17   ` Rich Felker
  2016-01-25 10:46 ` Laurent Bercot
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Natanael Copa @ 2016-01-25  8:08 UTC (permalink / raw)
  To: Rich Felker; +Cc: musl

On Sun, 24 Jan 2016 22:59:25 -0500
Rich Felker <dalias@libc.org> wrote:

> I'm about to try starting the bits deduplication, but before getting
> started, I took a quick survey of the current bits headers we have:

...

> Ideas/requests/preferences/etc.?

It would be nice to be able to build 32-bit boot loaders on 64 bit
hosts with gcc -m32. Currently that does not work due to it picks up 64
bit inttypes. We have a patch for xen's hvmloader:

http://git.alpinelinux.org/cgit/aports/tree/main/xen/musl-hvmloader-fix-stdint.patch
http://git.alpinelinux.org/cgit/aports/tree/main/xen/stdint_local.h

Introduced with this commit:
http://git.alpinelinux.org/cgit/aports/commit/main/xen/musl-hvmloader-fix-stdint.patch?id=bcf7b52774f1b0a3e405a207c3c4a5342b951f40


This is for stdint.h but I think its related and I assume it affects
limits.h too.

-nc


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bits deduplication: current situation
  2016-01-25  3:59 Bits deduplication: current situation Rich Felker
  2016-01-25  8:08 ` Natanael Copa
@ 2016-01-25 10:46 ` Laurent Bercot
  2016-01-25 14:56 ` Ward Willats
  2016-01-25 19:22 ` Dan Gohman
  3 siblings, 0 replies; 14+ messages in thread
From: Laurent Bercot @ 2016-01-25 10:46 UTC (permalink / raw)
  To: musl

On 25/01/2016 04:59, Rich Felker wrote:
> A possible compromise is to highly abstract these things at the musl
> source level, but generate flat bits files to install, or even flatten
> the headers completely to remove bits so that all definitions are
> inline and explicit in the top-level headers.

  Whatever you choose to do, my position is that clarity of the source is
more important than clarity of the installed files. Maintenance effort
goes to the source, not to the installed files. Users who peek at
installed headers to know what's going on will be able to figure it out,
and if they're not, they can always grab the musl source.

  Independently from that, I find it nice, when faced with a tree of
headers, to be able to see at a glance what can be copied as is and
what has been generated (and thus cannot be safely modified or reused
elsewhere). So I'm in favor of a separate bits, no matter what the
files under it look like.

-- 
  Laurent

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bits deduplication: current situation
  2016-01-25  3:59 Bits deduplication: current situation Rich Felker
  2016-01-25  8:08 ` Natanael Copa
  2016-01-25 10:46 ` Laurent Bercot
@ 2016-01-25 14:56 ` Ward Willats
  2016-01-25 15:37   ` Szabolcs Nagy
  2016-01-25 19:22 ` Dan Gohman
  3 siblings, 1 reply; 14+ messages in thread
From: Ward Willats @ 2016-01-25 14:56 UTC (permalink / raw)
  To: musl

> On Jan 24, 2016, at 7:59 PM, Rich Felker <dalias@libc.org> wrote:
> 
> signal.h: Arch-specific, and currently omits siginfo_t which is
> gratuitously different on mips (and thus broken). Moving siginfo_t
> into it would add A LOT of duplication and maintenance burden unless
> we have an elaborate bits generation system that can piece these
> headers together from multiple parts so the siginfo_t part can be
> shared by all but mips.
> 

Just curious. On our OpenWRT-based MIPS platform where our app uses MUSCL, we include <signal.h> (I believe from <somewhere>/staging_dir/toolchain-mipsel_24kec+dsp_gcc-4.8-linaro_musl-1.1.11/include/signal.h) and it defines a siginfo_t. But when we use it in a handler to catch faults ( SEGV, ILL, BUS, FPE ), the PC value of the faulting instruction is always non-existent or wrong, as is the errno. The fault subcode is also always zero.

I always figured this was a result of a bad build or bugs on our side, but reading this makes me wonder if the siginfo_t machinery on our MIPS platform is just not trustworthy in the first place? If so, can it be worked around?

Thanks,

-- Ward

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bits deduplication: current situation
  2016-01-25 14:56 ` Ward Willats
@ 2016-01-25 15:37   ` Szabolcs Nagy
  0 siblings, 0 replies; 14+ messages in thread
From: Szabolcs Nagy @ 2016-01-25 15:37 UTC (permalink / raw)
  To: musl

* Ward Willats <musl@wardco.com> [2016-01-25 06:56:26 -0800]:
> > On Jan 24, 2016, at 7:59 PM, Rich Felker <dalias@libc.org> wrote:
> > 
> > signal.h: Arch-specific, and currently omits siginfo_t which is
> > gratuitously different on mips (and thus broken). Moving siginfo_t
> > into it would add A LOT of duplication and maintenance burden unless
> > we have an elaborate bits generation system that can piece these
> > headers together from multiple parts so the siginfo_t part can be
> > shared by all but mips.
> > 
> 
> Just curious. On our OpenWRT-based MIPS platform where our app uses MUSCL, we include <signal.h> (I believe from <somewhere>/staging_dir/toolchain-mipsel_24kec+dsp_gcc-4.8-linaro_musl-1.1.11/include/signal.h) and it defines a siginfo_t. But when we use it in a handler to catch faults ( SEGV, ILL, BUS, FPE ), the PC value of the faulting instruction is always non-existent or wrong, as is the errno. The fault subcode is also always zero.
> 
> I always figured this was a result of a bad build or bugs on our side, but reading this makes me wonder if the siginfo_t machinery on our MIPS platform is just not trustworthy in the first place? If so, can it be worked around?
> 

siginfo_t is broken in musl for mips, see this thread:
http://www.openwall.com/lists/musl/2015/12/10/3

si_code and si_errno are swapped in the struct and
some SI_* macros are wrong.

(this happens sometimes in musl because we dont use kernel
headers since they can cause various problems so musl
has to replicate all the kernel uapi quirks.

e.g. bionic uses autogenerated headers to solve this problem
https://android.googlesource.com/platform/bionic/+/master/libc/kernel/README.TXT
but those headers still have lot of issues and the tools
for generating the headers are fairly complex

i think for musl some test tool to compare against linux
uapi would be better than autogen.)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bits deduplication: current situation
  2016-01-25  8:08 ` Natanael Copa
@ 2016-01-25 17:17   ` Rich Felker
  0 siblings, 0 replies; 14+ messages in thread
From: Rich Felker @ 2016-01-25 17:17 UTC (permalink / raw)
  To: musl

On Mon, Jan 25, 2016 at 09:08:03AM +0100, Natanael Copa wrote:
> On Sun, 24 Jan 2016 22:59:25 -0500
> Rich Felker <dalias@libc.org> wrote:
> 
> > I'm about to try starting the bits deduplication, but before getting
> > started, I took a quick survey of the current bits headers we have:
> 
> ....
> 
> > Ideas/requests/preferences/etc.?
> 
> It would be nice to be able to build 32-bit boot loaders on 64 bit
> hosts with gcc -m32. Currently that does not work due to it picks up 64
> bit inttypes. We have a patch for xen's hvmloader:
> 
> http://git.alpinelinux.org/cgit/aports/tree/main/xen/musl-hvmloader-fix-stdint.patch
> http://git.alpinelinux.org/cgit/aports/tree/main/xen/stdint_local.h
> 
> Introduced with this commit:
> http://git.alpinelinux.org/cgit/aports/commit/main/xen/musl-hvmloader-fix-stdint.patch?id=bcf7b52774f1b0a3e405a207c3c4a5342b951f40
> 
> 
> This is for stdint.h but I think its related and I assume it affects
> limits.h too.

I don't really see a good way to fix this. musl is not designed for
treating "related" 32- and 64-bit archs as if they were a common arch.
It would probably not be hard to make this one usage case work in
practice, but it would be fragile and incomplete.

Is there a reason you can't just pass -nostdinc and then -I the gcc
include dir to use gcc's freestanding headers for a non-native target
like this? IMO the cleanest (albeit somewhat costlier) solution would
be just installing a proper i386 cross compiler.

Alternatively maybe gcc's -m32 could be fixed to use completely
different include paths rather than trying to use the same headers for
different archs. This actually affects third-party installed headers
that are generated for the target arch too, which may be wrongly
picked up if -m32 is used.

Rich

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bits deduplication: current situation
  2016-01-25  3:59 Bits deduplication: current situation Rich Felker
                   ` (2 preceding siblings ...)
  2016-01-25 14:56 ` Ward Willats
@ 2016-01-25 19:22 ` Dan Gohman
  2016-01-25 21:00   ` Rich Felker
  3 siblings, 1 reply; 14+ messages in thread
From: Dan Gohman @ 2016-01-25 19:22 UTC (permalink / raw)
  To: musl


[-- Attachment #1.1: Type: text/plain, Size: 5243 bytes --]

Concerning stdint.h, there are a few details beyond just 32-bit vs 64-bit.
For example, int64_t can be either "long" or "long long" on an LP64 target.
The difference usually doesn't matter, but there are things which end up
noticing, like C++ name mangling and C format-string checking.

GCC >= 4.5 and clang predefine macros providing almost everything stdint.h
(and inttypes.h) needs. For example, see the attached file. Would you be
interested in a patch which refactors stdint.h to use this approach by
default, with a mechanism to support older compilers if needed?

Dan


On Sun, Jan 24, 2016 at 7:59 PM, Rich Felker <dalias@libc.org> wrote:

> I'm about to try starting the bits deduplication, but before getting
> started, I took a quick survey of the current bits headers we have:
>
>
> endian.h: We could have generic ones for little and big, but each arch
> that has subarchs with both endians needs its own custom version that
> tests the psABI-defined macro.
>
> errno.h: Almost all archs can share a generic errno.h. Those that
> don't might be able to share sub subset (thus benefiting from a more
> elaborate bits-header-gen system) but only a couple ugly archs are
> affected anyway.
>
> fcntl.h: Not sure how much these differ or how much they could share.
> Almost all archs' versions are unique now, but some may only have
> cosmetic differences.
>
> fenv.h: We can have a generic softfloat/no-fenv version, but each arch
> with hard float basically needs its own version.
>
> float.h: Only 3 generic versions should need to exist: ld64, ld80, and
> ld128(ieeequad).
>
> io.h: Most archs can use a generic empty file.
>
> ioctl.h: Varies highly but it may be possible to have generic versions
> (perhaps one 32-bit and one 64-bit) for the clean archs to share.
>
> ipc.h: Lots of trivial variations to account for kernel bugs in
> type/padding/etc. Not sure if they can be unified.
>
> limits.h: Varies by page size and 32/64-bit. Not sure if it makes
> sense to have generic versions; the logic to pick which one would be
> as large as the file. It would be nice to get the #ifdefs out of it
> though.
>
> mman.h: Seems to vary but differences may be mostly cosmetic; not
> sure.
>
> msg.h: Same deal as ipc.h.
>
> poll.h: Empty except for mips; generic definitions are in top-level
> poll.h now. With bits dedup we could move them to a generic bits file
> so that top-level doesn't have a nasty #ifndef.
>
> posix.h: Only 2 versions: ILP32 and LP64. They can be generic.
>
> reg.h: Completely arch-specific except in the case of multiple logical
> archs for the same ISA (x32).
>
> resource.h: Same deal as poll.h.
>
> sem.h: Same deal as ipc.h.
>
> setjmp.h: Arch-specific, same as reg.h.
>
> shm.h: Same deal as ipc.h.
>
> signal.h: Arch-specific, and currently omits siginfo_t which is
> gratuitously different on mips (and thus broken). Moving siginfo_t
> into it would add A LOT of duplication and maintenance burden unless
> we have an elaborate bits generation system that can piece these
> headers together from multiple parts so the siginfo_t part can be
> shared by all but mips.
>
> socket.h: The main difference is that workarounds for bogus kernel
> definitions of msghdr and cmsghdr are needed on 64-bit archs. A few
> archs also have their own definitions of some constants which override
> the top-level file's.
>
> stat.h: It varies a lot on current archs, but in principle there's a
> generic stat/stat64 that should be used for all new archs on the
> kernel side, so perhaps we could have a generic one for that.
>
> statfs.h: Mostly generic, but mips and x32 have quirks.
>
> stdarg.h: Not even used except with ancient/broken compilers. Same on
> all archs but i386 where the invalid legacy defs are provided.
> Probably should be dropped entirely.
>
> stdint.h: Purely a matter of 32 vs 64 bit, otherwise totally generic.
>
> syscall.h: Arch-specific except new kernel archs should use the
> generic one, which we can do as a generic.
>
> termios.h: Generic except for wacky archs (mips and powerpc).
>
> user.h: Highly arch-specific.
>
>
> The good news is that there are not a lot of places where there's
> value in doing anything elaborate with the deduplication. Just having
> a fixed ordered list of include dirs to search while building, and
> installation rules to pick the first matching one and install it in
> $(includedir)/bits, would probably work.
>
> It's possible that we could eliminate some bits headers entirely by
> having features.h (via a new bits/features.h) expose some parameters
> like endianness, ILP32-vs-LP64, etc. which the top-level headers could
> then use to define things in a non-arch-specific way. I'm not sure
> whether I like doing that though. It simplifies porting and header
> maintenance work, but at the cost of some explicitness whereby you can
> just open the header file (or the bits header file) and see how
> something is defined right away.
>
> A possible compromise is to highly abstract these things at the musl
> source level, but generate flat bits files to install, or even flatten
> the headers completely to remove bits so that all definitions are
> inline and explicit in the top-level headers.
>
> Ideas/requests/preferences/etc.?
>
> Rich
>

[-- Attachment #1.2: Type: text/html, Size: 6087 bytes --]

[-- Attachment #2: stdint-generic.h --]
[-- Type: text/x-chdr, Size: 2322 bytes --]

typedef __INT8_TYPE__ int8_t;
typedef __INT16_TYPE__ int16_t;
typedef __INT32_TYPE__ int32_t;
typedef __INT64_TYPE__ int64_t;
typedef __UINT8_TYPE__ uint8_t;
typedef __UINT16_TYPE__ uint16_t;
typedef __UINT32_TYPE__ uint32_t;
typedef __UINT64_TYPE__ uint64_t;

typedef __INT_FAST8_TYPE__ int_fast8_t;
typedef __INT_FAST16_TYPE__ int_fast16_t;
typedef __INT_FAST32_TYPE__ int_fast32_t;
typedef __INT_FAST64_TYPE__ int_fast64_t;
typedef __UINT_FAST8_TYPE__ uint_fast8_t;
typedef __UINT_FAST16_TYPE__ uint_fast16_t;
typedef __UINT_FAST32_TYPE__ uint_fast32_t;
typedef __UINT_FAST64_TYPE__ uint_fast64_t;

#define INT_FAST8_MIN  (-__UINT_FAST8_MAX__ - 1)
#define INT_FAST16_MIN  (-__UINT_FAST16_MAX__ - 1)
#define INT_FAST32_MIN  (-__UINT_FAST32_MAX__ - 1)
#define INT_FAST64_MIN  (-__UINT_FAST64_MAX__ - 1)

#define INT_FAST8_MAX  __UINT_FAST8_MAX__
#define INT_FAST16_MAX  __UINT_FAST16_MAX__
#define INT_FAST32_MAX  __UINT_FAST32_MAX__
#define INT_FAST64_MAX  __UINT_FAST64_MAX__

#define UINT_FAST8_MAX __UINT_FAST8_MAX
#define UINT_FAST16_MAX __UINT_FAST16_MAX
#define UINT_FAST32_MAX __UINT_FAST32_MAX
#define UINT_FAST64_MAX __UINT_FAST64_MAX

typedef __INT_LEAST8_TYPE__ int_least8_t;
typedef __INT_LEAST16_TYPE__ int_least16_t;
typedef __INT_LEAST32_TYPE__ int_least32_t;
typedef __INT_LEAST64_TYPE__ int_least64_t;
typedef __UINT_LEAST8_TYPE__ uint_least8_t;
typedef __UINT_LEAST16_TYPE__ uint_least16_t;
typedef __UINT_LEAST32_TYPE__ uint_least32_t;
typedef __UINT_LEAST64_TYPE__ uint_least64_t;

#define INT_LEAST8_MIN  (-__UINT_LEAST8_MAX__ - 1)
#define INT_LEAST16_MIN  (-__UINT_LEAST16_MAX__ - 1)
#define INT_LEAST32_MIN  (-__UINT_LEAST32_MAX__ - 1)
#define INT_LEAST64_MIN  (-__UINT_LEAST64_MAX__ - 1)

#define INT_LEAST8_MAX  __UINT_LEAST8_MAX__
#define INT_LEAST16_MAX  __UINT_LEAST16_MAX__
#define INT_LEAST32_MAX  __UINT_LEAST32_MAX__
#define INT_LEAST64_MAX  __UINT_LEAST64_MAX__

#define UINT_LEAST8_MAX __UINT_LEAST8_MAX
#define UINT_LEAST16_MAX __UINT_LEAST16_MAX
#define UINT_LEAST32_MAX __UINT_LEAST32_MAX
#define UINT_LEAST64_MAX __UINT_LEAST64_MAX

#define INTPTR_MIN      (-__INTPTR_MAX__ - 1)
#define INTPTR_MAX      __INTPTR_MAX__
#define UINTPTR_MAX     __UINTPTR_MAX__
#define PTRDIFF_MIN     (-__PTRDIFF_MAX__ - 1)
#define PTRDIFF_MAX     __PTRDIFF_MAX__
#define SIZE_MAX        __SIZE_MAX__

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bits deduplication: current situation
  2016-01-25 19:22 ` Dan Gohman
@ 2016-01-25 21:00   ` Rich Felker
  2016-01-25 21:32     ` Szabolcs Nagy
  0 siblings, 1 reply; 14+ messages in thread
From: Rich Felker @ 2016-01-25 21:00 UTC (permalink / raw)
  To: musl

On Mon, Jan 25, 2016 at 11:22:13AM -0800, Dan Gohman wrote:
> Concerning stdint.h, there are a few details beyond just 32-bit vs 64-bit.
> For example, int64_t can be either "long" or "long long" on an LP64 target.
> The difference usually doesn't matter, but there are things which end up
> noticing, like C++ name mangling and C format-string checking.

I'm pretty sure int64_t is long on all LP64 targets we support. Are
there others that differ?

> GCC >= 4.5 and clang predefine macros providing almost everything stdint.h
> (and inttypes.h) needs. For example, see the attached file. Would you be
> interested in a patch which refactors stdint.h to use this approach by
> default, with a mechanism to support older compilers if needed?

No, the intent is that the public headers be compatible with basically
all compilers honoring the ABI, not just gcc and compatible ones.
There are a very small number of things (documented in the outdated
manual) that need extensions in the public headers, mainly _Complex_I,
tgmath.h, and stdarg.h, and in those cases we use the conventions that
gcc and other existing compilers have created.

Also it's musl's intent to be explicit with definitions, and this is
actually helpful with the C++ types issue. IMO it's much better to get
an error that a new compiler you're trying has the ABI wrong than to
silently use different types.

Rich

P.S. Could you follw up replies below the quoted text (if any) when
replying to the list rather than top-posting?


> On Sun, Jan 24, 2016 at 7:59 PM, Rich Felker <dalias@libc.org> wrote:
> 
> > I'm about to try starting the bits deduplication, but before getting
> > started, I took a quick survey of the current bits headers we have:
> >
> >
> > endian.h: We could have generic ones for little and big, but each arch
> > that has subarchs with both endians needs its own custom version that
> > tests the psABI-defined macro.
> >
> > errno.h: Almost all archs can share a generic errno.h. Those that
> > don't might be able to share sub subset (thus benefiting from a more
> > elaborate bits-header-gen system) but only a couple ugly archs are
> > affected anyway.
> >
> > fcntl.h: Not sure how much these differ or how much they could share.
> > Almost all archs' versions are unique now, but some may only have
> > cosmetic differences.
> >
> > fenv.h: We can have a generic softfloat/no-fenv version, but each arch
> > with hard float basically needs its own version.
> >
> > float.h: Only 3 generic versions should need to exist: ld64, ld80, and
> > ld128(ieeequad).
> >
> > io.h: Most archs can use a generic empty file.
> >
> > ioctl.h: Varies highly but it may be possible to have generic versions
> > (perhaps one 32-bit and one 64-bit) for the clean archs to share.
> >
> > ipc.h: Lots of trivial variations to account for kernel bugs in
> > type/padding/etc. Not sure if they can be unified.
> >
> > limits.h: Varies by page size and 32/64-bit. Not sure if it makes
> > sense to have generic versions; the logic to pick which one would be
> > as large as the file. It would be nice to get the #ifdefs out of it
> > though.
> >
> > mman.h: Seems to vary but differences may be mostly cosmetic; not
> > sure.
> >
> > msg.h: Same deal as ipc.h.
> >
> > poll.h: Empty except for mips; generic definitions are in top-level
> > poll.h now. With bits dedup we could move them to a generic bits file
> > so that top-level doesn't have a nasty #ifndef.
> >
> > posix.h: Only 2 versions: ILP32 and LP64. They can be generic.
> >
> > reg.h: Completely arch-specific except in the case of multiple logical
> > archs for the same ISA (x32).
> >
> > resource.h: Same deal as poll.h.
> >
> > sem.h: Same deal as ipc.h.
> >
> > setjmp.h: Arch-specific, same as reg.h.
> >
> > shm.h: Same deal as ipc.h.
> >
> > signal.h: Arch-specific, and currently omits siginfo_t which is
> > gratuitously different on mips (and thus broken). Moving siginfo_t
> > into it would add A LOT of duplication and maintenance burden unless
> > we have an elaborate bits generation system that can piece these
> > headers together from multiple parts so the siginfo_t part can be
> > shared by all but mips.
> >
> > socket.h: The main difference is that workarounds for bogus kernel
> > definitions of msghdr and cmsghdr are needed on 64-bit archs. A few
> > archs also have their own definitions of some constants which override
> > the top-level file's.
> >
> > stat.h: It varies a lot on current archs, but in principle there's a
> > generic stat/stat64 that should be used for all new archs on the
> > kernel side, so perhaps we could have a generic one for that.
> >
> > statfs.h: Mostly generic, but mips and x32 have quirks.
> >
> > stdarg.h: Not even used except with ancient/broken compilers. Same on
> > all archs but i386 where the invalid legacy defs are provided.
> > Probably should be dropped entirely.
> >
> > stdint.h: Purely a matter of 32 vs 64 bit, otherwise totally generic.
> >
> > syscall.h: Arch-specific except new kernel archs should use the
> > generic one, which we can do as a generic.
> >
> > termios.h: Generic except for wacky archs (mips and powerpc).
> >
> > user.h: Highly arch-specific.
> >
> >
> > The good news is that there are not a lot of places where there's
> > value in doing anything elaborate with the deduplication. Just having
> > a fixed ordered list of include dirs to search while building, and
> > installation rules to pick the first matching one and install it in
> > $(includedir)/bits, would probably work.
> >
> > It's possible that we could eliminate some bits headers entirely by
> > having features.h (via a new bits/features.h) expose some parameters
> > like endianness, ILP32-vs-LP64, etc. which the top-level headers could
> > then use to define things in a non-arch-specific way. I'm not sure
> > whether I like doing that though. It simplifies porting and header
> > maintenance work, but at the cost of some explicitness whereby you can
> > just open the header file (or the bits header file) and see how
> > something is defined right away.
> >
> > A possible compromise is to highly abstract these things at the musl
> > source level, but generate flat bits files to install, or even flatten
> > the headers completely to remove bits so that all definitions are
> > inline and explicit in the top-level headers.
> >
> > Ideas/requests/preferences/etc.?
> >
> > Rich
> >

> typedef __INT8_TYPE__ int8_t;
> typedef __INT16_TYPE__ int16_t;
> typedef __INT32_TYPE__ int32_t;
> typedef __INT64_TYPE__ int64_t;
> typedef __UINT8_TYPE__ uint8_t;
> typedef __UINT16_TYPE__ uint16_t;
> typedef __UINT32_TYPE__ uint32_t;
> typedef __UINT64_TYPE__ uint64_t;
> 
> typedef __INT_FAST8_TYPE__ int_fast8_t;
> typedef __INT_FAST16_TYPE__ int_fast16_t;
> typedef __INT_FAST32_TYPE__ int_fast32_t;
> typedef __INT_FAST64_TYPE__ int_fast64_t;
> typedef __UINT_FAST8_TYPE__ uint_fast8_t;
> typedef __UINT_FAST16_TYPE__ uint_fast16_t;
> typedef __UINT_FAST32_TYPE__ uint_fast32_t;
> typedef __UINT_FAST64_TYPE__ uint_fast64_t;
> 
> #define INT_FAST8_MIN  (-__UINT_FAST8_MAX__ - 1)
> #define INT_FAST16_MIN  (-__UINT_FAST16_MAX__ - 1)
> #define INT_FAST32_MIN  (-__UINT_FAST32_MAX__ - 1)
> #define INT_FAST64_MIN  (-__UINT_FAST64_MAX__ - 1)
> 
> #define INT_FAST8_MAX  __UINT_FAST8_MAX__
> #define INT_FAST16_MAX  __UINT_FAST16_MAX__
> #define INT_FAST32_MAX  __UINT_FAST32_MAX__
> #define INT_FAST64_MAX  __UINT_FAST64_MAX__
> 
> #define UINT_FAST8_MAX __UINT_FAST8_MAX
> #define UINT_FAST16_MAX __UINT_FAST16_MAX
> #define UINT_FAST32_MAX __UINT_FAST32_MAX
> #define UINT_FAST64_MAX __UINT_FAST64_MAX
> 
> typedef __INT_LEAST8_TYPE__ int_least8_t;
> typedef __INT_LEAST16_TYPE__ int_least16_t;
> typedef __INT_LEAST32_TYPE__ int_least32_t;
> typedef __INT_LEAST64_TYPE__ int_least64_t;
> typedef __UINT_LEAST8_TYPE__ uint_least8_t;
> typedef __UINT_LEAST16_TYPE__ uint_least16_t;
> typedef __UINT_LEAST32_TYPE__ uint_least32_t;
> typedef __UINT_LEAST64_TYPE__ uint_least64_t;
> 
> #define INT_LEAST8_MIN  (-__UINT_LEAST8_MAX__ - 1)
> #define INT_LEAST16_MIN  (-__UINT_LEAST16_MAX__ - 1)
> #define INT_LEAST32_MIN  (-__UINT_LEAST32_MAX__ - 1)
> #define INT_LEAST64_MIN  (-__UINT_LEAST64_MAX__ - 1)
> 
> #define INT_LEAST8_MAX  __UINT_LEAST8_MAX__
> #define INT_LEAST16_MAX  __UINT_LEAST16_MAX__
> #define INT_LEAST32_MAX  __UINT_LEAST32_MAX__
> #define INT_LEAST64_MAX  __UINT_LEAST64_MAX__
> 
> #define UINT_LEAST8_MAX __UINT_LEAST8_MAX
> #define UINT_LEAST16_MAX __UINT_LEAST16_MAX
> #define UINT_LEAST32_MAX __UINT_LEAST32_MAX
> #define UINT_LEAST64_MAX __UINT_LEAST64_MAX
> 
> #define INTPTR_MIN      (-__INTPTR_MAX__ - 1)
> #define INTPTR_MAX      __INTPTR_MAX__
> #define UINTPTR_MAX     __UINTPTR_MAX__
> #define PTRDIFF_MIN     (-__PTRDIFF_MAX__ - 1)
> #define PTRDIFF_MAX     __PTRDIFF_MAX__
> #define SIZE_MAX        __SIZE_MAX__



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bits deduplication: current situation
  2016-01-25 21:00   ` Rich Felker
@ 2016-01-25 21:32     ` Szabolcs Nagy
  2016-01-26  5:03       ` Dan Gohman
  0 siblings, 1 reply; 14+ messages in thread
From: Szabolcs Nagy @ 2016-01-25 21:32 UTC (permalink / raw)
  To: musl

* Rich Felker <dalias@libc.org> [2016-01-25 16:00:05 -0500]:
> On Mon, Jan 25, 2016 at 11:22:13AM -0800, Dan Gohman wrote:
> > Concerning stdint.h, there are a few details beyond just 32-bit vs 64-bit.
> > For example, int64_t can be either "long" or "long long" on an LP64 target.
> > The difference usually doesn't matter, but there are things which end up
> > noticing, like C++ name mangling and C format-string checking.
> 
> I'm pretty sure int64_t is long on all LP64 targets we support. Are
> there others that differ?
> 

the convention is to use the smallest rank integer type with the
right range.

there may be other issues, but in general a c compiler does not
need to know these typedefs.

> > GCC >= 4.5 and clang predefine macros providing almost everything stdint.h
> > (and inttypes.h) needs. For example, see the attached file. Would you be
> > interested in a patch which refactors stdint.h to use this approach by
> > default, with a mechanism to support older compilers if needed?
> 
> No, the intent is that the public headers be compatible with basically
> all compilers honoring the ABI, not just gcc and compatible ones.
> There are a very small number of things (documented in the outdated
> manual) that need extensions in the public headers, mainly _Complex_I,
> tgmath.h, and stdarg.h, and in those cases we use the conventions that
> gcc and other existing compilers have created.
> 
> Also it's musl's intent to be explicit with definitions, and this is
> actually helpful with the C++ types issue. IMO it's much better to get
> an error that a new compiler you're trying has the ABI wrong than to
> silently use different types.

note that the patch is wrong for all released versions of gcc (<=5)
because the *fast types are different on musl vs glibc on 64bit arches.
(fwiw newlib defines these types in yet another way)

this is not visible in the libc abi but matters for third-party
code compiled against musl headers and those should be abi
compat no matter what compiler you used.

(with gcc the difference matters if you use the gcc provided stdatomic.h
or use the gfortran c ffi, but then you probably built a gcc
with musl support anyway and then the types are consistent.)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bits deduplication: current situation
  2016-01-25 21:32     ` Szabolcs Nagy
@ 2016-01-26  5:03       ` Dan Gohman
  2016-01-26 10:18         ` Szabolcs Nagy
  2016-01-26 20:17         ` Rich Felker
  0 siblings, 2 replies; 14+ messages in thread
From: Dan Gohman @ 2016-01-26  5:03 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1448 bytes --]

On Mon, Jan 25, 2016 at 1:32 PM, Szabolcs Nagy <nsz@port70.net> wrote:

> * Rich Felker <dalias@libc.org> [2016-01-25 16:00:05 -0500]:
> > On Mon, Jan 25, 2016 at 11:22:13AM -0800, Dan Gohman wrote:
> > > Concerning stdint.h, there are a few details beyond just 32-bit vs
> 64-bit.
> > > For example, int64_t can be either "long" or "long long" on an LP64
> target.
> > > The difference usually doesn't matter, but there are things which end
> up
> > > noticing, like C++ name mangling and C format-string checking.
> >
> > I'm pretty sure int64_t is long on all LP64 targets we support. Are
> > there others that differ?
>

I'm working on an architecture which does, though there's no musl support
for it currently.

note that the patch is wrong for all released versions of gcc (<=5)
> because the *fast types are different on musl vs glibc on 64bit arches.
> (fwiw newlib defines these types in yet another way)
>

> this is not visible in the libc abi but matters for third-party
> code compiled against musl headers and those should be abi
> compat no matter what compiler you used.
>
> (with gcc the difference matters if you use the gcc provided stdatomic.h
> or use the gfortran c ffi, but then you probably built a gcc
> with musl support anyway and then the types are consistent.)
>

Ah, I was unaware that musl and glibc differ here. I agree that that
complicates the patch I had envisioned, so I'll drop the idea for now.

Thanks,

Dan

[-- Attachment #2: Type: text/html, Size: 2420 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bits deduplication: current situation
  2016-01-26  5:03       ` Dan Gohman
@ 2016-01-26 10:18         ` Szabolcs Nagy
  2016-01-26 15:16           ` Dan Gohman
  2016-01-26 20:17         ` Rich Felker
  1 sibling, 1 reply; 14+ messages in thread
From: Szabolcs Nagy @ 2016-01-26 10:18 UTC (permalink / raw)
  To: musl

* Dan Gohman <sunfish@mozilla.com> [2016-01-25 21:03:54 -0800]:
> On Mon, Jan 25, 2016 at 1:32 PM, Szabolcs Nagy <nsz@port70.net> wrote:
> > * Rich Felker <dalias@libc.org> [2016-01-25 16:00:05 -0500]:
> > >
> > > I'm pretty sure int64_t is long on all LP64 targets we support. Are
> > > there others that differ?
> >
> 
> I'm working on an architecture which does, though there's no musl support
> for it currently.
> 

in gcc stdint.h only depends on libc/os and sizeof(long),
not on architecture.

(e.g. openbsd uses long long, glibc uses long consistently
for all LP64 arch abis.)


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bits deduplication: current situation
  2016-01-26 10:18         ` Szabolcs Nagy
@ 2016-01-26 15:16           ` Dan Gohman
  2016-01-26 20:26             ` Szabolcs Nagy
  0 siblings, 1 reply; 14+ messages in thread
From: Dan Gohman @ 2016-01-26 15:16 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 1067 bytes --]

On Tue, Jan 26, 2016 at 2:18 AM, Szabolcs Nagy <nsz@port70.net> wrote:

> * Dan Gohman <sunfish@mozilla.com> [2016-01-25 21:03:54 -0800]:
> > On Mon, Jan 25, 2016 at 1:32 PM, Szabolcs Nagy <nsz@port70.net> wrote:
> > > * Rich Felker <dalias@libc.org> [2016-01-25 16:00:05 -0500]:
> > > >
> > > > I'm pretty sure int64_t is long on all LP64 targets we support. Are
> > > > there others that differ?
> > >
> >
> > I'm working on an architecture which does, though there's no musl support
> > for it currently.
> >
>
> in gcc stdint.h only depends on libc/os and sizeof(long),
> not on architecture.
>
> (e.g. openbsd uses long long, glibc uses long consistently
> for all LP64 arch abis.)
>

I've been assuming that, in the absence of compatibility constraints (for
example on a new architecture), it would be reasonable for hypothetical new
musl, glibc, or newlib ports to arrange to be ABI compatible at the level
of a freestanding implementation (in the C standard sense), which would
include <stdint.h>. Is this an incorrect assumption, from your perspective?

Dan

[-- Attachment #2: Type: text/html, Size: 1770 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bits deduplication: current situation
  2016-01-26  5:03       ` Dan Gohman
  2016-01-26 10:18         ` Szabolcs Nagy
@ 2016-01-26 20:17         ` Rich Felker
  1 sibling, 0 replies; 14+ messages in thread
From: Rich Felker @ 2016-01-26 20:17 UTC (permalink / raw)
  To: musl

On Mon, Jan 25, 2016 at 09:03:54PM -0800, Dan Gohman wrote:
> On Mon, Jan 25, 2016 at 1:32 PM, Szabolcs Nagy <nsz@port70.net> wrote:
> 
> > * Rich Felker <dalias@libc.org> [2016-01-25 16:00:05 -0500]:
> > > On Mon, Jan 25, 2016 at 11:22:13AM -0800, Dan Gohman wrote:
> > > > Concerning stdint.h, there are a few details beyond just 32-bit vs
> > 64-bit.
> > > > For example, int64_t can be either "long" or "long long" on an LP64
> > target.
> > > > The difference usually doesn't matter, but there are things which end
> > up
> > > > noticing, like C++ name mangling and C format-string checking.
> > >
> > > I'm pretty sure int64_t is long on all LP64 targets we support. Are
> > > there others that differ?
> 
> I'm working on an architecture which does, though there's no musl support
> for it currently.

Is this for Linux to run on, or for non-Linux use? I'm pretty sure GCC
wants the above policy for LP64 to be followed on all Linux targets,
and generally considers this part of the general ABI descended from
sysv. Doing it differently is surely a gratuitous incompatibility and
pain for implementations.

Rich


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: Bits deduplication: current situation
  2016-01-26 15:16           ` Dan Gohman
@ 2016-01-26 20:26             ` Szabolcs Nagy
  0 siblings, 0 replies; 14+ messages in thread
From: Szabolcs Nagy @ 2016-01-26 20:26 UTC (permalink / raw)
  To: musl

* Dan Gohman <sunfish@mozilla.com> [2016-01-26 07:16:08 -0800]:
> On Tue, Jan 26, 2016 at 2:18 AM, Szabolcs Nagy <nsz@port70.net> wrote:
> > * Dan Gohman <sunfish@mozilla.com> [2016-01-25 21:03:54 -0800]:
> > > On Mon, Jan 25, 2016 at 1:32 PM, Szabolcs Nagy <nsz@port70.net> wrote:
> > > > * Rich Felker <dalias@libc.org> [2016-01-25 16:00:05 -0500]:
> > > > >
> > > > > I'm pretty sure int64_t is long on all LP64 targets we support. Are
> > > > > there others that differ?
> > > >
> > >
> > > I'm working on an architecture which does, though there's no musl support
> > > for it currently.
> > >
> >
> > in gcc stdint.h only depends on libc/os and sizeof(long),
> > not on architecture.
> >
> > (e.g. openbsd uses long long, glibc uses long consistently
> > for all LP64 arch abis.)
> >
> 
> I've been assuming that, in the absence of compatibility constraints (for
> example on a new architecture), it would be reasonable for hypothetical new
> musl, glibc, or newlib ports to arrange to be ABI compatible at the level
> of a freestanding implementation (in the C standard sense), which would
> include <stdint.h>. Is this an incorrect assumption, from your perspective?

it is correct in principle, but it means a bit more toolchain
work to support an inconsistent arch and it can bite you if
you work with historical code with invalid assumptions.

gcc/musl/glibc/linux all use consistent typedefs for all
64bit archs, most likely other projects do the same.
in most cases it should be easy to do the typedefs
differently for a new arch, but there might be caveats..

i think if you want to design a freestanding c language abi
for your arch then it makes sense to follow what's already
there (unless you have some specific reason to deviate).


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-01-26 20:26 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-25  3:59 Bits deduplication: current situation Rich Felker
2016-01-25  8:08 ` Natanael Copa
2016-01-25 17:17   ` Rich Felker
2016-01-25 10:46 ` Laurent Bercot
2016-01-25 14:56 ` Ward Willats
2016-01-25 15:37   ` Szabolcs Nagy
2016-01-25 19:22 ` Dan Gohman
2016-01-25 21:00   ` Rich Felker
2016-01-25 21:32     ` Szabolcs Nagy
2016-01-26  5:03       ` Dan Gohman
2016-01-26 10:18         ` Szabolcs Nagy
2016-01-26 15:16           ` Dan Gohman
2016-01-26 20:26             ` Szabolcs Nagy
2016-01-26 20:17         ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).