When building with -fshort-wchar the definition of wchar_t is incorrect. Get the correct definition from the compiler if available. This is useful when reusing the freestanding parts of musl on a bare-metal target that uses -fshort-wchar. --- arch/arm/bits/alltypes.h.in | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/arm/bits/alltypes.h.in b/arch/arm/bits/alltypes.h.in index d62bd7bd..9596466b 100644 --- a/arch/arm/bits/alltypes.h.in +++ b/arch/arm/bits/alltypes.h.in @@ -12,8 +12,12 @@ #define __LONG_MAX 0x7fffffffL #ifndef __cplusplus +#ifdef __WCHAR_TYPE__ +TYPEDEF __WCHAR_TYPE__ wchar_t; +#else TYPEDEF unsigned wchar_t; #endif +#endif TYPEDEF float float_t; TYPEDEF double double_t; -- 2.39.1.519.gcb327c4b5f-goog
On Sat Feb 4, 2023 at 7:30 AM CET, Peter Collingbourne wrote: > When building with -fshort-wchar the definition of wchar_t is > incorrect. Get the correct definition from the compiler if available. > > This is useful when reusing the freestanding parts of musl on a > bare-metal target that uses -fshort-wchar. somebody talked about this in 2015, see https://www.openwall.com/lists/musl/2015/02/18/2 for the previous discussion. i understand in this case it's proposed a little different- "reusing freestanding parts" as opposed to building a whole libc.so, but in that case you could most likely patch this in when reusing it standalone only? it doesn't seem a good idea for it to be there, in general. > --- > arch/arm/bits/alltypes.h.in | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/arch/arm/bits/alltypes.h.in b/arch/arm/bits/alltypes.h.in > index d62bd7bd..9596466b 100644 > --- a/arch/arm/bits/alltypes.h.in > +++ b/arch/arm/bits/alltypes.h.in > @@ -12,8 +12,12 @@ > #define __LONG_MAX 0x7fffffffL > > #ifndef __cplusplus > +#ifdef __WCHAR_TYPE__ > +TYPEDEF __WCHAR_TYPE__ wchar_t; > +#else > TYPEDEF unsigned wchar_t; > #endif > +#endif > > TYPEDEF float float_t; > TYPEDEF double double_t; > -- > 2.39.1.519.gcb327c4b5f-goog
On Sat, Feb 04, 2023 at 08:08:36AM +0100, alice wrote:
> On Sat Feb 4, 2023 at 7:30 AM CET, Peter Collingbourne wrote:
> > When building with -fshort-wchar the definition of wchar_t is
> > incorrect. Get the correct definition from the compiler if available.
> >
> > This is useful when reusing the freestanding parts of musl on a
> > bare-metal target that uses -fshort-wchar.
>
> somebody talked about this in 2015, see
> https://www.openwall.com/lists/musl/2015/02/18/2
> for the previous discussion.
>
> i understand in this case it's proposed a little different-
> "reusing freestanding parts" as opposed to building a whole libc.so, but in
> that case you could most likely patch this in when reusing it standalone only?
>
> it doesn't seem a good idea for it to be there, in general.
Seconded. A lot of code in musl depends on wchar_t being able to hold
the current maximum Unicode codepoint of 0x10FFFF at least, so the type
must be at least 21 bits.
Ciao,
Markus
On Sun, Feb 05, 2023 at 09:00:03PM +0100, Markus Wichmann wrote:
> On Sat, Feb 04, 2023 at 08:08:36AM +0100, alice wrote:
> > On Sat Feb 4, 2023 at 7:30 AM CET, Peter Collingbourne wrote:
> > > When building with -fshort-wchar the definition of wchar_t is
> > > incorrect. Get the correct definition from the compiler if available.
> > >
> > > This is useful when reusing the freestanding parts of musl on a
> > > bare-metal target that uses -fshort-wchar.
> >
> > somebody talked about this in 2015, see
> > https://www.openwall.com/lists/musl/2015/02/18/2
> > for the previous discussion.
> >
> > i understand in this case it's proposed a little different-
> > "reusing freestanding parts" as opposed to building a whole libc.so, but in
> > that case you could most likely patch this in when reusing it standalone only?
> >
> > it doesn't seem a good idea for it to be there, in general.
>
> Seconded. A lot of code in musl depends on wchar_t being able to hold
> the current maximum Unicode codepoint of 0x10FFFF at least, so the type
> must be at least 21 bits.
Absolutely. -fshort-wchar requests a different ABI that is
fundamentally incompatible with libc and with use of the libc headers,
and also fundamentally incompatible with Unicode and the requirements
of the C language (unless you only want to support the BMP) -- C does
not allow "multi-wchar_t characters".
If you're targeting freestanding environment not using libc, you
should use -nostdinc and provide headers suitable to your environment
instead of the libc ones. But really you should fix the offending code
not to use wchar_t for UTF-16, and not use -fshort-wchar. Modern C has
a char16_t type for this purpose.
Rich
On Sun, Feb 5, 2023 at 3:49 PM Rich Felker <dalias@libc.org> wrote:
>
> On Sun, Feb 05, 2023 at 09:00:03PM +0100, Markus Wichmann wrote:
> > On Sat, Feb 04, 2023 at 08:08:36AM +0100, alice wrote:
> > > On Sat Feb 4, 2023 at 7:30 AM CET, Peter Collingbourne wrote:
> > > > When building with -fshort-wchar the definition of wchar_t is
> > > > incorrect. Get the correct definition from the compiler if available.
> > > >
> > > > This is useful when reusing the freestanding parts of musl on a
> > > > bare-metal target that uses -fshort-wchar.
> > >
> > > somebody talked about this in 2015, see
> > > https://www.openwall.com/lists/musl/2015/02/18/2
> > > for the previous discussion.
> > >
> > > i understand in this case it's proposed a little different-
> > > "reusing freestanding parts" as opposed to building a whole libc.so, but in
> > > that case you could most likely patch this in when reusing it standalone only?
> > >
> > > it doesn't seem a good idea for it to be there, in general.
> >
> > Seconded. A lot of code in musl depends on wchar_t being able to hold
> > the current maximum Unicode codepoint of 0x10FFFF at least, so the type
> > must be at least 21 bits.
>
> Absolutely. -fshort-wchar requests a different ABI that is
> fundamentally incompatible with libc and with use of the libc headers,
> and also fundamentally incompatible with Unicode and the requirements
> of the C language (unless you only want to support the BMP) -- C does
> not allow "multi-wchar_t characters".
>
> If you're targeting freestanding environment not using libc, you
> should use -nostdinc and provide headers suitable to your environment
> instead of the libc ones. But really you should fix the offending code
> not to use wchar_t for UTF-16, and not use -fshort-wchar. Modern C has
> a char16_t type for this purpose.
Thanks, I agree with this and the other replies that I got. It did
seem at first that musl could be used unmodified in projects that
build with -fshort-wchar, but given the implications of a UTF-16
wchar_t for the code that implements <wchar.h>, it makes more sense
for this flag to be unsupported by musl and for any utilizing projects
to be fixed to not require -fshort-wchar.
Currently we accidentally "support" -fshort-wchar on architectures
that happen to use __WCHAR_TYPE__ to define wchar_t. Would it make
sense to add something like a static assert to alltypes.h that checks
that sizeof(wchar_t) >= 4?
Peter
On Mon, Feb 06, 2023 at 05:15:08PM -0800, Peter Collingbourne wrote:
> On Sun, Feb 5, 2023 at 3:49 PM Rich Felker <dalias@libc.org> wrote:
> >
> > On Sun, Feb 05, 2023 at 09:00:03PM +0100, Markus Wichmann wrote:
> > > On Sat, Feb 04, 2023 at 08:08:36AM +0100, alice wrote:
> > > > On Sat Feb 4, 2023 at 7:30 AM CET, Peter Collingbourne wrote:
> > > > > When building with -fshort-wchar the definition of wchar_t is
> > > > > incorrect. Get the correct definition from the compiler if available.
> > > > >
> > > > > This is useful when reusing the freestanding parts of musl on a
> > > > > bare-metal target that uses -fshort-wchar.
> > > >
> > > > somebody talked about this in 2015, see
> > > > https://www.openwall.com/lists/musl/2015/02/18/2
> > > > for the previous discussion.
> > > >
> > > > i understand in this case it's proposed a little different-
> > > > "reusing freestanding parts" as opposed to building a whole libc.so, but in
> > > > that case you could most likely patch this in when reusing it standalone only?
> > > >
> > > > it doesn't seem a good idea for it to be there, in general.
> > >
> > > Seconded. A lot of code in musl depends on wchar_t being able to hold
> > > the current maximum Unicode codepoint of 0x10FFFF at least, so the type
> > > must be at least 21 bits.
> >
> > Absolutely. -fshort-wchar requests a different ABI that is
> > fundamentally incompatible with libc and with use of the libc headers,
> > and also fundamentally incompatible with Unicode and the requirements
> > of the C language (unless you only want to support the BMP) -- C does
> > not allow "multi-wchar_t characters".
> >
> > If you're targeting freestanding environment not using libc, you
> > should use -nostdinc and provide headers suitable to your environment
> > instead of the libc ones. But really you should fix the offending code
> > not to use wchar_t for UTF-16, and not use -fshort-wchar. Modern C has
> > a char16_t type for this purpose.
>
> Thanks, I agree with this and the other replies that I got. It did
> seem at first that musl could be used unmodified in projects that
> build with -fshort-wchar, but given the implications of a UTF-16
> wchar_t for the code that implements <wchar.h>, it makes more sense
> for this flag to be unsupported by musl and for any utilizing projects
> to be fixed to not require -fshort-wchar.
>
> Currently we accidentally "support" -fshort-wchar on architectures
> that happen to use __WCHAR_TYPE__ to define wchar_t. Would it make
> sense to add something like a static assert to alltypes.h that checks
> that sizeof(wchar_t) >= 4?
If you count target-specific options, GCC probably has hundreds of
options that produce incompatible/broken ABIs. We certainly don't have
the means to trap all or even most of them. In the case of most,
including -fshort-wchar, GCC documents this:
"Warning: the -fshort-wchar switch causes GCC to generate code
that is not binary compatible with code generated without that
switch. Use it to conform to a non-default application binary
interface."
so I don't really think any action is needed.
Rich