mailing list of musl libc
 help / color / mirror / code / Atom feed
* type of wchar_t
@ 2012-11-15  9:09 Yuri Kozlov
  2012-11-15 11:53 ` Szabolcs Nagy
  0 siblings, 1 reply; 4+ messages in thread
From: Yuri Kozlov @ 2012-11-15  9:09 UTC (permalink / raw)
  To: musl

Hello.

arch/x86_64/bits/alltypes.h.sh
#ifndef __cplusplus
TYPEDEF int wchar_t;
#endif


arch/i386/bits/alltypes.h.sh
#ifndef __cplusplus
#ifdef __WCHAR_TYPE__
TYPEDEF __WCHAR_TYPE__ wchar_t;
#else
TYPEDEF long wchar_t;
#endif
#endif

(__WCHAR_TYPE__ is not defined everyware, so TYPEDEF long wchar_t;)

arch/arm/bits/alltypes.h.sh
#ifndef __cplusplus
TYPEDEF unsigned wchar_t;
#endif


Why type of wchar_t is so differs?

-- 
Best Regards,
Yuri Kozlov



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: type of wchar_t
  2012-11-15  9:09 type of wchar_t Yuri Kozlov
@ 2012-11-15 11:53 ` Szabolcs Nagy
  2012-11-15 12:36   ` Yuri Kozlov
  0 siblings, 1 reply; 4+ messages in thread
From: Szabolcs Nagy @ 2012-11-15 11:53 UTC (permalink / raw)
  To: musl

* Yuri Kozlov <yuray@komyakino.ru> [2012-11-15 13:09:50 +0400]:
> 
> arch/x86_64/bits/alltypes.h.sh
> #ifndef __cplusplus
> TYPEDEF int wchar_t;
> #endif
> 
> 
> arch/i386/bits/alltypes.h.sh
> #ifndef __cplusplus
> #ifdef __WCHAR_TYPE__
> TYPEDEF __WCHAR_TYPE__ wchar_t;
> #else
> TYPEDEF long wchar_t;
> #endif
> #endif
> 
> (__WCHAR_TYPE__ is not defined everyware, so TYPEDEF long wchar_t;)
> 
> arch/arm/bits/alltypes.h.sh
> #ifndef __cplusplus
> TYPEDEF unsigned wchar_t;
> #endif
> 
> 
> Why type of wchar_t is so differs?
> 

because wchar_t is a broken concept and platform
abis and compilers have gratitous incompatibilities

you cannot have arbitrary definition because the L'x'
character constant and L"" string literal has a given
type in the compiler and you should use the same in
the wchar_t typedef
(different int types are not compatible, they can be
converted if the range is ok, but eg. calling function
through incompatible function pointer type is
undeinfed behaviour)

in c++ wchar_t is a keyword because otherwise
polimorphism and strict type checking of int vs
wchar_t would not work
(wchar_t must be distinct from any other int type)

in c99 the compiler could be loose and allow pointer to
any sufficiently aligned+sized+signed integer type
to work with L"", so eg. wchar_t could be long or int
as well on a 32bit platform

c11 has generics (implemented in the compiler) so the
compiler must have a type internally for L'' or L""[0]
and wchar_t must be defined as that type

so we either use the __WCHAR_TYPE__ defined by the
compiler (when it's defined), or use the abi specs
(which gives the align+size+sign information and
hopefully compilers agree on a single int type when
there are multiple choices)



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: type of wchar_t
  2012-11-15 11:53 ` Szabolcs Nagy
@ 2012-11-15 12:36   ` Yuri Kozlov
  2012-11-15 13:26     ` Rich Felker
  0 siblings, 1 reply; 4+ messages in thread
From: Yuri Kozlov @ 2012-11-15 12:36 UTC (permalink / raw)
  To: musl

В Thu, 15 Nov 2012 12:53:40 +0100
Szabolcs Nagy <nsz@port70.net> пишет:

> * Yuri Kozlov <yuray@komyakino.ru> [2012-11-15 13:09:50 +0400]:
> > 
> > arch/x86_64/bits/alltypes.h.sh
> > #ifndef __cplusplus
> > TYPEDEF int wchar_t;
> > #endif
> > 
> > 
> > arch/i386/bits/alltypes.h.sh
> > #ifndef __cplusplus
> > #ifdef __WCHAR_TYPE__
> > TYPEDEF __WCHAR_TYPE__ wchar_t;
> > #else
> > TYPEDEF long wchar_t;
> > #endif
> > #endif
> > 
> > (__WCHAR_TYPE__ is not defined everyware, so TYPEDEF long wchar_t;)
> > 
> > arch/arm/bits/alltypes.h.sh
> > #ifndef __cplusplus
> > TYPEDEF unsigned wchar_t;
> > #endif
> > 
> > 
> > Why type of wchar_t is so differs?
> > 
[...]
> c11 has generics (implemented in the compiler) so the
> compiler must have a type internally for L'' or L""[0]
> and wchar_t must be defined as that type
> 
> so we either use the __WCHAR_TYPE__ defined by the
> compiler (when it's defined), or use the abi specs
> (which gives the align+size+sign information and
> hopefully compilers agree on a single int type when
> there are multiple choices)

Thanks for clarification.
Hah, gcc emit a __WCHAR_TYPE__ for arm as unsigned. Wow.
$ arm-linux-gnueabi-gcc -dM -E - < /dev/null |grep __WCHAR_T
#define __WCHAR_TYPE__ unsigned int


-- 
Best Regards,
Yuri Kozlov



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: type of wchar_t
  2012-11-15 12:36   ` Yuri Kozlov
@ 2012-11-15 13:26     ` Rich Felker
  0 siblings, 0 replies; 4+ messages in thread
From: Rich Felker @ 2012-11-15 13:26 UTC (permalink / raw)
  To: musl

On Thu, Nov 15, 2012 at 04:36:31PM +0400, Yuri Kozlov wrote:
> > so we either use the __WCHAR_TYPE__ defined by the
> > compiler (when it's defined), or use the abi specs
> > (which gives the align+size+sign information and
> > hopefully compilers agree on a single int type when
> > there are multiple choices)
> 
> Thanks for clarification.
> Hah, gcc emit a __WCHAR_TYPE__ for arm as unsigned. Wow.
> $ arm-linux-gnueabi-gcc -dM -E - < /dev/null |grep __WCHAR_T
> #define __WCHAR_TYPE__ unsigned int

Yes. Whoever designed this aspect of the ARM EABI did not know what
they were doing. They probably came from a Windows background where
wchar_t is unsigned short (to be able to represent all of the Unicode
BMP) and did not realize that making it unsigned is unnecessary and
even harmful when it's 32-bit and thus able to store all of Unicode
(and much more) in a signed type.

As already explained, I wanted to just always use a signed type on
musl, but since L"" must match the type of wchar_t* (otherwise,
passing L"" to a function that expects wchar_t* is a constraint
violation and the compiler should throw an error), we need the
definition to agree with whatever the compiler thinks it is, and
real-world compilers follow the EABI document that defines it as
unsigned.

Rich


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-11-15 13:26 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-15  9:09 type of wchar_t Yuri Kozlov
2012-11-15 11:53 ` Szabolcs Nagy
2012-11-15 12:36   ` Yuri Kozlov
2012-11-15 13:26     ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).