From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7611 Path: news.gmane.org!not-for-mail From: =?UTF-8?B?572X5YuH5YiaKFlvbmdnYW5nIEx1bykg?= Newsgroups: gmane.comp.standards.posix.austin.general,gmane.linux.lib.musl.general,gmane.comp.compilers.clang.devel Subject: Re: [cfe-dev] Is that getting wchar_t to be 32bit on win32 a good idea for compatible with Unix world by implement posix layer on win32 API? Date: Sat, 9 May 2015 19:19:14 +0800 Message-ID: References: <20150509103645.GG29035@port70.net> Reply-To: luoyonggang-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1431170393 17940 80.91.229.3 (9 May 2015 11:19:53 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 9 May 2015 11:19:53 +0000 (UTC) To: John Sully , =?UTF-8?B?5YuH5YiaIOe9lyAoWW9uZ2dhbmcgTHVvKQ==?= , Karsten Blees , musl-ZwoEplunGu1jrUoiu81ncdBPR1lH4CV8@public.gmane.org, dplakosh-etTNj8cnB6w@public.gmane.org, austin-group-l-7882/jkIBncuagvECLh61g@public.gmane.org, hsutter-0li6OtcxBFHby3iVrkZq2A@public.gmane.org, Clang Dev , James McNellis Original-X-From: austin-group-l-request-7882/jkIBncuagvECLh61g@public.gmane.org Sat May 09 13:19:52 2015 Return-path: Envelope-to: gcsa-austin-group-l-wOFGN7rlS/M9smdsby/KFg@public.gmane.org Original-Received: from m4.opengroup.org ([64.79.149.154]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1Yr2nM-0005PP-P1 for gcsa-austin-group-l-wOFGN7rlS/M9smdsby/KFg@public.gmane.org; Sat, 09 May 2015 13:19:49 +0200 Original-Received: (qmail 6786 invoked by uid 503); 9 May 2015 11:19:43 -0000 Resent-Date: 9 May 2015 11:19:43 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:in-reply-to:references:from:date:message-id :subject:to:content-type:content-transfer-encoding; bh=LBSUfdgW0d4qjboP2Gthts+p3jC0x+KtNoOfCDyKFa8=; b=pYH31g0toGQTFKeEk3ztyZbb5ll4oR4XCwt/7lE+9JrxFfqGfljwusg8965OJKyg1Y XE/APeSDt0f8VmEq/JEdbcfZlftSPYs88o6ev0NfYn2Wn9ayxRaATa625gI32StevqQC fcaAjmRmHhECN5QRIGxGHt7aDSl8wq+rSgol+B61/d2vjXSW4MQ7v/L6DKp/8q19dGod +RiXjJ1e9k072Z2fLI5AWD/BLxK9pjbCaptEnfsiFj8f2qA1/SdoucPADP+Ckf4tLN4+ VTU5AmToX0H16+VHItXR2rUALjMHbf9P5FsEbpY9FI/WB0fBqd1Wy50yFcZRdAyOFJCx 6cnw== X-Received: by 10.50.109.138 with SMTP id hs10mr2946900igb.48.1431170374708; Sat, 09 May 2015 04:19:34 -0700 (PDT) In-Reply-To: <20150509103645.GG29035-4P1ElwuDYu6sTnJN9+BGXg@public.gmane.org> X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-3.0 (m1.opengroup.org [172.20.55.20]); Sat, 09 May 2015 04:19:35 -0700 (PDT) X-Spam-Flag: NO X-Scanned-By: milter-spamc/0.25.320 (mimas [172.20.55.20]); Sat, 09 May 2015 04:19:40 -0700 X-Spam-Status: NO, hits=-3.60 required=5.00 X-Spam-Level: X-Virus-Scanned: clamav-milter 0.98 at m1.opengroup.org X-Virus-Status: Clean X-MIME-Autoconverted: from quoted-printable to 8bit by m4.opengroup.org id t49BJemq006704 Resent-Message-ID: <"npTlL.A.IpB.O1eTVB"@Phoebe.vpn.opengroup.org> Resent-To: austin-group-l-7882/jkIBncuagvECLh61g@public.gmane.org Resent-From: austin-group-l-7882/jkIBncuagvECLh61g@public.gmane.org X-Mailing-List: austin-group-l:archive/latest/22432 X-Loop: austin-group-l-7882/jkIBncuagvECLh61g@public.gmane.org Precedence: list Resent-Sender: austin-group-l-request-7882/jkIBncuagvECLh61g@public.gmane.org Xref: news.gmane.org gmane.comp.standards.posix.austin.general:10761 gmane.linux.lib.musl.general:7611 gmane.comp.compilers.clang.devel:42670 Archived-At: 2015-05-09 18:36 GMT+08:00 Szabolcs Nagy : > * John Sully [2015-05-09 00:55:12 -0700]: >> In my opinion you almost never want 32-bit wide characters once you learn >> of their limitations. Most people assume that if they use them they can >> return to the one character -> one glyph idiom like ASCII. But Unicode is > > wchar_t must be at least 21 bits on a system that spports unicode > in any locale: it has to be able to represent all code points of the > supported character set. > > in practice this means that the only conforming definition to iso c > (and thus posix, c++ and other standards based on c) is a 32bit wchar_t > (the signedness can be choosen freely). > > so the definition is not based on what "you almost never want" or what > "most people assume". > > if the goal is to provide a posix implementation then 16bit wchar_t > is not an option (assuming the system wants to be able to communicate > with the external world that uses unicode text). wchar_t is not the only way to communicate with the external way, and it's also not suite for communicate to the external world, from the C11 standard, it's never restrict the wchar_t's width, and for Posix, most API are implement in utf8, and indeed, Windows need the posix layer mainly because of those API that using utf8, not wchar_t APIs, for the communicate reason to getting wchar_t to be 32 bit on Win32 is not a good idea, And for portable text processing(Including win32) apps or libs, they would and should never dependents on the wchar_t must be 32 bit width. And C11/C++11 already provide uchar.h to provide cross-platform char16_t and char32_t, so there is no reason to getting wchar_t to be 32bit on win32 for suport posix on win32. We were intent to creating a usable posix layer on win32, not creating a theoretical POSIX layer that would be useless, on win32, we should considerate the de facto things on win32. -- 此致 礼 罗勇刚 Yours sincerely, Yonggang Luo