From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7608 Path: news.gmane.org!not-for-mail From: =?UTF-8?B?572X5YuH5YiaKFlvbmdnYW5nIEx1bykg?= Newsgroups: gmane.linux.lib.musl.general,gmane.comp.standards.posix.austin.general,gmane.comp.compilers.clang.devel Subject: Re: Is that getting wchar_t to be 32bit on win32 a good idea for compatible with Unix world by implement posix layer on win32 API? Date: Sat, 9 May 2015 11:36:44 +0800 Message-ID: References: <20150509033232.GG17573@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Trace: ger.gmane.org 1431142641 21095 80.91.229.3 (9 May 2015 03:37:21 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 9 May 2015 03:37:21 +0000 (UTC) Cc: musl@lists.openwall.com, James McNellis , austin-group-l@opengroup.org, Clang Dev , blees@dcon.de, dplakosh@cert.org, hsutter@microsoft.com, writeonce@midipix.org To: Rich Felker Original-X-From: musl-return-7621-gllmg-musl=m.gmane.org@lists.openwall.com Sat May 09 05:37:20 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1YqvZn-0006Bd-Uk for gllmg-musl@m.gmane.org; Sat, 09 May 2015 05:37:20 +0200 Original-Received: (qmail 17878 invoked by uid 550); 9 May 2015 03:37:17 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 17776 invoked from network); 9 May 2015 03:37:17 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:in-reply-to:references:from:date:message-id :subject:to:cc:content-type:content-transfer-encoding; bh=C83E6clo/PrNihwLv8LfPthvnSCgdU9SUhmjisIP0Eg=; b=kG+qf+1UTOTTJu+OFVA07dX6N4l1w33aMkxJEno35uLkKfQAzqOow1v0GwGucjNHV7 3u4KiA9cpeTXg/4DA7wGEbh+/nDusxQ4R53DSKuBiRDKQs9Zdh4TErs3N3Emvf9zJtco yDpKVGUisLj+gzLGkzLkpKk/6w1qlz+/pXJjf7+6rUmFFi7+kTAjwC+79xyS7nm17Ztx zyuQkvqTS38mY4kq8PlcKhhZoaIUDnDBltGOyqSGbcZIudQysqmMhvVopbJksTRVoLD/ bu7R/EgyFz8APdguW0dGvASXEkR36Teu/vME4j7Qtju4m1oyHYLql1r3EFk5xfKyqDNO fWjA== X-Received: by 10.42.99.205 with SMTP id x13mr1200945icn.53.1431142625595; Fri, 08 May 2015 20:37:05 -0700 (PDT) In-Reply-To: <20150509033232.GG17573@brightrain.aerifal.cx> Xref: news.gmane.org gmane.linux.lib.musl.general:7608 gmane.comp.standards.posix.austin.general:10758 gmane.comp.compilers.clang.devel:42665 Archived-At: 2015-05-09 11:32 GMT+08:00 Rich Felker : > On Sat, May 09, 2015 at 11:16:37AM +0800, =E7=BD=97=E5=8B=87=E5=88=9A(Yon= ggang Luo) wrote: >> Two solution: >> 1=E3=80=81Change the width of wchar_t to 16 bit, I guess that would brok= en a >> lot of things that exist on Win32 world. >> 2=E3=80=81Or we should preserve wchar_t to be 16 bit on win32, and add t= he >> char16_t and char32_t >> variant API for all API that have both narrow and wide version? >> >> >> I support for the second one, even if the second option is not >> applicable. the first option would cause a lot problems, the first >> thing is all Windows API use wchar_t and dependent on the wchar_t to >> be 2 byte width. Second is, there is open source libraries that >> dependent the de fac=C2=B7to that wchar_t to be 16 bit, such as Qt, >> Git(maybe). >> Almost exist open source libraries that already ported to Win32 are >> dependent the the fact wchar_t to be 16 bit, cygwin is also discussed >> if getting wchar_t to be 32bit on win32 >> >> https://www.cygwin.com/ml/cygwin/2011-02/msg00037.html > > Well, which option is an easier path forward depends on your main > usage case. If you're most concerned about building existing > Windows-targetted code unmodified, obviously doing the same thing MSVC > does, even if it's a bad design, achieves that. > > On the other hand, if your goal is building software that was written > for POSIX or POSIX-like systems on Windows with little or no > modification, it's more complicated. Code that currently has no > Windows support certainly will work best (full Unicode support) with > 32-bit wchar_t. Code that already has Windows-specific workarounds > (assuming wchar_t is 16-bit on Windows) needs those undone to make it > work. But such code _should_ be checking WCHAR_MAX instead of assuming > Windows is 16-bit. I believe midipix is dealing with this issue simply > by not predefining _WIN32 or whatever, so that none of the Windows > workarounds will get activated. > > I really suspect most Windows code interfacing with WINAPI is using > WCHAR, not wchar_t, for its UTF-16 strings. So fixing wchar_t to be This is a misunderstanding, The real definition of WCHAR is in winnt.h, and defined as follow: #ifndef _MAC typedef wchar_t WCHAR; // wc, 16-bit UNICODE character #else // some Macintosh compilers don't define wchar_t in a convenient location, or define it as a char typedef unsigned short WCHAR; // wc, 16-bit UNICODE character #endif > 32-bit and leaving WCHAR alone is the best solution in my opinion. > Note that you're still left with the issue that L"xxx" strings will > not work with WCHAR, but this really only matters if you're trying to > use existing Windows-targetted code unmodified, and it's easily fixed > by s/L"/u"/g across the source (making them char16_t[] literals rather > than wchar_t[] literals). > > I don't think adding lots of functions for char16_t and char32_t is > useful. The format you want programs to be using is UTF-8. With > midipix all of the standard C functions, just like in straight musl, > always work in UTF-8, and there are also wrappers for the WINAPI that > convert UTF-8 to UTF-16 transparently. This allows you to just work in > char[] strings and pass them to WINAPI functions like you would if you > were working in "ANSI codepage" mode, except that you actually have > full Unicode available. I strongly support this approach and hope > you'll adopt it. > > Rich --=20 =E6=AD=A4=E8=87=B4 =E7=A4=BC =E7=BD=97=E5=8B=87=E5=88=9A Yours sincerely, Yonggang Luo