From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7900 Path: news.gmane.org!not-for-mail From: Rich Felker Newsgroups: gmane.linux.lib.musl.general Subject: Re: [PATCH] Byte-based C locale, draft 1 Date: Sat, 6 Jun 2015 22:50:25 -0400 Message-ID: <20150607025025.GC17573@brightrain.aerifal.cx> References: <20150606214007.GA17398@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1433645445 14523 80.91.229.3 (7 Jun 2015 02:50:45 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sun, 7 Jun 2015 02:50:45 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-7913-gllmg-musl=m.gmane.org@lists.openwall.com Sun Jun 07 04:50:45 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1Z1QfX-0008Jz-KJ for gllmg-musl@m.gmane.org; Sun, 07 Jun 2015 04:50:39 +0200 Original-Received: (qmail 30298 invoked by uid 550); 7 Jun 2015 02:50:38 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 30280 invoked from network); 7 Jun 2015 02:50:37 -0000 Content-Disposition: inline In-Reply-To: <20150606214007.GA17398@brightrain.aerifal.cx> User-Agent: Mutt/1.5.21 (2010-09-15) Original-Sender: Rich Felker Xref: news.gmane.org gmane.linux.lib.musl.general:7900 Archived-At: On Sat, Jun 06, 2015 at 05:40:07PM -0400, Rich Felker wrote: > Attached is the first draft of a proposed byte-based C locale. The > patch is about 400 lines but most of it is context, because it's > basically a lot of tiny changes spread out over lots of files. > [...] If we go forward with this, I think I can factor it into 3 parts: 1. Add checks for MB_CUR_MAX==1 and the bytelocale support they would activate, and the CODEUNIT/IS_CODEUNIT macros needed for these code paths. This patch would be a complete nop and would not even affect codegen with a decent compiler since MB_CUR_MAX==4 is a constant right now. 2. Introduce stdio saving of active LC_CTYPE at the time of stream orientation (fwide) and save/restore of current locale around stdio ops that need it (fputwc, fgetwc, ungetwc) and iconv usage of multibyte functions. This patch would increase code size in a few places but would not change behavior. 3. Replace the constant MB_CUR_MAX macro with a runtime-variable value dependent on CURRENT_LOCALE->cat[LC_CTYPE]. This would actually activate the byte-based C locale support. locale_impl.h is actually already doing this, so I think I should remove that definition before making any changes and only bring it back if/when stage 3 here is committed. In principle stages 1 and 2 could be committed in either order; they're independent. Stage 3 is also independent in what it touches, but if it's already committed before stage 1/2, then committing stage 1 without stage 2 is a functional regression (stdio functions no longer behave according to spec; iconv stops working in C locale). Rich