From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/7878 Path: news.gmane.org!not-for-mail From: Christian Neukirchen Newsgroups: gmane.linux.lib.musl.general Subject: Re: Revisiting byte-based C locale Date: Fri, 05 Jun 2015 10:58:10 +0200 Message-ID: <876172cz19.fsf@gmail.com> References: <20150522022203.GA26651@brightrain.aerifal.cx> <20150604205332.GS17573@brightrain.aerifal.cx> <87eglrchph.fsf@gmail.com> <20150605013911.GT17573@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain X-Trace: ger.gmane.org 1433494733 10111 80.91.229.3 (5 Jun 2015 08:58:53 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Fri, 5 Jun 2015 08:58:53 +0000 (UTC) Cc: musl@lists.openwall.com To: Rich Felker Original-X-From: musl-return-7891-gllmg-musl=m.gmane.org@lists.openwall.com Fri Jun 05 10:58:44 2015 Return-path: Envelope-to: gllmg-musl@m.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1Z0nSS-0005Xs-3g for gllmg-musl@m.gmane.org; Fri, 05 Jun 2015 10:58:32 +0200 Original-Received: (qmail 5442 invoked by uid 550); 5 Jun 2015 08:58:23 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 5419 invoked from network); 5 Jun 2015 08:58:23 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:references:date:in-reply-to:message-id :user-agent:mime-version:content-type; bh=jBK8HGl5KJ5zHwR8S0SrKLdHAIyIA+EMxqhFeYM4Cgg=; b=oPJsq4pyREPJWodJWVZYLo/2GwiG4pG3S5frUIm/RlQyCHX60OzyHS46OB89BCDzOx 9wQeNpF1v8807OI0BnAhgSzkn5sE31U4PwQm/QeuKRHtk+7BW2PpfwbR8U6WeDGIBoYr UQiwCp8ahjzhztBqunEGRZ3IGH4NUW8oYYNy7i8tE2ff4kWgHtNSt94EZnXzf+hMHb20 s3qCApNRZXUbAiQN/Ljz9YGJlckCC4uIbmRzy1x/uKB9vGxKtsVj3X/r4CG8/isf79E0 yh8xIo/VcoSxIqaY0NyLNbsyabPwpvGERSyHtEr8USIh+rpnQ+LUpN6yk4nGyUv7B/Af fN5g== X-Received: by 10.180.99.39 with SMTP id en7mr60210456wib.31.1433494691684; Fri, 05 Jun 2015 01:58:11 -0700 (PDT) In-Reply-To: <20150605013911.GT17573@brightrain.aerifal.cx> (Rich Felker's message of "Thu, 4 Jun 2015 21:39:11 -0400") User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) Xref: news.gmane.org gmane.linux.lib.musl.general:7878 Archived-At: Rich Felker writes: > On Thu, Jun 04, 2015 at 11:00:10PM +0200, Christian Neukirchen wrote: >> Rich Felker writes: >> >> > On Thu, May 21, 2015 at 10:22:03PM -0400, Rich Felker wrote: >> >> Any new opinions on the topic? Or interest in re-emphasizing a >> >> previously stated opinion? :) >> > >> > No new opinions on this? I've tentatively added drafting a new >> > proposed byte-based C locale patch as a roadmap item for this release >> > cycle, not necessarily to commit it, but as a way to re-evaluate >> > whether it's still costly to implement. >> >> Will it support regexec on 8-bit binary data? > > Yes, as long as the program has done one of the following: > > - Not called setlocale at all. > - Called setlocale with an explicit "C" argument or in environment. > - Called uselocale with a locale_t for "C". AFAICS it does: in main: (void)setlocale(LC_CTYPE, ""); protected int file_regcomp(file_regex_t *rx, const char *pat, int flags) { #ifdef USE_C_LOCALE rx->c_lc_ctype = newlocale(LC_CTYPE_MASK, "C", 0); assert(rx->c_lc_ctype != NULL); rx->old_lc_ctype = uselocale(rx->c_lc_ctype); assert(rx->old_lc_ctype != NULL); #endif rx->pat = pat; return rx->rc = regcomp(&rx->rx, pat, flags); } >> We found out file(1) >> needs this. > > Indeed, aside from the Austin Group issue 663, having this topic come > up several times in real-world usage is the motivation for > reconsidering it. I believe file(1) _attempts_ to do this right, > making use of uselocale. A strong +1 from me then. I'll be glad to help testing it on Void Linux. -- Christian Neukirchen http://chneukirchen.org