From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/5619 Path: news.gmane.org!not-for-mail From: u-igbb@aetey.se Newsgroups: gmane.linux.lib.musl.general Subject: Re: Locale bikeshed time Date: Sat, 26 Jul 2014 11:38:05 +0200 Message-ID: <20140726093805.GS16795@example.net> References: <20140723210120.GD11570@brightrain.aerifal.cx> <20140724153526.GH16795@example.net> <20140724160150.GA4038@brightrain.aerifal.cx> <20140724201548.GM16795@example.net> <20140724220228.GB4038@brightrain.aerifal.cx> <20140725090649.GN16795@example.net> <20140725201551.GQ16795@example.net> <20140725223239.GG4038@brightrain.aerifal.cx> <20140726072502.GR16795@example.net> <20140726080327.GJ4038@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: ger.gmane.org 1406367523 2296 80.91.229.3 (26 Jul 2014 09:38:43 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 26 Jul 2014 09:38:43 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-5624-gllmg-musl=m.gmane.org@lists.openwall.com Sat Jul 26 11:38:32 2014 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1XAyQy-0003xH-7J for gllmg-musl@plane.gmane.org; Sat, 26 Jul 2014 11:38:32 +0200 Original-Received: (qmail 15937 invoked by uid 550); 26 Jul 2014 09:38:31 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 15929 invoked from network); 26 Jul 2014 09:38:31 -0000 X-T2-Spam-Status: No, hits=0.0 required=5.0 Received-SPF: none receiver=mailfe06.swip.net; client-ip=77.247.181.165; envelope-from=u-igbb@aetey.se Content-Disposition: inline In-Reply-To: <20140726080327.GJ4038@brightrain.aerifal.cx> User-Agent: Mutt/1.5.23 (2014-03-12) Xref: news.gmane.org gmane.linux.lib.musl.general:5619 Archived-At: On Sat, Jul 26, 2014 at 04:03:27AM -0400, Rich Felker wrote: > > So I would say it is indeed stupid to localize data meant for > > interchange. Nevertheless it may still be meaningful to format numbers > > for the user's taste when the data presentation is only meant for some > > kind of a "local" context. > > The problem is that the vast majority of actual printing and parsing > of floating point numbers is for interchange purposes, not mere visual > pretty-printing, and the existence of alternate radix characters > introduces subtle bugs into programs that are not tested in such > locales. Very few programs or libraries I've seen go to the trouble to > obtain a usable LC_NUMERIC locale in a portable, thread-safe, and > library-safe way before calling snprintf or strtod. And lots of broken > gui libraries set LC_NUMERIC behind the application's back even if the > application only wanted to set other categories. Ok, the reality is that locale is not being used in a reasonable way so we do not have to bother implementing it for proper use. Instead we are obliged to try to reduce the harm by being non-conforming in a partially compensating fashion. Sigh. Well, locale is a mess by design... > > Is there any evidence that "." is more widely used than "," ? > > Well, 2/3 of the world's population is in India and China and they all > use ".", so I think that pretty much covers the question of which is > "more widely used". Ah indeed. That's a sufficient evidence. > > locale is not about > > representing data for computers, but for humans - and I would love to > > have a best possible internationally useful locale as the default. > > This goes back to the question about modern versus old tradition. > Alternate radix points are a cultural convention that's (seemingly, > hopefully) on the way out due to computers and information > interchange. Maybe in some sense this is cultural imperialism (or just > globalization or whatnot) but it's certainly a lot less negative than > the "everyone should use English" attitude. Nobody's saying "don't use > your language", just "don't gratuitously break things for a one-pixel > difference". :-) :-D In practice this calls for "eo_ZZ@decimal_dot" - which actually would make sense. This reminds me that we have an unset issue of naming the variants. Wonder which schemes happen to exist, to be standardized (?), to be in use? Gnu gettext manual states " The ‘@variant’ can denote any kind of characteristics that is not already implied by the language ll and the country CC. It can denote a particular monetary unit. For example, on glibc systems, ‘de_DE@euro’ denotes the locale that uses the Euro currency, in contrast to the older locale ‘de_DE’ which implies the use of the currency before 2002. It can also denote a dialect of the language, or the script used to write text (for example, ‘sr_RS@latin’ uses the Latin script, whereas ‘sr_RS’ uses the Cyrillic script to write Serbian), or the orthography rules, or similar. " I read this as "there is no structure on variant naming and all kinds of variations share the same name space". Then it is the hopefully present comment in the locale definition file which apparently has to be consulted to know what a certain variant is about. Fine with me but I would like to see this stated somewhere (instead of my _guess_ after reading the above documentation - it does _not_ say a word about how one can learn the actual semantics of the variant aka the intention of the locale submitter). A straightforward try to learn what a certain installed locale is about, on a Debian Linux system: $ locale -a | grep en en_US.utf8 $ apropos en_US en_US: nothing appropriate. $ On a RedHat Linux system with "@Everything": $ locale -a | grep en ... lots of en_SOMETHING including en_US ... $ apropos en_US strlen_user (9) - Get the size of a string in user space strnlen_user (9) - Get the size of a string in user space $ Iow one has nice prerequisites for keeping the messy thing in a messy state :) Rune