From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.org/gmane.linux.lib.musl.general/5562 Path: news.gmane.org!not-for-mail From: u-igbb@aetey.se Newsgroups: gmane.linux.lib.musl.general Subject: Re: Locale bikeshed time Date: Tue, 22 Jul 2014 22:10:08 +0200 Message-ID: <20140722201008.GC16795@example.net> References: <20140722184932.GA4914@brightrain.aerifal.cx> Reply-To: musl@lists.openwall.com NNTP-Posting-Host: plane.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1406059839 29071 80.91.229.3 (22 Jul 2014 20:10:39 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 22 Jul 2014 20:10:39 +0000 (UTC) To: musl@lists.openwall.com Original-X-From: musl-return-5567-gllmg-musl=m.gmane.org@lists.openwall.com Tue Jul 22 22:10:32 2014 Return-path: Envelope-to: gllmg-musl@plane.gmane.org Original-Received: from mother.openwall.net ([195.42.179.200]) by plane.gmane.org with smtp (Exim 4.69) (envelope-from ) id 1X9gOO-0006RU-AO for gllmg-musl@plane.gmane.org; Tue, 22 Jul 2014 22:10:32 +0200 Original-Received: (qmail 26557 invoked by uid 550); 22 Jul 2014 20:10:31 -0000 Mailing-List: contact musl-help@lists.openwall.com; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Original-Received: (qmail 26549 invoked from network); 22 Jul 2014 20:10:31 -0000 X-T2-Spam-Status: No, hits=0.0 required=5.0 Received-SPF: none receiver=mailfe01.swip.net; client-ip=171.25.193.235; envelope-from=u-igbb@aetey.se Content-Disposition: inline In-Reply-To: <20140722184932.GA4914@brightrain.aerifal.cx> User-Agent: Mutt/1.5.23 (2014-03-12) Xref: news.gmane.org gmane.linux.lib.musl.general:5562 Archived-At: On Tue, Jul 22, 2014 at 02:49:32PM -0400, Rich Felker wrote: > Overall, my plan at this point is to disallow any absolute/relative > pathnames in the LC_* vars and restrict them purely to locale names, > and have the path in a separate variable outside the scope of the > standard. +1 > So, the first bikeshed decision to be made is what environment > variable to use for the locale path, and what fallback should be if > it's not set. Glibc uses $LOCPATH. On the one hand it would be nice to > use the same var (since apps are already aware of the need to treat it > specially), but on the other it's undesirable to have them tied > together (e.g. if you're using musl as a non-root installation and > can't write to /usr/lib) and to avoid clashing with glibc's files we This issue is not crucial for my usage pattern, here it is easy to assign values of this kind per binary, not per process tree (in contrast to the locale names which I want to be settable by the user and inheritable regardless of which library can happen to interpret them). Speaking more generally, using the same variable as glibc would introduce a substantial risk of confusion, making the semantics of the variable context-dependent (i.e. depending on which library a certain binary is linked to). This confusion is kind of hidden in monolithic distros where all binaries are expected to have been built by tightly cooperating parties using the same libraries - but the general case includes using binaries built with different premises. A musl-specific variable name would be a better/cleaner choice. > would need to choose a subdirectory under $LOCPATH rather than using > it directly. All of these aspects make it a lot less attractive. +1 > The second issue is how locale categories are split up. Glibc has each > category in a separate file, except for the "locale-archive" file > which stores everything in one file for easy mapping. My leaning so By the way, please do not follow the way of a single big file. For systems which rely on file boundaries to reflect data clustering (i.e. which data is most probable to be used together) it is very useful to let the files correspond to the data structure. Otherwise some cheap and efficient distributed data access optimizations become impossible. Coda file system uses a file as a transmission and caching unit - which is quite efficient because a file very often corresponds to an "information unit" which is needed as a whole. Glibc's locale archive enforces a big wasteful transfer and a large cache footprint for very little actual use. > far is to put the whole locale -- time format and translations, > message translations, ... in a single file. This avoids the need for > multiple mappings (and syscall overhead, and vma overhead, ...) if > you're using the same value for all categories. But on the other hand, > if you wanted to have lots of subtle variants of a locale, you might > end up with largely-duplicate files on disk. Fortunately I think > they'll all be very small anyway so this may not matter. I actually do mix categories from different locales. No problem as long as the files are small. Rune