mailing list of musl libc
 help / color / mirror / code / Atom feed
* Locale bikeshed time
@ 2014-07-22 18:49 Rich Felker
  2014-07-22 20:10 ` u-igbb
  2014-07-22 20:17 ` Laurent Bercot
  0 siblings, 2 replies; 43+ messages in thread
From: Rich Felker @ 2014-07-22 18:49 UTC (permalink / raw)
  To: musl

I've got the next phase of the locale work pretty much ready to
commit, but since it needs some policy for how to load locales, I want
to continue the discussion first rather than having commits that
change the behavior back and forth as we discuss this.

Overall, my plan at this point is to disallow any absolute/relative
pathnames in the LC_* vars and restrict them purely to locale names,
and have the path in a separate variable outside the scope of the
standard. This is basically how glibc does it, and the idea is that
you can allow locale names from an untrusted source (e.g. for suid,
for remote apps acting on behalf of a user such as web apps or
gitolite, or for apps that process mixed-locale data with uselocale
and have locale names in their data) as long as the locale path does
not contain malicious locales.

So, the first bikeshed decision to be made is what environment
variable to use for the locale path, and what fallback should be if
it's not set. Glibc uses $LOCPATH. On the one hand it would be nice to
use the same var (since apps are already aware of the need to treat it
specially), but on the other it's undesirable to have them tied
together (e.g. if you're using musl as a non-root installation and
can't write to /usr/lib) and to avoid clashing with glibc's files we
would need to choose a subdirectory under $LOCPATH rather than using
it directly. All of these aspects make it a lot less attractive.

The second issue is how locale categories are split up. Glibc has each
category in a separate file, except for the "locale-archive" file
which stores everything in one file for easy mapping. My leaning so
far is to put the whole locale -- time format and translations,
message translations, ... in a single file. This avoids the need for
multiple mappings (and syscall overhead, and vma overhead, ...) if
you're using the same value for all categories. But on the other hand,
if you wanted to have lots of subtle variants of a locale, you might
end up with largely-duplicate files on disk. Fortunately I think
they'll all be very small anyway so this may not matter.

Of course making this work is contingent on finding a good way to
encode LC_MONETARY and LC_COLLATE data in a .mo file, since if the
whole locale is unified into one file, it would be a .mo file. My
leaning is to simply use "int_cur_symbol", etc. as gettext keys for
the string fields of LC_MONETARY and then put all the numeric fields
of lconv into a single string that could be parsed with scanf or a
tiny integer parser in localeconv() on the first usage. While not the
most efficient, it avoids needing nasty special tools to generate
locale files; a po-to-mo converter is all you need. For LC_COLLATE,
obviously one solution would be to have keys for each collation
element and use gettext to convert collation elements to the symbols
strxfrm is supposed to output. I'm not sure if the efficiency of this
method is tolerable however. We could go with it for now and later add
something more advanced if needed (e.g. mapping to a DFA represented
as a byte arrary that does the conversions).

I probably have some more issues to discuss with this too but I'll
just go ahead and send now to get discussion started, and hopefully
get back to adding some more code first.

Rich


^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2014-07-27  8:24 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-22 18:49 Locale bikeshed time Rich Felker
2014-07-22 20:10 ` u-igbb
2014-07-22 20:35   ` Rich Felker
2014-07-23  9:50     ` u-igbb
2014-07-23 16:39       ` Rich Felker
2014-07-23 19:25         ` u-igbb
2014-07-23 21:01           ` Rich Felker
2014-07-24 15:35             ` u-igbb
2014-07-24 16:01               ` Rich Felker
2014-07-24 19:24                 ` u-igbb
2014-07-24 20:15                 ` u-igbb
2014-07-24 22:02                   ` Rich Felker
2014-07-25  9:06                     ` u-igbb
2014-07-25 20:15                       ` u-igbb
2014-07-25 22:32                         ` Rich Felker
2014-07-26  7:25                           ` u-igbb
2014-07-26  8:03                             ` Rich Felker
2014-07-26  9:06                               ` Jens Gustedt
2014-07-26  9:25                                 ` Rich Felker
2014-07-26  9:38                               ` u-igbb
2014-07-26 17:47                                 ` Szabolcs Nagy
2014-07-26 18:23                                   ` Rich Felker
2014-07-26 18:59                                     ` u-igbb
2014-07-26 19:14                                       ` Rich Felker
2014-07-26 18:56                                   ` u-igbb
2014-07-26 19:30                                     ` Rich Felker
2014-07-27  7:28                                       ` u-igbb
2014-07-26 20:43                         ` Rich Felker
2014-07-27  7:51                           ` u-igbb
2014-07-27  8:00                             ` Rich Felker
2014-07-27  8:24                               ` u-igbb
2014-07-23 23:22         ` writeonce
2014-07-23 23:38           ` Rich Felker
2014-07-24  1:07             ` writeonce
2014-07-24  1:57               ` Rich Felker
2014-07-24  2:16                 ` writeonce
2014-07-24  2:24                   ` Rich Felker
2014-07-24  2:59                     ` writeonce
2014-07-22 20:17 ` Laurent Bercot
2014-07-22 20:36   ` Rich Felker
2014-07-23 22:03     ` Laurent Bercot
2014-07-23 22:12       ` Rich Felker
2014-07-24 15:38         ` u-igbb

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).