mailing list of musl libc
 help / color / mirror / code / Atom feed
* Information on locale system in musl 1.1.4
@ 2014-08-01  5:29 Rich Felker
  0 siblings, 0 replies; only message in thread
From: Rich Felker @ 2014-08-01  5:29 UTC (permalink / raw)
  To: musl

The major new feature in musl 1.1.4 is the locale system. In
accordance with the long-term plan it was based on, it's designed to:

1. Be lightweight -- calling setlocale pulls in around 2k of code when
   static linking on i386.

2. Meet the minimum needs for applications to provide an interface in
   the user's preferred natural language using the official and de
   facto standard interfaces for doing so -- the standard C/POSIX
   locale API and gettext translation API.

3. Avoid complicating the libc or applications that call setlocale in
   ways that impact security, introduce bugs that only occur in
   unusual locales, or discourage developers of light applications
   from calling setlocale.

The version of the locale system in musl 1.1.4 is still incomplete and
experimental. However, its experimental status should not impact use
on existing deployments; locales are not loaded at all unless the
MUSL_LOCPATH environment variable is set.

The features presently supported are:

- The setting of the LC_MESSAGES locale category is recorded
  regardless of whether a libc locale file is available to be loaded.
  This will be used by the gettext interfaces if the application uses
  gettext message translation and can be retrieved by the application
  by calling setlocale(LC_MESSAGES, 0).

- Message translation for most messages produced by libc, including
  error and signal name strings, controlled by LC_MESSAGES.

- Translated day/month names and appropriate date/time format strings,
  controlled by LC_TIME.

The key missing features which will definitely be added at some time
in the future are collation rules (LC_COLLATE) and currency
information and monetary numeric formatting (LC_MONETARY).

Finding locale files:

If the MUSL_LOCPATH environment variable is set, it's treated as a
colon-delimited list of directories to search for locale files. The
locale file must have the exact same name as the locale setting being
requested. Locale names greater than 15 bytes in length, starting with
a '.', or containing the '/' character are rejected.

In the future, musl will probably ignore everything after the dot when
the locale name contains a dot, since by convention this component
reflects a character encoding, whereas musl always uses UTF-8. Other
character may also be rejected in the future; to be safe, locale names
should be restricted to using alphanumeric characters, the underscore,
and the at sign.

In programs running with elevated privileges (setuid/setgid/etc.), the
MUSL_LOCPATH environment variable is not honored. At present, this
means there is no way to use the locale functionality with such
programs. This deficiency will be addressed in a future release.

Unrecognized locale names:

Any locale name that is not usable for any reason (file not found,
name rejected, error loading, etc.) is treated as an alias for the
built-in C.UTF-8 locale. The motivation for this behavior is to avoid
possibly breaking UTF-8 support when the application depends on
setlocale success for UTF-8 to work; this may be a bigger issue in the
future if musl adopts an abstract 8-bit C locale.

Locale file format:

A locale file for use by musl is simply a .mo format file like the
ones used by gettext, and can be created with the msgfmt utility from
the GNU gettext package, gettext-tiny, or possibly other versions.
Translations for message strings and LC_TIME strings (day names, month
names, strftime-style date/time format strings) all go in the same
translation file. The format for monetary and collation data will be
specified at a later time, but will be stored in the same type of
file.

Using gettext:

The gettext translation functions are largely compatible with the
documented interfaces in the GNU gettext manual. This does not include
some more recent, undocumented, ill-designed features in GNU gettext
which are used mostly (only?) by some GNU packages so far. The main
deviation from GNU gettext in the outward behavior is that the
LANGUAGE environment variable is not honored; that topic is covered in
a separate message to the musl list. Also, there is no default path
for translation files, but this should not affect applications since
the documented usage is that calling bindtextdomain is required.


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2014-08-01  5:29 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-01  5:29 Information on locale system in musl 1.1.4 Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).