mailing list of musl libc
 help / color / mirror / code / Atom feed
* Status towards next release (1.1.4)
@ 2014-07-12  5:10 Rich Felker
  2014-07-12  6:02 ` Isaac Dunham
                   ` (3 more replies)
  0 siblings, 4 replies; 18+ messages in thread
From: Rich Felker @ 2014-07-12  5:10 UTC (permalink / raw)
  To: musl

I think we're pretty well on-schedule for the next release. Here's a
summary of progress so far:

- Private futex support, not committed. If we can demonstrate any
  performance benefit, it can be committed, but otherwise I'm inclined
  to throw it out. There's no point in adding complexity with no
  evidence of benefit.

- Locale framework. Right now this is mostly just a framework and does
  nothing useful.

- Byte-based C locale, not committed. As discussed previously, this is
  non-essential for conforming to current standards, so I'm inclined
  to omit it for now. But if there's demand for it we can consider
  adding it.

- Gettext/mo file lookup core. This is not integrated with libc yet,
  but tested and working.

- Openrisc (or1k) port. Stefan Kristiansson's work seems basically
  complete and is in the testing phase now. I'm hoping to merge it
  in the next few days.

There are several things we need to focus on now:

- The Big Bikeshed: where to find locale files? These will be somewhat
  musl-specific (to the extent that no other implementation uses the
  design I have in mind, though it would be easy for others to use
  it), so there's no existing practice to simply adopt. The files are
  not machine-specific (we'll support either endianness .mo file) so
  /usr/share (or other prefix variants) is the natural base location.

- Minor coding tasks for locale. Really, this is minor. The policy of
  where to find the files is a much bigger issue to work out.

- Adding non-stub public gettext API. I'd like this to happen along
  with the locale work since it uses the same core operation, but it
  may turn out that there are various bloated gettext features which
  applications use which we don't want in the core libc itself uses
  for locale, in which case we'd end up with two implementations.

- What to do with if_nameindex and getifaddrs? This issue has been
  deferred for a couple releases now so I really want to solve it this
  time.

The other items on the roadmap are all secondary and related to ports.
I'll be happy if we can just get or1k into this release, since it's a
nice way to draw some publicity for both projects (musl and openrisc).
But if there's time, I might do the bits refactoring (and other
port-related cleanup) in this release cycle once or1k is committed.

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Status towards next release (1.1.4)
  2014-07-12  5:10 Status towards next release (1.1.4) Rich Felker
@ 2014-07-12  6:02 ` Isaac Dunham
  2014-07-12 14:26   ` Rich Felker
  2014-07-12  7:24 ` u-igbb
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 18+ messages in thread
From: Isaac Dunham @ 2014-07-12  6:02 UTC (permalink / raw)
  To: musl

On Sat, Jul 12, 2014 at 01:10:35AM -0400, Rich Felker wrote:
> I think we're pretty well on-schedule for the next release. Here's a
> summary of progress so far:
<snip> 
> - Locale framework. Right now this is mostly just a framework and does
>   nothing useful.
> 
> - Byte-based C locale, not committed. As discussed previously, this is
>   non-essential for conforming to current standards, so I'm inclined
>   to omit it for now. But if there's demand for it we can consider
>   adding it.

I'd like to at least test this to see how well it works.
I just discovered that sword built with C++11 regex support dies with
complaints related to the locale:
terminate called after throwing an instance of 'std::runtime_error'
  what():  locale::facet::_S_create_c_locale name not valid

...so I'm wondering whether this will improve compatability.
(I'm not eager to go hunt down the issue right now; I expect it's
some variant of the usual locale issues.)

> - Gettext/mo file lookup core. This is not integrated with libc yet,
>   but tested and working.
> 
> - Openrisc (or1k) port. Stefan Kristiansson's work seems basically
>   complete and is in the testing phase now. I'm hoping to merge it
>   in the next few days.
> 
> There are several things we need to focus on now:
> 
> - The Big Bikeshed: where to find locale files? These will be somewhat
>   musl-specific (to the extent that no other implementation uses the
>   design I have in mind, though it would be easy for others to use
>   it), so there's no existing practice to simply adopt. The files are
>   not machine-specific (we'll support either endianness .mo file) so
>   /usr/share (or other prefix variants) is the natural base location.

/usr/share/muslnls is awkward, maybe newnls?
I don't care exactly what gets decided, but a couple issues come to mind:
-the name should NOT be .../"locale" or any other name in use on Linux
systems. otherwise parallel installs break.
-it would be nice if the end of the path is at most 6 chars, since
paths have to be stored somewhere...
(Actually, this implies that 4 chars would be ideal:
"/usr/share/" is 11 non-zero bytes, then add 4, then NUL, making 16 bytes,
which shouldn't need any padding. This is, of course, decidedly premature
optimization. ;-) )
> 
> Rich

Thanks,
Isaac Dunham


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Status towards next release (1.1.4)
  2014-07-12  5:10 Status towards next release (1.1.4) Rich Felker
  2014-07-12  6:02 ` Isaac Dunham
@ 2014-07-12  7:24 ` u-igbb
  2014-07-12  8:44   ` Laurent Bercot
  2014-07-12 14:55   ` Rich Felker
  2014-07-12 14:41 ` Matias A. Fonzo
  2014-07-12 15:03 ` Rich Felker
  3 siblings, 2 replies; 18+ messages in thread
From: u-igbb @ 2014-07-12  7:24 UTC (permalink / raw)
  To: musl

On Sat, Jul 12, 2014 at 01:10:35AM -0400, Rich Felker wrote:
> - The Big Bikeshed: where to find locale files? These will be somewhat
>   musl-specific (to the extent that no other implementation uses the
>   design I have in mind, though it would be easy for others to use
>   it), so there's no existing practice to simply adopt. The files are
>   not machine-specific (we'll support either endianness .mo file) so
>   /usr/share (or other prefix variants) is the natural base location.

For me it looks like you take a wrong kind of responsibility and try to
make a decision which does not belong to a library developer.

This is an "integrator" decision, the one who knows how the library will
be used and what is the corresponding environment's policy of placing
stuff around in the file system.

In other words, as long as it is configurable, any "default" goes.
You can not know (and imho do _not_ have to pretend to) what is best or
sensible for the actual deployment.

As an "integrator" I am concerned in the following way:

- If locale is mostly static (additions or changes to locale
  can be done at the same time as library recompilations/upgrades)
  then a "default" placement is totally irrelevant, but I must be
  able to choose the actual one at compilation time - I guess this is
  expected and hence a non-issue

With the paranoia-hat on:

- if locale data is supposed to be available from more sources than the
  library upstream (then potentially even with different licenses)
  and/or if it is supposed to change often, then

  I'd badly need a possibility to tell an application at runtime where
  to look for the data (presumably via an environment variable specific
  to musl).

  Hope such kind of locale data is not expected to exist.

Regards,
Rune



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Status towards next release (1.1.4)
  2014-07-12  7:24 ` u-igbb
@ 2014-07-12  8:44   ` Laurent Bercot
  2014-07-12 14:55   ` Rich Felker
  1 sibling, 0 replies; 18+ messages in thread
From: Laurent Bercot @ 2014-07-12  8:44 UTC (permalink / raw)
  To: musl

On 12/07/2014 08:24, u-igbb@aetey.se wrote:
> For me it looks like you take a wrong kind of responsibility and try to
> make a decision which does not belong to a library developer.
>
> This is an "integrator" decision, the one who knows how the library will
> be used and what is the corresponding environment's policy of placing
> stuff around in the file system.
>
> In other words, as long as it is configurable, any "default" goes.
> You can not know (and imho do _not_ have to pretend to) what is best or
> sensible for the actual deployment.
>
> As an "integrator" I am concerned in the following way:
>
> - If locale is mostly static (additions or changes to locale
>    can be done at the same time as library recompilations/upgrades)
>    then a "default" placement is totally irrelevant, but I must be
>    able to choose the actual one at compilation time - I guess this is
>    expected and hence a non-issue

  +1 and QFT.
  Policy should not be included in software, but delegated to the user
(sysadmins and distributors). There should be a reasonable default, but
configurability is a lot more important than the exact default value.

-- 
  Laurent



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Status towards next release (1.1.4)
  2014-07-12  6:02 ` Isaac Dunham
@ 2014-07-12 14:26   ` Rich Felker
  2014-07-12 19:13     ` Isaac Dunham
  0 siblings, 1 reply; 18+ messages in thread
From: Rich Felker @ 2014-07-12 14:26 UTC (permalink / raw)
  To: musl

On Fri, Jul 11, 2014 at 11:02:28PM -0700, Isaac Dunham wrote:
> On Sat, Jul 12, 2014 at 01:10:35AM -0400, Rich Felker wrote:
> > I think we're pretty well on-schedule for the next release. Here's a
> > summary of progress so far:
> <snip> 
> > - Locale framework. Right now this is mostly just a framework and does
> >   nothing useful.
> > 
> > - Byte-based C locale, not committed. As discussed previously, this is
> >   non-essential for conforming to current standards, so I'm inclined
> >   to omit it for now. But if there's demand for it we can consider
> >   adding it.
> 
> I'd like to at least test this to see how well it works.
> I just discovered that sword built with C++11 regex support dies with
> complaints related to the locale:
> terminate called after throwing an instance of 'std::runtime_error'
>   what():  locale::facet::_S_create_c_locale name not valid

What musl version? (1.1.3 or git?) I doubt this has anything to do
with musl's actual locale implementation, which has essentially no
outwardly-visible behavior right now, but we can check.

If you're not using git, see if git fixes it. 1.1.3 and earlier
rejected unknown locale names (anything but C, C.UTF-8, or POSIX).
Now, any name is accepted, and unknown names are all aliases for
C.UTF-8.

> > - The Big Bikeshed: where to find locale files? These will be somewhat
> >   musl-specific (to the extent that no other implementation uses the
> >   design I have in mind, though it would be easy for others to use
> >   it), so there's no existing practice to simply adopt. The files are
> >   not machine-specific (we'll support either endianness .mo file) so
> >   /usr/share (or other prefix variants) is the natural base location.
> 
> /usr/share/muslnls is awkward, maybe newnls?

FWIW, on glibc it's a mix of /usr/share/locale (messages,
non-machine-specific) and /usr/lib/locale (nasty machine-specific
binary stuff for other locale categories).

> I don't care exactly what gets decided, but a couple issues come to mind:
> -the name should NOT be .../"locale" or any other name in use on Linux
> systems. otherwise parallel installs break.

Agreed. We should not use a pathname with existing precedent for an
incompatible purpose.

> -it would be nice if the end of the path is at most 6 chars, since
> paths have to be stored somewhere...
> (Actually, this implies that 4 chars would be ideal:
> "/usr/share/" is 11 non-zero bytes, then add 4, then NUL, making 16 bytes,
> which shouldn't need any padding. This is, of course, decidedly premature
> optimization. ;-) )

Yes, I think it's premature optimization. I'd rather the name be clean
and reasonable to users than needlessly short. There's no fundamental
reason strings need padding to 16-byte boundaries anyway; if they are
padded as such, it's a toolchain issue and we should try to fix it at
the toolchain level.

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Status towards next release (1.1.4)
  2014-07-12  5:10 Status towards next release (1.1.4) Rich Felker
  2014-07-12  6:02 ` Isaac Dunham
  2014-07-12  7:24 ` u-igbb
@ 2014-07-12 14:41 ` Matias A. Fonzo
  2014-07-12 14:58   ` Rich Felker
  2014-07-12 15:03 ` Rich Felker
  3 siblings, 1 reply; 18+ messages in thread
From: Matias A. Fonzo @ 2014-07-12 14:41 UTC (permalink / raw)
  To: musl

Hello,
El Sab, 12 de Julio de 2014, 2:10 am, Rich Felker escribió:
> [..]
>
> - The Big Bikeshed: where to find locale files? These will be somewhat
> musl-specific (to the extent that no other implementation uses the design I
> have in mind, though it would be easy for others to use it), so there's no
> existing practice to simply adopt. The files are not machine-specific
> (we'll support either endianness .mo file) so
> /usr/share (or other prefix variants) is the natural base location.

Exist:
/usr/lib/locale (glibc)
/usr/share/locale - /usr/local/share/locale (most programs)
/usr/share/X11/locale (X11 programs)
/usr/share/nls (Message catalogs for Native language support)

The default for musl can be /usr/local/share/musl/locale





^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Status towards next release (1.1.4)
  2014-07-12  7:24 ` u-igbb
  2014-07-12  8:44   ` Laurent Bercot
@ 2014-07-12 14:55   ` Rich Felker
  2014-07-12 16:29     ` u-igbb
  1 sibling, 1 reply; 18+ messages in thread
From: Rich Felker @ 2014-07-12 14:55 UTC (permalink / raw)
  To: musl

On Sat, Jul 12, 2014 at 09:24:09AM +0200, u-igbb@aetey.se wrote:
> On Sat, Jul 12, 2014 at 01:10:35AM -0400, Rich Felker wrote:
> > - The Big Bikeshed: where to find locale files? These will be somewhat
> >   musl-specific (to the extent that no other implementation uses the
> >   design I have in mind, though it would be easy for others to use
> >   it), so there's no existing practice to simply adopt. The files are
> >   not machine-specific (we'll support either endianness .mo file) so
> >   /usr/share (or other prefix variants) is the natural base location.
> 
> For me it looks like you take a wrong kind of responsibility and try to
> make a decision which does not belong to a library developer.
> 
> This is an "integrator" decision, the one who knows how the library will
> be used and what is the corresponding environment's policy of placing
> stuff around in the file system.
> 
> In other words, as long as it is configurable, any "default" goes.
> You can not know (and imho do _not_ have to pretend to) what is best or
> sensible for the actual deployment.

I understand that configuring this matters for your usage case where
you're configuring ALL of the paths where configuration/data/etc. is
read from to isolate each program in its own bubble. However I don't
see any value in configuring this one location when other things (like
the place timezones are searched for) is fixed.

> As an "integrator" I am concerned in the following way:
> 
> - If locale is mostly static (additions or changes to locale
>   can be done at the same time as library recompilations/upgrades)
>   then a "default" placement is totally irrelevant, but I must be
>   able to choose the actual one at compilation time - I guess this is
>   expected and hence a non-issue

No, the intent is that they're produced independently of musl, or at
least independently of my part of the development/maintenance process.
I don't want to be a locale maintainer. BTW, locale definitions are a
much bigger "imposing policy" issue than a standard pathname.

> With the paranoia-hat on:
> 
> - if locale data is supposed to be available from more sources than the
>   library upstream (then potentially even with different licenses)
>   and/or if it is supposed to change often, then
> 
>   I'd badly need a possibility to tell an application at runtime where
>   to look for the data (presumably via an environment variable specific
>   to musl).
> 
>   Hope such kind of locale data is not expected to exist.

Runtime configuration of the path is a big problem for many usage
cases, possibly even if it's blocked for suid. The recent glibc
CVE-2014-0475 has me concerned and wanting to avoid any dubious
practices with how locales are searched out. This is potentially a
much bigger issue than timezones, because for timezones, invalid data
probably results in compromises no worse than a crash or information
leak. With locales, invalid data can result in full code execution
(via injection of %n into format strings, and possibly other ways).

On the other hand, runtime configuration is something I'd really like
to have, so that users can use locales that are not installed by the
system administrator and develop/test/debug locales without
installing. But this is a sufficiently big opening for environmental
state to alter the behavior of the program that I'm very concerned
about the safety of it and frustrated by the whole process...

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Status towards next release (1.1.4)
  2014-07-12 14:41 ` Matias A. Fonzo
@ 2014-07-12 14:58   ` Rich Felker
  0 siblings, 0 replies; 18+ messages in thread
From: Rich Felker @ 2014-07-12 14:58 UTC (permalink / raw)
  To: musl

On Sat, Jul 12, 2014 at 11:41:43AM -0300, Matias A. Fonzo wrote:
> Hello,
> El Sab, 12 de Julio de 2014, 2:10 am, Rich Felker escribió:
> > [..]
> >
> > - The Big Bikeshed: where to find locale files? These will be somewhat
> > musl-specific (to the extent that no other implementation uses the design I
> > have in mind, though it would be easy for others to use it), so there's no
> > existing practice to simply adopt. The files are not machine-specific
> > (we'll support either endianness .mo file) so
> > /usr/share (or other prefix variants) is the natural base location.
> 
> Exist:
> /usr/lib/locale (glibc)
> /usr/share/locale - /usr/local/share/locale (most programs)
> /usr/share/X11/locale (X11 programs)
> /usr/share/nls (Message catalogs for Native language support)
> 
> The default for musl can be /usr/local/share/musl/locale

The location definitely shouldn't be something under /usr/local unless
it's just based on the prefix. That would only make sense for installs
in /usr/local/musl as a non-default libc for use with the wrapper, but
not for musl-based systems or deployment in various mixed
environments.

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Status towards next release (1.1.4)
  2014-07-12  5:10 Status towards next release (1.1.4) Rich Felker
                   ` (2 preceding siblings ...)
  2014-07-12 14:41 ` Matias A. Fonzo
@ 2014-07-12 15:03 ` Rich Felker
  2014-07-12 16:41   ` Locale path and security [Was: Status towards next release (1.1.4)] Rich Felker
  2014-07-12 17:04   ` Status towards next release (1.1.4) u-igbb
  3 siblings, 2 replies; 18+ messages in thread
From: Rich Felker @ 2014-07-12 15:03 UTC (permalink / raw)
  To: musl

On Sat, Jul 12, 2014 at 01:10:35AM -0400, Rich Felker wrote:
> - The Big Bikeshed: where to find locale files? These will be somewhat
>   musl-specific (to the extent that no other implementation uses the
>   design I have in mind, though it would be easy for others to use
>   it), so there's no existing practice to simply adopt. The files are
>   not machine-specific (we'll support either endianness .mo file) so
>   /usr/share (or other prefix variants) is the natural base location.

One idea for this: just don't accept anything except the built-in
locales (C, C.UTF-8, POSIX) and absolute pathnames. For suid programs,
the latter could be rejected completely (the safest and probably what
we should do) or restricted to a set of reasonable paths where each
path component is checked for permissions.

Another idea is pulling the search path from /etc/musl-locale.conf or
similar. Obviously this is not the most friendly to Rune's usage case,
but it would just be one more hard-coded path to override in the
custom build, or if absolute pathnames were also accepted for locales
the support for /etc/musl-locale.conf could just be stripped out.

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Status towards next release (1.1.4)
  2014-07-12 14:55   ` Rich Felker
@ 2014-07-12 16:29     ` u-igbb
  2014-07-12 17:00       ` Rich Felker
  0 siblings, 1 reply; 18+ messages in thread
From: u-igbb @ 2014-07-12 16:29 UTC (permalink / raw)
  To: musl

On Sat, Jul 12, 2014 at 10:55:25AM -0400, Rich Felker wrote:
> > You can not know (and imho do _not_ have to pretend to) what is best or
> > sensible for the actual deployment.
> 
> I understand that configuring this matters for your usage case where
> you're configuring ALL of the paths where configuration/data/etc. is
> read from to isolate each program in its own bubble. However I don't
> see any value in configuring this one location when other things (like
> the place timezones are searched for) is fixed.

Exactly! :) It is hardly tenable to hardcode the path to any database,
including the timezone one. Fortunately TZ syntax allows escaping the trap
(so actually per design it is not strictly enforced how the user may
supply the timezone information, at least according to the gnu description).

> > - If locale is mostly static (additions or changes to locale
> >   can be done at the same time as library recompilations/upgrades)
> >   then a "default" placement is totally irrelevant, but I must be
> >   able to choose the actual one at compilation time - I guess this is
> >   expected and hence a non-issue
> 
> No, the intent is that they're produced independently of musl, or at
> least independently of my part of the development/maintenance process.

> I don't want to be a locale maintainer. BTW, locale definitions are a
> much bigger "imposing policy" issue than a standard pathname.

Then the library should not postulate nor hardcode the location, given
that the expected maintenance routines for the data are unclear.

> Runtime configuration of the path is a big problem for many usage
> cases, possibly even if it's blocked for suid. The recent glibc
> CVE-2014-0475 has me concerned and wanting to avoid any dubious
> practices with how locales are searched out. This is potentially a

I understand your concern about security but disallowing something at
the library level just to prevent a certain possible mode of failure of
a third party's flawed security model? This feels almost like designing
flats without windows [no pun] to prevent children from falling out.

> much bigger issue than timezones, because for timezones, invalid data
> probably results in compromises no worse than a crash or information
> leak. With locales, invalid data can result in full code execution
> (via injection of %n into format strings, and possibly other ways).

Allowing a user to set environment variables is giving her freedom to
control her applications iow a policy question. The low level library has
no proper knowledge to make policy decisions.

Again, I feel you assume more responsibility for musl than is due.

The policy enforcer (ssh) would fare perfectly fine - just don't list
the hypothetical MUSL_LOCALE_DIR in the variables allowed to be set,
this will end the issue. Of course the Big Brother has to properly set
the variable if locales are supposed to be available - or compile in
the path to where he stores the "approved" locale defintions. Not worse
than this and safe - unless the policy maker wants "allow all variables
except a list", which is inherently unsafe.

So this doesn't look like a security concern for musl.

> On the other hand, runtime configuration is something I'd really like
> to have, so that users can use locales that are not installed by the
> system administrator and develop/test/debug locales without
> installing. But this is a sufficiently big opening for environmental
> state to alter the behavior of the program that I'm very concerned
> about the safety of it and frustrated by the whole process...

:(

If you strongly feel for providing hardwired and unmutable behaviour then
let the run-time envvar-driven choices be compile-time conditionals. This
also will save several bytes for the control freaks :) while still allowing
flexible deployment.

Most of the traditional paranoia about the code being mislead by the user
comes from the role of setuid in *nix which implies hardcoded references
"as much as possible". In a setuid-free milieux (which we always have in
a distributed/global context) this is a pure nuisance.

By the way, it is easy to wrap binaries, resetting/protecting/checking
variables accordingly to the actual purpose.

This means the extra protection in a more complete form is available
when needed, without putting it into the library and sacrificing
functionality.

Thanks for listening Rich, the decisions are yours anyway.

Regards,
Rune



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Locale path and security [Was: Status towards next release (1.1.4)]
  2014-07-12 15:03 ` Rich Felker
@ 2014-07-12 16:41   ` Rich Felker
  2014-07-12 17:04   ` Status towards next release (1.1.4) u-igbb
  1 sibling, 0 replies; 18+ messages in thread
From: Rich Felker @ 2014-07-12 16:41 UTC (permalink / raw)
  To: musl

On Sat, Jul 12, 2014 at 11:03:33AM -0400, Rich Felker wrote:
> On Sat, Jul 12, 2014 at 01:10:35AM -0400, Rich Felker wrote:
> > - The Big Bikeshed: where to find locale files? These will be somewhat
> >   musl-specific (to the extent that no other implementation uses the
> >   design I have in mind, though it would be easy for others to use
> >   it), so there's no existing practice to simply adopt. The files are
> >   not machine-specific (we'll support either endianness .mo file) so
> >   /usr/share (or other prefix variants) is the natural base location.
> 
> One idea for this: just don't accept anything except the built-in
> locales (C, C.UTF-8, POSIX) and absolute pathnames. For suid programs,
> the latter could be rejected completely (the safest and probably what
> we should do) or restricted to a set of reasonable paths where each
> path component is checked for permissions.
> 
> Another idea is pulling the search path from /etc/musl-locale.conf or
> similar. Obviously this is not the most friendly to Rune's usage case,
> but it would just be one more hard-coded path to override in the
> custom build, or if absolute pathnames were also accepted for locales
> the support for /etc/musl-locale.conf could just be stripped out.

From a usability standpoint, I think it's desirable to have some sort
of search path, even if absolute pathnames are also supported.
Consider mixed environments where the user has something like
LANG=fr_FR.UTF-8 for glibc programs; assuming the corresponding locale
is also installed for musl, the reasonable user expectation is that
musl-linked programs also use French messages, time formatting,
collation, etc.

glibc honors the non-POSIX environment variable LOCPATH to control its
search for locales. While this is something of a consideration for
applications trying to avoid unwanted environment-influenced behavior
for security purposes or otherwise, it's not a big conformance problem
since setlocale already depends on the environment anyway (and thus
can't be called safely in parallel with modifications to the
environment, per POSIX). We could honor the same variable and just
append "/musl/" to the value (this would be nice from the standpoint
of not introducing another variable apps have to be aware of when they
want to filter it) but that's somewhat ugly since the glibc one is
intended to point to a "lib" (arch-specific) dir whereas musl's is
portable data. Using a separate variable might be preferable if we
even want to support an environment variable as a way to configure
this at runtime -- and I think doing so may be valuable since users
may want locales that are not installed by the system administrator.

In light of glibc CVE-2014-0475, which I'm not sure is even really a
proper "vulnerability" but rather just a complication of the standard
locale semantics that makes it hard to write secure programs without
filtering out locale vars from untrusted sources, a major goal I'd
like to pursue is minimizing the potential security impact of an
untrusted/malicious locale file. Obviously suid/AT_SECURE programs
should not even honor locale files except possibly from a hard-coded
trusted source, but ideally even programs without formally elevated
privilegs -- think gitolite type setups with ssh authorized_keys --
would not yield code execution or information leak when fed a
malicious locale file.

Here are the security aspects I have in mind:

- For libc itself (obviously we can't control application use of
  gettext), only translate literal strings, never printf/scanf format
  strings. For dlerror this requires some refactoring of the message
  strings but otherwise I think this property is easy to satisfy. The
  purpose of this property is to prevent format string injection via
  locales and limit the scope of bad messages to literal copying of
  those messages into the program output.

- Avoid loading as a locale any file which was not intended to be a
  locale. This entails checking the magic number, sanity-checking the
  headers, and also doing a single gettext-type string lookup for a
  key string associated with our locale file format (a specialization
  of general mo files). If the key is not found, the file can be
  rejected; it's probably a mo file but not one that satisfies the
  needs of libc for the requested locale category. The purpose of this
  check is to prevent disclosure of contents of files that were not
  intended to be locales.

- During gettext lookup (binary search), validate all offsets as lying
  within the address range of the mapping. The purpose of this check
  is to preclude information disclosure due to reading strings from
  locations outside the mapping.

Obviously as long as mmap is used, there is a possibility of DoS via
file truncation and SIGBUS. I don't think it's worth trying to work
around this since the scope is limited to crashing your own programs
(or allowing someone else to crash them if you use a locale file
writable by someone else). As previoysly discussed for zoneinfo, one
option would be to malloc, read, and validate (rather than mmap), but
IMO this is cost-prohibitive.

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Status towards next release (1.1.4)
  2014-07-12 16:29     ` u-igbb
@ 2014-07-12 17:00       ` Rich Felker
  2014-07-12 17:15         ` u-igbb
                           ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Rich Felker @ 2014-07-12 17:00 UTC (permalink / raw)
  To: musl

On Sat, Jul 12, 2014 at 06:29:44PM +0200, u-igbb@aetey.se wrote:
> > Runtime configuration of the path is a big problem for many usage
> > cases, possibly even if it's blocked for suid. The recent glibc
> > CVE-2014-0475 has me concerned and wanting to avoid any dubious
> > practices with how locales are searched out. This is potentially a
> 
> I understand your concern about security but disallowing something at
> the library level just to prevent a certain possible mode of failure of
> a third party's flawed security model? This feels almost like designing
> flats without windows [no pun] to prevent children from falling out.
> 
> > much bigger issue than timezones, because for timezones, invalid data
> > probably results in compromises no worse than a crash or information
> > leak. With locales, invalid data can result in full code execution
> > (via injection of %n into format strings, and possibly other ways).
> 
> Allowing a user to set environment variables is giving her freedom to
> control her applications iow a policy question. The low level library has
> no proper knowledge to make policy decisions.
> 
> Again, I feel you assume more responsibility for musl than is due.

I partly agree with you here, and that's why I've raised a question on
oss-security as to whether CVE-2014-0475 was even a valid
vulnerability rather than just an ordinary non-security bug.

However, format string vulnerabilities are also a sufficiently serious
issue that extra precautions need to be taken to avoid introducing
them in situations where it might be at all non-obvious that they
could arise. This is why (see my other email in the thread spun off
this one) I'm working on a design that avoids the format string issue
entirely.

I think we'll be able to work something out where locale path is
configurable locally (per-process), or at least where absolute paths
are allowed. Of course in suid processes both need to be forbidden;
until we can be sure of what's safe, it might be necessary just to
forbid all non-builtin locales for suid (libc.secure) programs.

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Status towards next release (1.1.4)
  2014-07-12 15:03 ` Rich Felker
  2014-07-12 16:41   ` Locale path and security [Was: Status towards next release (1.1.4)] Rich Felker
@ 2014-07-12 17:04   ` u-igbb
  1 sibling, 0 replies; 18+ messages in thread
From: u-igbb @ 2014-07-12 17:04 UTC (permalink / raw)
  To: musl

On Sat, Jul 12, 2014 at 11:03:33AM -0400, Rich Felker wrote:
> Another idea is pulling the search path from /etc/musl-locale.conf or
> similar. Obviously this is not the most friendly to Rune's usage case,

Thanks for the thought :)

Such a file would enforce the configuration to be strictly "one per computer".
No differences between different users on the computer would be allowed.

Hardcoding a reference to a _globally_ placed file like this (you know,
we have no local places to count with - our programs work without
bothering the local admin of an unknown distro) would be disasterous.

(Any change in the file or in the data in the paths given there would
instantly affect a potentially infinite set of users. And of course
different users have different needs, one size does not fit all.)

> but it would just be one more hard-coded path to override in the
> custom build, or if absolute pathnames were also accepted for locales
> the support for /etc/musl-locale.conf could just be stripped out.

Absolute locale names can not replace using short (standard) locale
names with adjustable (not necessarily standard) databases.

Please leave a possibility to specify the directory containing the
locale definitions (or a path to search if you prefer) at run time,
per application instance - I am not aware of anything adequate besides
a dedicated environment variable.

Rune



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Status towards next release (1.1.4)
  2014-07-12 17:00       ` Rich Felker
@ 2014-07-12 17:15         ` u-igbb
  2014-07-13  8:46         ` Weldon Goree
  2014-07-14 17:55         ` Rich Felker
  2 siblings, 0 replies; 18+ messages in thread
From: u-igbb @ 2014-07-12 17:15 UTC (permalink / raw)
  To: musl

On Sat, Jul 12, 2014 at 01:00:08PM -0400, Rich Felker wrote:
> Of course in suid processes both need to be forbidden;
> until we can be sure of what's safe, it might be necessary just to
> forbid all non-builtin locales for suid (libc.secure) programs.

+1

Rune



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Status towards next release (1.1.4)
  2014-07-12 14:26   ` Rich Felker
@ 2014-07-12 19:13     ` Isaac Dunham
  0 siblings, 0 replies; 18+ messages in thread
From: Isaac Dunham @ 2014-07-12 19:13 UTC (permalink / raw)
  To: musl

On Sat, Jul 12, 2014 at 10:26:06AM -0400, Rich Felker wrote:
> On Fri, Jul 11, 2014 at 11:02:28PM -0700, Isaac Dunham wrote:
> > I'd like to at least test this to see how well it works.
> > I just discovered that sword built with C++11 regex support dies with
> > complaints related to the locale:
> > terminate called after throwing an instance of 'std::runtime_error'
> >   what():  locale::facet::_S_create_c_locale name not valid
> 
> What musl version? (1.1.3 or git?) I doubt this has anything to do
> with musl's actual locale implementation, which has essentially no
> outwardly-visible behavior right now, but we can check.
> 
> If you're not using git, see if git fixes it. 1.1.3 and earlier
> rejected unknown locale names (anything but C, C.UTF-8, or POSIX).
> Now, any name is accepted, and unknown names are all aliases for
> C.UTF-8.

I was using Alpine's package (1.1.3 and cherry-picked fixes).
But after running git pull; ./configure; make; the new libc.so does not
fix this problem (tried with both LANG and LC_ALL set to each of C,
C.UTF-8, and POSIX).

Also, this error happens with mongodb on glibc systems where
localization isn't properly set up, so the error happens somewhere in the C++
toolchain/library stack (libstdc++ or perhaps icu?).

Thanks,
Isaac Dunham


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Status towards next release (1.1.4)
  2014-07-12 17:00       ` Rich Felker
  2014-07-12 17:15         ` u-igbb
@ 2014-07-13  8:46         ` Weldon Goree
  2014-07-14  3:48           ` Rich Felker
  2014-07-14 17:55         ` Rich Felker
  2 siblings, 1 reply; 18+ messages in thread
From: Weldon Goree @ 2014-07-13  8:46 UTC (permalink / raw)
  To: musl

Just because I figure someone should propose the most brute possible
strategy: what about storing the .mo data in the library itself? Port
the built-ins to the format, and you have a single code path for locale
access, and it doesn't involve persistent storage. If I'm understanding
your idea right and you're talking about the equivalent of
SYS_LC_MESSAGES and parts of LC_TIME and LC_COLLATE, this isn't nearly
as bloated as it sounds at first (particularly if one is putting, say, 4
locales in a given build rather than 446).

Now, obviously maintainers wouldn't like the choice of either 1 bloated
binary or 446 non-bloated binaries (or God forbid the Cartesian product
of all the possible locale combinations), and this kind of violates the
basic idea of locale that you shouldn't need to recompile software to
get it to speak French, but I just wanted to throw that idea out there.

Weldon


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Status towards next release (1.1.4)
  2014-07-13  8:46         ` Weldon Goree
@ 2014-07-14  3:48           ` Rich Felker
  0 siblings, 0 replies; 18+ messages in thread
From: Rich Felker @ 2014-07-14  3:48 UTC (permalink / raw)
  To: musl

On Sun, Jul 13, 2014 at 02:16:30PM +0530, Weldon Goree wrote:
> Just because I figure someone should propose the most brute possible
> strategy: what about storing the .mo data in the library itself? Port
> the built-ins to the format, and you have a single code path for locale
> access, and it doesn't involve persistent storage. If I'm understanding
> your idea right and you're talking about the equivalent of
> SYS_LC_MESSAGES and parts of LC_TIME and LC_COLLATE, this isn't nearly
> as bloated as it sounds at first (particularly if one is putting, say, 4
> locales in a given build rather than 446).
> 
> Now, obviously maintainers wouldn't like the choice of either 1 bloated
> binary or 446 non-bloated binaries (or God forbid the Cartesian product
> of all the possible locale combinations), and this kind of violates the
> basic idea of locale that you shouldn't need to recompile software to
> get it to speak French, but I just wanted to throw that idea out there.

Indeed, this idea violates that and many other principles:

- That support for Unicode should be cheap (your idea makes
  setlocale(), which any portable program supporting non-ASCII text
  needs to call, pull in a huge part of the library)

- That the person compiling the software generally has no idea what
  languages the user will care about.

- That while character encodings and character identity are
  essentially finished, settled matters that won't change, language
  and culture are fluid. A program with locale data hard-linked into
  is is basically guaranteed not only to be incomplete in the future,
  but outright WRONG in the future. The best analogy I can think of
  would be hard-coding timezones into the binary.

And probably many others. So I don't think this idea is viable.

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Status towards next release (1.1.4)
  2014-07-12 17:00       ` Rich Felker
  2014-07-12 17:15         ` u-igbb
  2014-07-13  8:46         ` Weldon Goree
@ 2014-07-14 17:55         ` Rich Felker
  2 siblings, 0 replies; 18+ messages in thread
From: Rich Felker @ 2014-07-14 17:55 UTC (permalink / raw)
  To: musl

On Sat, Jul 12, 2014 at 01:00:08PM -0400, Rich Felker wrote:
> > Allowing a user to set environment variables is giving her freedom to
> > control her applications iow a policy question. The low level library has
> > no proper knowledge to make policy decisions.
> > 
> > Again, I feel you assume more responsibility for musl than is due.
> 
> I partly agree with you here, and that's why I've raised a question on
> oss-security as to whether CVE-2014-0475 was even a valid
> vulnerability rather than just an ordinary non-security bug.

See the answer from glibc's side here:

http://www.openwall.com/lists/oss-security/2014/07/14/3

They consider absolute pathnames or directory traversal outside of the
locale base to be a vulnerability, but allow the base to be overridden
via LOCPATH which also comes from the environment. To me this seems a
bit contradictory, but I _think_ the idea is that they see it as
important to accept and trust LC_* from the user even when the source
of these vars is a different privilege domain, so that a properly
localized environment can be provided to the user. (Presumably,
LOCPATH and other arbitrary non-whitelisted env vars would not be
accepted in such situations, and suid programs would not honor
LOCPATH.)

If we want to follow a similar approach, I think we should at least
consider using the same var (LOCPATH) and having the musl locale data
reside in a directory under that base, since this would avoid adding
new vars that users need to be aware of that could affect the behavior
and safety (but hopefully, we've covered all the safety issues) of
programs.

By the way, as far as absolute pathnames go, we're under no obligation
from POSIX to support them, since we do not support the POSIX
localedef system (leading / means the LC_* refers to a locale
definition built by the localedef utility). If we do decide we want to
support them, on the other hand, we should use a different syntax
so as not to overlap with the form for POSIX localedef (which we don't
support).

Rich


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2014-07-14 17:55 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-12  5:10 Status towards next release (1.1.4) Rich Felker
2014-07-12  6:02 ` Isaac Dunham
2014-07-12 14:26   ` Rich Felker
2014-07-12 19:13     ` Isaac Dunham
2014-07-12  7:24 ` u-igbb
2014-07-12  8:44   ` Laurent Bercot
2014-07-12 14:55   ` Rich Felker
2014-07-12 16:29     ` u-igbb
2014-07-12 17:00       ` Rich Felker
2014-07-12 17:15         ` u-igbb
2014-07-13  8:46         ` Weldon Goree
2014-07-14  3:48           ` Rich Felker
2014-07-14 17:55         ` Rich Felker
2014-07-12 14:41 ` Matias A. Fonzo
2014-07-12 14:58   ` Rich Felker
2014-07-12 15:03 ` Rich Felker
2014-07-12 16:41   ` Locale path and security [Was: Status towards next release (1.1.4)] Rich Felker
2014-07-12 17:04   ` Status towards next release (1.1.4) u-igbb

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).