[-- Attachment #1: Type: text/plain, Size: 3683 bytes --] Hello, This is a follow-up to a discussion on the IRC channel regarding musl's tzset() implementation. I did some tests of the tzset() code. The code basically provides parsers for two kinds of input formats: 1. TZ env values (e.g. in the TZ POSIX format [1]) 2. Zoneinfo files as provided by the IANA [2] As part of my performed tests I found various spatial memory safety violations in these two parsers. Both of them parse input through a pointer which is continuously advanced using pointer arithmetic and then dereferenced to access individual fields of the input formats. Unfortunately, the parsing code is largely lacking boundary checks to ensure that the pointer is still in bounds when dereferenced. As an example, consider the attached zoneinfo file. This file will cause musl to perform an out-of-bounds memory read which likely results in a segmentation fault on most systems: $ TZ=./zonefile-musl-crash.bin busybox date Segmentation fault (core dumped) This particular zoneinfo file causes a segmentation fault due to the calculation of the types pointer: types = index + zi_read32(trans-12); (gdb) p zi_read32(trans-12) $1 = 4286962800 The resulting types pointer value is outside the bounds of the mapped zoneinfo file (which is only 2048 bytes large). The value is also dereferenced later as part of the following code: for (p=types; p<abbrevs; p+=6) { if (!p[4] && !__tzname[0]) { // … } // … } Accessing p[4] will likely cause a segmentation fault. The problem here is simply that the entire parsing code for the zonefile does not perform any boundary checks at all. The TZ parsing code has similar issues. For example, consider a TZ value such as `TZ=FOO-,M`. The musl implementation interprets this as a TZ value in POSIX format. As such, it expects 3 integers after the 'M' character (Mm.n.d). However, the getrule code does not check whether any characters are actually present after the 'M' character: if (r=='M') { ++*p; rule[1] = getint(p); ++*p; rule[2] = getint(p); ++*p; rule[3] = getint(p); } As such, this code reads beyond the bounds of the TZ environment variable. This will likely not cause a crash but still illustrates the lack of bounds checks in the parser. These are just two examples, there are actually more spatial memory safety violations in that code. I have not checked whether any of these spatial memory safety violations could be exploited. From the aforementioned discussion on IRC, the underlying assumption seems to be that both the TZ value as well as any parsed zonefiles are always well-formated. Keep in mind that paths to arbitrary zonefiles can be passed using the `TZ=:${file}` syntax. For this reason, I don't think that this a good assumption to make. For example, calendar application such as calcurse [3] set $TZ to values extracted from .ical files which may be attacker-controlled [4]. Of course it would be desirable for applications to validate this data but they don't always do so. Even if they do, the validation code may be buggy and not be 100% aligned with the inherent assumptions musl's tzset() implementation makes about the format of these values. As such, I believe that musl should never perform, potentially exploitable, spatial memory safety violations on arbitrary TZ and zoneinfo file input values. Greetings, Sören [1]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03 [2]: https://www.iana.org/time-zones [3]: https://calcurse.org/ [4]: https://github.com/lfos/calcurse/blob/2b766040a9a5684aa71757a0ad96fb37b5333e09/src/utils.c#L437 [-- Attachment #2: zonefile-musl-crash.bin --] [-- Type: application/octet-stream, Size: 2048 bytes --]
Hi all, I don't see any security issues here, only QoI issues. The user setting TZ is also the one getting the crashes. The assumption is less that the input is always valid, but more that if it is invalid, the user will only be hacking themselves. Which is pointless. The user can at any point provide a good definition of TZ, even if the site admin is a BOFH that is deliberatly putting bad zone definitions into the zoneinfo database. That said, the user is prevented from doing so if the login shell crashes after a successful hack of the system, which is where the QoI and security domains start to rub up against each other. Then again, an attacker capable of implanting bad zone files has at least root access, and can therefore just disable user accounts and change passwords. And an attacker capable of setting the user's default TZ variable has user access and can probably just create an RC file that quits the shell or something. So a successful attacker has no need to detain themselves with zone files or TZ parsers. Ciao, Markus
Markus Wichmann <nullplan@gmx.net> wrote: > Hi all, Hi, > The user setting TZ is also the one getting the crashes. The > assumption is less that the input is always valid, but more that if it > is invalid, the user will only be hacking themselves. This is only true under the assumption that $TZ itself is never set to a value derived from an untrusted source. See the aforementioned calcurse code for example where this assumption does not hold. If you search GitHub or use codesearch.debian.net you will find plenty additional examples where people "abuse" $TZ to query information about zonefiles by name. Please also note that these spatial memory safety violations are UB and thus might or might not lead to a crash. Greetings, Sören
On Sun, Sep 05, 2021 at 11:26:08PM +0200, Sören Tempel wrote:
> Markus Wichmann <nullplan@gmx.net> wrote:
> > Hi all,
>
> Hi,
>
> > The user setting TZ is also the one getting the crashes. The
> > assumption is less that the input is always valid, but more that if it
> > is invalid, the user will only be hacking themselves.
>
> This is only true under the assumption that $TZ itself is never set to a
> value derived from an untrusted source. See the aforementioned calcurse
> code for example where this assumption does not hold. If you search
> GitHub or use codesearch.debian.net you will find plenty additional
> examples where people "abuse" $TZ to query information about zonefiles
> by name. Please also note that these spatial memory safety violations
> are UB and thus might or might not lead to a crash.
There are all sorts of reasons this is a really bad idea, especially
if no filtering on the accepted names is performed. For instance, TZ
strings taken from malicious data files could cause device files with
side effects to be opened (including acquisition of a controlling
tty), or coule potentially leak private data. If applications are
doing this, they really should be ensuring that the string fits a
reasonably "safe" form -- for example, not starting with ./ or / and
not containing .. components, or even more restrictive like just
/[[:alnum:]]+(/[[:alnum:]]+)?/
I think we could and should do some basic things on the musl side to
ensure at least memory safety (bounding all accesses within the mapped
region) but as long as we're using mmap on systems that lack MAP_COPY,
that's about the limit of what can be done. And I want to approach
this topic with some care not to give the impression that, after
changes are made, it will be "safe" to use untrusted zone files.
Rich