mailing list of musl libc
 help / color / mirror / code / Atom feed
* [musl] Handling of non-location specific TZ values
@ 2021-04-25 17:46 Sören Tempel
  2021-04-25 18:38 ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: Sören Tempel @ 2021-04-25 17:46 UTC (permalink / raw)
  To: musl

Hi,

While debugging a test failure of the calendar application calcurse on
Alpine I noticed that musl does not support TZ values which don't
include area/location information, e.g. TZ=CET [0]. This is contrary to
the documentation from the musl wiki which states the following [1]:

	The zoneinfo file is interpreted as an absolute pathname if it
	begins with a slash, a relative pathname if it begins with a
	dot, and otherwise is searched in /usr/share/zoneinfo,
	/share/zoneinfo, and /etc/zoneinfo.

Since commit 5c4f11d995cf178b3146cde0734d6988c145f243 musl only consults
the zoneinfo database if the value of the TZ environment variable starts
with a colon or contains a slash [2]. As such, the zoneinfo database is
not consulted for TZ=CET thereby causing musl to not determine DST etc.
correctly for such TZ values. TZ values such as Europe/Kaliningrad are
correctly looked up in the zoneinfo database though.

The aforementioned commit claims that this strict check is necessary
since "TZ=name is reserved for POSIX-form" which consists of a mandatory
timezone name (std), an offset, and some more optional information [3].
**If** TZ values adhering to the POSIX format should not be looked up in
the zoneinfo database, it would be necessary to somehow determine if a
given string adheres to the POSIX timezone format before performing the
lookup.

The glibc implementation seems to unconditionally consult the zoneinfo
database and falls back to the POSIX format (__tzset_parse_tz) if no
corresponding file was found in the database [4]. Apart from glibc, the
non-POSIX TZ format with TZ=<timezone> is also supported BSDs, e.g.
OpenBSD [5].

Any thoughts on how this could fixed in musl?

Greetings,
Sören

[0]: https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/19695
[1]: https://wiki.musl-libc.org/environment-variables.html
[2]: https://git.musl-libc.org/cgit/musl/commit/?id=5c4f11d995cf178b3146cde0734d6988c145f243
[3]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08_03
[4]: https://sourceware.org/git/?p=glibc.git;a=blob;f=time/tzset.c;h=2fc51194b63bda8eeaff4f2ac111291ff43c28ad;hb=24f261f27fb8fd19ae294ff2a13bc5b7a0bafc91#l405
[5]: https://github.com/openbsd/src/blob/2207c4325726fdc5c4bcd0011af0fdf7d3dab137/lib/libc/time/Theory

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [musl] Handling of non-location specific TZ values
  2021-04-25 17:46 [musl] Handling of non-location specific TZ values Sören Tempel
@ 2021-04-25 18:38 ` Rich Felker
  2021-05-02 20:28   ` Sören Tempel
  0 siblings, 1 reply; 5+ messages in thread
From: Rich Felker @ 2021-04-25 18:38 UTC (permalink / raw)
  To: Sören Tempel; +Cc: musl

On Sun, Apr 25, 2021 at 07:46:51PM +0200, Sören Tempel wrote:
> Hi,
> 
> While debugging a test failure of the calendar application calcurse on
> Alpine I noticed that musl does not support TZ values which don't
> include area/location information, e.g. TZ=CET [0]. This is contrary to
> the documentation from the musl wiki which states the following [1]:
> 
> 	The zoneinfo file is interpreted as an absolute pathname if it
> 	begins with a slash, a relative pathname if it begins with a
> 	dot, and otherwise is searched in /usr/share/zoneinfo,
> 	/share/zoneinfo, and /etc/zoneinfo.
> 
> Since commit 5c4f11d995cf178b3146cde0734d6988c145f243 musl only consults
> the zoneinfo database if the value of the TZ environment variable starts
> with a colon or contains a slash [2]. As such, the zoneinfo database is
> not consulted for TZ=CET thereby causing musl to not determine DST etc.
> correctly for such TZ values. TZ values such as Europe/Kaliningrad are
> correctly looked up in the zoneinfo database though.

The behavior was the same before that commit except that a leading :
was not able to override the behavior and force inspection of a file.

> The aforementioned commit claims that this strict check is necessary
> since "TZ=name is reserved for POSIX-form" which consists of a mandatory
> timezone name (std), an offset, and some more optional information [3].
> **If** TZ values adhering to the POSIX format should not be looked up in
> the zoneinfo database, it would be necessary to somehow determine if a
> given string adheres to the POSIX timezone format before performing the
> lookup.

Yes. I suspect we can get by with calling getname with a dummy output
array, then checking if the next character is one of +, -, or a digit.
If not (in particular, if it's a null character) then we can attempt
loading it as a file.

It might be worth adding a special exception for "UTC" and "GMT" so
that they are always interpreted as "UTC0" and "GMT0" and can't be
overridden by a bogus file in the zoneinfo path, for the sake of
software that does "TZ=UTC cmd" to avoid any timezone shenanigans.

I wonder if we should also check the validity of the string after
loading as a file fails, so that if you have TZ=CET but no file named
CET, it doesn't get interpreted as if it were CET0 but instead as
UTC0. That would avoid bad surprises when the program prints %Z.

> The glibc implementation seems to unconditionally consult the zoneinfo
> database and falls back to the POSIX format (__tzset_parse_tz) if no
> corresponding file was found in the database [4]. Apart from glibc, the
> non-POSIX TZ format with TZ=<timezone> is also supported BSDs, e.g.
> OpenBSD [5].

glibc really should not be doing this...

Rich

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [musl] Handling of non-location specific TZ values
  2021-04-25 18:38 ` Rich Felker
@ 2021-05-02 20:28   ` Sören Tempel
  2021-06-02 15:41     ` Sören Tempel
  0 siblings, 1 reply; 5+ messages in thread
From: Sören Tempel @ 2021-05-02 20:28 UTC (permalink / raw)
  To: musl

Rich Felker <dalias@libc.org> wrote:
> Yes. I suspect we can get by with calling getname with a dummy output
> array, then checking if the next character is one of +, -, or a digit.
> If not (in particular, if it's a null character) then we can attempt
> loading it as a file.

Maybe something along the following? Not too familiar with the musl code
base so not sure if including ctype.h is allowed etc.

diff --git a/src/time/__tz.c b/src/time/__tz.c
index 09a6317e..6bc183d0 100644
--- a/src/time/__tz.c
+++ b/src/time/__tz.c
@@ -3,6 +3,7 @@
 #include <limits.h>
 #include <stdlib.h>
 #include <string.h>
+#include <ctype.h>
 #include <sys/mman.h>
 #include "libc.h"
 #include "lock.h"
@@ -125,13 +126,13 @@ static size_t zi_dotprod(const unsigned char *z, const unsigned char *v, size_t
 static void do_tzset()
 {
 	char buf[NAME_MAX+25], *pathname=buf+24;
-	const char *try, *s, *p;
+	const char *try, *s, *orig;
 	const unsigned char *map = 0;
 	size_t i;
 	static const char search[] =
 		"/usr/share/zoneinfo/\0/share/zoneinfo/\0/etc/zoneinfo/\0";

-	s = getenv("TZ");
+	s = orig = getenv("TZ");
 	if (!s) s = "/etc/localtime";
 	if (!*s) s = __utc;

@@ -154,11 +155,19 @@ static void do_tzset()
 	}
 	if (old_tz) memcpy(old_tz, s, i+1);

-	/* Non-suid can use an absolute tzfile pathname or a relative
-	 * pathame beginning with "."; in secure mode, only the
-	 * standard path will be searched. */
-	if (*s == ':' || ((p=strchr(s, '/')) && !memchr(s, ',', p-s))) {
+	/* The TZ format specified by POSIX consists of a mandatory
+	 * time zone name and a mandatory offset. We determine the
+	 * name using getname, if the next character cannot constitute
+	 * a valid offset (or the TZ value starts with a colon) we
+	 * interpret the TZ environment variable as a zoneinfo file name. */
+	getname(std_name, &s);
+	if (*s == ':' || (!isdigit(*s) && *s != '+' && *s != '-')) {
 		if (*s == ':') s++;
+		else if (orig) s = orig;
+
+		/* Non-suid can use an absolute tzfile pathname or a relative
+		 * pathame beginning with "."; in secure mode, only the
+		 * standard path will be searched. */
 		if (*s == '/' || *s == '.') {
 			if (!libc.secure || !strcmp(s, "/etc/localtime"))
 				map = __map_file(s, &map_size);

Patch can be tested using date(1). For instance, compare the output of
`TZ=CET date` on a patched and unpatched system with an installed
zoneinfo database.

> It might be worth adding a special exception for "UTC" and "GMT" so
> that they are always interpreted as "UTC0" and "GMT0" and can't be
> overridden by a bogus file in the zoneinfo path, for the sake of
> software that does "TZ=UTC cmd" to avoid any timezone shenanigans.

Maybe that can be done in a separate commit?

Greetings,
Sören

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [musl] Handling of non-location specific TZ values
  2021-05-02 20:28   ` Sören Tempel
@ 2021-06-02 15:41     ` Sören Tempel
  2021-06-05 15:52       ` Rich Felker
  0 siblings, 1 reply; 5+ messages in thread
From: Sören Tempel @ 2021-06-02 15:41 UTC (permalink / raw)
  To: musl

Ping.

Would be willing to adjust the patch as needed. In any case, it would be
nice to get this fixed as it currently causes some test failures of
Alpine packages.

Greetings,
Sören

Sören Tempel <soeren@soeren-tempel.net> wrote:
> Rich Felker <dalias@libc.org> wrote:
> > Yes. I suspect we can get by with calling getname with a dummy output
> > array, then checking if the next character is one of +, -, or a digit.
> > If not (in particular, if it's a null character) then we can attempt
> > loading it as a file.
> 
> Maybe something along the following? Not too familiar with the musl code
> base so not sure if including ctype.h is allowed etc.
> 
> diff --git a/src/time/__tz.c b/src/time/__tz.c
> index 09a6317e..6bc183d0 100644
> --- a/src/time/__tz.c
> +++ b/src/time/__tz.c
> @@ -3,6 +3,7 @@
>  #include <limits.h>
>  #include <stdlib.h>
>  #include <string.h>
> +#include <ctype.h>
>  #include <sys/mman.h>
>  #include "libc.h"
>  #include "lock.h"
> @@ -125,13 +126,13 @@ static size_t zi_dotprod(const unsigned char *z, const unsigned char *v, size_t
>  static void do_tzset()
>  {
>  	char buf[NAME_MAX+25], *pathname=buf+24;
> -	const char *try, *s, *p;
> +	const char *try, *s, *orig;
>  	const unsigned char *map = 0;
>  	size_t i;
>  	static const char search[] =
>  		"/usr/share/zoneinfo/\0/share/zoneinfo/\0/etc/zoneinfo/\0";
> 
> -	s = getenv("TZ");
> +	s = orig = getenv("TZ");
>  	if (!s) s = "/etc/localtime";
>  	if (!*s) s = __utc;
> 
> @@ -154,11 +155,19 @@ static void do_tzset()
>  	}
>  	if (old_tz) memcpy(old_tz, s, i+1);
> 
> -	/* Non-suid can use an absolute tzfile pathname or a relative
> -	 * pathame beginning with "."; in secure mode, only the
> -	 * standard path will be searched. */
> -	if (*s == ':' || ((p=strchr(s, '/')) && !memchr(s, ',', p-s))) {
> +	/* The TZ format specified by POSIX consists of a mandatory
> +	 * time zone name and a mandatory offset. We determine the
> +	 * name using getname, if the next character cannot constitute
> +	 * a valid offset (or the TZ value starts with a colon) we
> +	 * interpret the TZ environment variable as a zoneinfo file name. */
> +	getname(std_name, &s);
> +	if (*s == ':' || (!isdigit(*s) && *s != '+' && *s != '-')) {
>  		if (*s == ':') s++;
> +		else if (orig) s = orig;
> +
> +		/* Non-suid can use an absolute tzfile pathname or a relative
> +		 * pathame beginning with "."; in secure mode, only the
> +		 * standard path will be searched. */
>  		if (*s == '/' || *s == '.') {
>  			if (!libc.secure || !strcmp(s, "/etc/localtime"))
>  				map = __map_file(s, &map_size);
> 
> Patch can be tested using date(1). For instance, compare the output of
> `TZ=CET date` on a patched and unpatched system with an installed
> zoneinfo database.
> 
> > It might be worth adding a special exception for "UTC" and "GMT" so
> > that they are always interpreted as "UTC0" and "GMT0" and can't be
> > overridden by a bogus file in the zoneinfo path, for the sake of
> > software that does "TZ=UTC cmd" to avoid any timezone shenanigans.
> 
> Maybe that can be done in a separate commit?
> 
> Greetings,
> Sören

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [musl] Handling of non-location specific TZ values
  2021-06-02 15:41     ` Sören Tempel
@ 2021-06-05 15:52       ` Rich Felker
  0 siblings, 0 replies; 5+ messages in thread
From: Rich Felker @ 2021-06-05 15:52 UTC (permalink / raw)
  To: Sören Tempel; +Cc: musl

On Wed, Jun 02, 2021 at 05:41:30PM +0200, Sören Tempel wrote:
> Ping.
> 
> Would be willing to adjust the patch as needed. In any case, it would be
> nice to get this fixed as it currently causes some test failures of
> Alpine packages.

Thanks for the ping. I had an alternate approach in draft that I want
to look back at and compare. I'll try to get this fixed one way or the
other before rolling the release.

Rich


> Sören Tempel <soeren@soeren-tempel.net> wrote:
> > Rich Felker <dalias@libc.org> wrote:
> > > Yes. I suspect we can get by with calling getname with a dummy output
> > > array, then checking if the next character is one of +, -, or a digit.
> > > If not (in particular, if it's a null character) then we can attempt
> > > loading it as a file.
> > 
> > Maybe something along the following? Not too familiar with the musl code
> > base so not sure if including ctype.h is allowed etc.
> > 
> > diff --git a/src/time/__tz.c b/src/time/__tz.c
> > index 09a6317e..6bc183d0 100644
> > --- a/src/time/__tz.c
> > +++ b/src/time/__tz.c
> > @@ -3,6 +3,7 @@
> >  #include <limits.h>
> >  #include <stdlib.h>
> >  #include <string.h>
> > +#include <ctype.h>
> >  #include <sys/mman.h>
> >  #include "libc.h"
> >  #include "lock.h"
> > @@ -125,13 +126,13 @@ static size_t zi_dotprod(const unsigned char *z, const unsigned char *v, size_t
> >  static void do_tzset()
> >  {
> >  	char buf[NAME_MAX+25], *pathname=buf+24;
> > -	const char *try, *s, *p;
> > +	const char *try, *s, *orig;
> >  	const unsigned char *map = 0;
> >  	size_t i;
> >  	static const char search[] =
> >  		"/usr/share/zoneinfo/\0/share/zoneinfo/\0/etc/zoneinfo/\0";
> > 
> > -	s = getenv("TZ");
> > +	s = orig = getenv("TZ");
> >  	if (!s) s = "/etc/localtime";
> >  	if (!*s) s = __utc;
> > 
> > @@ -154,11 +155,19 @@ static void do_tzset()
> >  	}
> >  	if (old_tz) memcpy(old_tz, s, i+1);
> > 
> > -	/* Non-suid can use an absolute tzfile pathname or a relative
> > -	 * pathame beginning with "."; in secure mode, only the
> > -	 * standard path will be searched. */
> > -	if (*s == ':' || ((p=strchr(s, '/')) && !memchr(s, ',', p-s))) {
> > +	/* The TZ format specified by POSIX consists of a mandatory
> > +	 * time zone name and a mandatory offset. We determine the
> > +	 * name using getname, if the next character cannot constitute
> > +	 * a valid offset (or the TZ value starts with a colon) we
> > +	 * interpret the TZ environment variable as a zoneinfo file name. */
> > +	getname(std_name, &s);
> > +	if (*s == ':' || (!isdigit(*s) && *s != '+' && *s != '-')) {
> >  		if (*s == ':') s++;
> > +		else if (orig) s = orig;
> > +
> > +		/* Non-suid can use an absolute tzfile pathname or a relative
> > +		 * pathame beginning with "."; in secure mode, only the
> > +		 * standard path will be searched. */
> >  		if (*s == '/' || *s == '.') {
> >  			if (!libc.secure || !strcmp(s, "/etc/localtime"))
> >  				map = __map_file(s, &map_size);
> > 
> > Patch can be tested using date(1). For instance, compare the output of
> > `TZ=CET date` on a patched and unpatched system with an installed
> > zoneinfo database.
> > 
> > > It might be worth adding a special exception for "UTC" and "GMT" so
> > > that they are always interpreted as "UTC0" and "GMT0" and can't be
> > > overridden by a bogus file in the zoneinfo path, for the sake of
> > > software that does "TZ=UTC cmd" to avoid any timezone shenanigans.
> > 
> > Maybe that can be done in a separate commit?
> > 
> > Greetings,
> > Sören

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-06-05 15:52 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-25 17:46 [musl] Handling of non-location specific TZ values Sören Tempel
2021-04-25 18:38 ` Rich Felker
2021-05-02 20:28   ` Sören Tempel
2021-06-02 15:41     ` Sören Tempel
2021-06-05 15:52       ` Rich Felker

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).