* [musl] strftime %Z behavior with manually populated struct tm @ 2020-08-10 13:31 Nikita Popov 2020-08-10 16:53 ` Rich Felker 2020-08-12 12:57 ` [musl] " Nikita Popov 0 siblings, 2 replies; 5+ messages in thread From: Nikita Popov @ 2020-08-10 13:31 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 1084 bytes --] Hi, Currently, strftime() %Z will print an empty string if the provided tm_zone does not originate from musl. It appears that this behavior is explicitly implemented in __tm_to_tzname(). Would it be possible to instead print any non-NULL tm_zone as provided? For consumers manually populating the struct tm structure, it is non-trivial to work around the current behavior. Python introduced a work around in https://github.com/python/cpython/commit/163eca34c48f1b25e1504e37f4656773fd0fdc78, but it is not easy to generalize. From what I gathered, the original concern here was that a consumer manually initializing struct tm may not initialize the tm_zone field. As struct tm is only specified to contain "at least" certain members, manual initialization of this structure is already on shaky ground anyway and I think it is more useful to assume the initialization is correct than try to deal with garbage data. Because other libc implementation do not try to validate the pointer either (beyond being non-NULL), incorrect initialization is unlikely in practice. Regards, Nikita [-- Attachment #2: Type: text/html, Size: 2154 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [musl] strftime %Z behavior with manually populated struct tm 2020-08-10 13:31 [musl] strftime %Z behavior with manually populated struct tm Nikita Popov @ 2020-08-10 16:53 ` Rich Felker 2020-08-12 12:57 ` [musl] " Nikita Popov 1 sibling, 0 replies; 5+ messages in thread From: Rich Felker @ 2020-08-10 16:53 UTC (permalink / raw) To: musl On Mon, Aug 10, 2020 at 03:31:23PM +0200, Nikita Popov wrote: > Hi, > > Currently, strftime() %Z will print an empty string if the provided tm_zone > does not originate from musl. It appears that this behavior is explicitly > implemented in __tm_to_tzname(). Would it be possible to instead print any > non-NULL tm_zone as provided? I don't think so. The problem is that %Z is specified, albeit underspecified, as something portable applications can pass to strftime, but tm_zone is not standard and may be uninitialized in a structure the caller passes. Attempting to honor it would mean crashing on inputs where it's not present, which is nonconforming and unacceptable. I haven't checked, but I believe most implementations just print the zone name from the current timezone, using tm_isdst to decide whether to print the standard or daylight version of the name. This is insufficient with zoneinfo for zones where the name changed over time, where it would print the wrong name for historical times. So instead we support printing any one of the zone names from the current zone, if the tm_zone member points to one of them, and blank otherwise. > For consumers manually populating the struct tm structure, it is > non-trivial to work around the current behavior. Python introduced a work > around in > https://github.com/python/cpython/commit/163eca34c48f1b25e1504e37f4656773fd0fdc78, > but it is not easy to generalize. I think you have the wrong commit link; that one does not look related. If you have the real one I'd be interested in seeing what they did. > From what I gathered, the original concern here was that a consumer > manually initializing struct tm may not initialize the tm_zone field. As > struct tm is only specified to contain "at least" certain members, manual > initialization of this structure is already on shaky ground anyway and I This is not the case. There is nothing shaky about using the interface in the standard-specified way. The specification already labels which struct tm fields each format uses, and it's perfectly valid to call strftime with junk in the rest; this is even fairly common usage (e.g. not filling in the derived fields that you'd need mktime to get). On the other hand, it's already dubious doing anything more than ignoring tm_zone and just using tm_isdst (which is the field the spec labels %Z as depending on). I only did this because it's impossible to give culturally-correct results in all zones with just tm_isdst and because the degree of underspecification made it seem defensible. > think it is more useful to assume the initialization is correct than try to > deal with garbage data. Because other libc implementation do not try to > validate the pointer either (beyond being non-NULL), incorrect > initialization is unlikely in practice. I think if applications want to use zones other than the actual configured zone with strftime, they need to just do something like expand the %Z themselves with the string they want before calling strftime (note: this requires quoting any % in the name). I looked hard for a better solution that wouldn't crash valid applications, and couldn't find one at the time. There is a proposal for a new C time API that allows zone objects (like locale_t objects for locales, but hopefully better-designed) and functions that take the zone as an argument. This would be a really good solution once it's established, but I don't know the status on it. Rich ^ permalink raw reply [flat|nested] 5+ messages in thread
* [musl] Re: strftime %Z behavior with manually populated struct tm 2020-08-10 13:31 [musl] strftime %Z behavior with manually populated struct tm Nikita Popov 2020-08-10 16:53 ` Rich Felker @ 2020-08-12 12:57 ` Nikita Popov 2020-08-12 14:05 ` Nikita Popov 2020-08-12 15:17 ` Rich Felker 1 sibling, 2 replies; 5+ messages in thread From: Nikita Popov @ 2020-08-12 12:57 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 1833 bytes --] > I haven't checked, but I believe most implementations just print the > zone name from the current timezone, using tm_isdst to decide whether > to print the standard or daylight version of the name. This is > insufficient with zoneinfo for zones where the name changed over time, > where it would print the wrong name for historical times. So instead > we support printing any one of the zone names from the current zone, > if the tm_zone member points to one of them, and blank otherwise. You are right. I was under the impression that glibc uses tm_zone, but double checking the implementation right now, it doesn't. So the behavioral discrepancy here comes from the fact that musl checks tm_zone at all, not the other way around. Sorry for looking in the completely wrong direction here. > I think you have the wrong commit link; that one does not look > related. If you have the real one I'd be interested in seeing what > they did. Indeed, I meant to link to https://github.com/python/cpython/commit/5633c4f342d957df2ef0d67b9bfb472a0d28a76b. What Python does is to retrieve musl's string for UTC using gmtime_r, as well as trying to match against the current tzname. > I think if applications want to use zones other than the actual > configured zone with strftime, they need to just do something like > expand the %Z themselves with the string they want before calling > strftime (note: this requires quoting any % in the name). With your explanation in mind, explicitly expanding %Z does seem like a good approach. I'll consider doing that. > I looked > hard for a better solution that wouldn't crash valid applications, and > couldn't find one at the time. Possibly it would make sense to fall back to using the current tzname if the pointer given in tm_zone is invalid, rather than an empty string? Regards, Nikita [-- Attachment #2: Type: text/html, Size: 2268 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* [musl] Re: strftime %Z behavior with manually populated struct tm 2020-08-12 12:57 ` [musl] " Nikita Popov @ 2020-08-12 14:05 ` Nikita Popov 2020-08-12 15:17 ` Rich Felker 1 sibling, 0 replies; 5+ messages in thread From: Nikita Popov @ 2020-08-12 14:05 UTC (permalink / raw) To: musl [-- Attachment #1: Type: text/plain, Size: 1429 bytes --] On Wed, Aug 12, 2020 at 2:57 PM Nikita Popov <nikita.ppv@gmail.com> wrote: > > I haven't checked, but I believe most implementations just print the > > zone name from the current timezone, using tm_isdst to decide whether > > to print the standard or daylight version of the name. This is > > insufficient with zoneinfo for zones where the name changed over time, > > where it would print the wrong name for historical times. So instead > > we support printing any one of the zone names from the current zone, > > if the tm_zone member points to one of them, and blank otherwise. > > You are right. I was under the impression that glibc uses tm_zone, but > double checking the implementation right now, it doesn't. So the behavioral > discrepancy here comes from the fact that musl checks tm_zone at all, not > the other way around. Sorry for looking in the completely wrong direction > here. > Nevermind, I just confused myself further here. glibc does use tm_zone, with the following comment: /* The POSIX test suite assumes that setting the environment variable TZ to a new value before calling strftime() will influence the result (the %Z format) even if the information in TP is computed with a totally different time zone. This is bogus: though POSIX allows bad behavior like this, POSIX does not require it. Do the right thing instead. */ zone = (const char *) tp->tm_zone; Regards, Nikita [-- Attachment #2: Type: text/html, Size: 1955 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [musl] Re: strftime %Z behavior with manually populated struct tm 2020-08-12 12:57 ` [musl] " Nikita Popov 2020-08-12 14:05 ` Nikita Popov @ 2020-08-12 15:17 ` Rich Felker 1 sibling, 0 replies; 5+ messages in thread From: Rich Felker @ 2020-08-12 15:17 UTC (permalink / raw) To: musl On Wed, Aug 12, 2020 at 02:57:08PM +0200, Nikita Popov wrote: > > I haven't checked, but I believe most implementations just print the > > zone name from the current timezone, using tm_isdst to decide whether > > to print the standard or daylight version of the name. This is > > insufficient with zoneinfo for zones where the name changed over time, > > where it would print the wrong name for historical times. So instead > > we support printing any one of the zone names from the current zone, > > if the tm_zone member points to one of them, and blank otherwise. > > You are right. I was under the impression that glibc uses tm_zone, but > double checking the implementation right now, it doesn't. So the behavioral > discrepancy here comes from the fact that musl checks tm_zone at all, not > the other way around. Sorry for looking in the completely wrong direction > here. > > > I think you have the wrong commit link; that one does not look > > related. If you have the real one I'd be interested in seeing what > > they did. > > Indeed, I meant to link to > https://github.com/python/cpython/commit/5633c4f342d957df2ef0d67b9bfb472a0d28a76b. > What Python does is to retrieve musl's string for UTC using gmtime_r, as > well as trying to match against the current tzname. > > > I think if applications want to use zones other than the actual > > configured zone with strftime, they need to just do something like > > expand the %Z themselves with the string they want before calling > > strftime (note: this requires quoting any % in the name). > > With your explanation in mind, explicitly expanding %Z does seem like a > good approach. I'll consider doing that. > > > I looked > > hard for a better solution that wouldn't crash valid applications, and > > couldn't find one at the time. > > Possibly it would make sense to fall back to using the current tzname if > the pointer given in tm_zone is invalid, rather than an empty string? This might make sense. I was thinking from a standpoint of most reasonable behavior for an uninitialized field, but but can always give junk anyway (if you happen to have the exact wrong pointer value that points to the current timezone's string area) so it's really not useful to care about that case except for it not crashing. Using tm_isdst and __timezone if the caller zeros the field would still be useful potentially. Rich ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-08-12 15:18 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-08-10 13:31 [musl] strftime %Z behavior with manually populated struct tm Nikita Popov 2020-08-10 16:53 ` Rich Felker 2020-08-12 12:57 ` [musl] " Nikita Popov 2020-08-12 14:05 ` Nikita Popov 2020-08-12 15:17 ` Rich Felker
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/musl/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).