mailing list of musl libc
 help / color / mirror / code / Atom feed
* Bugs in strftime
@ 2018-02-02 17:07 Dennis Wölfing
  2018-02-05 17:51 ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: Dennis Wölfing @ 2018-02-02 17:07 UTC (permalink / raw)
  To: musl

Hi,

I recently wrote a test for strftime and ran it on multiple
implementations. The source code of the test is available at [1]. The
test results (also for other implementations) are available at [2].

On musl my test currently reports 8 failures. These are caused by two bugs:

1. For the + flag, musl does currently only prefix the output by a plus
when the number without padding consumes more the 4 bytes (2 for %C).
However according to the POSIX standard, there should be a leading plus
when "the field being produced consumes more than four bytes" (2 for
%C). The padding is part of the field being produced.
So for example %+3C should always have a leading plus for any
non-negative years because the field then always has a width of at least
3 bytes. (There is also an example in the POSIX standard where "%+5Y"
produces "+0270" for the year 270.)

This bug is causing these failures:
> "%+3C": expected "+20", got "020"
> "%+11F": expected "+2016-01-03", got "02016-01-03"
> "%+5G": expected "+2015", got "02015"
> "%+5Y": expected "+2016", got "02016"
> "%+5Y": expected "+0000", got "00000"

2. %F produces incorrect results for the year 0 when a field width is
specified. It seems like in this case strftime does not output the year
and applies the padding to the month.

This bug is causing these failures:
> "%01F": expected "0-02-23", got "2-23"
> "%06F": expected "0-02-23", got "002-23"
> "%010F": expected "0000-02-23", got "0000002-23"

The tests were run on musl 1.1.18 on Alpine Linux.

[1] https://github.com/dennis95/dennix/blob/master/libc/test/test-strftime.c

[2]
https://gist.github.com/dennis95/b4869b5cbb3c21e15e409afb827354a5#file-musl-1-1-18-alpine-linux-x86_64


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bugs in strftime
  2018-02-02 17:07 Bugs in strftime Dennis Wölfing
@ 2018-02-05 17:51 ` Rich Felker
  2018-02-06 16:10   ` Dennis Wölfing
  2018-02-06 17:46   ` Rich Felker
  0 siblings, 2 replies; 7+ messages in thread
From: Rich Felker @ 2018-02-05 17:51 UTC (permalink / raw)
  To: musl

On Fri, Feb 02, 2018 at 06:07:38PM +0100, Dennis Wölfing wrote:
> Hi,
> 
> I recently wrote a test for strftime and ran it on multiple
> implementations. The source code of the test is available at [1]. The
> test results (also for other implementations) are available at [2].

Great! This has been a test deficiency we've had for a long time and
was a big part of what kept me from reviewing and merging recent
patches to strftime in a timely manner.

> On musl my test currently reports 8 failures. These are caused by two bugs:
> 
> 1. For the + flag, musl does currently only prefix the output by a plus
> when the number without padding consumes more the 4 bytes (2 for %C).
> However according to the POSIX standard, there should be a leading plus
> when "the field being produced consumes more than four bytes" (2 for
> %C). The padding is part of the field being produced.

I've actually discussed this before, being doubtful about whether the
current behavior was correct, but was unable to find any authoritative
interpretation. Do you know if there is one?

> So for example %+3C should always have a leading plus for any
> non-negative years because the field then always has a width of at least
> 3 bytes. (There is also an example in the POSIX standard where "%+5Y"
> produces "+0270" for the year 270.)

While rationale is not itself authoritative, it looks like it contains
enough information to differentiate the intent of the ambiguous text.
Thanks!

> This bug is causing these failures:
> > "%+3C": expected "+20", got "020"
> > "%+11F": expected "+2016-01-03", got "02016-01-03"
> > "%+5G": expected "+2015", got "02015"
> > "%+5Y": expected "+2016", got "02016"
> > "%+5Y": expected "+0000", got "00000"

I'll need to look over how to change the logic to match the desired
behavior but it shouldn't be hard.

> 2. %F produces incorrect results for the year 0 when a field width is
> specified. It seems like in this case strftime does not output the year
> and applies the padding to the month.

Ah, it's the code in the top-level function that strips leading sign
and zeros to do the padding:

	for (; *t=='+' || *t=='-' || (*t=='0'&&t[1]); t++, k--);

I think instead if should do something like:

	if (*t=='+' || *t=='-') t++, k--;
	for (; (*t=='0'&&t[1]); t++, k--);

In other words, only strip + or - from the first character, not later
in the string.

> This bug is causing these failures:
> > "%01F": expected "0-02-23", got "2-23"
> > "%06F": expected "0-02-23", got "002-23"
> > "%010F": expected "0000-02-23", got "0000002-23"
> 
> The tests were run on musl 1.1.18 on Alpine Linux.
> 
> [1] https://github.com/dennis95/dennix/blob/master/libc/test/test-strftime.c
> 
> [2]
> https://gist.github.com/dennis95/b4869b5cbb3c21e15e409afb827354a5#file-musl-1-1-18-alpine-linux-x86_64

Thanks again for doing this testing and reporting it. Would you be
interested in helping get these tests into our libc-test package?

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bugs in strftime
  2018-02-05 17:51 ` Rich Felker
@ 2018-02-06 16:10   ` Dennis Wölfing
  2018-02-06 17:30     ` Rich Felker
  2018-02-06 17:46   ` Rich Felker
  1 sibling, 1 reply; 7+ messages in thread
From: Dennis Wölfing @ 2018-02-06 16:10 UTC (permalink / raw)
  To: musl

On 05.02.2018 18:51, Rich Felker wrote:
> I've actually discussed this before, being doubtful about whether the
> current behavior was correct, but was unable to find any authoritative
> interpretation. Do you know if there is one?

Unfortunately I don't know of any.

I don't think that the standard explicitly defines what "field" means.
However the standard also uses the term "minimum field width".
It would be weird to interpret the text in a way that "minimum field
width" refers to a different "field" than "the field being produced".

> Thanks again for doing this testing and reporting it. Would you be
> interested in helping get these tests into our libc-test package?

Sure. What do I need to do for that?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bugs in strftime
  2018-02-06 16:10   ` Dennis Wölfing
@ 2018-02-06 17:30     ` Rich Felker
  0 siblings, 0 replies; 7+ messages in thread
From: Rich Felker @ 2018-02-06 17:30 UTC (permalink / raw)
  To: musl

On Tue, Feb 06, 2018 at 05:10:02PM +0100, Dennis Wölfing wrote:
> On 05.02.2018 18:51, Rich Felker wrote:
> > I've actually discussed this before, being doubtful about whether the
> > current behavior was correct, but was unable to find any authoritative
> > interpretation. Do you know if there is one?
> 
> Unfortunately I don't know of any.

OK, I guess the examples will have to suffice.

> I don't think that the standard explicitly defines what "field" means.
> However the standard also uses the term "minimum field width".
> It would be weird to interpret the text in a way that "minimum field
> width" refers to a different "field" than "the field being produced".
> 
> > Thanks again for doing this testing and reporting it. Would you be
> > interested in helping get these tests into our libc-test package?
> 
> Sure. What do I need to do for that?

The repo is here:

http://nsz.repo.hu/git/?p=libc-test

Patches for it are generally just sent to this list. Szabolcs Nagy is
the maintainer.

Rich


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bugs in strftime
  2018-02-05 17:51 ` Rich Felker
  2018-02-06 16:10   ` Dennis Wölfing
@ 2018-02-06 17:46   ` Rich Felker
  2018-02-06 20:24     ` Dennis Wölfing
  1 sibling, 1 reply; 7+ messages in thread
From: Rich Felker @ 2018-02-06 17:46 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 614 bytes --]

On Mon, Feb 05, 2018 at 12:51:24PM -0500, Rich Felker wrote:
> > This bug is causing these failures:
> > > "%+3C": expected "+20", got "020"
> > > "%+11F": expected "+2016-01-03", got "02016-01-03"
> > > "%+5G": expected "+2015", got "02015"
> > > "%+5Y": expected "+2016", got "02016"
> > > "%+5Y": expected "+0000", got "00000"
> 
> I'll need to look over how to change the logic to match the desired
> behavior but it shouldn't be hard.

Attaching a patch I intend to push if I don't find any problems, and
the one I've already got queued up for the %F issue. Let me know if
you see any issues with them.

Rich

[-- Attachment #2: 0001-fix-strftime-field-widths-with-F-format-and-zero-yea.patch --]
[-- Type: text/plain, Size: 1250 bytes --]

From 596207aa38b0db33f222c9924a1310fee3de88b5 Mon Sep 17 00:00:00 2001
From: Rich Felker <dalias@aerifal.cx>
Date: Mon, 5 Feb 2018 13:36:04 -0500
Subject: [PATCH 1/2] fix strftime field widths with %F format and zero year
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

the code to strip initial sign and leading zeros inadvertently
stripped all the zeros and the subsequent '-' separating the month.
instead, only strip sign characters from the very first position, and
only strip zeros when they are followed by another digit.

based on testing by Dennis Wölfing.
---
 src/time/strftime.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/time/strftime.c b/src/time/strftime.c
index d1ca7ca..16b3bb2 100644
--- a/src/time/strftime.c
+++ b/src/time/strftime.c
@@ -251,7 +251,8 @@ size_t __strftime_l(char *restrict s, size_t n, const char *restrict f, const st
 		t = __strftime_fmt_1(&buf, &k, *f, tm, loc, pad);
 		if (!t) break;
 		if (width) {
-			for (; *t=='+' || *t=='-' || (*t=='0'&&t[1]); t++, k--);
+			if (*t=='+' || *t=='-') t++, k--;
+			for (; *t=='0' && t[1]-'0'<10U; t++, k--);
 			width--;
 			if (plus && tm->tm_year >= 10000-1900)
 				s[l++] = '+';
-- 
2.10.0


[-- Attachment #3: 0002-adjust-strftime-modifier-to-match-apparent-intent-of.patch --]
[-- Type: text/plain, Size: 2043 bytes --]

From c7f0da4134d4e7f2efd295e7fb738c65c469fbd1 Mon Sep 17 00:00:00 2001
From: Rich Felker <dalias@aerifal.cx>
Date: Tue, 6 Feb 2018 12:31:06 -0500
Subject: [PATCH 2/2] adjust strftime + modifier to match apparent intent of
 POSIX
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

it's unclear from the specification whether the word "consumes" in
"consumes more than four bytes to represent a year" refers just to
significant places or includes leading zeros due to field width
padding. however the examples in the rationale indicate that the
latter was the intent. in particular, the year 270 is shown being
formatted by %+5Y as +0270 rather than 00270.

previously '+' prefixing was implemented just by comparing the year
against 10000. instead, count the number of significant digits and
padding bytes to be added, and use the total to determine whether to
apply the '+' prefix.

based on testing by Dennis Wölfing.
---
 src/time/strftime.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/src/time/strftime.c b/src/time/strftime.c
index 16b3bb2..708875e 100644
--- a/src/time/strftime.c
+++ b/src/time/strftime.c
@@ -251,15 +251,21 @@ size_t __strftime_l(char *restrict s, size_t n, const char *restrict f, const st
 		t = __strftime_fmt_1(&buf, &k, *f, tm, loc, pad);
 		if (!t) break;
 		if (width) {
+			/* Trim off any sign and leading zeros, then
+			 * count remaining digits to determine behavior
+			 * for the + flag. */
 			if (*t=='+' || *t=='-') t++, k--;
 			for (; *t=='0' && t[1]-'0'<10U; t++, k--);
-			width--;
-			if (plus && tm->tm_year >= 10000-1900)
-				s[l++] = '+';
-			else if (tm->tm_year < -1900)
+			if (width < k) width = k;
+			size_t d;
+			for (d=0; t[d]-'0'<10U; d++);
+			if (tm->tm_year < -1900) {
 				s[l++] = '-';
-			else
-				width++;
+				width--;
+			} else if (plus && d+(width-k) >= (*p=='C'?3:5)) {
+				s[l++] = '+';
+				width--;
+			}
 			for (; width > k && l < n; width--)
 				s[l++] = '0';
 		}
-- 
2.10.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bugs in strftime
  2018-02-06 17:46   ` Rich Felker
@ 2018-02-06 20:24     ` Dennis Wölfing
  2018-02-07 20:35       ` Szabolcs Nagy
  0 siblings, 1 reply; 7+ messages in thread
From: Dennis Wölfing @ 2018-02-06 20:24 UTC (permalink / raw)
  To: musl

[-- Attachment #1: Type: text/plain, Size: 455 bytes --]

On 06.02.2018 18:46, Rich Felker wrote:
> Attaching a patch I intend to push if I don't find any problems, and
> the one I've already got queued up for the %F issue. Let me know if
> you see any issues with them.

With these two patches applied I no longer see any failures.

I have attached the patch for libc-test.
The test in the patch now uses the test.h header from libc-test and is
now indented by tabs. Otherwise the test is still mostly the same.

[-- Attachment #2: 0001-add-test-for-strftime.patch --]
[-- Type: text/x-patch, Size: 5709 bytes --]

From 11bc9641b443162afb336d7d9635c9ec16c51c60 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dennis=20W=C3=B6lfing?= <denniswoelfing@gmx.de>
Date: Tue, 6 Feb 2018 20:54:00 +0100
Subject: [PATCH] add test for strftime

---
 src/functional/strftime.c | 181 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 181 insertions(+)
 create mode 100644 src/functional/strftime.c

diff --git a/src/functional/strftime.c b/src/functional/strftime.c
new file mode 100644
index 0000000..25e1172
--- /dev/null
+++ b/src/functional/strftime.c
@@ -0,0 +1,181 @@
+#include <limits.h>
+#include <stdlib.h>
+#include <string.h>
+#include <time.h>
+#include "test.h"
+
+static char buffer[100];
+
+static void checkStrftime(const char* format, const struct tm* tm,
+		const char* expected) {
+	size_t resultLength = strftime(buffer, sizeof(buffer), format, tm);
+
+	if (resultLength != 0 && strcmp(buffer, expected) != 0) {
+		t_error("\"%s\": expected \"%s\", got \"%s\"\n", format, expected, buffer);
+	} else if (resultLength == 0 && strlen(expected) != 0) {
+		t_error("\"%s\": expected \"%s\", got nothing\n", format, expected);
+	}
+}
+
+static struct tm tm1 = {
+	.tm_sec = 45,
+	.tm_min = 23,
+	.tm_hour = 13,
+	.tm_mday = 3,
+	.tm_mon = 0,
+	.tm_year = 2016 - 1900,
+	.tm_wday = 0,
+	.tm_yday = 2,
+	.tm_isdst = 0
+};
+
+static struct tm tm2 = {
+	.tm_sec = 53,
+	.tm_min = 17,
+	.tm_hour = 5,
+	.tm_mday = 5,
+	.tm_mon = 0,
+	.tm_year = 10009 - 1900,
+	.tm_wday = 1,
+	.tm_yday = 4,
+	.tm_isdst = 0
+};
+
+static struct tm tm3 = {
+	.tm_sec = 0,
+	.tm_min = 0,
+	.tm_hour = 12,
+	.tm_mday = 23,
+	.tm_mon = 1,
+	.tm_year = 0 - 1900,
+	.tm_wday = 3,
+	.tm_yday = 53,
+	.tm_isdst = 0
+};
+
+static struct tm tm4 = {
+	.tm_sec = 0,
+	.tm_min = 0,
+	.tm_hour = 0,
+	.tm_mday = 1,
+	.tm_mon = 0,
+	.tm_year = -123 - 1900,
+	.tm_wday = 1,
+	.tm_yday = 0,
+	.tm_isdst = 0
+};
+
+static struct tm tm5 = {
+	.tm_sec = 0,
+	.tm_min = 0,
+	.tm_hour = 0,
+	.tm_mday = 1,
+	.tm_mon = 0,
+	.tm_year = INT_MAX,
+	.tm_wday = 3,
+	.tm_yday = 0,
+	.tm_isdst = 0
+};
+
+int main() {
+	setenv("TZ", "UTC0", 1);
+
+	checkStrftime("%c", &tm1, "Sun Jan  3 13:23:45 2016");
+	checkStrftime("%c", &tm2, "Mon Jan  5 05:17:53 +10009");
+	checkStrftime("%c", &tm3, "Wed Feb 23 12:00:00 0000");
+
+	// The POSIX.1-2008 standard does not specify the padding character for
+	// "%C". The C standard requires that the number is padded by '0'.
+	// See also http://austingroupbugs.net/view.php?id=1184
+	checkStrftime("%C", &tm1, "20");
+	checkStrftime("%03C", &tm1, "020");
+	checkStrftime("%+3C", &tm1, "+20");
+	checkStrftime("%C", &tm2, "100");
+	checkStrftime("%C", &tm3, "00");
+	checkStrftime("%01C", &tm3, "0");
+
+	checkStrftime("%F", &tm1, "2016-01-03");
+	checkStrftime("%012F", &tm1, "002016-01-03");
+	checkStrftime("%+10F", &tm1, "2016-01-03");
+	checkStrftime("%+11F", &tm1, "+2016-01-03");
+	checkStrftime("%F", &tm2, "+10009-01-05");
+	checkStrftime("%011F", &tm2, "10009-01-05");
+	checkStrftime("%F", &tm3, "0000-02-23");
+	checkStrftime("%01F", &tm3, "0-02-23");
+	checkStrftime("%06F", &tm3, "0-02-23");
+	checkStrftime("%010F", &tm3, "0000-02-23");
+	checkStrftime("%F", &tm4, "-123-01-01");
+	checkStrftime("%011F", &tm4, "-0123-01-01");
+
+	checkStrftime("%g", &tm1, "15");
+	checkStrftime("%g", &tm2, "09");
+
+	checkStrftime("%G", &tm1, "2015");
+	checkStrftime("%+5G", &tm1, "+2015");
+	checkStrftime("%04G", &tm2, "10009");
+
+	checkStrftime("%r", &tm1, "01:23:45 PM");
+	checkStrftime("%r", &tm2, "05:17:53 AM");
+	checkStrftime("%r", &tm3, "12:00:00 PM");
+	checkStrftime("%r", &tm4, "12:00:00 AM");
+
+	// The "%s" specifier was accepted by the Austin Group for the next POSIX.1
+	// revision. See http://austingroupbugs.net/view.php?id=169
+	checkStrftime("%s", &tm1, "1451827425");
+	if (sizeof(time_t) * CHAR_BIT >= 64) {
+		checkStrftime("%s", &tm2, "253686748673");
+	}
+
+	checkStrftime("%T", &tm1, "13:23:45");
+	checkStrftime("%T", &tm2, "05:17:53");
+	checkStrftime("%T", &tm3, "12:00:00");
+	checkStrftime("%T", &tm4, "00:00:00");
+
+	checkStrftime("%U", &tm1, "01");
+	checkStrftime("%U", &tm2, "01");
+	checkStrftime("%U", &tm3, "08");
+
+	checkStrftime("%V", &tm1, "53");
+	checkStrftime("%V", &tm2, "02");
+	checkStrftime("%V", &tm3, "08");
+
+	checkStrftime("%W", &tm1, "00");
+	checkStrftime("%W", &tm2, "01");
+	checkStrftime("%W", &tm3, "08");
+
+	checkStrftime("%x", &tm1, "01/03/16");
+	checkStrftime("%X", &tm1, "13:23:45");
+	checkStrftime("%y", &tm1, "16");
+
+	// There is no standard that explicitly specifies the exact format of "%Y".
+	// The C standard says that "%F" is equivalent to "%Y-%m-%d". The
+	// POSIX.1-2008 standard says that "%F" is equivalent to "%+4Y-%m-%d".
+	// This implies that to conform to both standards "%Y" needs to be
+	// equivalent to "%+4Y".
+	// See also http://austingroupbugs.net/view.php?id=739
+	checkStrftime("%Y", &tm1, "2016");
+	checkStrftime("%05Y", &tm1, "02016");
+	checkStrftime("%+4Y", &tm1, "2016");
+	checkStrftime("%+5Y", &tm1, "+2016");
+	checkStrftime("%Y", &tm2, "+10009");
+	checkStrftime("%05Y", &tm2, "10009");
+	checkStrftime("%Y", &tm3, "0000");
+	checkStrftime("%02Y", &tm3, "00");
+	checkStrftime("%+5Y", &tm3, "+0000");
+	checkStrftime("%Y", &tm4, "-123");
+	checkStrftime("%+4Y", &tm4, "-123");
+	checkStrftime("%+5Y", &tm4, "-0123");
+
+	if (INT_MAX == 0x7FFFFFFF) {
+		// The standard does not specify any range for tm_year, so INT_MAX
+		// should be valid.
+		checkStrftime("%y", &tm5, "47");
+		checkStrftime("%Y", &tm5, "+2147485547");
+		checkStrftime("%011Y", &tm5, "02147485547");
+		if (sizeof(time_t) * CHAR_BIT >= 64) {
+			checkStrftime("%s", &tm5, "67768036160140800");
+		}
+	}
+
+	return t_status;
+}
-- 
2.15.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bugs in strftime
  2018-02-06 20:24     ` Dennis Wölfing
@ 2018-02-07 20:35       ` Szabolcs Nagy
  0 siblings, 0 replies; 7+ messages in thread
From: Szabolcs Nagy @ 2018-02-07 20:35 UTC (permalink / raw)
  To: musl

* Dennis Wölfing <denniswoelfing@gmx.de> [2018-02-06 21:24:12 +0100]:
> On 06.02.2018 18:46, Rich Felker wrote:
> > Attaching a patch I intend to push if I don't find any problems, and
> > the one I've already got queued up for the %F issue. Let me know if
> > you see any issues with them.
> 
> With these two patches applied I no longer see any failures.
> 
> I have attached the patch for libc-test.
> The test in the patch now uses the test.h header from libc-test and is
> now indented by tabs. Otherwise the test is still mostly the same.

thanks, applied


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-02-07 20:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-02 17:07 Bugs in strftime Dennis Wölfing
2018-02-05 17:51 ` Rich Felker
2018-02-06 16:10   ` Dennis Wölfing
2018-02-06 17:30     ` Rich Felker
2018-02-06 17:46   ` Rich Felker
2018-02-06 20:24     ` Dennis Wölfing
2018-02-07 20:35       ` Szabolcs Nagy

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/musl/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).