From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from scc-mailout.scc.kit.edu (scc-mailout.scc.kit.edu [129.13.185.202]) by krisdoz.my.domain (8.14.3/8.14.3) with ESMTP id p24MOfbr030911 for ; Fri, 4 Mar 2011 17:24:43 -0500 (EST) Received: from hekate.usta.de (asta-nat.asta.uni-karlsruhe.de [172.22.63.82]) by scc-mailout-02.scc.kit.edu with esmtp (Exim 4.72 #1) id 1PvdQM-0004cj-3i; Fri, 04 Mar 2011 23:24:39 +0100 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.72) (envelope-from ) id 1PvdQM-0006u9-4Y; Fri, 04 Mar 2011 23:24:38 +0100 Received: from iris.usta.de ([172.24.96.5] helo=usta.de) by donnerwolke.usta.de with esmtp (Exim 4.69) (envelope-from ) id 1PvdQf-0005Qh-Or; Fri, 04 Mar 2011 23:24:57 +0100 Received: from schwarze by usta.de with local (Exim 4.72) (envelope-from ) id 1PvdQL-0002Tg-RY; Fri, 04 Mar 2011 23:24:37 +0100 Date: Fri, 4 Mar 2011 23:24:37 +0100 From: Ingo Schwarze To: tech@mdocml.bsd.lv Cc: Tim van der Molen , jmc@openbsd.org Subject: Re: clean up date handling Message-ID: <20110304222437.GA21838@iris.usta.de> References: <20110304001451.GA8690@iris.usta.de> X-Mailinglist: mdocml-tech Reply-To: tech@mdocml.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110304001451.GA8690@iris.usta.de> User-Agent: Mutt/1.5.21 (2010-09-15) Hi, replying to myself... > Tim recently noticed inconsistencies between mandoc date handling > and what the manuals say about it. Having a closer look, i came > to the conclusion that this is one of the rare fields where > compatibility with new groff really makes no sense because groff > makes too many bad decisions. Besides, it is well-known that > this area is one of the worst sources of warnings, and i feel > these warnings are not really useful. > > So, here is a radical cleanup of date handling (-99 +69). Updated version -97 +85. > Basically, it store all dates as strings everywhere, which simplifies > handling a lot, in particular in the frontends, and it tries the three > most common formats everywhere; in case of mismatch it just copies the > date string verbatim, or if there is no string at all, it uses the > current date. > > This needs more testing, but i won't continue on it tonight, so i'm > showing it around for comments. > > Note that i removed all warnings. Maybe that's ok. > Maybe we should sparingly restore a few. > I'm going to reconsider that aspect. I have restored some warnings - both those issued when the date is missing completely and those issued when it cannot be parsed, but i suggest accepting all formats everywhere to reduce noise. It's hard to see where exactly date format rules come from, and practical usage varies wildly. In case we find other common input formats in the wild, we can simply add them to mandoc_normdate(). This is now tested as well: Both systematically exercising all code paths and running a build of all OpenBSD manuals. OK to commit? > I'm planning to look at the manuals once we agree about the > functionality we want. I'm now going to look at the manuals to document the changes, as far as that's required. Yours, Ingo P.S. By the way, Tim's original suggestion to remove ewarn_ge1 from posts_dd is now contained in the diff as well,-) because the warning about a missing date is now handled in mandoc_normdate(). Index: libmandoc.h =================================================================== RCS file: /cvs/src/usr.bin/mandoc/libmandoc.h,v retrieving revision 1.8 diff -u -p -r1.8 libmandoc.h --- libmandoc.h 3 Jan 2011 22:27:21 -0000 1.8 +++ libmandoc.h 4 Mar 2011 22:07:55 -0000 @@ -25,11 +25,7 @@ char *mandoc_strdup(const char *); void *mandoc_malloc(size_t); void *mandoc_realloc(void *, size_t); char *mandoc_getarg(char **, mandocmsg, void *, int, int *); -time_t mandoc_a2time(int, const char *); -#define MTIME_CANONICAL (1 << 0) -#define MTIME_REDUCED (1 << 1) -#define MTIME_MDOCDATE (1 << 2) -#define MTIME_ISO_8601 (1 << 3) +char *mandoc_normdate(char *, mandocmsg, void *, int, int); int mandoc_eos(const char *, size_t, int); int mandoc_hyph(const char *, const char *); Index: main.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/main.c,v retrieving revision 1.72 diff -u -p -r1.72 main.c --- main.c 6 Feb 2011 17:33:20 -0000 1.72 +++ main.c 4 Mar 2011 22:07:56 -0000 @@ -113,7 +113,8 @@ static const char * const mandocerrs[MAN "no title in document", "document title should be all caps", "unknown manual section", - "cannot parse date argument", + "date missing, using today's date", + "cannot parse date, using it verbatim", "prologue macros out of order", "duplicate prologue macro", "macro not allowed in prologue", Index: man.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/man.c,v retrieving revision 1.55 diff -u -p -r1.55 man.c --- man.c 10 Feb 2011 00:06:30 -0000 1.55 +++ man.c 4 Mar 2011 22:07:56 -0000 @@ -142,8 +142,8 @@ man_free1(struct man *man) free(man->meta.title); if (man->meta.source) free(man->meta.source); - if (man->meta.rawdate) - free(man->meta.rawdate); + if (man->meta.date) + free(man->meta.date); if (man->meta.vol) free(man->meta.vol); if (man->meta.msec) Index: man.h =================================================================== RCS file: /cvs/src/usr.bin/mandoc/man.h,v retrieving revision 1.34 diff -u -p -r1.34 man.h --- man.h 16 Jan 2011 02:56:47 -0000 1.34 +++ man.h 4 Mar 2011 22:07:56 -0000 @@ -75,8 +75,7 @@ enum man_type { */ struct man_meta { char *msec; /* `TH' section (1, 3p, etc.) */ - time_t date; /* `TH' normalised date */ - char *rawdate; /* raw `TH' date */ + char *date; /* `TH' normalised date */ char *vol; /* `TH' volume */ char *title; /* `TH' title (e.g., FOO) */ char *source; /* `TH' source (e.g., GNU) */ Index: man_html.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/man_html.c,v retrieving revision 1.34 diff -u -p -r1.34 man_html.c --- man_html.c 17 Jan 2011 00:15:19 -0000 1.34 +++ man_html.c 4 Mar 2011 22:07:56 -0000 @@ -330,12 +330,6 @@ man_root_post(MAN_ARGS) { struct htmlpair tag[3]; struct tag *t, *tt; - char b[DATESIZ]; - - if (m->rawdate) - strlcpy(b, m->rawdate, DATESIZ); - else - time2a(m->date, b, DATESIZ); PAIR_SUMMARY_INIT(&tag[0], "Document Footer"); PAIR_CLASS_INIT(&tag[1], "foot"); @@ -353,7 +347,7 @@ man_root_post(MAN_ARGS) PAIR_CLASS_INIT(&tag[0], "foot-date"); print_otag(h, TAG_TD, 1, tag); - print_text(h, b); + print_text(h, m->date); print_stagq(h, tt); PAIR_CLASS_INIT(&tag[0], "foot-os"); Index: man_term.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/man_term.c,v retrieving revision 1.64 diff -u -p -r1.64 man_term.c --- man_term.c 25 Jan 2011 12:35:07 -0000 1.64 +++ man_term.c 4 Mar 2011 22:07:56 -0000 @@ -939,24 +939,18 @@ print_man_nodelist(DECL_ARGS) static void print_man_foot(struct termp *p, const void *arg) { - char buf[DATESIZ]; const struct man_meta *meta; meta = (const struct man_meta *)arg; term_fontrepl(p, TERMFONT_NONE); - if (meta->rawdate) - strlcpy(buf, meta->rawdate, DATESIZ); - else - time2a(meta->date, buf, DATESIZ); - term_vspace(p); term_vspace(p); term_vspace(p); p->flags |= TERMP_NOSPACE | TERMP_NOBREAK; - p->rmargin = p->maxrmargin - term_strlen(p, buf); + p->rmargin = p->maxrmargin - term_strlen(p, meta->date); p->offset = 0; /* term_strlen() can return zero. */ @@ -974,7 +968,7 @@ print_man_foot(struct termp *p, const vo p->rmargin = p->maxrmargin; p->flags &= ~TERMP_NOBREAK; - term_word(p, buf); + term_word(p, meta->date); term_flushln(p); } Index: man_validate.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/man_validate.c,v retrieving revision 1.40 diff -u -p -r1.40 man_validate.c --- man_validate.c 17 Jan 2011 00:15:19 -0000 1.40 +++ man_validate.c 4 Mar 2011 22:07:56 -0000 @@ -187,8 +187,9 @@ check_root(CHKARGS) */ m->meta.title = mandoc_strdup("unknown"); - m->meta.date = time(NULL); m->meta.msec = mandoc_strdup("1"); + m->meta.date = mandoc_normdate(NULL, + m->msg, m->data, n->line, n->pos); } return(1); @@ -370,6 +371,7 @@ static int post_TH(CHKARGS) { const char *p; + int line, pos; if (m->meta.title) free(m->meta.title); @@ -379,12 +381,13 @@ post_TH(CHKARGS) free(m->meta.source); if (m->meta.msec) free(m->meta.msec); - if (m->meta.rawdate) - free(m->meta.rawdate); + if (m->meta.date) + free(m->meta.date); - m->meta.title = m->meta.vol = m->meta.rawdate = + line = n->line; + pos = n->pos; + m->meta.title = m->meta.vol = m->meta.date = m->meta.msec = m->meta.source = NULL; - m->meta.date = 0; /* ->TITLE<- MSEC DATE SOURCE VOL */ @@ -412,24 +415,12 @@ post_TH(CHKARGS) /* TITLE MSEC ->DATE<- SOURCE VOL */ - /* - * Try to parse the date. If this works, stash the epoch (this - * is optimal because we can reformat it in the canonical form). - * If it doesn't parse, isn't specified at all, or is an empty - * string, then use the current date. - */ - if (n) n = n->next; - if (n && n->string && *n->string) { - m->meta.date = mandoc_a2time - (MTIME_ISO_8601, n->string); - if (0 == m->meta.date) { - man_nmsg(m, n, MANDOCERR_BADDATE); - m->meta.rawdate = mandoc_strdup(n->string); - } - } else - m->meta.date = time(NULL); + if (n) + pos = n->pos; + m->meta.date = mandoc_normdate(n ? n->string : NULL, + m->msg, m->data, line, pos); /* TITLE MSEC DATE ->SOURCE<- VOL */ Index: mandoc.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/mandoc.c,v retrieving revision 1.21 diff -u -p -r1.21 mandoc.c --- mandoc.c 3 Jan 2011 22:27:21 -0000 1.21 +++ mandoc.c 4 Mar 2011 22:07:56 -0000 @@ -27,8 +27,10 @@ #include "mandoc.h" #include "libmandoc.h" -static int a2time(time_t *, const char *, const char *); +#define DATESIZE 32 +static int a2time(time_t *, const char *, const char *); +static char *time2a(time_t); int mandoc_special(char *p) @@ -376,38 +378,55 @@ a2time(time_t *t, const char *fmt, const } -/* - * Convert from a manual date string (see mdoc(7) and man(7)) into a - * date according to the stipulated date type. - */ -time_t -mandoc_a2time(int flags, const char *p) +static char * +time2a(time_t t) +{ + struct tm tm; + char buf[DATESIZE]; + char *p; + size_t nsz, rsz; + int isz; + + localtime_r(&t, &tm); + + p = buf; + rsz = DATESIZE; + + if (0 == (nsz = strftime(p, rsz, "%B ", &tm))) + return(NULL); + + p += (int)nsz; + rsz -= nsz; + + if (-1 == (isz = snprintf(p, rsz, "%d, ", tm.tm_mday))) + return(NULL); + + p += isz; + rsz -= isz; + + return(strftime(p, rsz, "%Y", &tm) ? buf : NULL); +} + + +char * +mandoc_normdate(char *in, mandocmsg msg, void *data, int ln, int pos) { + char *out; time_t t; - if (MTIME_MDOCDATE & flags) { - if (0 == strcmp(p, "$" "Mdocdate$")) - return(time(NULL)); - if (a2time(&t, "$" "Mdocdate: %b %d %Y $", p)) - return(t); - } - - if (MTIME_CANONICAL & flags || MTIME_REDUCED & flags) - if (a2time(&t, "%b %d, %Y", p)) - return(t); - - if (MTIME_ISO_8601 & flags) - if (a2time(&t, "%Y-%m-%d", p)) - return(t); - - if (MTIME_REDUCED & flags) { - if (a2time(&t, "%d, %Y", p)) - return(t); - if (a2time(&t, "%Y", p)) - return(t); + if (NULL == in || '\0' == *in || + 0 == strcmp(in, "$" "Mdocdate$")) { + (*msg)(MANDOCERR_NODATE, data, ln, pos, NULL); + time(&t); + } + else if (!a2time(&t, "$" "Mdocdate: %b %d %Y $", in) && + !a2time(&t, "%b %d, %Y", in) && + !a2time(&t, "%Y-%m-%d", in)) { + (*msg)(MANDOCERR_BADDATE, data, ln, pos, NULL); + t = 0; } - - return(0); + out = t ? time2a(t) : NULL; + return(mandoc_strdup(out ? out : in)); } Index: mandoc.h =================================================================== RCS file: /cvs/src/usr.bin/mandoc/mandoc.h,v retrieving revision 1.33 diff -u -p -r1.33 mandoc.h --- mandoc.h 10 Feb 2011 00:06:30 -0000 1.33 +++ mandoc.h 4 Mar 2011 22:07:56 -0000 @@ -50,7 +50,8 @@ enum mandocerr { MANDOCERR_NOTITLE, /* no title in document */ MANDOCERR_UPPERCASE, /* document title should be all caps */ MANDOCERR_BADMSEC, /* unknown manual section */ - MANDOCERR_BADDATE, /* cannot parse date argument */ + MANDOCERR_NODATE, /* date missing, using today's date */ + MANDOCERR_BADDATE, /* cannot parse date, using it verbatim */ MANDOCERR_PROLOGOOO, /* prologue macros out of order */ MANDOCERR_PROLOGREP, /* duplicate prologue macro */ MANDOCERR_BADPROLOG, /* macro not allowed in prologue */ Index: mdoc.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/mdoc.c,v retrieving revision 1.79 diff -u -p -r1.79 mdoc.c --- mdoc.c 10 Feb 2011 00:06:30 -0000 1.79 +++ mdoc.c 4 Mar 2011 22:07:57 -0000 @@ -314,8 +314,9 @@ mdoc_macro(MACRO_PROT_ARGS) m->meta.vol = mandoc_strdup("LOCAL"); if (NULL == m->meta.os) m->meta.os = mandoc_strdup("LOCAL"); - if (0 == m->meta.date) - m->meta.date = time(NULL); + if (NULL == m->meta.date) + m->meta.date = mandoc_normdate(NULL, + m->msg, m->data, line, ppos); m->flags |= MDOC_PBODY; } Index: mdoc.h =================================================================== RCS file: /cvs/src/usr.bin/mandoc/mdoc.h,v retrieving revision 1.43 diff -u -p -r1.43 mdoc.h --- mdoc.h 30 Jan 2011 17:41:59 -0000 1.43 +++ mdoc.h 4 Mar 2011 22:07:57 -0000 @@ -231,7 +231,7 @@ struct mdoc_meta { char *msec; /* `Dt' section (1, 3p, etc.) */ char *vol; /* `Dt' volume (implied) */ char *arch; /* `Dt' arch (i386, etc.) */ - time_t date; /* `Dd' normalised date */ + char *date; /* `Dd' normalised date */ char *title; /* `Dt' title (FOO, etc.) */ char *os; /* `Os' system (OpenBSD, etc.) */ char *name; /* leading `Nm' name */ Index: mdoc_html.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/mdoc_html.c,v retrieving revision 1.52 diff -u -p -r1.52 mdoc_html.c --- mdoc_html.c 6 Feb 2011 22:56:45 -0000 1.52 +++ mdoc_html.c 4 Mar 2011 22:07:57 -0000 @@ -488,9 +488,6 @@ mdoc_root_post(MDOC_ARGS) { struct htmlpair tag[3]; struct tag *t, *tt; - char b[DATESIZ]; - - time2a(m->date, b, DATESIZ); PAIR_SUMMARY_INIT(&tag[0], "Document Footer"); PAIR_CLASS_INIT(&tag[1], "foot"); @@ -510,7 +507,7 @@ mdoc_root_post(MDOC_ARGS) PAIR_CLASS_INIT(&tag[0], "foot-date"); print_otag(h, TAG_TD, 1, tag); - print_text(h, b); + print_text(h, m->date); print_stagq(h, tt); PAIR_CLASS_INIT(&tag[0], "foot-os"); Index: mdoc_term.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/mdoc_term.c,v retrieving revision 1.129 diff -u -p -r1.129 mdoc_term.c --- mdoc_term.c 6 Feb 2011 22:56:45 -0000 1.129 +++ mdoc_term.c 4 Mar 2011 22:07:57 -0000 @@ -404,7 +404,6 @@ print_mdoc_node(DECL_ARGS) static void print_mdoc_foot(struct termp *p, const void *arg) { - char buf[DATESIZ], os[BUFSIZ]; const struct mdoc_meta *m; m = (const struct mdoc_meta *)arg; @@ -419,24 +418,21 @@ print_mdoc_foot(struct termp *p, const v * SYSTEM DATE SYSTEM */ - time2a(m->date, buf, DATESIZ); - strlcpy(os, m->os, BUFSIZ); - term_vspace(p); p->offset = 0; p->rmargin = (p->maxrmargin - - term_strlen(p, buf) + term_len(p, 1)) / 2; + term_strlen(p, m->date) + term_len(p, 1)) / 2; p->flags |= TERMP_NOSPACE | TERMP_NOBREAK; - term_word(p, os); + term_word(p, m->os); term_flushln(p); p->offset = p->rmargin; - p->rmargin = p->maxrmargin - term_strlen(p, os); + p->rmargin = p->maxrmargin - term_strlen(p, m->os); p->flags |= TERMP_NOLPAD | TERMP_NOSPACE; - term_word(p, buf); + term_word(p, m->date); term_flushln(p); p->offset = p->rmargin; @@ -444,7 +440,7 @@ print_mdoc_foot(struct termp *p, const v p->flags &= ~TERMP_NOBREAK; p->flags |= TERMP_NOLPAD | TERMP_NOSPACE; - term_word(p, os); + term_word(p, m->os); term_flushln(p); p->offset = 0; Index: mdoc_validate.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/mdoc_validate.c,v retrieving revision 1.88 diff -u -p -r1.88 mdoc_validate.c --- mdoc_validate.c 6 Feb 2011 17:33:21 -0000 1.88 +++ mdoc_validate.c 4 Mar 2011 22:07:58 -0000 @@ -1,6 +1,7 @@ /* $Id: mdoc_validate.c,v 1.88 2011/02/06 17:33:21 schwarze Exp $ */ /* * Copyright (c) 2008, 2009, 2010, 2011 Kristaps Dzonsons + * Copyright (c) 2011 Ingo Schwarze * * Permission to use, copy, modify, and distribute this software for any * purpose with or without fee is hereby granted, provided that the above @@ -135,7 +136,7 @@ static v_post posts_bx[] = { post_bx, N static v_post posts_bool[] = { ebool, NULL }; static v_post posts_eoln[] = { post_eoln, NULL }; static v_post posts_defaults[] = { post_defaults, NULL }; -static v_post posts_dd[] = { ewarn_ge1, post_dd, post_prol, NULL }; +static v_post posts_dd[] = { post_dd, post_prol, NULL }; static v_post posts_dl[] = { post_literal, bwarn_ge1, NULL }; static v_post posts_dt[] = { post_dt, post_prol, NULL }; static v_post posts_fo[] = { hwarn_eq1, bwarn_ge1, NULL }; @@ -216,7 +217,7 @@ const struct valids mdoc_valids[MDOC_MAX { NULL, posts_text }, /* Xr */ { NULL, posts_text }, /* %A */ { NULL, posts_text }, /* %B */ /* FIXME: can be used outside Rs/Re. */ - { NULL, posts_text }, /* %D */ /* FIXME: check date with mandoc_a2time(). */ + { NULL, posts_text }, /* %D */ { NULL, posts_text }, /* %I */ { NULL, posts_text }, /* %J */ { NULL, posts_text }, /* %N */ @@ -910,7 +911,7 @@ static int pre_dt(PRE_ARGS) { - if (0 == mdoc->meta.date || mdoc->meta.os) + if (NULL == mdoc->meta.date || mdoc->meta.os) mdoc_nmsg(mdoc, n, MANDOCERR_PROLOGOOO); if (mdoc->meta.title) @@ -923,7 +924,7 @@ static int pre_os(PRE_ARGS) { - if (NULL == mdoc->meta.title || 0 == mdoc->meta.date) + if (NULL == mdoc->meta.title || NULL == mdoc->meta.date) mdoc_nmsg(mdoc, n, MANDOCERR_PROLOGOOO); if (mdoc->meta.os) @@ -1962,23 +1963,21 @@ post_dd(POST_ARGS) char buf[DATESIZE]; struct mdoc_node *n; - n = mdoc->last; + if (mdoc->meta.date) + free(mdoc->meta.date); - if (NULL == n->child) { - mdoc->meta.date = time(NULL); + n = mdoc->last; + if (NULL == n->child || '\0' == n->child->string[0]) { + mdoc->meta.date = mandoc_normdate(NULL, + mdoc->msg, mdoc->data, n->line, n->pos); return(1); } if ( ! concat(mdoc, buf, n->child, DATESIZE)) return(0); - mdoc->meta.date = mandoc_a2time - (MTIME_MDOCDATE | MTIME_CANONICAL, buf); - - if (0 == mdoc->meta.date) { - mdoc_nmsg(mdoc, n, MANDOCERR_BADDATE); - mdoc->meta.date = time(NULL); - } + mdoc->meta.date = mandoc_normdate(buf, + mdoc->msg, mdoc->data, n->line, n->pos); return(1); } Index: out.h =================================================================== RCS file: /cvs/src/usr.bin/mandoc/out.h,v retrieving revision 1.8 diff -u -p -r1.8 out.h --- out.h 30 Jan 2011 16:05:29 -0000 1.8 +++ out.h 4 Mar 2011 22:07:58 -0000 @@ -17,8 +17,6 @@ #ifndef OUT_H #define OUT_H -#define DATESIZ 24 - __BEGIN_DECLS struct roffcol { -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv