From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp1.rz.uni-karlsruhe.de (Debian-exim@smtp1.rz.uni-karlsruhe.de [129.13.185.217]) by krisdoz.my.domain (8.14.3/8.14.3) with ESMTP id p02LFt8r019018 for ; Sun, 2 Jan 2011 16:15:55 -0500 (EST) Received: from hekate.usta.de (asta-nat.asta.uni-karlsruhe.de [172.22.63.82]) by smtp1.rz.uni-karlsruhe.de with esmtp (Exim 4.63 #1) id 1PZVHN-0003Yk-Ka; Sun, 02 Jan 2011 22:15:53 +0100 Received: from donnerwolke.usta.de ([172.24.96.3]) by hekate.usta.de with esmtp (Exim 4.72) (envelope-from ) id 1PZVHN-0004Re-JE for tech@mdocml.bsd.lv; Sun, 02 Jan 2011 22:15:53 +0100 Received: from iris.usta.de ([172.24.96.5] helo=usta.de) by donnerwolke.usta.de with esmtp (Exim 4.69) (envelope-from ) id 1PZVHN-0001nE-Hu for tech@mdocml.bsd.lv; Sun, 02 Jan 2011 22:15:53 +0100 Received: from schwarze by usta.de with local (Exim 4.72) (envelope-from ) id 1PZVHN-0001Q5-CP for tech@mdocml.bsd.lv; Sun, 02 Jan 2011 22:15:53 +0100 Date: Sun, 2 Jan 2011 22:15:53 +0100 From: Ingo Schwarze To: tech@mdocml.bsd.lv Subject: [PATCH] partial cleanup of mdoc(7) arg count validation Message-ID: <20110102211552.GB21085@iris.usta.de> X-Mailinglist: mdocml-tech Reply-To: tech@mdocml.bsd.lv MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Hi, when i ran into a few segfaults, i had a look at argument count validation in libmdoc, and it revealed itself as a can of worms. First, i did some framework improvements: * Added MANDOCERR_ARGCWARN to support real warnings. * Removed CHECK_FATAL because it was unused, and i can't see how a wrong number of arguments on any macro line could ever force us to abort the parser: For missing arguments, one can assume defaults, and excessive ones can be discarded. * Use the correct level in check_count(). * Improve the wording: "more than N children", not "greater than". Then, the bug fixes that made me look at this: * Don't segfault on empty .Rs. * Don't segfault on empty .Sm and .Db; do the required checks in ebool(). * Don't segfault on empty .St; do the required checks in post_st(). While there, i did lots of message level adjustments, all downgrades from error to warnings, because none of these is likely to loose relevant information. Some macros had their own validation function anyway, so use these to do all checks, making the code easier to follow: * .An -split with additional arguments * .Lb with wrong number of arguments (all checks in post_lb) * .Rs with header arguments * .Rs with an empty body (all checks in post_rs) * posts_text1: These three macros (.In, .Pf, .%U) have more problems: Empty .In and .%U should be removed, and the first word after .Pf should not be parsed, but i don't touch that now. * .sp with excessive arguments The macros .Dl and .D1 don't need their HEAD children check: These macros are blk_part_imp, so they cannot have HEAD children; we could make it an assertion, but they have no dedicated validation functions, so i guess it's not worth the effort. After doing all this, i could emove the now unused functions eerr_eq0, eerr_eq1, eerr_le1, herr_eq0, herr_ge1, only adding ewarn_eq1, ewarn_le1, hwarn_ge1. The bad news is that eerr_ge1 still needs to be checked. Probably, it can be changed to ewarn_ge1; but after finding so many issues with the other functions, i'm reluctant to do that blindly, so i'm postponing it for now. It is only used by .Ad .Cd .Er .Fn .Ic .%A .%B .%D .%I .%J .%N .%O .%P .%R .%T .%V .Ms .Sx .Sy .Tn .Lk .%C .%Q .Vt, so that should be a quick check. ;-/ OK for what i have so far? Yours, Ingo Index: main.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/main.c,v retrieving revision 1.62 diff -u -p -r1.62 main.c --- main.c 21 Dec 2010 01:22:00 -0000 1.62 +++ main.c 2 Jan 2011 21:14:22 -0000 @@ -138,6 +138,7 @@ static const char * const mandocerrs[MAN /* related to missing macro arguments */ "skipping empty macro", + "argument count wrong", "missing display type", "list type must come first", "tag lists require a width argument", Index: mandoc.h =================================================================== RCS file: /cvs/src/usr.bin/mandoc/mandoc.h,v retrieving revision 1.25 diff -u -p -r1.25 mandoc.h --- mandoc.h 9 Dec 2010 23:01:18 -0000 1.25 +++ mandoc.h 2 Jan 2011 21:14:22 -0000 @@ -75,6 +75,7 @@ enum mandocerr { /* related to missing macro arguments */ MANDOCERR_MACROEMPTY, /* skipping empty macro */ + MANDOCERR_ARGCWARN, /* argument count wrong */ MANDOCERR_DISPTYPE, /* missing display type */ MANDOCERR_LISTFIRST, /* list type must come first */ MANDOCERR_NOWIDTHARG, /* tag lists require a width argument */ Index: mdoc_validate.c =================================================================== RCS file: /cvs/src/usr.bin/mandoc/mdoc_validate.c,v retrieving revision 1.82 diff -u -p -r1.82 mdoc_validate.c --- mdoc_validate.c 29 Dec 2010 00:47:31 -0000 1.82 +++ mdoc_validate.c 2 Jan 2011 21:14:22 -0000 @@ -53,7 +53,6 @@ enum check_ineq { enum check_lvl { CHECK_WARN, CHECK_ERROR, - CHECK_FATAL }; typedef int (*v_pre)(PRE_ARGS); @@ -78,16 +77,14 @@ static int concat(struct mdoc *, char * static int ebool(POST_ARGS); static int berr_ge1(POST_ARGS); static int bwarn_ge1(POST_ARGS); -static int eerr_eq0(POST_ARGS); -static int eerr_eq1(POST_ARGS); static int eerr_ge1(POST_ARGS); -static int eerr_le1(POST_ARGS); static int ewarn_eq0(POST_ARGS); +static int ewarn_eq1(POST_ARGS); static int ewarn_ge1(POST_ARGS); -static int herr_eq0(POST_ARGS); -static int herr_ge1(POST_ARGS); +static int ewarn_le1(POST_ARGS); static int hwarn_eq0(POST_ARGS); static int hwarn_eq1(POST_ARGS); +static int hwarn_ge1(POST_ARGS); static int hwarn_le1(POST_ARGS); static int post_an(POST_ARGS); @@ -138,29 +135,29 @@ static v_post posts_bd[] = { post_liter static v_post posts_bf[] = { hwarn_le1, post_bf, NULL }; static v_post posts_bk[] = { hwarn_eq0, bwarn_ge1, NULL }; static v_post posts_bl[] = { bwarn_ge1, post_bl, NULL }; -static v_post posts_bool[] = { eerr_eq1, ebool, NULL }; +static v_post posts_bool[] = { ebool, NULL }; static v_post posts_eoln[] = { post_eoln, NULL }; static v_post posts_defaults[] = { post_defaults, NULL }; static v_post posts_dd[] = { ewarn_ge1, post_dd, post_prol, NULL }; -static v_post posts_dl[] = { post_literal, bwarn_ge1, herr_eq0, NULL }; +static v_post posts_dl[] = { post_literal, bwarn_ge1, NULL }; static v_post posts_dt[] = { post_dt, post_prol, NULL }; static v_post posts_fo[] = { hwarn_eq1, bwarn_ge1, NULL }; static v_post posts_it[] = { post_it, NULL }; -static v_post posts_lb[] = { eerr_eq1, post_lb, NULL }; +static v_post posts_lb[] = { post_lb, NULL }; static v_post posts_nd[] = { berr_ge1, NULL }; static v_post posts_nm[] = { post_nm, NULL }; static v_post posts_notext[] = { ewarn_eq0, NULL }; static v_post posts_os[] = { post_os, post_prol, NULL }; -static v_post posts_rs[] = { berr_ge1, herr_eq0, post_rs, NULL }; -static v_post posts_sh[] = { post_ignpar, herr_ge1, bwarn_ge1, post_sh, NULL }; -static v_post posts_sp[] = { eerr_le1, NULL }; -static v_post posts_ss[] = { post_ignpar, herr_ge1, bwarn_ge1, NULL }; -static v_post posts_st[] = { eerr_eq1, post_st, NULL }; +static v_post posts_rs[] = { post_rs, NULL }; +static v_post posts_sh[] = { post_ignpar, hwarn_ge1, bwarn_ge1, post_sh, NULL }; +static v_post posts_sp[] = { ewarn_le1, NULL }; +static v_post posts_ss[] = { post_ignpar, hwarn_ge1, bwarn_ge1, NULL }; +static v_post posts_st[] = { post_st, NULL }; static v_post posts_std[] = { post_std, NULL }; static v_post posts_text[] = { eerr_ge1, NULL }; -static v_post posts_text1[] = { eerr_eq1, NULL }; +static v_post posts_text1[] = { ewarn_eq1, NULL }; static v_post posts_vt[] = { post_vt, NULL }; -static v_post posts_wline[] = { bwarn_ge1, herr_eq0, NULL }; +static v_post posts_wline[] = { bwarn_ge1, NULL }; static v_post posts_wtext[] = { ewarn_ge1, NULL }; static v_pre pres_an[] = { pre_an, NULL }; static v_pre pres_bd[] = { pre_display, pre_bd, pre_literal, pre_par, NULL }; @@ -380,6 +377,7 @@ check_count(struct mdoc *m, enum mdoc_ty enum check_lvl lvl, enum check_ineq ineq, int val) { const char *p; + enum mandocerr t; if (m->last->type != type) return(1); @@ -391,7 +389,7 @@ check_count(struct mdoc *m, enum mdoc_ty return(1); break; case (CHECK_GT): - p = "greater than "; + p = "more than "; if (m->last->nchild > val) return(1); break; @@ -405,18 +403,10 @@ check_count(struct mdoc *m, enum mdoc_ty /* NOTREACHED */ } - if (CHECK_WARN == lvl) { - return(mdoc_vmsg(m, MANDOCERR_ARGCOUNT, - m->last->line, m->last->pos, - "want %s%d children (have %d)", - p, val, m->last->nchild)); - } + t = lvl == CHECK_WARN ? MANDOCERR_ARGCWARN : MANDOCERR_ARGCOUNT; - /* FIXME: THIS IS THE SAME AS THE ABOVE. */ - - return(mdoc_vmsg(m, MANDOCERR_ARGCOUNT, - m->last->line, m->last->pos, - "require %s%d children (have %d)", + return(mdoc_vmsg(m, t, m->last->line, m->last->pos, + "want %s%d children (have %d)", p, val, m->last->nchild)); } @@ -424,7 +414,7 @@ static int berr_ge1(POST_ARGS) { - return(check_count(mdoc, MDOC_BODY, CHECK_FATAL, CHECK_GT, 0)); + return(check_count(mdoc, MDOC_BODY, CHECK_ERROR, CHECK_GT, 0)); } static int @@ -434,27 +424,9 @@ bwarn_ge1(POST_ARGS) } static int -eerr_eq0(POST_ARGS) -{ - return(check_count(mdoc, MDOC_ELEM, CHECK_FATAL, CHECK_EQ, 0)); -} - -static int -eerr_eq1(POST_ARGS) -{ - return(check_count(mdoc, MDOC_ELEM, CHECK_FATAL, CHECK_EQ, 1)); -} - -static int eerr_ge1(POST_ARGS) { - return(check_count(mdoc, MDOC_ELEM, CHECK_FATAL, CHECK_GT, 0)); -} - -static int -eerr_le1(POST_ARGS) -{ - return(check_count(mdoc, MDOC_ELEM, CHECK_FATAL, CHECK_LT, 2)); + return(check_count(mdoc, MDOC_ELEM, CHECK_ERROR, CHECK_GT, 0)); } static int @@ -464,21 +436,21 @@ ewarn_eq0(POST_ARGS) } static int -ewarn_ge1(POST_ARGS) +ewarn_eq1(POST_ARGS) { - return(check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_GT, 0)); + return(check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_EQ, 1)); } static int -herr_eq0(POST_ARGS) +ewarn_ge1(POST_ARGS) { - return(check_count(mdoc, MDOC_HEAD, CHECK_FATAL, CHECK_EQ, 0)); + return(check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_GT, 0)); } static int -herr_ge1(POST_ARGS) +ewarn_le1(POST_ARGS) { - return(check_count(mdoc, MDOC_HEAD, CHECK_FATAL, CHECK_GT, 0)); + return(check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_LT, 2)); } static int @@ -494,6 +466,12 @@ hwarn_eq1(POST_ARGS) } static int +hwarn_ge1(POST_ARGS) +{ + return(check_count(mdoc, MDOC_HEAD, CHECK_WARN, CHECK_GT, 0)); +} + +static int hwarn_le1(POST_ARGS) { return(check_count(mdoc, MDOC_HEAD, CHECK_WARN, CHECK_LT, 2)); @@ -1046,6 +1024,8 @@ post_lb(POST_ARGS) char *buf; size_t sz; + check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_EQ, 1); + assert(mdoc->last->child); assert(MDOC_TEXT == mdoc->last->child->type); @@ -1238,8 +1218,10 @@ post_an(POST_ARGS) struct mdoc_node *np; np = mdoc->last; - if (AUTH__NONE != np->norm->An.auth && np->child) - return(eerr_eq0(mdoc)); + if (AUTH__NONE != np->norm->An.auth && np->child) { + check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_EQ, 0); + return(1); + } /* * FIXME: make this ewarn and make sure that the front-ends @@ -1581,8 +1563,12 @@ static int ebool(struct mdoc *mdoc) { - if (NULL == mdoc->last->child) + if (NULL == mdoc->last->child) { + mdoc_nmsg(mdoc, mdoc->last, MANDOCERR_MACROEMPTY); + mdoc_node_delete(mdoc, mdoc->last); return(1); + } + check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_EQ, 1); assert(MDOC_TEXT == mdoc->last->child->type); @@ -1631,18 +1617,23 @@ post_root(POST_ARGS) static int post_st(POST_ARGS) { - const char *p; + struct mdoc_node *ch; + const char *p; - assert(MDOC_TEXT == mdoc->last->child->type); + if (NULL == (ch = mdoc->last->child)) { + mdoc_nmsg(mdoc, mdoc->last, MANDOCERR_MACROEMPTY); + mdoc_node_delete(mdoc, mdoc->last); + return(1); + } - p = mdoc_a2st(mdoc->last->child->string); + assert(MDOC_TEXT == ch->type); - if (p == NULL) { + if (NULL == (p = mdoc_a2st(ch->string))) { mdoc_nmsg(mdoc, mdoc->last, MANDOCERR_BADSTANDARD); mdoc_node_delete(mdoc, mdoc->last); } else { - free(mdoc->last->child->string); - mdoc->last->child->string = mandoc_strdup(p); + free(ch->string); + ch->string = mandoc_strdup(p); } return(1); @@ -1654,8 +1645,18 @@ post_rs(POST_ARGS) struct mdoc_node *nn, *next, *prev; int i, j; - if (MDOC_BODY != mdoc->last->type) + switch (mdoc->last->type) { + case (MDOC_HEAD): + check_count(mdoc, MDOC_HEAD, CHECK_WARN, CHECK_EQ, 0); return(1); + case (MDOC_BODY): + if (mdoc->last->child) + break; + check_count(mdoc, MDOC_BODY, CHECK_WARN, CHECK_GT, 0); + return(1); + default: + return(1); + } /* * Make sure only certain types of nodes are allowed within the -- To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv