tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
From: Ingo Schwarze <schwarze@usta.de>
To: tech@mdocml.bsd.lv
Subject: [PATCH] partial cleanup of mdoc(7) arg count validation
Date: Sun, 2 Jan 2011 22:15:53 +0100	[thread overview]
Message-ID: <20110102211552.GB21085@iris.usta.de> (raw)

Hi,

when i ran into a few segfaults, i had a look at argument count
validation in libmdoc, and it revealed itself as a can of worms.

First, i did some framework improvements:
 * Added MANDOCERR_ARGCWARN to support real warnings.
 * Removed CHECK_FATAL because it was unused, and i can't see how
   a wrong number of arguments on any macro line could ever force
   us to abort the parser:  For missing arguments, one can assume
   defaults, and excessive ones can be discarded.
 * Use the correct level in check_count().
 * Improve the wording: "more than N children", not "greater than".

Then, the bug fixes that made me look at this:
 * Don't segfault on empty .Rs.
 * Don't segfault on empty .Sm and .Db; do the required checks in ebool().
 * Don't segfault on empty .St; do the required checks in post_st().

While there, i did lots of message level adjustments, all downgrades
from error to warnings, because none of these is likely to loose relevant
information.  Some macros had their own validation function anyway,
so use these to do all checks, making the code easier to follow:
 * .An -split with additional arguments
 * .Lb with wrong number of arguments (all checks in post_lb)
 * .Rs with header arguments
 * .Rs with an empty body (all checks in post_rs)
 * posts_text1: These three macros (.In, .Pf, .%U) have more problems:
   Empty .In and .%U should be removed, and the first word after .Pf
   should not be parsed, but i don't touch that now.
 * .sp with excessive arguments

The macros .Dl and .D1 don't need their HEAD children check:
These macros are blk_part_imp, so they cannot have HEAD children;
we could make it an assertion, but they have no dedicated validation
functions, so i guess it's not worth the effort.

After doing all this, i could emove the now unused functions
eerr_eq0, eerr_eq1, eerr_le1, herr_eq0, herr_ge1,
only adding ewarn_eq1, ewarn_le1, hwarn_ge1.

The bad news is that eerr_ge1 still needs to be checked.
Probably, it can be changed to ewarn_ge1; but after finding so many
issues with the other functions, i'm reluctant to do that blindly,
so i'm postponing it for now.  It is only used by .Ad .Cd .Er .Fn
.Ic .%A .%B .%D .%I .%J .%N .%O .%P .%R .%T .%V .Ms .Sx .Sy .Tn .Lk
.%C .%Q .Vt, so that should be a quick check.  ;-/

OK for what i have so far?

Yours,
  Ingo


Index: main.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/main.c,v
retrieving revision 1.62
diff -u -p -r1.62 main.c
--- main.c	21 Dec 2010 01:22:00 -0000	1.62
+++ main.c	2 Jan 2011 21:14:22 -0000
@@ -138,6 +138,7 @@ static	const char * const	mandocerrs[MAN
 
 	/* related to missing macro arguments */
 	"skipping empty macro",
+	"argument count wrong",
 	"missing display type",
 	"list type must come first",
 	"tag lists require a width argument",
Index: mandoc.h
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/mandoc.h,v
retrieving revision 1.25
diff -u -p -r1.25 mandoc.h
--- mandoc.h	9 Dec 2010 23:01:18 -0000	1.25
+++ mandoc.h	2 Jan 2011 21:14:22 -0000
@@ -75,6 +75,7 @@ enum	mandocerr {
 
 	/* related to missing macro arguments */
 	MANDOCERR_MACROEMPTY, /* skipping empty macro */
+	MANDOCERR_ARGCWARN, /* argument count wrong */
 	MANDOCERR_DISPTYPE, /* missing display type */
 	MANDOCERR_LISTFIRST, /* list type must come first */
 	MANDOCERR_NOWIDTHARG, /* tag lists require a width argument */
Index: mdoc_validate.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/mdoc_validate.c,v
retrieving revision 1.82
diff -u -p -r1.82 mdoc_validate.c
--- mdoc_validate.c	29 Dec 2010 00:47:31 -0000	1.82
+++ mdoc_validate.c	2 Jan 2011 21:14:22 -0000
@@ -53,7 +53,6 @@ enum	check_ineq {
 enum	check_lvl {
 	CHECK_WARN,
 	CHECK_ERROR,
-	CHECK_FATAL
 };
 
 typedef	int	(*v_pre)(PRE_ARGS);
@@ -78,16 +77,14 @@ static	int	 concat(struct mdoc *, char *
 static	int	 ebool(POST_ARGS);
 static	int	 berr_ge1(POST_ARGS);
 static	int	 bwarn_ge1(POST_ARGS);
-static	int	 eerr_eq0(POST_ARGS);
-static	int	 eerr_eq1(POST_ARGS);
 static	int	 eerr_ge1(POST_ARGS);
-static	int	 eerr_le1(POST_ARGS);
 static	int	 ewarn_eq0(POST_ARGS);
+static	int	 ewarn_eq1(POST_ARGS);
 static	int	 ewarn_ge1(POST_ARGS);
-static	int	 herr_eq0(POST_ARGS);
-static	int	 herr_ge1(POST_ARGS);
+static	int	 ewarn_le1(POST_ARGS);
 static	int	 hwarn_eq0(POST_ARGS);
 static	int	 hwarn_eq1(POST_ARGS);
+static	int	 hwarn_ge1(POST_ARGS);
 static	int	 hwarn_le1(POST_ARGS);
 
 static	int	 post_an(POST_ARGS);
@@ -138,29 +135,29 @@ static	v_post	 posts_bd[] = { post_liter
 static	v_post	 posts_bf[] = { hwarn_le1, post_bf, NULL };
 static	v_post	 posts_bk[] = { hwarn_eq0, bwarn_ge1, NULL };
 static	v_post	 posts_bl[] = { bwarn_ge1, post_bl, NULL };
-static	v_post	 posts_bool[] = { eerr_eq1, ebool, NULL };
+static	v_post	 posts_bool[] = { ebool, NULL };
 static	v_post	 posts_eoln[] = { post_eoln, NULL };
 static	v_post	 posts_defaults[] = { post_defaults, NULL };
 static	v_post	 posts_dd[] = { ewarn_ge1, post_dd, post_prol, NULL };
-static	v_post	 posts_dl[] = { post_literal, bwarn_ge1, herr_eq0, NULL };
+static	v_post	 posts_dl[] = { post_literal, bwarn_ge1, NULL };
 static	v_post	 posts_dt[] = { post_dt, post_prol, NULL };
 static	v_post	 posts_fo[] = { hwarn_eq1, bwarn_ge1, NULL };
 static	v_post	 posts_it[] = { post_it, NULL };
-static	v_post	 posts_lb[] = { eerr_eq1, post_lb, NULL };
+static	v_post	 posts_lb[] = { post_lb, NULL };
 static	v_post	 posts_nd[] = { berr_ge1, NULL };
 static	v_post	 posts_nm[] = { post_nm, NULL };
 static	v_post	 posts_notext[] = { ewarn_eq0, NULL };
 static	v_post	 posts_os[] = { post_os, post_prol, NULL };
-static	v_post	 posts_rs[] = { berr_ge1, herr_eq0, post_rs, NULL };
-static	v_post	 posts_sh[] = { post_ignpar, herr_ge1, bwarn_ge1, post_sh, NULL };
-static	v_post	 posts_sp[] = { eerr_le1, NULL };
-static	v_post	 posts_ss[] = { post_ignpar, herr_ge1, bwarn_ge1, NULL };
-static	v_post	 posts_st[] = { eerr_eq1, post_st, NULL };
+static	v_post	 posts_rs[] = { post_rs, NULL };
+static	v_post	 posts_sh[] = { post_ignpar, hwarn_ge1, bwarn_ge1, post_sh, NULL };
+static	v_post	 posts_sp[] = { ewarn_le1, NULL };
+static	v_post	 posts_ss[] = { post_ignpar, hwarn_ge1, bwarn_ge1, NULL };
+static	v_post	 posts_st[] = { post_st, NULL };
 static	v_post	 posts_std[] = { post_std, NULL };
 static	v_post	 posts_text[] = { eerr_ge1, NULL };
-static	v_post	 posts_text1[] = { eerr_eq1, NULL };
+static	v_post	 posts_text1[] = { ewarn_eq1, NULL };
 static	v_post	 posts_vt[] = { post_vt, NULL };
-static	v_post	 posts_wline[] = { bwarn_ge1, herr_eq0, NULL };
+static	v_post	 posts_wline[] = { bwarn_ge1, NULL };
 static	v_post	 posts_wtext[] = { ewarn_ge1, NULL };
 static	v_pre	 pres_an[] = { pre_an, NULL };
 static	v_pre	 pres_bd[] = { pre_display, pre_bd, pre_literal, pre_par, NULL };
@@ -380,6 +377,7 @@ check_count(struct mdoc *m, enum mdoc_ty
 		enum check_lvl lvl, enum check_ineq ineq, int val)
 {
 	const char	*p;
+	enum mandocerr	 t;
 
 	if (m->last->type != type)
 		return(1);
@@ -391,7 +389,7 @@ check_count(struct mdoc *m, enum mdoc_ty
 			return(1);
 		break;
 	case (CHECK_GT):
-		p = "greater than ";
+		p = "more than ";
 		if (m->last->nchild > val)
 			return(1);
 		break;
@@ -405,18 +403,10 @@ check_count(struct mdoc *m, enum mdoc_ty
 		/* NOTREACHED */
 	}
 
-	if (CHECK_WARN == lvl) {
-		return(mdoc_vmsg(m, MANDOCERR_ARGCOUNT,
-				m->last->line, m->last->pos,
-				"want %s%d children (have %d)",
-				p, val, m->last->nchild));
-	}
+	t = lvl == CHECK_WARN ? MANDOCERR_ARGCWARN : MANDOCERR_ARGCOUNT;
 
-	/* FIXME: THIS IS THE SAME AS THE ABOVE. */
-
-	return(mdoc_vmsg(m, MANDOCERR_ARGCOUNT,
-			m->last->line, m->last->pos,
-			"require %s%d children (have %d)",
+	return(mdoc_vmsg(m, t, m->last->line, m->last->pos,
+			"want %s%d children (have %d)",
 			p, val, m->last->nchild));
 }
 
@@ -424,7 +414,7 @@ static int
 berr_ge1(POST_ARGS)
 {
 
-	return(check_count(mdoc, MDOC_BODY, CHECK_FATAL, CHECK_GT, 0));
+	return(check_count(mdoc, MDOC_BODY, CHECK_ERROR, CHECK_GT, 0));
 }
 
 static int
@@ -434,27 +424,9 @@ bwarn_ge1(POST_ARGS)
 }
 
 static int
-eerr_eq0(POST_ARGS)
-{
-	return(check_count(mdoc, MDOC_ELEM, CHECK_FATAL, CHECK_EQ, 0));
-}
-
-static int
-eerr_eq1(POST_ARGS)
-{
-	return(check_count(mdoc, MDOC_ELEM, CHECK_FATAL, CHECK_EQ, 1));
-}
-
-static int
 eerr_ge1(POST_ARGS)
 {
-	return(check_count(mdoc, MDOC_ELEM, CHECK_FATAL, CHECK_GT, 0));
-}
-
-static int
-eerr_le1(POST_ARGS)
-{
-	return(check_count(mdoc, MDOC_ELEM, CHECK_FATAL, CHECK_LT, 2));
+	return(check_count(mdoc, MDOC_ELEM, CHECK_ERROR, CHECK_GT, 0));
 }
 
 static int
@@ -464,21 +436,21 @@ ewarn_eq0(POST_ARGS)
 }
 
 static int
-ewarn_ge1(POST_ARGS)
+ewarn_eq1(POST_ARGS)
 {
-	return(check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_GT, 0));
+	return(check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_EQ, 1));
 }
 
 static int
-herr_eq0(POST_ARGS)
+ewarn_ge1(POST_ARGS)
 {
-	return(check_count(mdoc, MDOC_HEAD, CHECK_FATAL, CHECK_EQ, 0));
+	return(check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_GT, 0));
 }
 
 static int
-herr_ge1(POST_ARGS)
+ewarn_le1(POST_ARGS)
 {
-	return(check_count(mdoc, MDOC_HEAD, CHECK_FATAL, CHECK_GT, 0));
+	return(check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_LT, 2));
 }
 
 static int
@@ -494,6 +466,12 @@ hwarn_eq1(POST_ARGS)
 }
 
 static int
+hwarn_ge1(POST_ARGS)
+{
+	return(check_count(mdoc, MDOC_HEAD, CHECK_WARN, CHECK_GT, 0));
+}
+
+static int
 hwarn_le1(POST_ARGS)
 {
 	return(check_count(mdoc, MDOC_HEAD, CHECK_WARN, CHECK_LT, 2));
@@ -1046,6 +1024,8 @@ post_lb(POST_ARGS)
 	char		*buf;
 	size_t		 sz;
 
+	check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_EQ, 1);
+
 	assert(mdoc->last->child);
 	assert(MDOC_TEXT == mdoc->last->child->type);
 
@@ -1238,8 +1218,10 @@ post_an(POST_ARGS)
 	struct mdoc_node *np;
 
 	np = mdoc->last;
-	if (AUTH__NONE != np->norm->An.auth && np->child)
-		return(eerr_eq0(mdoc));
+	if (AUTH__NONE != np->norm->An.auth && np->child) {
+		check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_EQ, 0);
+		return(1);
+	}
 
 	/* 
 	 * FIXME: make this ewarn and make sure that the front-ends
@@ -1581,8 +1563,12 @@ static int
 ebool(struct mdoc *mdoc)
 {
 
-	if (NULL == mdoc->last->child)
+	if (NULL == mdoc->last->child) {
+		mdoc_nmsg(mdoc, mdoc->last, MANDOCERR_MACROEMPTY);
+		mdoc_node_delete(mdoc, mdoc->last);
 		return(1);
+	}
+	check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_EQ, 1);
 
 	assert(MDOC_TEXT == mdoc->last->child->type);
 
@@ -1631,18 +1617,23 @@ post_root(POST_ARGS)
 static int
 post_st(POST_ARGS)
 {
-	const char	*p;
+	struct mdoc_node	 *ch;
+	const char		 *p;
 
-	assert(MDOC_TEXT == mdoc->last->child->type);
+	if (NULL == (ch = mdoc->last->child)) {
+		mdoc_nmsg(mdoc, mdoc->last, MANDOCERR_MACROEMPTY);
+		mdoc_node_delete(mdoc, mdoc->last);
+		return(1);
+	}
 
-	p = mdoc_a2st(mdoc->last->child->string);
+	assert(MDOC_TEXT == ch->type);
 
-	if (p == NULL) {
+	if (NULL == (p = mdoc_a2st(ch->string))) {
 		mdoc_nmsg(mdoc, mdoc->last, MANDOCERR_BADSTANDARD);
 		mdoc_node_delete(mdoc, mdoc->last);
 	} else {
-		free(mdoc->last->child->string);
-		mdoc->last->child->string = mandoc_strdup(p);
+		free(ch->string);
+		ch->string = mandoc_strdup(p);
 	}
 
 	return(1);
@@ -1654,8 +1645,18 @@ post_rs(POST_ARGS)
 	struct mdoc_node *nn, *next, *prev;
 	int		  i, j;
 
-	if (MDOC_BODY != mdoc->last->type)
+	switch (mdoc->last->type) {
+	case (MDOC_HEAD):
+		check_count(mdoc, MDOC_HEAD, CHECK_WARN, CHECK_EQ, 0);
 		return(1);
+	case (MDOC_BODY):
+		if (mdoc->last->child)
+			break;
+		check_count(mdoc, MDOC_BODY, CHECK_WARN, CHECK_GT, 0);
+		return(1);
+	default:
+		return(1);
+	}
 
 	/*
 	 * Make sure only certain types of nodes are allowed within the
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

             reply	other threads:[~2011-01-02 21:15 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-02 21:15 Ingo Schwarze [this message]
2011-01-02 21:35 ` Kristaps Dzonsons
2011-01-04  0:06   ` Ingo Schwarze
2011-01-20 20:22     ` [PATCH] remaining " Ingo Schwarze
2011-01-21 18:31       ` Kristaps Dzonsons
2011-01-22 16:08         ` better explain roff(7) macro argument quoting Ingo Schwarze

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110102211552.GB21085@iris.usta.de \
    --to=schwarze@usta.de \
    --cc=tech@mdocml.bsd.lv \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).