tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* [PATCH] partial cleanup of mdoc(7) arg count validation
@ 2011-01-02 21:15 Ingo Schwarze
  2011-01-02 21:35 ` Kristaps Dzonsons
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Schwarze @ 2011-01-02 21:15 UTC (permalink / raw)
  To: tech

Hi,

when i ran into a few segfaults, i had a look at argument count
validation in libmdoc, and it revealed itself as a can of worms.

First, i did some framework improvements:
 * Added MANDOCERR_ARGCWARN to support real warnings.
 * Removed CHECK_FATAL because it was unused, and i can't see how
   a wrong number of arguments on any macro line could ever force
   us to abort the parser:  For missing arguments, one can assume
   defaults, and excessive ones can be discarded.
 * Use the correct level in check_count().
 * Improve the wording: "more than N children", not "greater than".

Then, the bug fixes that made me look at this:
 * Don't segfault on empty .Rs.
 * Don't segfault on empty .Sm and .Db; do the required checks in ebool().
 * Don't segfault on empty .St; do the required checks in post_st().

While there, i did lots of message level adjustments, all downgrades
from error to warnings, because none of these is likely to loose relevant
information.  Some macros had their own validation function anyway,
so use these to do all checks, making the code easier to follow:
 * .An -split with additional arguments
 * .Lb with wrong number of arguments (all checks in post_lb)
 * .Rs with header arguments
 * .Rs with an empty body (all checks in post_rs)
 * posts_text1: These three macros (.In, .Pf, .%U) have more problems:
   Empty .In and .%U should be removed, and the first word after .Pf
   should not be parsed, but i don't touch that now.
 * .sp with excessive arguments

The macros .Dl and .D1 don't need their HEAD children check:
These macros are blk_part_imp, so they cannot have HEAD children;
we could make it an assertion, but they have no dedicated validation
functions, so i guess it's not worth the effort.

After doing all this, i could emove the now unused functions
eerr_eq0, eerr_eq1, eerr_le1, herr_eq0, herr_ge1,
only adding ewarn_eq1, ewarn_le1, hwarn_ge1.

The bad news is that eerr_ge1 still needs to be checked.
Probably, it can be changed to ewarn_ge1; but after finding so many
issues with the other functions, i'm reluctant to do that blindly,
so i'm postponing it for now.  It is only used by .Ad .Cd .Er .Fn
.Ic .%A .%B .%D .%I .%J .%N .%O .%P .%R .%T .%V .Ms .Sx .Sy .Tn .Lk
.%C .%Q .Vt, so that should be a quick check.  ;-/

OK for what i have so far?

Yours,
  Ingo


Index: main.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/main.c,v
retrieving revision 1.62
diff -u -p -r1.62 main.c
--- main.c	21 Dec 2010 01:22:00 -0000	1.62
+++ main.c	2 Jan 2011 21:14:22 -0000
@@ -138,6 +138,7 @@ static	const char * const	mandocerrs[MAN
 
 	/* related to missing macro arguments */
 	"skipping empty macro",
+	"argument count wrong",
 	"missing display type",
 	"list type must come first",
 	"tag lists require a width argument",
Index: mandoc.h
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/mandoc.h,v
retrieving revision 1.25
diff -u -p -r1.25 mandoc.h
--- mandoc.h	9 Dec 2010 23:01:18 -0000	1.25
+++ mandoc.h	2 Jan 2011 21:14:22 -0000
@@ -75,6 +75,7 @@ enum	mandocerr {
 
 	/* related to missing macro arguments */
 	MANDOCERR_MACROEMPTY, /* skipping empty macro */
+	MANDOCERR_ARGCWARN, /* argument count wrong */
 	MANDOCERR_DISPTYPE, /* missing display type */
 	MANDOCERR_LISTFIRST, /* list type must come first */
 	MANDOCERR_NOWIDTHARG, /* tag lists require a width argument */
Index: mdoc_validate.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/mdoc_validate.c,v
retrieving revision 1.82
diff -u -p -r1.82 mdoc_validate.c
--- mdoc_validate.c	29 Dec 2010 00:47:31 -0000	1.82
+++ mdoc_validate.c	2 Jan 2011 21:14:22 -0000
@@ -53,7 +53,6 @@ enum	check_ineq {
 enum	check_lvl {
 	CHECK_WARN,
 	CHECK_ERROR,
-	CHECK_FATAL
 };
 
 typedef	int	(*v_pre)(PRE_ARGS);
@@ -78,16 +77,14 @@ static	int	 concat(struct mdoc *, char *
 static	int	 ebool(POST_ARGS);
 static	int	 berr_ge1(POST_ARGS);
 static	int	 bwarn_ge1(POST_ARGS);
-static	int	 eerr_eq0(POST_ARGS);
-static	int	 eerr_eq1(POST_ARGS);
 static	int	 eerr_ge1(POST_ARGS);
-static	int	 eerr_le1(POST_ARGS);
 static	int	 ewarn_eq0(POST_ARGS);
+static	int	 ewarn_eq1(POST_ARGS);
 static	int	 ewarn_ge1(POST_ARGS);
-static	int	 herr_eq0(POST_ARGS);
-static	int	 herr_ge1(POST_ARGS);
+static	int	 ewarn_le1(POST_ARGS);
 static	int	 hwarn_eq0(POST_ARGS);
 static	int	 hwarn_eq1(POST_ARGS);
+static	int	 hwarn_ge1(POST_ARGS);
 static	int	 hwarn_le1(POST_ARGS);
 
 static	int	 post_an(POST_ARGS);
@@ -138,29 +135,29 @@ static	v_post	 posts_bd[] = { post_liter
 static	v_post	 posts_bf[] = { hwarn_le1, post_bf, NULL };
 static	v_post	 posts_bk[] = { hwarn_eq0, bwarn_ge1, NULL };
 static	v_post	 posts_bl[] = { bwarn_ge1, post_bl, NULL };
-static	v_post	 posts_bool[] = { eerr_eq1, ebool, NULL };
+static	v_post	 posts_bool[] = { ebool, NULL };
 static	v_post	 posts_eoln[] = { post_eoln, NULL };
 static	v_post	 posts_defaults[] = { post_defaults, NULL };
 static	v_post	 posts_dd[] = { ewarn_ge1, post_dd, post_prol, NULL };
-static	v_post	 posts_dl[] = { post_literal, bwarn_ge1, herr_eq0, NULL };
+static	v_post	 posts_dl[] = { post_literal, bwarn_ge1, NULL };
 static	v_post	 posts_dt[] = { post_dt, post_prol, NULL };
 static	v_post	 posts_fo[] = { hwarn_eq1, bwarn_ge1, NULL };
 static	v_post	 posts_it[] = { post_it, NULL };
-static	v_post	 posts_lb[] = { eerr_eq1, post_lb, NULL };
+static	v_post	 posts_lb[] = { post_lb, NULL };
 static	v_post	 posts_nd[] = { berr_ge1, NULL };
 static	v_post	 posts_nm[] = { post_nm, NULL };
 static	v_post	 posts_notext[] = { ewarn_eq0, NULL };
 static	v_post	 posts_os[] = { post_os, post_prol, NULL };
-static	v_post	 posts_rs[] = { berr_ge1, herr_eq0, post_rs, NULL };
-static	v_post	 posts_sh[] = { post_ignpar, herr_ge1, bwarn_ge1, post_sh, NULL };
-static	v_post	 posts_sp[] = { eerr_le1, NULL };
-static	v_post	 posts_ss[] = { post_ignpar, herr_ge1, bwarn_ge1, NULL };
-static	v_post	 posts_st[] = { eerr_eq1, post_st, NULL };
+static	v_post	 posts_rs[] = { post_rs, NULL };
+static	v_post	 posts_sh[] = { post_ignpar, hwarn_ge1, bwarn_ge1, post_sh, NULL };
+static	v_post	 posts_sp[] = { ewarn_le1, NULL };
+static	v_post	 posts_ss[] = { post_ignpar, hwarn_ge1, bwarn_ge1, NULL };
+static	v_post	 posts_st[] = { post_st, NULL };
 static	v_post	 posts_std[] = { post_std, NULL };
 static	v_post	 posts_text[] = { eerr_ge1, NULL };
-static	v_post	 posts_text1[] = { eerr_eq1, NULL };
+static	v_post	 posts_text1[] = { ewarn_eq1, NULL };
 static	v_post	 posts_vt[] = { post_vt, NULL };
-static	v_post	 posts_wline[] = { bwarn_ge1, herr_eq0, NULL };
+static	v_post	 posts_wline[] = { bwarn_ge1, NULL };
 static	v_post	 posts_wtext[] = { ewarn_ge1, NULL };
 static	v_pre	 pres_an[] = { pre_an, NULL };
 static	v_pre	 pres_bd[] = { pre_display, pre_bd, pre_literal, pre_par, NULL };
@@ -380,6 +377,7 @@ check_count(struct mdoc *m, enum mdoc_ty
 		enum check_lvl lvl, enum check_ineq ineq, int val)
 {
 	const char	*p;
+	enum mandocerr	 t;
 
 	if (m->last->type != type)
 		return(1);
@@ -391,7 +389,7 @@ check_count(struct mdoc *m, enum mdoc_ty
 			return(1);
 		break;
 	case (CHECK_GT):
-		p = "greater than ";
+		p = "more than ";
 		if (m->last->nchild > val)
 			return(1);
 		break;
@@ -405,18 +403,10 @@ check_count(struct mdoc *m, enum mdoc_ty
 		/* NOTREACHED */
 	}
 
-	if (CHECK_WARN == lvl) {
-		return(mdoc_vmsg(m, MANDOCERR_ARGCOUNT,
-				m->last->line, m->last->pos,
-				"want %s%d children (have %d)",
-				p, val, m->last->nchild));
-	}
+	t = lvl == CHECK_WARN ? MANDOCERR_ARGCWARN : MANDOCERR_ARGCOUNT;
 
-	/* FIXME: THIS IS THE SAME AS THE ABOVE. */
-
-	return(mdoc_vmsg(m, MANDOCERR_ARGCOUNT,
-			m->last->line, m->last->pos,
-			"require %s%d children (have %d)",
+	return(mdoc_vmsg(m, t, m->last->line, m->last->pos,
+			"want %s%d children (have %d)",
 			p, val, m->last->nchild));
 }
 
@@ -424,7 +414,7 @@ static int
 berr_ge1(POST_ARGS)
 {
 
-	return(check_count(mdoc, MDOC_BODY, CHECK_FATAL, CHECK_GT, 0));
+	return(check_count(mdoc, MDOC_BODY, CHECK_ERROR, CHECK_GT, 0));
 }
 
 static int
@@ -434,27 +424,9 @@ bwarn_ge1(POST_ARGS)
 }
 
 static int
-eerr_eq0(POST_ARGS)
-{
-	return(check_count(mdoc, MDOC_ELEM, CHECK_FATAL, CHECK_EQ, 0));
-}
-
-static int
-eerr_eq1(POST_ARGS)
-{
-	return(check_count(mdoc, MDOC_ELEM, CHECK_FATAL, CHECK_EQ, 1));
-}
-
-static int
 eerr_ge1(POST_ARGS)
 {
-	return(check_count(mdoc, MDOC_ELEM, CHECK_FATAL, CHECK_GT, 0));
-}
-
-static int
-eerr_le1(POST_ARGS)
-{
-	return(check_count(mdoc, MDOC_ELEM, CHECK_FATAL, CHECK_LT, 2));
+	return(check_count(mdoc, MDOC_ELEM, CHECK_ERROR, CHECK_GT, 0));
 }
 
 static int
@@ -464,21 +436,21 @@ ewarn_eq0(POST_ARGS)
 }
 
 static int
-ewarn_ge1(POST_ARGS)
+ewarn_eq1(POST_ARGS)
 {
-	return(check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_GT, 0));
+	return(check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_EQ, 1));
 }
 
 static int
-herr_eq0(POST_ARGS)
+ewarn_ge1(POST_ARGS)
 {
-	return(check_count(mdoc, MDOC_HEAD, CHECK_FATAL, CHECK_EQ, 0));
+	return(check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_GT, 0));
 }
 
 static int
-herr_ge1(POST_ARGS)
+ewarn_le1(POST_ARGS)
 {
-	return(check_count(mdoc, MDOC_HEAD, CHECK_FATAL, CHECK_GT, 0));
+	return(check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_LT, 2));
 }
 
 static int
@@ -494,6 +466,12 @@ hwarn_eq1(POST_ARGS)
 }
 
 static int
+hwarn_ge1(POST_ARGS)
+{
+	return(check_count(mdoc, MDOC_HEAD, CHECK_WARN, CHECK_GT, 0));
+}
+
+static int
 hwarn_le1(POST_ARGS)
 {
 	return(check_count(mdoc, MDOC_HEAD, CHECK_WARN, CHECK_LT, 2));
@@ -1046,6 +1024,8 @@ post_lb(POST_ARGS)
 	char		*buf;
 	size_t		 sz;
 
+	check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_EQ, 1);
+
 	assert(mdoc->last->child);
 	assert(MDOC_TEXT == mdoc->last->child->type);
 
@@ -1238,8 +1218,10 @@ post_an(POST_ARGS)
 	struct mdoc_node *np;
 
 	np = mdoc->last;
-	if (AUTH__NONE != np->norm->An.auth && np->child)
-		return(eerr_eq0(mdoc));
+	if (AUTH__NONE != np->norm->An.auth && np->child) {
+		check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_EQ, 0);
+		return(1);
+	}
 
 	/* 
 	 * FIXME: make this ewarn and make sure that the front-ends
@@ -1581,8 +1563,12 @@ static int
 ebool(struct mdoc *mdoc)
 {
 
-	if (NULL == mdoc->last->child)
+	if (NULL == mdoc->last->child) {
+		mdoc_nmsg(mdoc, mdoc->last, MANDOCERR_MACROEMPTY);
+		mdoc_node_delete(mdoc, mdoc->last);
 		return(1);
+	}
+	check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_EQ, 1);
 
 	assert(MDOC_TEXT == mdoc->last->child->type);
 
@@ -1631,18 +1617,23 @@ post_root(POST_ARGS)
 static int
 post_st(POST_ARGS)
 {
-	const char	*p;
+	struct mdoc_node	 *ch;
+	const char		 *p;
 
-	assert(MDOC_TEXT == mdoc->last->child->type);
+	if (NULL == (ch = mdoc->last->child)) {
+		mdoc_nmsg(mdoc, mdoc->last, MANDOCERR_MACROEMPTY);
+		mdoc_node_delete(mdoc, mdoc->last);
+		return(1);
+	}
 
-	p = mdoc_a2st(mdoc->last->child->string);
+	assert(MDOC_TEXT == ch->type);
 
-	if (p == NULL) {
+	if (NULL == (p = mdoc_a2st(ch->string))) {
 		mdoc_nmsg(mdoc, mdoc->last, MANDOCERR_BADSTANDARD);
 		mdoc_node_delete(mdoc, mdoc->last);
 	} else {
-		free(mdoc->last->child->string);
-		mdoc->last->child->string = mandoc_strdup(p);
+		free(ch->string);
+		ch->string = mandoc_strdup(p);
 	}
 
 	return(1);
@@ -1654,8 +1645,18 @@ post_rs(POST_ARGS)
 	struct mdoc_node *nn, *next, *prev;
 	int		  i, j;
 
-	if (MDOC_BODY != mdoc->last->type)
+	switch (mdoc->last->type) {
+	case (MDOC_HEAD):
+		check_count(mdoc, MDOC_HEAD, CHECK_WARN, CHECK_EQ, 0);
 		return(1);
+	case (MDOC_BODY):
+		if (mdoc->last->child)
+			break;
+		check_count(mdoc, MDOC_BODY, CHECK_WARN, CHECK_GT, 0);
+		return(1);
+	default:
+		return(1);
+	}
 
 	/*
 	 * Make sure only certain types of nodes are allowed within the
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] partial cleanup of mdoc(7) arg count validation
  2011-01-02 21:15 [PATCH] partial cleanup of mdoc(7) arg count validation Ingo Schwarze
@ 2011-01-02 21:35 ` Kristaps Dzonsons
  2011-01-04  0:06   ` Ingo Schwarze
  0 siblings, 1 reply; 6+ messages in thread
From: Kristaps Dzonsons @ 2011-01-02 21:35 UTC (permalink / raw)
  To: tech

> when i ran into a few segfaults, i had a look at argument count
> validation in libmdoc, and it revealed itself as a can of worms.
>
> First, i did some framework improvements:
>   * Added MANDOCERR_ARGCWARN to support real warnings.
>   * Removed CHECK_FATAL because it was unused, and i can't see how
>     a wrong number of arguments on any macro line could ever force
>     us to abort the parser:  For missing arguments, one can assume
>     defaults, and excessive ones can be discarded.
>   * Use the correct level in check_count().
>   * Improve the wording: "more than N children", not "greater than".
>
> Then, the bug fixes that made me look at this:
>   * Don't segfault on empty .Rs.
>   * Don't segfault on empty .Sm and .Db; do the required checks in ebool().
>   * Don't segfault on empty .St; do the required checks in post_st().
>
> While there, i did lots of message level adjustments, all downgrades
> from error to warnings, because none of these is likely to loose relevant
> information.  Some macros had their own validation function anyway,
> so use these to do all checks, making the code easier to follow:
>   * .An -split with additional arguments
>   * .Lb with wrong number of arguments (all checks in post_lb)
>   * .Rs with header arguments
>   * .Rs with an empty body (all checks in post_rs)
>   * posts_text1: These three macros (.In, .Pf, .%U) have more problems:
>     Empty .In and .%U should be removed, and the first word after .Pf
>     should not be parsed, but i don't touch that now.
>   * .sp with excessive arguments
>
> The macros .Dl and .D1 don't need their HEAD children check:
> These macros are blk_part_imp, so they cannot have HEAD children;
> we could make it an assertion, but they have no dedicated validation
> functions, so i guess it's not worth the effort.
>
> After doing all this, i could emove the now unused functions
> eerr_eq0, eerr_eq1, eerr_le1, herr_eq0, herr_ge1,
> only adding ewarn_eq1, ewarn_le1, hwarn_ge1.
>
> The bad news is that eerr_ge1 still needs to be checked.
> Probably, it can be changed to ewarn_ge1; but after finding so many
> issues with the other functions, i'm reluctant to do that blindly,
> so i'm postponing it for now.  It is only used by .Ad .Cd .Er .Fn
> .Ic .%A .%B .%D .%I .%J .%N .%O .%P .%R .%T .%V .Ms .Sx .Sy .Tn .Lk
> .%C .%Q .Vt, so that should be a quick check.  ;-/
>
> OK for what i have so far?

Ingo,

These look fine by me, as the ERROR->WARN downgrade doesn't effect a 
semantic difference.

In general I encourage two functions instead of one (e.g., checking 
whether an argument exist (1) and checking whether it's bool (2)) to 
avoid code duplication.  If your way doesn't bloat the code any, 
however, I'm not opposed to it.

And by all means, feel free to remove those superfluous HEAD checks...

Thanks,

Kristaps
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] partial cleanup of mdoc(7) arg count validation
  2011-01-02 21:35 ` Kristaps Dzonsons
@ 2011-01-04  0:06   ` Ingo Schwarze
  2011-01-20 20:22     ` [PATCH] remaining " Ingo Schwarze
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Schwarze @ 2011-01-04  0:06 UTC (permalink / raw)
  To: tech

Hi Kristaps,

> These look fine by me, as the ERROR->WARN downgrade doesn't effect a
> semantic difference.

Thanks, these are in.

> In general I encourage two functions instead of one (e.g., checking
> whether an argument exist (1) and checking whether it's bool (2)) to
> avoid code duplication.  If your way doesn't bloat the code any,
> however, I'm not opposed to it.

No, it doesn't bloat.

With two functions, we typically have:
 * an entry in the function table
 * an xwarn_opN() wrapper function
 * that one just calls check_count() and returns (one line)
 * the main post_macro() validator function

With one function (in the cases where i removed the second
function), we have:
 * one fewer entry in the function table
 * sometimes the wrapper could be deleted completely
 * one additional line to call check_count() from post_macro()

What would bloat the code would be to introduce a new post_macro()
function just to call check_count() a few times, but i didn't
do that.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] remaining cleanup of mdoc(7) arg count validation
  2011-01-04  0:06   ` Ingo Schwarze
@ 2011-01-20 20:22     ` Ingo Schwarze
  2011-01-21 18:31       ` Kristaps Dzonsons
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Schwarze @ 2011-01-20 20:22 UTC (permalink / raw)
  To: tech

Hi,

recently, we downgraded several argument count issues in mdoc(7)
from ERROR to WARNING.  Not all cases could be handled in the first
batch.  Now i had another look and cleaned up the rest, which turned
out to be a subset of all mdoc macros handled by the mdoc_macro.c
function in_line(), so i tested the argument count checking of all
in_line() macros.

The changes are:
 * Most empty in_line() macros are already removed by the parser,
   so there is no need to check again in mdoc_validate.c.
 * posts_wtest[] can be consolidated into posts_text[].
 * All uses of eerr_ge1() go away, so we can delete that function.
 * For .An, handle both cases (-[no]split with more arguments
   and no arguments at all) as argument count warnings.

This patch leaves on single argument count ERROR in mdoc(7):
.Nd without arguments.  That issue causes output unusable by
makewhatis(8), so i feel calling that output "seriously garbled"
and throwing an ERROR is warranted.

OK?

Yours,
  Ingo


Index: mdoc_validate.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/mdoc_validate.c,v
retrieving revision 1.84
diff -u -p -r1.84 mdoc_validate.c
--- mdoc_validate.c	4 Jan 2011 22:28:17 -0000	1.84
+++ mdoc_validate.c	20 Jan 2011 20:01:57 -0000
@@ -73,7 +73,6 @@ static	int	 concat(struct mdoc *, char *
 static	int	 ebool(POST_ARGS);
 static	int	 berr_ge1(POST_ARGS);
 static	int	 bwarn_ge1(POST_ARGS);
-static	int	 eerr_ge1(POST_ARGS);
 static	int	 ewarn_eq0(POST_ARGS);
 static	int	 ewarn_eq1(POST_ARGS);
 static	int	 ewarn_ge1(POST_ARGS);
@@ -149,11 +148,10 @@ static	v_post	 posts_sp[] = { ewarn_le1,
 static	v_post	 posts_ss[] = { post_ignpar, hwarn_ge1, bwarn_ge1, NULL };
 static	v_post	 posts_st[] = { post_st, NULL };
 static	v_post	 posts_std[] = { post_std, NULL };
-static	v_post	 posts_text[] = { eerr_ge1, NULL };
+static	v_post	 posts_text[] = { ewarn_ge1, NULL };
 static	v_post	 posts_text1[] = { ewarn_eq1, NULL };
 static	v_post	 posts_vt[] = { post_vt, NULL };
 static	v_post	 posts_wline[] = { bwarn_ge1, NULL };
-static	v_post	 posts_wtext[] = { ewarn_ge1, NULL };
 static	v_pre	 pres_an[] = { pre_an, NULL };
 static	v_pre	 pres_bd[] = { pre_display, pre_bd, pre_literal, pre_par, NULL };
 static	v_pre	 pres_bl[] = { pre_bl, pre_par, NULL };
@@ -185,21 +183,21 @@ const	struct valids mdoc_valids[MDOC_MAX
 	{ pres_bl, posts_bl },			/* Bl */ 
 	{ NULL, NULL },				/* El */
 	{ pres_it, posts_it },			/* It */
-	{ NULL, posts_text },			/* Ad */ 
+	{ NULL, NULL },				/* Ad */ 
 	{ pres_an, posts_an },			/* An */ 
 	{ NULL, posts_defaults },		/* Ar */
-	{ NULL, posts_text },			/* Cd */ 
+	{ NULL, NULL },				/* Cd */ 
 	{ NULL, NULL },				/* Cm */
 	{ NULL, NULL },				/* Dv */ 
-	{ pres_er, posts_text },		/* Er */ 
+	{ pres_er, NULL },			/* Er */ 
 	{ NULL, NULL },				/* Ev */ 
 	{ pres_std, posts_std },		/* Ex */ 
 	{ NULL, NULL },				/* Fa */ 
-	{ pres_fd, posts_wtext },		/* Fd */
+	{ pres_fd, posts_text },		/* Fd */
 	{ NULL, NULL },				/* Fl */
-	{ NULL, posts_text },			/* Fn */ 
-	{ NULL, posts_wtext },			/* Ft */ 
-	{ NULL, posts_text },			/* Ic */ 
+	{ NULL, NULL },				/* Fn */ 
+	{ NULL, NULL },				/* Ft */ 
+	{ NULL, NULL },				/* Ic */ 
 	{ NULL, posts_text1 },			/* In */ 
 	{ NULL, posts_defaults },		/* Li */
 	{ NULL, posts_nd },			/* Nd */
@@ -211,7 +209,7 @@ const	struct valids mdoc_valids[MDOC_MAX
 	{ NULL, posts_st },			/* St */ 
 	{ NULL, NULL },				/* Va */
 	{ NULL, posts_vt },			/* Vt */ 
-	{ NULL, posts_wtext },			/* Xr */ 
+	{ NULL, posts_text },			/* Xr */ 
 	{ NULL, posts_text },			/* %A */
 	{ NULL, posts_text },			/* %B */ /* FIXME: can be used outside Rs/Re. */
 	{ NULL, posts_text },			/* %D */ /* FIXME: check date with mandoc_a2time(). */
@@ -242,7 +240,7 @@ const	struct valids mdoc_valids[MDOC_MAX
 	{ NULL, NULL },				/* Em */ 
 	{ NULL, NULL },				/* Eo */
 	{ NULL, NULL },				/* Fx */
-	{ NULL, posts_text },			/* Ms */ 
+	{ NULL, NULL },				/* Ms */ 
 	{ NULL, posts_notext },			/* No */
 	{ NULL, posts_notext },			/* Ns */
 	{ NULL, NULL },				/* Nx */
@@ -261,9 +259,9 @@ const	struct valids mdoc_valids[MDOC_MAX
 	{ NULL, NULL },				/* So */
 	{ NULL, NULL },				/* Sq */
 	{ NULL, posts_bool },			/* Sm */ 
-	{ NULL, posts_text },			/* Sx */
-	{ NULL, posts_text },			/* Sy */
-	{ NULL, posts_text },			/* Tn */
+	{ NULL, NULL },				/* Sx */
+	{ NULL, NULL },				/* Sy */
+	{ NULL, NULL },				/* Tn */
 	{ NULL, NULL },				/* Ux */
 	{ NULL, NULL },				/* Xc */
 	{ NULL, NULL },				/* Xo */
@@ -279,7 +277,7 @@ const	struct valids mdoc_valids[MDOC_MAX
 	{ NULL, posts_eoln },			/* Ud */
 	{ NULL, posts_lb },			/* Lb */
 	{ NULL, posts_notext },			/* Lp */ 
-	{ NULL, posts_text },			/* Lk */ 
+	{ NULL, NULL },				/* Lk */ 
 	{ NULL, posts_defaults },		/* Mt */ 
 	{ NULL, NULL },				/* Brq */ 
 	{ NULL, NULL },				/* Bro */ 
@@ -429,12 +427,6 @@ bwarn_ge1(POST_ARGS)
 }
 
 static int
-eerr_ge1(POST_ARGS)
-{
-	return(check_count(mdoc, MDOC_ELEM, CHECK_ERROR, CHECK_GT, 0));
-}
-
-static int
 ewarn_eq0(POST_ARGS)
 {
 	return(check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_EQ, 0));
@@ -1074,12 +1066,11 @@ post_vt(POST_ARGS)
 	/*
 	 * The Vt macro comes in both ELEM and BLOCK form, both of which
 	 * have different syntaxes (yet more context-sensitive
-	 * behaviour).  ELEM types must have a child; BLOCK types,
+	 * behaviour).  ELEM types must have a child, which is already
+	 * guaranteed by the in_line parsing routine; BLOCK types,
 	 * specifically the BODY, should only have TEXT children.
 	 */
 
-	if (MDOC_ELEM == mdoc->last->type)
-		return(eerr_ge1(mdoc));
 	if (MDOC_BODY != mdoc->last->type)
 		return(1);
 	
@@ -1223,19 +1214,12 @@ post_an(POST_ARGS)
 	struct mdoc_node *np;
 
 	np = mdoc->last;
-	if (AUTH__NONE != np->norm->An.auth && np->child) {
+	if (AUTH__NONE == np->norm->An.auth) {
+		if (0 == np->child)
+			check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_GT, 0);
+	} else if (np->child)
 		check_count(mdoc, MDOC_ELEM, CHECK_WARN, CHECK_EQ, 0);
-		return(1);
-	}
-
-	/* 
-	 * FIXME: make this ewarn and make sure that the front-ends
-	 * don't print the arguments.
-	 */
-	if (AUTH__NONE != np->norm->An.auth || np->child)
-		return(1);
 
-	mdoc_nmsg(mdoc, np, MANDOCERR_NOARGS);
 	return(1);
 }
 
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] remaining cleanup of mdoc(7) arg count validation
  2011-01-20 20:22     ` [PATCH] remaining " Ingo Schwarze
@ 2011-01-21 18:31       ` Kristaps Dzonsons
  2011-01-22 16:08         ` better explain roff(7) macro argument quoting Ingo Schwarze
  0 siblings, 1 reply; 6+ messages in thread
From: Kristaps Dzonsons @ 2011-01-21 18:31 UTC (permalink / raw)
  To: tech; +Cc: Ingo Schwarze

> recently, we downgraded several argument count issues in mdoc(7)
> from ERROR to WARNING.  Not all cases could be handled in the first
> batch.  Now i had another look and cleaned up the rest, which turned
> out to be a subset of all mdoc macros handled by the mdoc_macro.c
> function in_line(), so i tested the argument count checking of all
> in_line() macros.
>
> The changes are:
>   * Most empty in_line() macros are already removed by the parser,
>     so there is no need to check again in mdoc_validate.c.
>   * posts_wtest[] can be consolidated into posts_text[].
>   * All uses of eerr_ge1() go away, so we can delete that function.
>   * For .An, handle both cases (-[no]split with more arguments
>     and no arguments at all) as argument count warnings.
>
> This patch leaves on single argument count ERROR in mdoc(7):
> .Nd without arguments.  That issue causes output unusable by
> makewhatis(8), so i feel calling that output "seriously garbled"
> and throwing an ERROR is warranted.

Ingo, I like this!   Please check over mdoc.7 to see if the 
documentation for the macro (i.e., number of arguments) matches the 
reality.  Less code pleases me...

(I remember some documentation is still pending from you for the 
argument handling...)

Thanks again!

Kristaps
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

* better explain roff(7) macro argument quoting
  2011-01-21 18:31       ` Kristaps Dzonsons
@ 2011-01-22 16:08         ` Ingo Schwarze
  0 siblings, 0 replies; 6+ messages in thread
From: Ingo Schwarze @ 2011-01-22 16:08 UTC (permalink / raw)
  To: jmc; +Cc: tech

Hi Jason, hi Kristaps,

Kristaps Dzonsons wrote on Fri, Jan 21, 2011 at 07:31:53PM +0100:

[...]
> (I remember some documentation is still pending from you for the
> argument handling...)

Yes, indeed, i promised to look into that when i recently fixed the
roff(7) and man(7) argument parsing code.  Thanks for the reminder.

So, here is a suggestion.

This is really about roff(7) syntax, so i strongly feel it should be
in roff(7), not in man(7) and mdoc(7).  For now, i propose to put
a pointer into man(7), but to not touch the text in mdoc(7) yet.
As you know, the mdoc(7) code is still partially broken, so pointing
to roff(7) right now would even be a bit misleading; besides, what
mdoc(7) has so far roughly matches what mandoc really does in the
mdoc(7) case, even though that differs from groff in a few respects.

I realize that the text is getting much longer and rather technical.
But roff macro argument quoting is really tricky, so i think
documenting it with as much precision as possible is worthwhile.

Besides, i suggest to move this part of the documentation up to
its own chapter "MACRO SYNTAX" just after "REQUEST SYNTAX" because
the two are closely related, and because argument parsing logic
is somewhat misplaced below .de:  We are not talking about the
parsing of the arguments passed to the .de request after all.

OK?
  Ingo


Index: man.7
===================================================================
RCS file: /cvs/src/share/man/man7/man.7,v
retrieving revision 1.14
diff -u -r1.14 man.7
--- man.7	16 Jan 2011 02:56:47 -0000	1.14
+++ man.7	22 Jan 2011 16:04:23 -0000
@@ -373,6 +373,13 @@
 \&.\ \ \ PP
 .Ed
 .Pp
+To include space characters into macro arguments, arguments may be quoted;
+see the
+.Sq MACRO SYNTAX
+section in the
+.Xr roff 7
+manual for details.
+.Pp
 The
 .Nm
 macros are classified by scope: line scope or block scope.
Index: roff.7
===================================================================
RCS file: /cvs/src/share/man/man7/roff.7,v
retrieving revision 1.8
diff -u -r1.8 roff.7
--- roff.7	9 Jan 2011 15:24:57 -0000	1.8
+++ roff.7	22 Jan 2011 16:04:23 -0000
@@ -86,6 +86,38 @@
 \&.ig    end
 \&.   ig end
 .Ed
+.Sh MACRO SYNTAX
+Macros can be defined by the
+.Sx \&de
+request.
+When called, they follow the same syntax as requests, except that
+macro arguments may optionally be quoted by enclosing them
+in double quote characters
+.Pq Sq \(dq .
+To be recognized as the beginning of a quoted argument, the opening
+quote character must be preceded by a space character.
+.Pp
+A quoted argument may contain whitespace, and pairs of double quote
+characters
+.Pq Sq Qq
+resolve to single double quote characters.
+A quoted argument extends to the next double quote character that is not
+part of a pair, or to the end of the input line, whichever comes earlier.
+Leaving out the terminating double quote character at the end of the line
+is discouraged.
+For clarity, if more arguments follow on the same input line,
+it is recommended to follow the terminating double quote character
+by a space character; in case the next character after the terminating
+double quote character is anything else, it is regarded as the beginning
+of the next, unquoted argument.
+.Pp
+Both in quoted and unquoted arguments, pairs of backslashes
+.Pq Sq \e\e
+resolve to single backslashes.
+In unquoted arguments, space characters can alternatively be included
+by preceding them with a backslash
+.Pq Sq \e\~ ,
+but quoting is usually better for clarity.
 .Sh REQUEST REFERENCE
 The
 .Xr mandoc 1
@@ -174,12 +206,9 @@
 .Pp
 .D1 Pf . Ar name Op Ar argument Op Ar argument ...
 .Pp
-Arguments are separated by blank characters and can be quoted
-using double-quotes
-.Pq Sq \(dq
-to allow inclusion of blank characters into arguments.
-To include the double-quote character into a quoted argument,
-escape it from ending the argument by doubling it.
+Regarding argument parsing, see
+.Sx MACRO SYNTAX
+above.
 .Pp
 The line invoking the macro will be replaced
 in the input stream by the
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-01-22 16:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-02 21:15 [PATCH] partial cleanup of mdoc(7) arg count validation Ingo Schwarze
2011-01-02 21:35 ` Kristaps Dzonsons
2011-01-04  0:06   ` Ingo Schwarze
2011-01-20 20:22     ` [PATCH] remaining " Ingo Schwarze
2011-01-21 18:31       ` Kristaps Dzonsons
2011-01-22 16:08         ` better explain roff(7) macro argument quoting Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).