source@mandoc.bsd.lv
 help / color / mirror / Atom feed
* mandoc: Several improvements to escape sequence handling.
@ 2018-12-15 19:31 schwarze
  0 siblings, 0 replies; only message in thread
From: schwarze @ 2018-12-15 19:31 UTC (permalink / raw)
  To: source

Log Message:
-----------
Several improvements to escape sequence handling.

* Add the missing special character \_ (underscore).
* Partial implementations of \a (leader character)
and \E (uninterpreted escape character).
* Parse and ignore \r (reverse line feed).
* Add a WARNING message about undefined escape sequences.
* Add an UNSUPP message about unsupported escape sequences.
* Mark \! and \? (transparent throughput)
and \O (suppress output) as unsupported.
* Treat the various variants of zero-width spaces as one-byte escape
sequences rather than as special characters, to avoid defining bogus
forms with square brackets.
* For special characters with one-byte names, do not define bogus
forms with square brackets, except for \[-], which is valid.
* In the form with square brackets, undefined special characters do not
fall back to printing the name verbatim, not even for one-byte names.
* Starting a special character name with a blank is an error.
* Undefined escape sequences never abort formatting of the input
string, not even in HTML output mode.
* Document the newly handled escapes, and a few that were missing.
* Regression tests for most of the above.

Modified Files:
--------------
    mandoc:
        TODO
        chars.c
        html.c
        mandoc.1
        mandoc.c
        mandoc.h
        mandoc_char.7
        mandoc_msg.c
        mdoc_man.c
        mdoc_markdown.c
        roff.7
        roff.c
        term.c
    mandoc/regress/char/accent:
        Makefile
        nocombine.in
        nocombine.out_ascii
        nocombine.out_utf8
    mandoc/regress/char/space:
        Makefile
        esct-man.in
        esct-man.out_ascii
        esct-man.out_lint
    mandoc/regress/roff/esc:
        Makefile
        ignore.in
        ignore.out_ascii
        ignore.out_lint
        one.in
        one.out_ascii

Added Files:
-----------
    mandoc/regress/char/accent:
        nocombine.out_lint
    mandoc/regress/char/space:
        invalid.in
        invalid.out_ascii
        invalid.out_lint
    mandoc/regress/roff/esc:
        O.in
        O.out_ascii
        O.out_lint
        invalid.in
        invalid.out_ascii
        invalid.out_lint
        unsupp.in
        unsupp.out_ascii
        unsupp.out_lint

Revision Data
-------------
Index: nocombine.out_utf8
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/char/accent/nocombine.out_utf8,v
retrieving revision 1.1
retrieving revision 1.2
diff -Lregress/char/accent/nocombine.out_utf8 -Lregress/char/accent/nocombine.out_utf8 -u -p -r1.1 -r1.2
--- regress/char/accent/nocombine.out_utf8
+++ regress/char/accent/nocombine.out_utf8
@@ -7,10 +7,10 @@ N\bNA\bAM\bME\bE
 
 D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
        bare acute accent: e'e
-       escaped acute accent: e´e
+       escaped acute accent: e´ee
        acute accent sequence: e´e
        bare grave accent: e`e
-       escaped grave accent: e`e
+       escaped grave accent: e`ee
        acute grave sequence: e`e
        hungarian umlaut: e˝e
        circumflex: e^e
@@ -25,4 +25,4 @@ D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
 
 
 
-OpenBSD                          March 8, 2014        CHAR-ACCENT-NOCOMBINE(1)
+OpenBSD                        December 15, 2018      CHAR-ACCENT-NOCOMBINE(1)
Index: Makefile
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/char/accent/Makefile,v
retrieving revision 1.1
retrieving revision 1.2
diff -Lregress/char/accent/Makefile -Lregress/char/accent/Makefile -u -p -r1.1 -r1.2
--- regress/char/accent/Makefile
+++ regress/char/accent/Makefile
@@ -3,5 +3,6 @@
 REGRESS_TARGETS = nocombine utf8only combine
 SKIP_ASCII = utf8only combine
 UTF8_TARGETS = nocombine utf8only combine
+LINT_TARGETS = nocombine
 
 .include <bsd.regress.mk>
Index: nocombine.in
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/char/accent/nocombine.in,v
retrieving revision 1.2
retrieving revision 1.3
diff -Lregress/char/accent/nocombine.in -Lregress/char/accent/nocombine.in -u -p -r1.2 -r1.3
--- regress/char/accent/nocombine.in
+++ regress/char/accent/nocombine.in
@@ -1,17 +1,17 @@
 .\" $OpenBSD: nocombine.in,v 1.2 2017/07/04 14:53:23 schwarze Exp $
-.TH CHAR-ACCENT-NOCOMBINE 1 "March 8, 2014"
+.TH CHAR-ACCENT-NOCOMBINE 1 "December 15, 2018"
 .SH NAME
 \fBchar-accent-nocombine\fR - non-combining accents
 .SH DESCRIPTION
 bare acute accent: e'e
 .br
-escaped acute accent: e\'e
+escaped acute accent: e\'e\[']e
 .br
 acute accent sequence: e\(aae
 .br
 bare grave accent: e`e
 .br
-escaped grave accent: e\`e
+escaped grave accent: e\`e\[`]e
 .br
 acute grave sequence: e\(gae
 .br
@@ -20,15 +20,15 @@ hungarian umlaut: e\(a"e
 .\" XXX This is ridiculous.
 .\" XXX groff prints the macron as an underscore in the previous line.
 .\" macron: e\(a-e
-.br
+.\" .br
 .\" XXX groff doesn't have a dot in ASCII mode, only in UTF-8 mode.
 .\" dotted: e\(a.e
-.br
+.\" .br
 circumflex: e\(a^e
 .br
 .\" XXX groff uses a backspace for this one in ASCII mode.
 .\" breve: e\(abe
-.br
+.\" .br
 cedilla: e\(ace
 .br
 dieresis: e\(ade
--- /dev/null
+++ regress/char/accent/nocombine.out_lint
@@ -0,0 +1,2 @@
+mandoc: nocombine.in:8:27: WARNING: invalid escape sequence: \[']
+mandoc: nocombine.in:14:27: WARNING: invalid escape sequence: \[`]
Index: nocombine.out_ascii
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/char/accent/nocombine.out_ascii,v
retrieving revision 1.1
retrieving revision 1.2
diff -Lregress/char/accent/nocombine.out_ascii -Lregress/char/accent/nocombine.out_ascii -u -p -r1.1 -r1.2
--- regress/char/accent/nocombine.out_ascii
+++ regress/char/accent/nocombine.out_ascii
@@ -7,10 +7,10 @@ N\bNA\bAM\bME\bE
 
 D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
        bare acute accent: e'e
-       escaped acute accent: e'e
+       escaped acute accent: e'ee
        acute accent sequence: e'e
        bare grave accent: e`e
-       escaped grave accent: e`e
+       escaped grave accent: e`ee
        acute grave sequence: e`e
        hungarian umlaut: e"e
        circumflex: e^e
@@ -25,4 +25,4 @@ D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
 
 
 
-OpenBSD                          March 8, 2014        CHAR-ACCENT-NOCOMBINE(1)
+OpenBSD                        December 15, 2018      CHAR-ACCENT-NOCOMBINE(1)
Index: mdoc_markdown.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mdoc_markdown.c,v
retrieving revision 1.28
retrieving revision 1.29
diff -Lmdoc_markdown.c -Lmdoc_markdown.c -u -p -r1.28 -r1.29
--- mdoc_markdown.c
+++ mdoc_markdown.c
@@ -589,6 +589,9 @@ md_word(const char *s)
 			case ESCAPE_SPECIAL:
 				uc = mchars_spec2cp(seq, sz);
 				break;
+			case ESCAPE_UNDEF:
+				uc = *seq;
+				break;
 			case ESCAPE_DEVICE:
 				md_rawword("markdown");
 				continue;
Index: mdoc_man.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mdoc_man.c,v
retrieving revision 1.129
retrieving revision 1.130
diff -Lmdoc_man.c -Lmdoc_man.c -u -p -r1.129 -r1.130
--- mdoc_man.c
+++ mdoc_man.c
@@ -325,6 +325,7 @@ man_strlen(const char *cp)
 		case ESCAPE_UNICODE:
 		case ESCAPE_NUMBERED:
 		case ESCAPE_SPECIAL:
+		case ESCAPE_UNDEF:
 		case ESCAPE_OVERSTRIKE:
 			if (skip)
 				skip = 0;
Index: term.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/term.c,v
retrieving revision 1.276
retrieving revision 1.277
diff -Lterm.c -Lterm.c -u -p -r1.276 -r1.277
--- term.c
+++ term.c
@@ -477,9 +477,6 @@ term_word(struct termp *p, const char *w
 
 		word++;
 		esc = mandoc_escape(&word, &seq, &sz);
-		if (ESCAPE_ERROR == esc)
-			continue;
-
 		switch (esc) {
 		case ESCAPE_UNICODE:
 			uc = mchars_num2uc(seq + 1, sz - 1);
@@ -500,6 +497,9 @@ term_word(struct termp *p, const char *w
 					encode1(p, uc);
 			}
 			continue;
+		case ESCAPE_UNDEF:
+			uc = *seq;
+			break;
 		case ESCAPE_FONTBOLD:
 			term_fontrepl(p, TERMFONT_BOLD);
 			continue;
@@ -587,6 +587,9 @@ term_word(struct termp *p, const char *w
 				case ESCAPE_SPECIAL:
 					uc = mchars_spec2cp(cp, sz);
 					break;
+				case ESCAPE_UNDEF:
+					uc = *seq;
+					break;
 				default:
 					uc = -1;
 					break;
@@ -845,12 +848,8 @@ term_strlen(const struct termp *p, const
 		switch (*cp) {
 		case '\\':
 			cp++;
-			esc = mandoc_escape(&cp, &seq, &ssz);
-			if (ESCAPE_ERROR == esc)
-				continue;
-
 			rhs = NULL;
-
+			esc = mandoc_escape(&cp, &seq, &ssz);
 			switch (esc) {
 			case ESCAPE_UNICODE:
 				uc = mchars_num2uc(seq + 1, ssz - 1);
@@ -871,6 +870,9 @@ term_strlen(const struct termp *p, const
 						sz += cond_width(p, uc, &skip);
 				}
 				continue;
+			case ESCAPE_UNDEF:
+				uc = *seq;
+				break;
 			case ESCAPE_DEVICE:
 				if (p->type == TERMTYPE_PDF) {
 					rhs = "pdf";
Index: roff.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/roff.c,v
retrieving revision 1.351
retrieving revision 1.352
diff -Lroff.c -Lroff.c -u -p -r1.351 -r1.352
--- roff.c
+++ roff.c
@@ -1154,6 +1154,7 @@ roff_res(struct roff *r, struct buf *buf
 	struct roff_node *n;	/* used for header comments */
 	const char	*start;	/* start of the string to process */
 	char		*stesc;	/* start of an escape sequence ('\\') */
+	const char	*esct;	/* type of esccape sequence */
 	char		*ep;	/* end of comment string */
 	const char	*stnam;	/* start of the name, after "[(*" */
 	const char	*cp;	/* end of the name, e.g. before ']' */
@@ -1163,7 +1164,6 @@ roff_res(struct roff *r, struct buf *buf
 	size_t		 naml;	/* actual length of the escape name */
 	size_t		 asz;	/* length of the replacement */
 	size_t		 rsz;	/* length of the rest of the string */
-	enum mandoc_esc	 esc;	/* type of the escape sequence */
 	int		 inaml;	/* length returned from mandoc_escape() */
 	int		 expand_count;	/* to avoid infinite loops */
 	int		 npos;	/* position in numeric expression */
@@ -1172,6 +1172,7 @@ roff_res(struct roff *r, struct buf *buf
 	int		 done;	/* no more input available */
 	int		 deftype; /* type of definition to paste */
 	int		 rcsid;	/* kind of RCS id seen */
+	enum mandocerr	 err;	/* for escape sequence problems */
 	char		 sign;	/* increment number register */
 	char		 term;	/* character terminating the escape */
 
@@ -1304,7 +1305,10 @@ roff_res(struct roff *r, struct buf *buf
 
 		term = '\0';
 		cp = stesc + 1;
-		switch (*cp) {
+		if (*cp == 'E')
+			cp++;
+		esct = cp;
+		switch (*esct) {
 		case '*':
 		case '$':
 			res = NULL;
@@ -1320,12 +1324,26 @@ roff_res(struct roff *r, struct buf *buf
 			res = ubuf;
 			break;
 		default:
-			esc = mandoc_escape(&cp, &stnam, &inaml);
-			if (esc == ESCAPE_ERROR ||
-			    (esc == ESCAPE_SPECIAL &&
-			     mchars_spec2cp(stnam, inaml) < 0))
-				mandoc_msg(MANDOCERR_ESC_BAD,
-				    ln, (int)(stesc - buf->buf),
+			err = MANDOCERR_OK;
+			switch(mandoc_escape(&cp, &stnam, &inaml)) {
+			case ESCAPE_SPECIAL:
+				if (mchars_spec2cp(stnam, inaml) >= 0)
+					break;
+				/* FALLTHROUGH */
+			case ESCAPE_ERROR:
+				err = MANDOCERR_ESC_BAD;
+				break;
+			case ESCAPE_UNDEF:
+				err = MANDOCERR_ESC_UNDEF;
+				break;
+			case ESCAPE_UNSUPP:
+				err = MANDOCERR_ESC_UNSUPP;
+				break;
+			default:
+				break;
+			}
+			if (err != MANDOCERR_OK)
+				mandoc_msg(err, ln, (int)(stesc - buf->buf),
 				    "%.*s", (int)(cp - stesc), stesc);
 			stesc--;
 			continue;
@@ -1382,7 +1400,7 @@ roff_res(struct roff *r, struct buf *buf
 				cp++;
 				break;
 			}
-			if (*cp++ != '\\' || stesc[1] != 'w') {
+			if (*cp++ != '\\' || *esct != 'w') {
 				naml++;
 				continue;
 			}
@@ -1390,6 +1408,7 @@ roff_res(struct roff *r, struct buf *buf
 			case ESCAPE_SPECIAL:
 			case ESCAPE_UNICODE:
 			case ESCAPE_NUMBERED:
+			case ESCAPE_UNDEF:
 			case ESCAPE_OVERSTRIKE:
 				naml++;
 				break;
@@ -1403,7 +1422,7 @@ roff_res(struct roff *r, struct buf *buf
 		 * undefined, resume searching for escapes.
 		 */
 
-		switch (stesc[1]) {
+		switch (*esct) {
 		case '*':
 			if (arg_complete) {
 				deftype = ROFFDEF_USER | ROFFDEF_PRE;
@@ -1430,15 +1449,15 @@ roff_res(struct roff *r, struct buf *buf
 				break;
 			}
 			ctx = r->mstack + r->mstackpos;
-			npos = stesc[2] - '1';
+			npos = esct[1] - '1';
 			if (npos >= 0 && npos <= 8) {
 				res = npos < ctx->argc ?
 				    ctx->argv[npos] : "";
 				break;
 			}
-			if (stesc[2] == '*')
+			if (esct[1] == '*')
 				quote_args = 0;
-			else if (stesc[2] == '@')
+			else if (esct[1] == '@')
 				quote_args = 1;
 			else {
 				mandoc_msg(MANDOCERR_ARG_NONUM, ln,
@@ -1500,7 +1519,7 @@ roff_res(struct roff *r, struct buf *buf
 		}
 
 		if (res == NULL) {
-			if (stesc[1] == '*')
+			if (*esct == '*')
 				mandoc_msg(MANDOCERR_STR_UNDEF,
 				    ln, (int)(stesc - buf->buf),
 				    "%.*s", (int)naml, stnam);
Index: mandoc.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mandoc.c,v
retrieving revision 1.110
retrieving revision 1.111
diff -Lmandoc.c -Lmandoc.c -u -p -r1.110 -r1.111
--- mandoc.c
+++ mandoc.c
@@ -56,6 +56,14 @@ mandoc_escape(const char **end, const ch
 		sz = &local_sz;
 
 	/*
+	 * Treat "\E" just like "\";
+	 * it only makes a difference in copy mode.
+	 */
+
+	if (**end == 'E')
+		++*end;
+
+	/*
 	 * Beyond the backslash, at least one input character
 	 * is part of the escape sequence.  With one exception
 	 * (see below), that character won't be returned.
@@ -77,6 +85,10 @@ mandoc_escape(const char **end, const ch
 		*sz = 2;
 		break;
 	case '[':
+		if (**start == ' ') {
+			++*end;
+			return ESCAPE_ERROR;
+		}
 		gly = ESCAPE_SPECIAL;
 		term = ']';
 		break;
@@ -91,11 +103,26 @@ mandoc_escape(const char **end, const ch
 	/*
 	 * Escapes taking no arguments at all.
 	 */
-	case 'd':
-	case 'u':
+	case '!':
+	case '?':
+		return ESCAPE_UNSUPP;
+	case '%':
+	case '&':
+	case ')':
 	case ',':
 	case '/':
+	case '^':
+	case 'a':
+	case 'd':
+	case 'r':
+	case 't':
+	case 'u':
+	case '{':
+	case '|':
+	case '}':
 		return ESCAPE_IGNORE;
+	case 'c':
+		return ESCAPE_NOSPACE;
 	case 'p':
 		return ESCAPE_BREAK;
 
@@ -113,28 +140,46 @@ mandoc_escape(const char **end, const ch
 	 * 'X' is the trigger.  These have opaque sub-strings.
 	 */
 	case 'F':
+	case 'f':
 	case 'g':
 	case 'k':
 	case 'M':
 	case 'm':
 	case 'n':
+	case 'O':
 	case 'V':
 	case 'Y':
-		gly = ESCAPE_IGNORE;
-		/* FALLTHROUGH */
-	case 'f':
-		if (ESCAPE_ERROR == gly)
-			gly = ESCAPE_FONT;
+		gly = (*start)[-1] == 'f' ? ESCAPE_FONT : ESCAPE_IGNORE;
 		switch (**start) {
 		case '(':
+			if ((*start)[-1] == 'O')
+				gly = ESCAPE_ERROR;
 			*start = ++*end;
 			*sz = 2;
 			break;
 		case '[':
+			if ((*start)[-1] == 'O')
+				gly = (*start)[1] == '5' ?
+				    ESCAPE_UNSUPP : ESCAPE_ERROR;
 			*start = ++*end;
 			term = ']';
 			break;
 		default:
+			if ((*start)[-1] == 'O') {
+				switch (**start) {
+				case '0':
+					gly = ESCAPE_UNSUPP;
+					break;
+				case '1':
+				case '2':
+				case '3':
+				case '4':
+					break;
+				default:
+					gly = ESCAPE_ERROR;
+					break;
+				}
+			}
 			*sz = 1;
 			break;
 		}
@@ -257,18 +302,29 @@ mandoc_escape(const char **end, const ch
 		break;
 
 	/*
-	 * Anything else is assumed to be a glyph.
-	 * In this case, pass back the character after the backslash.
+	 * Several special characters can be encoded as
+	 * one-byte escape sequences without using \[].
 	 */
-	default:
+	case ' ':
+	case '\'':
+	case '-':
+	case '.':
+	case '0':
+	case ':':
+	case '_':
+	case '`':
+	case 'e':
+	case '~':
 		gly = ESCAPE_SPECIAL;
+		/* FALLTHROUGH */
+	default:
+		if (gly == ESCAPE_ERROR)
+			gly = ESCAPE_UNDEF;
 		*start = --*end;
 		*sz = 1;
 		break;
 	}
 
-	assert(ESCAPE_ERROR != gly);
-
 	/*
 	 * Read up to the terminating character,
 	 * paying attention to nested escapes.
@@ -291,6 +347,15 @@ mandoc_escape(const char **end, const ch
 			}
 		}
 		*sz = (*end)++ - *start;
+
+		/*
+		 * The file chars.c only provides one common list
+		 * of character names, but \[-] == \- is the only
+		 * one of the characters with one-byte names that
+		 * allows enclosing the name in brackets.
+		 */
+		if (gly == ESCAPE_SPECIAL && *sz == 1 && **start != '-')
+			return ESCAPE_ERROR;
 	} else {
 		assert(*sz > 0);
 		if ((size_t)*sz > strlen(*start))
@@ -346,10 +411,6 @@ mandoc_escape(const char **end, const ch
 		break;
 	case ESCAPE_SPECIAL:
 		if (**start == 'c') {
-			if (*sz == 1) {
-				gly = ESCAPE_NOSPACE;
-				break;
-			}
 			if (*sz < 6 || *sz > 7 ||
 			    strncmp(*start, "char", 4) != 0 ||
 			    (int)strspn(*start + 4, "0123456789") + 4 < *sz)
@@ -431,6 +492,7 @@ mandoc_getarg(char **cpp, int ln, int *p
 			 * backslashes and backslash-t to literal tabs.
 			 */
 			switch (cp[1]) {
+			case 'a':
 			case 't':
 				cp[0] = '\t';
 				/* FALLTHROUGH */
Index: mandoc.h
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mandoc.h,v
retrieving revision 1.260
retrieving revision 1.261
diff -Lmandoc.h -Lmandoc.h -u -p -r1.260 -r1.261
--- mandoc.h
+++ mandoc.h
@@ -169,6 +169,7 @@ enum	mandocerr {
 	MANDOCERR_FI_TAB, /* tab in filled text */
 	MANDOCERR_EOS, /* new sentence, new line */
 	MANDOCERR_ESC_BAD, /* invalid escape sequence: esc */
+	MANDOCERR_ESC_UNDEF, /* undefined escape, printing literally: char */
 	MANDOCERR_STR_UNDEF, /* undefined string, using "": name */
 
 	/* related to tables */
@@ -231,6 +232,7 @@ enum	mandocerr {
 
 	MANDOCERR_TOOLARGE, /* input too large */
 	MANDOCERR_CHAR_UNSUPP, /* unsupported control character: number */
+	MANDOCERR_ESC_UNSUPP, /* unsupported escape sequence: escape */
 	MANDOCERR_REQ_UNSUPP, /* unsupported roff request: request */
 	MANDOCERR_WHILE_NEST, /* nested .while loops */
 	MANDOCERR_WHILE_OUTOF, /* end of scope with open .while loop */
@@ -245,7 +247,9 @@ enum	mandocerr {
 
 enum	mandoc_esc {
 	ESCAPE_ERROR = 0, /* bail! unparsable escape */
+	ESCAPE_UNSUPP, /* unsupported escape; ignore it */
 	ESCAPE_IGNORE, /* escape to be ignored */
+	ESCAPE_UNDEF, /* undefined escape; print literal character */
 	ESCAPE_SPECIAL, /* a regular special character */
 	ESCAPE_FONT, /* a generic font mode */
 	ESCAPE_FONTBOLD, /* bold font mode */
Index: chars.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/chars.c,v
retrieving revision 1.77
retrieving revision 1.78
diff -Lchars.c -Lchars.c -u -p -r1.77 -r1.78
--- chars.c
+++ chars.c
@@ -1,7 +1,7 @@
 /*	$Id$ */
 /*
  * Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv>
- * Copyright (c) 2011, 2014, 2015, 2017 Ingo Schwarze <schwarze@openbsd.org>
+ * Copyright (c) 2011,2014,2015,2017,2018 Ingo Schwarze <schwarze@openbsd.org>
  *
  * Permission to use, copy, modify, and distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
@@ -48,21 +48,13 @@ static struct ln lines[] = {
 	{ " ",			ascii_nbrsp,	0x00a0	},
 	{ "~",			ascii_nbrsp,	0x00a0	},
 	{ "0",			" ",		0x2002	},
-	{ "|",			"",		0	},
-	{ "^",			"",		0	},
-	{ "&",			"",		0	},
-	{ ")",			"",		0	},
-	{ "%",			"",		0	},
 	{ ":",			ascii_break,	0	},
-	/* XXX The following three do not really belong here. */
-	{ "t",			"",		0	},
-	{ "c",			"",		0	},
-	{ "}",			"",		0	},
 
 	/* Lines. */
 	{ "ba",			"|",		0x007c	},
 	{ "br",			"|",		0x2502	},
 	{ "ul",			"_",		0x005f	},
+	{ "_",			"_",		0x005f	},
 	{ "ru",			"_",		0x005f	},
 	{ "rn",			"-",		0x203e	},
 	{ "bb",			"|",		0x00a6	},
@@ -465,7 +457,7 @@ mchars_spec2cp(const char *p, size_t sz)
 
 	end = p + sz;
 	ln = ohash_find(&mchars, ohash_qlookupi(&mchars, p, &end));
-	return ln != NULL ? ln->unicode : sz == 1 ? (unsigned char)*p : -1;
+	return ln != NULL ? ln->unicode : -1;
 }
 
 int
@@ -495,10 +487,8 @@ mchars_spec2str(const char *p, size_t sz
 
 	end = p + sz;
 	ln = ohash_find(&mchars, ohash_qlookupi(&mchars, p, &end));
-	if (ln == NULL) {
-		*rsz = 1;
-		return sz == 1 ? p : NULL;
-	}
+	if (ln == NULL)
+		return NULL;
 
 	*rsz = strlen(ln->ascii);
 	return ln->ascii;
Index: TODO
===================================================================
RCS file: /home/cvs/mandoc/mandoc/TODO,v
retrieving revision 1.280
retrieving revision 1.281
diff -LTODO -LTODO -u -p -r1.280 -r1.281
--- TODO
+++ TODO
@@ -75,12 +75,6 @@ are mere guesses, and some may be wrong.
   Found by naddy@ in devel/cutils cobfusc(1)  Mon, 16 Feb 2015 19:10:52 +0100
   loc ***  exist ***  algo ***  size **  imp *
 
-- check for missing roff escape sequences, implement those that are
-  trivial even if not usually appearing in manual pages, gracefully
-  ignore the non-trivial ones, document what they are supposed to do
-  and what mandoc does instead
-  loc *  exist **  algo *  size *  imp *
-
 --- missing mdoc features ----------------------------------------------
 
 - .Bl -column .Xo support is missing
@@ -533,10 +527,6 @@ are mere guesses, and some may be wrong.
   of a text line, if it is likely intended to follow the preceding
   output without intervening whitespace, in particular after a
   macro line (from the mdoclint TODO)
-
-- mandoc_special does not really check the escape sequence,
-  but just the overall format
-  loc **  exist **  algo ***  size **  imp **
 
 - makewhatis -p complains about language subdirectories:
   /usr/local/man//ru: Unknown directory part
Index: mandoc_msg.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mandoc_msg.c,v
retrieving revision 1.2
retrieving revision 1.3
diff -Lmandoc_msg.c -Lmandoc_msg.c -u -p -r1.2 -r1.3
--- mandoc_msg.c
+++ mandoc_msg.c
@@ -167,6 +167,7 @@ static	const char *const type_message[MA
 	"tab in filled text",
 	"new sentence, new line",
 	"invalid escape sequence",
+	"undefined escape, printing literally",
 	"undefined string, using \"\"",
 
 	/* related to tables */
@@ -228,6 +229,7 @@ static	const char *const type_message[MA
 	"unsupported feature",
 	"input too large",
 	"unsupported control character",
+	"unsupported escape sequence",
 	"unsupported roff request",
 	"nested .while loops",
 	"end of scope with open .while loop",
Index: html.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/html.c,v
retrieving revision 1.245
retrieving revision 1.246
diff -Lhtml.c -Lhtml.c -u -p -r1.245 -r1.246
--- html.c
+++ html.c
@@ -402,9 +402,6 @@ print_encode(struct html *h, const char 
 			continue;
 
 		esc = mandoc_escape(&p, &seq, &len);
-		if (ESCAPE_ERROR == esc)
-			break;
-
 		switch (esc) {
 		case ESCAPE_FONT:
 		case ESCAPE_FONTPREV:
@@ -422,6 +419,8 @@ print_encode(struct html *h, const char 
 		case ESCAPE_SKIPCHAR:
 			h->flags |= HTML_SKIPCHAR;
 			continue;
+		case ESCAPE_ERROR:
+			continue;
 		default:
 			break;
 		}
@@ -445,6 +444,9 @@ print_encode(struct html *h, const char 
 			c = mchars_spec2cp(seq, len);
 			if (c <= 0)
 				continue;
+			break;
+		case ESCAPE_UNDEF:
+			c = *seq;
 			break;
 		case ESCAPE_DEVICE:
 			print_word(h, "html");
Index: mandoc.1
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mandoc.1,v
retrieving revision 1.231
retrieving revision 1.232
diff -Lmandoc.1 -Lmandoc.1 -u -p -r1.231 -r1.232
--- mandoc.1
+++ mandoc.1
@@ -1676,7 +1676,8 @@ Start it on a new input line to help for
 .It Sy "invalid escape sequence"
 .Pq roff
 An escape sequence has an invalid opening argument delimiter, lacks the
-closing argument delimiter, or the argument has too few characters.
+closing argument delimiter, the argument is of an invalid form, or it is
+a character escape sequence with an invalid name.
 If the argument is incomplete,
 .Ic \e*
 and
@@ -1689,6 +1690,12 @@ and
 .Ic \ew
 to the length of the incomplete argument.
 All other invalid escape sequences are ignored.
+.It Sy "undefined escape, printing literally"
+.Pq roff
+In an escape sequence, the first character
+right after the leading backslash is invalid.
+That character is printed literally,
+which is equivalent to ignoring the backslash.
 .It Sy "undefined string, using \(dq\(dq"
 .Pq roff
 If a string is used without being defined before,
@@ -2154,6 +2161,13 @@ implementations but not by
 .Nm
 was found in an input file.
 It is replaced by a question mark.
+.It Sy "unsupported escape sequence"
+.Pq roff
+An input file contains an escape sequence supported by GNU troff
+or Heirloom troff but not by
+.Nm ,
+and it is likely that this will cause information loss
+or considerable misformatting.
 .It Sy "unsupported roff request"
 .Pq roff
 An input file contains a
Index: mandoc_char.7
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mandoc_char.7,v
retrieving revision 1.74
retrieving revision 1.75
diff -Lmandoc_char.7 -Lmandoc_char.7 -u -p -r1.74 -r1.75
--- mandoc_char.7
+++ mandoc_char.7
@@ -2,7 +2,7 @@
 .\"
 .\" Copyright (c) 2003 Jason McIntyre <jmc@openbsd.org>
 .\" Copyright (c) 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv>
-.\" Copyright (c) 2011, 2013, 2015, 2017 Ingo Schwarze <schwarze@openbsd.org>
+.\" Copyright (c) 2011,2013,2015,2017,2018 Ingo Schwarze <schwarze@openbsd.org>
 .\"
 .\" Permission to use, copy, modify, and distribute this software for any
 .\" purpose with or without fee is hereby granted, provided that the above
@@ -266,11 +266,13 @@ Spacing:
 .It Em Input Ta Em Description
 .It Sq \e\ \& Ta unpaddable non-breaking space
 .It \e\(ti   Ta paddable non-breaking space
-.It \e0      Ta unpaddable, breaking digit-width space
+.It \e0      Ta digit-width space allowing line break
 .It \e|      Ta one-sixth \e(em narrow space, zero width in nroff mode
 .It \e^      Ta one-twelfth \e(em half-narrow space, zero width in nroff
-.It \e&      Ta zero-width space
+.It \e&      Ta zero-width non-breaking space
+.It \e)      Ta zero-width space transparent to end-of-sentence detection
 .It \e%      Ta zero-width space allowing hyphenation
+.It \e:      Ta zero-width space allowing line break
 .El
 .Pp
 Lines:
Index: roff.7
===================================================================
RCS file: /home/cvs/mandoc/mandoc/roff.7,v
retrieving revision 1.107
retrieving revision 1.108
diff -Lroff.7 -Lroff.7 -u -p -r1.107 -r1.108
--- roff.7
+++ roff.7
@@ -1844,6 +1844,11 @@ The escape sequence backslash-space
 .Pq Sq \e\ \&
 is an unpaddable space-sized non-breaking space character; see
 .Sx Whitespace .
+.It Ic \e!
+Embed text up to and including the end of the input line into the
+current diversion or into intermediate output without interpreting
+requests, macros, and escapes.
+Currently unsupported.
 .It Ic \e\(dq
 The rest of the input line is treated as
 .Sx Comments .
@@ -1870,6 +1875,10 @@ instead.
 .Sx Special Characters
 with two-letter names, see
 .Xr mandoc_char 7 .
+.It Ic \e)
+Zero-width space transparent to end-of-sentence detection;
+ignored by
+.Xr mandoc 1 .
 .It Ic \e*[ Ns Ar name Ns Ic \&]
 Interpolate the string with the
 .Ar name .
@@ -1907,6 +1916,15 @@ Special character
 .It Ic \e/
 Right italic correction (groff extension); ignored by
 .Xr mandoc 1 .
+.It Ic \e:
+Breaking the line is allowed at this point of the word
+without inserting a hyphen.
+.It Ic \e?
+Embed the text up to the next
+.Ic \e?
+into the current diversion without interpreting requests, macros,
+and escapes.
+This is a groff extension and currently unsupported.
 .It Ic \e[ Ns Ar name Ns Ic \&]
 .Sx Special Characters
 with names of arbitrary length, see
@@ -1914,6 +1932,10 @@ with names of arbitrary length, see
 .It Ic \e^
 One-twelfth em half-narrow space character, effectively zero-width in
 .Xr mandoc 1 .
+.It Ic \e_
+Underline special character; use
+.Ic \e(ul
+instead.
 .It Ic \e`
 Grave accent special character; use
 .Ic \e(ga
@@ -1934,6 +1956,9 @@ Digit width space character.
 .It Ic \eA\(aq Ns Ar string Ns Ic \(aq
 Anchor definition; ignored by
 .Xr mandoc 1 .
+.It Ic \ea
+Leader character; ignored by
+.Xr mandoc 1 .
 .It Ic \eB\(aq Ns Ar string Ns Ic \(aq
 Interpolate
 .Sq 1
@@ -1961,6 +1986,13 @@ Draw graphics function; ignored by
 .It Ic \ed
 Move down by half a line; ignored by
 .Xr mandoc 1 .
+.It Ic \eE
+Escape character intended to not be interpreted in copy mode.
+In
+.Xr mandoc 1 ,
+it does the same as
+.Ic \e
+itself for now.
 .It Ic \ee
 Backslash special character.
 .It Ic \eF[ Ns Ar name Ns Ic \&]
@@ -2042,6 +2074,14 @@ the register is first incremented or dec
 that was specified in the relevant
 .Ic \&nr
 request, and the changed value is interpolated.
+.It Ic \eO Ns Ar digit , Ic \eO[5 Ns arguments Ns Ic \&]
+Suppress output.
+This is a groff extension and currently unsupported.
+With an argument of
+.Ic 1 , 2 , 3 ,
+or
+.Ic 4 ,
+it is ignored.
 .It Ic \eo\(aq Ns Ar string Ns Ic \(aq
 Overstrike, writing all the characters contained in the
 .Ar string
@@ -2052,6 +2092,9 @@ only the last one of the characters is v
 Break the output line at the end of the current word.
 .It Ic \eR\(aq Ns Ar name Oo +|- Oc Ns Ar number Ns Ic \(aq
 Set number register; ignored by
+.Xr mandoc 1 .
+.It Ic \er
+Move up by one line; ignored by
 .Xr mandoc 1 .
 .It Ic \eS\(aq Ns Ar number Ns Ic \(aq
 Slant output; ignored by
--- /dev/null
+++ regress/char/space/invalid.out_lint
@@ -0,0 +1,9 @@
+mandoc: invalid.in:7:15: WARNING: invalid escape sequence: \[ 
+mandoc: invalid.in:8:14: WARNING: invalid escape sequence: \[%]
+mandoc: invalid.in:9:16: WARNING: invalid escape sequence: \[&]
+mandoc: invalid.in:10:12: WARNING: invalid escape sequence: \[:]
+mandoc: invalid.in:11:12: WARNING: invalid escape sequence: \[^]
+mandoc: invalid.in:12:16: WARNING: invalid escape sequence: \[_]
+mandoc: invalid.in:13:11: WARNING: invalid escape sequence: \[|]
+mandoc: invalid.in:14:12: WARNING: invalid escape sequence: \[~]
+mandoc: invalid.in:15:18: WARNING: invalid escape sequence: \[0]
Index: esct-man.out_ascii
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/char/space/esct-man.out_ascii,v
retrieving revision 1.1
retrieving revision 1.2
diff -Lregress/char/space/esct-man.out_ascii -Lregress/char/space/esct-man.out_ascii -u -p -r1.1 -r1.2
--- regress/char/space/esct-man.out_ascii
+++ regress/char/space/esct-man.out_ascii
@@ -9,16 +9,20 @@ D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
        In plain text:
        single    tab
        singleescape-t
+       singleescape-a
        double         tab
        doubleescape-t
+       doubleescape-a
        This line starts with escape-t and comes close to the right margin.
        The next line starts with escape-t as well.
 
        In a literal display:
        single    tab
        singleescape-t
+       singleescape-a
        double         tab
        doubleescape-t
+       doubleescape-a
 
        After the IP macro:
 
@@ -33,4 +37,4 @@ D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
 
 
 
-OpenBSD                           2013-06-20                 SPACE-ESCT-MAN(1)
+OpenBSD                        December 15, 2018             SPACE-ESCT-MAN(1)
Index: esct-man.out_lint
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/char/space/esct-man.out_lint,v
retrieving revision 1.4
retrieving revision 1.5
diff -Lregress/char/space/esct-man.out_lint -Lregress/char/space/esct-man.out_lint -u -p -r1.4 -r1.5
--- regress/char/space/esct-man.out_lint
+++ regress/char/space/esct-man.out_lint
@@ -1,6 +1,6 @@
 mandoc: esct-man.in:8:7: WARNING: tab in filled text
-mandoc: esct-man.in:12:7: WARNING: tab in filled text
-mandoc: esct-man.in:12:8: WARNING: tab in filled text
-mandoc: esct-man.in:28:11: WARNING: tab in filled text
-mandoc: esct-man.in:30:11: WARNING: tab in filled text
-mandoc: esct-man.in:35:10: WARNING: tab in filled text
+mandoc: esct-man.in:14:7: WARNING: tab in filled text
+mandoc: esct-man.in:14:8: WARNING: tab in filled text
+mandoc: esct-man.in:34:11: WARNING: tab in filled text
+mandoc: esct-man.in:36:11: WARNING: tab in filled text
+mandoc: esct-man.in:44:10: WARNING: tab in filled text
Index: esct-man.in
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/char/space/esct-man.in,v
retrieving revision 1.2
retrieving revision 1.3
diff -Lregress/char/space/esct-man.in -Lregress/char/space/esct-man.in -u -p -r1.2 -r1.3
--- regress/char/space/esct-man.in
+++ regress/char/space/esct-man.in
@@ -1,5 +1,5 @@
 .\" $OpenBSD: esct-man.in,v 1.2 2017/07/04 14:53:23 schwarze Exp $
-.TH SPACE-ESCT-MAN 1 2013-06-20
+.TH SPACE-ESCT-MAN 1 "December 15, 2018"
 .SH NAME
 SPACE-T-MAN \- the t escape sequence in pages with man macros
 .SH DESCRIPTION
@@ -9,10 +9,14 @@ single	tab
 .br
 single\tescape-t
 .br
+single\aescape-a
+.br
 double		tab
 .br
 double\t\tescape-t
 .br
+double\a\aescape-a
+.br
 \tThis line starts with escape-t and comes close to the right margin.
 \tThe next line starts with escape-t as well.
 .sp
@@ -20,8 +24,10 @@ In a literal display:
 .nf
 single	tab
 single\tescape-t
+single\aescape-a
 double		tab
 double\t\tescape-t
+double\a\aescape-a
 .fi
 .sp
 After the IP macro:
@@ -29,7 +35,13 @@ After the IP macro:
 text
 .IP single\tescape-t 3n
 text
+.\" XXX not implemented
+.\" .IP single\aescape-a 3n
+.\" text
 .PP
 After font macros:
 .br
 .B single\ttab
+.\" XXX not implemented
+.\" .br
+.\" .B single\aleader
--- /dev/null
+++ regress/char/space/invalid.out_ascii
@@ -0,0 +1,21 @@
+SPACE-INVALID(1)            General Commands Manual           SPACE-INVALID(1)
+
+
+
+N\bNA\bAM\bME\bE
+       SPACE-INVALID - invalid whitespace escape sequences
+
+D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
+       blank: a-bhy]c
+       percent: abc
+       ampersand: abc
+       colon: abc
+       caret: abc
+       underline: a_bc
+       pipe: abc
+       tilde: a bc
+       digit-width: a bc
+
+
+
+OpenBSD                        December 15, 2018              SPACE-INVALID(1)
Index: Makefile
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/char/space/Makefile,v
retrieving revision 1.1
retrieving revision 1.2
diff -Lregress/char/space/Makefile -Lregress/char/space/Makefile -u -p -r1.1 -r1.2
--- regress/char/space/Makefile
+++ regress/char/space/Makefile
@@ -3,11 +3,12 @@
 REGRESS_TARGETS  = leading-mdoc leading-man multiple trailing-mdoc zerowidth
 REGRESS_TARGETS += eos eos-man break nobreak
 REGRESS_TARGETS += tab tab-man esct-mdoc esct-man
+REGRESS_TARGETS += invalid
 
 UTF8_TARGETS	 = zerowidth
 
 HTML_TARGETS	 = zerowidth
 
-LINT_TARGETS	 = trailing-mdoc tab tab-man esct-mdoc esct-man
+LINT_TARGETS	 = trailing-mdoc tab tab-man esct-mdoc esct-man invalid
 
 .include <bsd.regress.mk>
--- /dev/null
+++ regress/char/space/invalid.in
@@ -0,0 +1,15 @@
+.\" $OpenBSD$
+.TH SPACE-INVALID 1 "December 15, 2018"
+.SH NAME
+SPACE-INVALID \- invalid whitespace escape sequences
+.SH DESCRIPTION
+.nf
+blank: a\[hy]b\[ hy]c
+percent: a\%b\[%]c
+ampersand: a\&b\[&]c
+colon: a\:b\[:]c
+caret: a\^b\[^]c
+underline: a\_b\[_]c
+pipe: a\|b\[|]c
+tilde: a\~b\[~]c
+digit-width: a\0b\[0]c
Index: ignore.in
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/roff/esc/ignore.in,v
retrieving revision 1.2
retrieving revision 1.3
diff -Lregress/roff/esc/ignore.in -Lregress/roff/esc/ignore.in -u -p -r1.2 -r1.3
--- regress/roff/esc/ignore.in
+++ regress/roff/esc/ignore.in
@@ -1,15 +1,13 @@
 .\" $OpenBSD: ignore.in,v 1.3 2017/07/04 14:53:27 schwarze Exp $
-.Dd $Mdocdate$
-.Dt ESC-IGNORE 1
-.Os
-.Sh NAME
-.Nm esc-ignore
-.Nd ignored roff escape sequences
-.Sh DESCRIPTION
+.TH ESC-IGNORE 1 "December 15, 2018"
+.SH NAME
+esc-ignore \- ignored roff escape sequences
+.SH DESCRIPTION
+.nf
+closing parenthesis: a\)b\[)]c
+comma: a\,b\[,]c
+slash: a\/b\[/]c
 multiform: a\kxb\k(xyc\k[xyz]d
-.br
 quoted: a\R'myreg 0'b\R'myreg \A'y'0'c
-.br
 sizes: a\s0b\s(12c\s[123]d\s'123'e\s'1\w'xy'2'f
-.br
 signed sizes: a\s-0b\s-(12c\s-[123]d\s-'123'e\s-'1\w'xy'2'f\s-
Index: one.in
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/roff/esc/one.in,v
retrieving revision 1.2
retrieving revision 1.3
diff -Lregress/roff/esc/one.in -Lregress/roff/esc/one.in -u -p -r1.2 -r1.3
--- regress/roff/esc/one.in
+++ regress/roff/esc/one.in
@@ -1,17 +1,11 @@
 .\" $OpenBSD: one.in,v 1.3 2017/07/04 14:53:27 schwarze Exp $
-.Dd $Mdocdate$
-.Dt ESC-ONE 1
-.Os
-.Sh NAME
-.Nm esc-one
-.Nd roff one-character escape sequences
-.Sh DESCRIPTION
+.TH ESC-ONE 1 "December 15, 2018"
+.SH NAME
+esc-one \- roff one-character escape sequences
+.SH DESCRIPTION
+.nf
 backslash: >\e<
-.br
-minus: >\-<
-.br
+minus: >\-|\[-]<
 acute: >\'<
-.br
 grave: >\`<
-.br
 normal character: >\q<
--- /dev/null
+++ regress/roff/esc/invalid.out_ascii
@@ -0,0 +1,34 @@
+ESC-INVALID(1)              General Commands Manual             ESC-INVALID(1)
+
+
+
+N\bNA\bAM\bME\bE
+       esc-invalid - invalid roff escape sequences
+
+D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
+       plus: a+bc
+       semicolon: a;bc
+       less than: a<bc
+       equal to: a=bc
+       greater than: a>bc
+       at: a@bc
+       square bracket: a]b
+       curly braces: abc
+       digit: a1bc
+       G: aGbc
+       I: aIbc
+       i: aibc
+       J: aJbc
+       j: ajbc
+       K: aKbc
+       P: aPbc
+       Q: aQbc
+       q: aqbc
+       T: aTbc
+       U: aUbc
+       W: aWbc
+       y: aybc
+
+
+
+OpenBSD                        December 15, 2018                ESC-INVALID(1)
--- /dev/null
+++ regress/roff/esc/invalid.out_lint
@@ -0,0 +1,43 @@
+mandoc: invalid.in:7:11: WARNING: invalid escape sequence: \[+]
+mandoc: invalid.in:7:8: WARNING: undefined escape, printing literally: \+
+mandoc: invalid.in:8:16: WARNING: invalid escape sequence: \[;]
+mandoc: invalid.in:8:13: WARNING: undefined escape, printing literally: \;
+mandoc: invalid.in:9:16: WARNING: invalid escape sequence: \[<]
+mandoc: invalid.in:9:13: WARNING: undefined escape, printing literally: \<
+mandoc: invalid.in:10:15: WARNING: invalid escape sequence: \[=]
+mandoc: invalid.in:10:12: WARNING: undefined escape, printing literally: \=
+mandoc: invalid.in:11:19: WARNING: invalid escape sequence: \[>]
+mandoc: invalid.in:11:16: WARNING: undefined escape, printing literally: \>
+mandoc: invalid.in:12:9: WARNING: invalid escape sequence: \[@]
+mandoc: invalid.in:12:6: WARNING: undefined escape, printing literally: \@
+mandoc: invalid.in:13:18: WARNING: undefined escape, printing literally: \]
+mandoc: invalid.in:14:21: WARNING: invalid escape sequence: \[}]
+mandoc: invalid.in:14:16: WARNING: invalid escape sequence: \[{]
+mandoc: invalid.in:15:12: WARNING: invalid escape sequence: \[1]
+mandoc: invalid.in:15:9: WARNING: undefined escape, printing literally: \1
+mandoc: invalid.in:16:8: WARNING: invalid escape sequence: \[G]
+mandoc: invalid.in:16:5: WARNING: undefined escape, printing literally: \G
+mandoc: invalid.in:17:8: WARNING: invalid escape sequence: \[I]
+mandoc: invalid.in:17:5: WARNING: undefined escape, printing literally: \I
+mandoc: invalid.in:18:8: WARNING: invalid escape sequence: \[i]
+mandoc: invalid.in:18:5: WARNING: undefined escape, printing literally: \i
+mandoc: invalid.in:19:8: WARNING: invalid escape sequence: \[J]
+mandoc: invalid.in:19:5: WARNING: undefined escape, printing literally: \J
+mandoc: invalid.in:20:8: WARNING: invalid escape sequence: \[j]
+mandoc: invalid.in:20:5: WARNING: undefined escape, printing literally: \j
+mandoc: invalid.in:21:8: WARNING: invalid escape sequence: \[K]
+mandoc: invalid.in:21:5: WARNING: undefined escape, printing literally: \K
+mandoc: invalid.in:22:8: WARNING: invalid escape sequence: \[P]
+mandoc: invalid.in:22:5: WARNING: undefined escape, printing literally: \P
+mandoc: invalid.in:23:8: WARNING: invalid escape sequence: \[Q]
+mandoc: invalid.in:23:5: WARNING: undefined escape, printing literally: \Q
+mandoc: invalid.in:24:8: WARNING: invalid escape sequence: \[q]
+mandoc: invalid.in:24:5: WARNING: undefined escape, printing literally: \q
+mandoc: invalid.in:25:8: WARNING: invalid escape sequence: \[T]
+mandoc: invalid.in:25:5: WARNING: undefined escape, printing literally: \T
+mandoc: invalid.in:26:8: WARNING: invalid escape sequence: \[U]
+mandoc: invalid.in:26:5: WARNING: undefined escape, printing literally: \U
+mandoc: invalid.in:27:8: WARNING: invalid escape sequence: \[W]
+mandoc: invalid.in:27:5: WARNING: undefined escape, printing literally: \W
+mandoc: invalid.in:28:8: WARNING: invalid escape sequence: \[y]
+mandoc: invalid.in:28:5: WARNING: undefined escape, printing literally: \y
Index: Makefile
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/roff/esc/Makefile,v
retrieving revision 1.3
retrieving revision 1.4
diff -Lregress/roff/esc/Makefile -Lregress/roff/esc/Makefile -u -p -r1.3 -r1.4
--- regress/roff/esc/Makefile
+++ regress/roff/esc/Makefile
@@ -1,6 +1,7 @@
 # $OpenBSD: Makefile,v 1.11 2015/04/29 18:32:57 schwarze Exp $
 
-REGRESS_TARGETS = one two multi B c c_man e f h l o p w z ignore
-LINT_TARGETS	= B h l w ignore
+REGRESS_TARGETS	 = one two multi B c c_man e f h l O o p w z
+REGRESS_TARGETS	+= ignore invalid unsupp
+LINT_TARGETS	 = B h l O w ignore invalid unsupp
 
 .include <bsd.regress.mk>
Index: one.out_ascii
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/roff/esc/one.out_ascii,v
retrieving revision 1.2
retrieving revision 1.3
diff -Lregress/roff/esc/one.out_ascii -Lregress/roff/esc/one.out_ascii -u -p -r1.2 -r1.3
--- regress/roff/esc/one.out_ascii
+++ regress/roff/esc/one.out_ascii
@@ -1,13 +1,17 @@
 ESC-ONE(1)                  General Commands Manual                 ESC-ONE(1)
 
+
+
 N\bNA\bAM\bME\bE
-     e\bes\bsc\bc-\b-o\bon\bne\be - roff one-character escape sequences
+       esc-one - roff one-character escape sequences
 
 D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
-     backslash: >\<
-     minus: >-<
-     acute: >'<
-     grave: >`<
-     normal character: >q<
+       backslash: >\<
+       minus: >-|-<
+       acute: >'<
+       grave: >`<
+       normal character: >q<
+
+
 
-OpenBSD                          July 4, 2017                          OpenBSD
+OpenBSD                        December 15, 2018                    ESC-ONE(1)
--- /dev/null
+++ regress/roff/esc/unsupp.out_lint
@@ -0,0 +1,5 @@
+mandoc: unsupp.in:7:23: WARNING: invalid escape sequence: \[!]
+mandoc: unsupp.in:7:20: UNSUPP: unsupported escape sequence: \!
+mandoc: unsupp.in:8:24: WARNING: invalid escape sequence: \[?]
+mandoc: unsupp.in:8:21: UNSUPP: unsupported escape sequence: \?
+mandoc: unsupp.in:8:17: UNSUPP: unsupported escape sequence: \?
--- /dev/null
+++ regress/roff/esc/unsupp.out_ascii
@@ -0,0 +1,14 @@
+ESC-UNSUPP(1)               General Commands Manual              ESC-UNSUPP(1)
+
+
+
+N\bNA\bAM\bME\bE
+       esc-unsupp - unsupported escape sequences
+
+D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
+       exclamation mark: abc
+       question mark: abc
+
+
+
+OpenBSD                        December 15, 2018                 ESC-UNSUPP(1)
--- /dev/null
+++ regress/roff/esc/O.out_ascii
@@ -0,0 +1,21 @@
+ESC-O(1)                    General Commands Manual                   ESC-O(1)
+
+
+
+N\bNA\bAM\bME\bE
+       esc-O - escape sequence to suppress output
+
+D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
+       O1: ab
+       O2: ab
+       O3: ab
+       O4: ab
+       O5: ab
+       O52: ab
+       O5n: ab
+       O6: ab
+       O0: ab
+
+
+
+OpenBSD                        December 15, 2018                      ESC-O(1)
Index: ignore.out_ascii
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/roff/esc/ignore.out_ascii,v
retrieving revision 1.2
retrieving revision 1.3
diff -Lregress/roff/esc/ignore.out_ascii -Lregress/roff/esc/ignore.out_ascii -u -p -r1.2 -r1.3
--- regress/roff/esc/ignore.out_ascii
+++ regress/roff/esc/ignore.out_ascii
@@ -1,12 +1,19 @@
 ESC-IGNORE(1)               General Commands Manual              ESC-IGNORE(1)
 
+
+
 N\bNA\bAM\bME\bE
-     e\bes\bsc\bc-\b-i\big\bgn\bno\bor\bre\be - ignored roff escape sequences
+       esc-ignore - ignored roff escape sequences
 
 D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
-     multiform: abcd
-     quoted: abc
-     sizes: abcdef
-     signed sizes: abcdef
+       closing parenthesis: abc
+       comma: abc
+       slash: abc
+       multiform: abcd
+       quoted: abc
+       sizes: abcdef
+       signed sizes: abcdef
+
+
 
-OpenBSD                          July 4, 2017                          OpenBSD
+OpenBSD                        December 15, 2018                 ESC-IGNORE(1)
--- /dev/null
+++ regress/roff/esc/O.out_lint
@@ -0,0 +1,5 @@
+mandoc: O.in:11:6: WARNING: invalid escape sequence: \O5
+mandoc: O.in:12:7: WARNING: invalid escape sequence: \O(52
+mandoc: O.in:13:7: UNSUPP: unsupported escape sequence: \O[5dummy]
+mandoc: O.in:14:6: WARNING: invalid escape sequence: \O6
+mandoc: O.in:15:6: UNSUPP: unsupported escape sequence: \O0
--- /dev/null
+++ regress/roff/esc/invalid.in
@@ -0,0 +1,28 @@
+.\" $OpenBSD$
+.TH ESC-INVALID 1 "December 15, 2018"
+.SH NAME
+esc-invalid \- invalid roff escape sequences
+.SH DESCRIPTION
+.nf
+plus: a\+b\[+]c
+semicolon: a\;b\[;]c
+less than: a\<b\[<]c
+equal to: a\=b\[=]c
+greater than: a\>b\[>]c
+at: a\@b\[@]c
+square bracket: a\]b
+curly braces: a\[{]b\[}]c
+digit: a\1b\[1]c
+G: a\Gb\[G]c
+I: a\Ib\[I]c
+i: a\ib\[i]c
+J: a\Jb\[J]c
+j: a\jb\[j]c
+K: a\Kb\[K]c
+P: a\Pb\[P]c
+Q: a\Qb\[Q]c
+q: a\qb\[q]c
+T: a\Tb\[T]c
+U: a\Ub\[U]c
+W: a\Wb\[W]c
+y: a\yb\[y]c
Index: ignore.out_lint
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/roff/esc/ignore.out_lint,v
retrieving revision 1.5
retrieving revision 1.6
diff -Lregress/roff/esc/ignore.out_lint -Lregress/roff/esc/ignore.out_lint -u -p -r1.5 -r1.6
--- regress/roff/esc/ignore.out_lint
+++ regress/roff/esc/ignore.out_lint
@@ -1 +1,4 @@
-mandoc: ignore.in:15:60: WARNING: invalid escape sequence: \s-
+mandoc: ignore.in:7:26: WARNING: invalid escape sequence: \[)]
+mandoc: ignore.in:8:12: WARNING: invalid escape sequence: \[,]
+mandoc: ignore.in:9:12: WARNING: invalid escape sequence: \[/]
+mandoc: ignore.in:13:60: WARNING: invalid escape sequence: \s-
--- /dev/null
+++ regress/roff/esc/O.in
@@ -0,0 +1,15 @@
+.\" $OpenBSD$
+.TH ESC-O 1 "December 15, 2018"
+.SH NAME
+esc-O \- escape sequence to suppress output
+.SH DESCRIPTION
+.nf
+O1: a\O1b
+O2: a\O2b
+O3: a\O3b
+O4: a\O4b
+O5: a\O5b
+O52: a\O(52b
+O5n: a\O[5dummy]b
+O6: a\O6b
+O0: a\O0\&\O1b
--- /dev/null
+++ regress/roff/esc/unsupp.in
@@ -0,0 +1,8 @@
+.\" $OpenBSD$
+.TH ESC-UNSUPP 1 "December 15, 2018"
+.SH NAME
+esc-unsupp \- unsupported escape sequences
+.SH DESCRIPTION
+.nf
+exclamation mark: a\!b\[!]c
+question mark: a\?\&\?b\[?]c
--
 To unsubscribe send an email to source+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2018-12-15 19:31 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-15 19:31 mandoc: Several improvements to escape sequence handling schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).