discuss@mandoc.bsd.lv
 help / color / mirror / Atom feed
From: Ingo Schwarze <schwarze@usta.de>
To: Mike Small <smallm@panix.com>
Cc: discuss@mdocml.bsd.lv
Subject: Re: mandoc -Tps aborts on <sp>\^h\n\n
Date: Thu, 30 May 2013 06:13:38 +0200	[thread overview]
Message-ID: <20130530041338.GI736@iris.usta.de> (raw)
In-Reply-To: <li6a9ndp6xc.fsf@panix5.panix.com>

Hi Mike,

Mike Small wrote on Wed, May 29, 2013 at 09:55:59PM -0400:

> Not sure if this is covered under "clean up escape sequence handling"
> or another TODO I missed,

No, it isn't, you have found a so far unknown issue.

> but I see an abort in mandoc 1.12.1 from
> OpenBSD current from around May 3rd if I do the following:
> 
> $ mandoc -Tps `man -w roff`
> ...
> 509.608 441.221 moveto
> (ASCII) show
> 490.358 425.832 moveto
> (characters) show
> 490.358 410.443 moveto
> (with) show
> assertion "8 != c" failed:
>   file "/usr/src/usr.bin/mandoc/term_ps.c",
>   line 997, function "ps_letter"
> Abort trap (core dumped) 

Indeed, easily reproducible.

> $ man -w roff
> /usr/local/man/cat7/roff.0
> /usr/share/man/man7/roff.7
> 
> So you also get it like this...
> 
> $ mandoc -Tps /usr/local/man/cat7/roff.0

That doesn't make a lot of sense - you are handing an already formatted
document to the parsers again.  Anyway, mandoc(1) is not supposed to
crash even when given garbage input.

[...]
> A minimal test file causing the same assertion...
> 
> $ od -cb small_test_file.1  
> 0000000        \  \b   x  \n  \n                                        
>          040 134 010 170 012 012                                        
> 0000006

Indeed, that works as well.
It even works when you put a proper mdoc(7) or man(7) header
in front of the exploit code.

I have just committed a proper fix to both OpenBSD and bsd.lv,
see below for the patch.

Thanks for the perfect report!
  Ingo

 ----- 8< ----- schnipp ----- >8 ----- 8< ----- schnapp ----- >8 -----

Log Message:
-----------
Reject non-printable characters found in the input stream even when
preceded by a backslash; otherwise, the escape sequence would later 
be identified as invalid and the non-printable character would be
passed through to the output backends, sometimes triggering assertions.

Reported by Mike Small <smallm at panix dot com> on the mdocml discuss list.

Modified Files:
--------------
    mdocml:
        read.c

Revision Data
-------------
Index: read.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/read.c,v
retrieving revision 1.34
retrieving revision 1.35
diff -Lread.c -Lread.c -u -p -r1.34 -r1.35
--- read.c
+++ read.c
@@ -1,7 +1,7 @@
 /*	$Id$ */
 /*
  * Copyright (c) 2008, 2009, 2010, 2011 Kristaps Dzonsons <kristaps@bsd.lv>
- * Copyright (c) 2010, 2011, 2012 Ingo Schwarze <schwarze@openbsd.org>
+ * Copyright (c) 2010, 2011, 2012, 2013 Ingo Schwarze <schwarze@openbsd.org>
  *
  * Permission to use, copy, modify, and distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
@@ -328,6 +328,15 @@ mparse_buf_r(struct mparse *curp, struct
 				break;
 			}
 
+			/*
+			 * Make sure we have space for at least
+			 * one backslash and one other character
+			 * and the trailing NUL byte.
+			 */
+
+			if (pos + 2 >= (int)ln.sz)
+				resize_buf(&ln, 256);
+
 			/* 
 			 * Warn about bogus characters.  If you're using
 			 * non-ASCII encoding, you're screwing your
@@ -344,8 +353,6 @@ mparse_buf_r(struct mparse *curp, struct
 				mandoc_msg(MANDOCERR_BADCHAR, curp,
 						curp->line, pos, NULL);
 				i++;
-				if (pos >= (int)ln.sz)
-					resize_buf(&ln, 256);
 				ln.buf[pos++] = '?';
 				continue;
 			}
@@ -353,8 +360,6 @@ mparse_buf_r(struct mparse *curp, struct
 			/* Trailing backslash = a plain char. */
 
 			if ('\\' != blk.buf[i] || i + 1 == (int)blk.sz) {
-				if (pos >= (int)ln.sz)
-					resize_buf(&ln, 256);
 				ln.buf[pos++] = blk.buf[i++];
 				continue;
 			}
@@ -396,10 +401,20 @@ mparse_buf_r(struct mparse *curp, struct
 				break;
 			}
 
-			/* Some other escape sequence, copy & cont. */
+			/* Catch escaped bogus characters. */
 
-			if (pos + 1 >= (int)ln.sz)
-				resize_buf(&ln, 256);
+			c = (unsigned char) blk.buf[i+1];
+
+			if ( ! (isascii(c) && 
+					(isgraph(c) || isblank(c)))) {
+				mandoc_msg(MANDOCERR_BADCHAR, curp,
+						curp->line, pos, NULL);
+				i += 2;
+				ln.buf[pos++] = '?';
+				continue;
+			}
+
+			/* Some other escape sequence, copy & cont. */
 
 			ln.buf[pos++] = blk.buf[i++];
 			ln.buf[pos++] = blk.buf[i++];
--
 To unsubscribe send an email to source+unsubscribe@mdocml.bsd.lv

--
 To unsubscribe send an email to discuss+unsubscribe@mdocml.bsd.lv

  reply	other threads:[~2013-05-30  4:13 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-30  1:55 Mike Small
2013-05-30  4:13 ` Ingo Schwarze [this message]
2013-05-30 12:25   ` Mike Small

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130530041338.GI736@iris.usta.de \
    --to=schwarze@usta.de \
    --cc=discuss@mdocml.bsd.lv \
    --cc=smallm@panix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).