tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* Non-ASCII check fails in main.c on OpenBSD
@ 2010-08-08 12:47 Ingo Schwarze
  2010-08-08 14:27 ` Kristaps Dzonsons
  2010-08-08 15:39 ` Joerg Sonnenberger
  0 siblings, 2 replies; 5+ messages in thread
From: Ingo Schwarze @ 2010-08-08 12:47 UTC (permalink / raw)
  To: tech; +Cc: jmc

Hi,

i just noticed that on OpenBSD, the change main.c bsd.lv-rev. 1.99
pathetically fails.  Non-ASCII characters are just passed through.

Now maybe that's a bug in OpenBSD isgraph(3), maybe it is not and
isgraph(3) behaves like it does on purpose - actually, i don't really
care that much either way, and i doubt that it will be easy to get
OpenBSD isgraph(3) changed.

For maximum portability, i think when we put a comment

  /* 
   * Warn about bogus characters.  If you're using
   * non-ASCII encoding, you're screwing your
   * readers.  Since I'd rather this not happen,
   * I'll be helpful and drop these characters so
   * we don't display gibberish.  Note to manual
   * writers: use special characters.
   */

which i fully agree with, we should also be explicit in the code,
or we risk that on some systems our code behaves in another way than
the comments make you think and than we intend, which is always bad.

So, as we want to drop non-ASCII, lets explicitely use isascii(3).

OK to commit to bsd.lv?

Of course, this will not be merged to OpenBSD until after unlock.

Yours,
  Ingo



Index: main.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/main.c,v
retrieving revision 1.43
diff -u -p -r1.43 main.c
--- main.c	25 Jul 2010 18:05:54 -0000	1.43
+++ main.c	8 Aug 2010 12:15:17 -0000
@@ -451,6 +451,7 @@ fdesc(struct curparse *curp)
 	struct buf	 ln, blk;
 	int		 i, pos, lnn, lnn_start, with_mmap, of;
 	enum rofferr	 re;
+	unsigned char	 c;
 	struct man	*man;
 	struct mdoc	*mdoc;
 	struct roff	*roff;
@@ -493,8 +494,8 @@ fdesc(struct curparse *curp)
 			 * writers: use special characters.
 			 */
 
-			if ( ! isgraph((u_char)blk.buf[i]) &&
-					! isblank((u_char)blk.buf[i])) {
+			c = (unsigned char) blk.buf[i];
+			if ( ! (isascii(c) && (isgraph(c) || isblank(c)))) {
 				if ( ! mmsg(MANDOCERR_BADCHAR, curp, 
 						lnn_start, pos, 
 						"ignoring byte"))
Index: man.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/man.c,v
retrieving revision 1.38
diff -u -p -r1.38 man.c
--- man.c	25 Jul 2010 18:05:54 -0000	1.38
+++ man.c	8 Aug 2010 12:15:17 -0000
@@ -477,23 +477,14 @@ man_pmacro(struct man *m, int ln, char *
 
 	ppos = i;
 
-	/* Copy the first word into a nil-terminated buffer. */
-
-	for (j = 0; j < 4; j++, i++) {
-		if ('\0' == (mac[j] = buf[i]))
-			break;
-		else if (' ' == buf[i])
-			break;
-
-		/* Check for invalid characters. */
-
-		if (isgraph((u_char)buf[i]))
-			continue;
-		if ( ! man_pmsg(m, ln, i, MANDOCERR_BADCHAR))
-			return(0);
-		i--;
-	}
+	/*
+	 * Copy the first word into a nil-terminated buffer.
+	 * Stop copying when a tab, space, or eoln is encountered.
+	 */
 
+	j = 0;
+	while (j < 4 && '\0' != buf[i] && ' ' != buf[i] && '\t' != buf[i])
+		mac[j++] = buf[i++];
 	mac[j] = '\0';
 
 	if (j == 4 || j < 1) {
Index: mdoc.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/mdoc.c,v
retrieving revision 1.63
diff -u -p -r1.63 mdoc.c
--- mdoc.c	7 Aug 2010 18:06:45 -0000	1.63
+++ mdoc.c	8 Aug 2010 12:15:17 -0000
@@ -773,26 +773,13 @@ mdoc_pmacro(struct mdoc *m, int ln, char
 	sv = i;
 
 	/* 
-	 * Copy the first word into a nil-terminated buffer.  Stop
-	 * copying when a tab, space, or eoln is encountered.
+	 * Copy the first word into a nil-terminated buffer.
+	 * Stop copying when a tab, space, or eoln is encountered.
 	 */
 
-	for (j = 0; j < 4; j++, i++) {
-		if ('\0' == (mac[j] = buf[i]))
-			break;
-		else if (' ' == buf[i] || '\t' == buf[i])
-			break;
-
-		/* Check for invalid characters. */
-		/* TODO: remove me, already done in main.c. */
-
-		if (isgraph((u_char)buf[i]))
-			continue;
-		if ( ! mdoc_pmsg(m, ln, i, MANDOCERR_BADCHAR))
-			return(0);
-		i--;
-	}
-
+	j = 0;
+	while (j < 4 && '\0' != buf[i] && ' ' != buf[i] && '\t' != buf[i])
+		mac[j++] = buf[i++];
 	mac[j] = '\0';
 
 	if (j == 4 || j < 2) {
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-08-11 16:34 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-08 12:47 Non-ASCII check fails in main.c on OpenBSD Ingo Schwarze
2010-08-08 14:27 ` Kristaps Dzonsons
2010-08-08 15:39 ` Joerg Sonnenberger
2010-08-08 18:52   ` Ingo Schwarze
2010-08-11 16:34     ` Ulrich Spörlein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).