tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* Fwd: mandoc rendering of perl pod C<< <foo> >>
@ 2010-12-26 23:10 Kristaps Dzonsons
  2010-12-27  3:04 ` Ingo Schwarze
  0 siblings, 1 reply; 4+ messages in thread
From: Kristaps Dzonsons @ 2010-12-26 23:10 UTC (permalink / raw)
  To: Andreas Vögele, tech

Hi,

Ingo, any insight into this?  Seems to be a matter of quoting; maybe 
you've some insight into where the issue lies...

-------- Original Message --------
Subject: mandoc rendering of perl pod C<< <foo> >>
Date: Sun, 26 Dec 2010 14:50:16 +0100
From: Andreas Vögele <ports@andreasvoegele.com>
To: Kristaps Dzonsons <kristaps@openbsd.org>

Dear Kristaps,

here's a minor difference between mandoc and nroff.  Mandoc outputs two
double quotes before and after the mail address when rendering the
following pod file with "pod2man | mandoc".  Nroff only outputs one
double quote.

--------8<--------8<--------8<--------8<--------
=head1 CONTRIBUTORS

=over 4

=item Andrew Moore - C<< <amoore@cpan.org> >>
--------8<--------8<--------8<--------8<--------

Kind regards,
Andreas
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fwd: mandoc rendering of perl pod C<< <foo> >>
  2010-12-26 23:10 Fwd: mandoc rendering of perl pod C<< <foo> >> Kristaps Dzonsons
@ 2010-12-27  3:04 ` Ingo Schwarze
  2011-01-01 16:52   ` Ingo Schwarze
  0 siblings, 1 reply; 4+ messages in thread
From: Ingo Schwarze @ 2010-12-27  3:04 UTC (permalink / raw)
  To: tech; +Cc: Andreas Voegele

Hi Kristaps, hi Andreas,

> Ingo, any insight into this?  Seems to be a matter of quoting;
> maybe you've some insight into where the issue lies...

Sure, this is a known issue, unrelated to pod2man.

The problem is that man_args() simply doesn't implement
double-quote double escaping, so no wonder it doesn't work.

I have recently implemented double-quote double escaping for roff(7)
in roff_userdef().  The context is a bit different, and i'm not sure
how disrupting it would be to the mdocml library concept to have
a common utility function to be called both from libman and libroff;
in the long run, we should definitely do that, but for now, i have
just copied the code from roff.c to man_argv.c and adapted it to the
slightly different needs over there.

The consequence of the full rewrite is that it now does a bit more
and is two lines shorter.

It now handles Andreas' example correctly, but apart from that,
it is untested, so there will probably be other bugs.  I need to do
more testing later, this is just a heads-up that i'm working on it.

Apart from the quote escaping issue, there is another issue with
this code.  The blank escaping is wrong as well.  It is not
sufficient to check that the blank is not preceded by a backslash.
If there is an even number of backslashes before the blank,
the blank is not escaped; it is escaped only by an odd number of
backslashes.  But i won't continue with this tonight...

Oh, and i have a feeling libmdoc will probably suffer from the same
bugs as well.

Yours,
  Ingo


> -------- Original Message --------
> Subject: mandoc rendering of perl pod C<< <foo> >>
> Date: Sun, 26 Dec 2010 14:50:16 +0100
> From: Andreas Vögele <ports@andreasvoegele.com>
> To: Kristaps Dzonsons <kristaps@openbsd.org>
> 
> Dear Kristaps,
> 
> here's a minor difference between mandoc and nroff.  Mandoc outputs two
> double quotes before and after the mail address when rendering the
> following pod file with "pod2man | mandoc".  Nroff only outputs one
> double quote.
> 
> --------8<--------8<--------8<--------8<--------
> =head1 CONTRIBUTORS
> 
> =over 4
> 
> =item Andrew Moore - C<< <amoore@cpan.org> >>
> --------8<--------8<--------8<--------8<--------
> 
> Kind regards,
> Andreas


Index: man_argv.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/man_argv.c,v
retrieving revision 1.3
diff -u -p -r1.3 man_argv.c
--- man_argv.c	31 Jul 2010 23:42:04 -0000	1.3
+++ man_argv.c	27 Dec 2010 02:37:11 -0000
@@ -1,6 +1,6 @@
 /*	$Id: man_argv.c,v 1.3 2010/07/31 23:42:04 schwarze Exp $ */
 /*
- * Copyright (c) 2008, 2009 Kristaps Dzonsons <kristaps@bsd.lv>
+ * Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org>
  *
  * Permission to use, copy, modify, and distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
@@ -27,74 +27,57 @@
 int
 man_args(struct man *m, int line, int *pos, char *buf, char **v)
 {
+	char	 *cp;
+	int	  quoted, pairs, white;
 
 	assert(*pos);
-	assert(' ' != buf[*pos]);
+	*v = cp = buf + *pos;
+	assert(' ' != *cp);
 
-	if (0 == buf[*pos])
+	if ('\0' == *cp)
 		return(ARGS_EOLN);
 
-	*v = &buf[*pos];
-
-	/* 
-	 * Process a quoted literal.  A quote begins with a double-quote
-	 * and ends with a double-quote NOT preceded by a double-quote.
-	 * Whitespace is NOT involved in literal termination.
-	 */
-
-	if ('\"' == buf[*pos]) {
-		*v = &buf[++(*pos)];
-
-		for ( ; buf[*pos]; (*pos)++) {
-			if ('\"' != buf[*pos])
-				continue;
-			if ('\"' != buf[*pos + 1])
+	if ('"' == *cp) {
+		quoted = 1;
+		*v = ++cp;
+	} else
+		quoted = 0;
+
+	for (pairs = 0; '\0' != *cp; cp++) {
+		/* Unquoted arguments end at unescaped blanks. */
+		if (0 == quoted) {
+			if (' ' == cp[0] && '\\' != cp[-1])
 				break;
-			(*pos)++;
+			continue;
 		}
-
-		if (0 == buf[*pos]) {
-			if ( ! man_pmsg(m, line, *pos, MANDOCERR_BADQUOTE))
-				return(ARGS_ERROR);
-			return(ARGS_QWORD);
+		/* After pairs of quotes, move left. */
+		if (pairs)
+			cp[-pairs] = cp[0];
+		if ('"' != cp[0])
+			continue;
+		/* Solitary quotes end words, ... */
+		if ('"' != cp[1]) {
+			if (pairs)
+				cp[-pairs] = '\0';
+			break;
 		}
-
-		buf[(*pos)++] = 0;
-
-		if (0 == buf[*pos])
-			return(ARGS_QWORD);
-
-		while (' ' == buf[*pos])
-			(*pos)++;
-
-		if (0 == buf[*pos])
-			if ( ! man_pmsg(m, line, *pos, MANDOCERR_EOLNSPACE))
-				return(ARGS_ERROR);
-
-		return(ARGS_QWORD);
+		/* ... but pairs of quotes do not. */
+		pairs++;
+		cp++;
 	}
 
-	/* 
-	 * A non-quoted term progresses until either the end of line or
-	 * a non-escaped whitespace.
-	 */
-
-	for ( ; buf[*pos]; (*pos)++)
-		if (' ' == buf[*pos] && '\\' != buf[*pos - 1])
-			break;
-
-	if (0 == buf[*pos])
-		return(ARGS_WORD);
-
-	buf[(*pos)++] = 0;
-
-	while (' ' == buf[*pos])
-		(*pos)++;
-
-	if (0 == buf[*pos])
-		if ( ! man_pmsg(m, line, *pos, MANDOCERR_EOLNSPACE))
-			return(ARGS_ERROR);
-
-	return(ARGS_WORD);
+	if ('\0' == *cp) {
+		*pos = cp - buf;
+		if (quoted)
+			man_pmsg(m, line, *pos, MANDOCERR_BADQUOTE);
+	} else {
+		white = ' ' == cp[0] || ' ' == cp[1];
+		*(cp++) = '\0';
+		while (' ' == *cp)
+			cp++;
+		*pos = cp - buf;
+		if (white && '\0' == *cp)
+			man_pmsg(m, line, *pos, MANDOCERR_EOLNSPACE)
+	}
+	return(quoted ? ARGS_QWORD : ARGS_WORD);
 }
-
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fwd: mandoc rendering of perl pod C<< <foo> >>
  2010-12-27  3:04 ` Ingo Schwarze
@ 2011-01-01 16:52   ` Ingo Schwarze
  2011-01-02 21:42     ` Kristaps Dzonsons
  0 siblings, 1 reply; 4+ messages in thread
From: Ingo Schwarze @ 2011-01-01 16:52 UTC (permalink / raw)
  To: tech; +Cc: Andreas Voegele

Ingo Schwarze wrote on Mon, Dec 27, 2010 at 04:04:15AM +0100:
> Andreas Voegele wrote:

>> =head1 CONTRIBUTORS
>> 
>> =over 4
>> 
>> =item Andrew Moore - C<< <amoore@cpan.org> >>

> The problem is that man_args() simply doesn't implement
> double-quote double escaping, so no wonder it doesn't work.
> 
> I have recently implemented double-quote double escaping for roff(7)
> in roff_userdef().  The context is a bit different, and i'm not sure
> how disrupting it would be to the mdocml library concept to have
> a common utility function to be called both from libman and libroff;
> in the long run, we should definitely do that, but for now, i have
> just copied the code from roff.c to man_argv.c and adapted it to the
> slightly different needs over there.

Kristaps liked my plan to move this to libmandoc,
so here we go.

This patch (against OpenBSD) fixes the following bugs as well:

 * Escaped blanks (i.e. those preceded by an odd number of backslashes)
   were mishandled as argument separators in unquoted arguments to
   user-defined roff macros.

 * Escaped backslashes (i.e. pairs of backslashes) were not reduced 
   to single backslashes both in unquoted and quoted arguments both
   to user-defined roff macros and to man macros.

OK?

Yours,
  Ingo

P.S.
> Oh, and i have a feeling libmdoc will probably suffer from the same
> bugs as well.

I don't attempt to fix mdoc(7) argument handling right now
because in mdoc_argv.c, basic argument handling (which probably
needs the same fixes applied here to roff(7) and man(7) is
intricately entangled with lots of unrelated high-level mdoc(7) 
functionality, like delimiter handling and column list phrase
handling, to just name two examples.


Index: libmandoc.h
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/libmandoc.h,v
retrieving revision 1.7
diff -u -p -r1.7 libmandoc.h
--- libmandoc.h	16 Jul 2010 00:34:33 -0000	1.7
+++ libmandoc.h	1 Jan 2011 16:22:24 -0000
@@ -24,6 +24,7 @@ void		*mandoc_calloc(size_t, size_t);
 char		*mandoc_strdup(const char *);
 void		*mandoc_malloc(size_t);
 void		*mandoc_realloc(void *, size_t);
+char		*mandoc_getarg(char **, mandocmsg, void *, int, int *);
 time_t		 mandoc_a2time(int, const char *);
 #define		 MTIME_CANONICAL	(1 << 0)
 #define		 MTIME_REDUCED		(1 << 1)
Index: man_argv.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/man_argv.c,v
retrieving revision 1.3
diff -u -p -r1.3 man_argv.c
--- man_argv.c	31 Jul 2010 23:42:04 -0000	1.3
+++ man_argv.c	1 Jan 2011 16:22:24 -0000
@@ -1,6 +1,6 @@
 /*	$Id: man_argv.c,v 1.3 2010/07/31 23:42:04 schwarze Exp $ */
 /*
- * Copyright (c) 2008, 2009 Kristaps Dzonsons <kristaps@bsd.lv>
+ * Copyright (c) 2011 Ingo Schwarze <schwarze@openbsd.org>
  *
  * Permission to use, copy, modify, and distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
@@ -17,84 +17,24 @@
 #include <sys/types.h>
 
 #include <assert.h>
-#include <stdlib.h>
-#include <string.h>
 
 #include "mandoc.h"
 #include "libman.h"
+#include "libmandoc.h"
 
 
 int
 man_args(struct man *m, int line, int *pos, char *buf, char **v)
 {
+	char	 *start;
 
 	assert(*pos);
-	assert(' ' != buf[*pos]);
+	*v = start = buf + *pos;
+	assert(' ' != *start);
 
-	if (0 == buf[*pos])
+	if ('\0' == *start)
 		return(ARGS_EOLN);
 
-	*v = &buf[*pos];
-
-	/* 
-	 * Process a quoted literal.  A quote begins with a double-quote
-	 * and ends with a double-quote NOT preceded by a double-quote.
-	 * Whitespace is NOT involved in literal termination.
-	 */
-
-	if ('\"' == buf[*pos]) {
-		*v = &buf[++(*pos)];
-
-		for ( ; buf[*pos]; (*pos)++) {
-			if ('\"' != buf[*pos])
-				continue;
-			if ('\"' != buf[*pos + 1])
-				break;
-			(*pos)++;
-		}
-
-		if (0 == buf[*pos]) {
-			if ( ! man_pmsg(m, line, *pos, MANDOCERR_BADQUOTE))
-				return(ARGS_ERROR);
-			return(ARGS_QWORD);
-		}
-
-		buf[(*pos)++] = 0;
-
-		if (0 == buf[*pos])
-			return(ARGS_QWORD);
-
-		while (' ' == buf[*pos])
-			(*pos)++;
-
-		if (0 == buf[*pos])
-			if ( ! man_pmsg(m, line, *pos, MANDOCERR_EOLNSPACE))
-				return(ARGS_ERROR);
-
-		return(ARGS_QWORD);
-	}
-
-	/* 
-	 * A non-quoted term progresses until either the end of line or
-	 * a non-escaped whitespace.
-	 */
-
-	for ( ; buf[*pos]; (*pos)++)
-		if (' ' == buf[*pos] && '\\' != buf[*pos - 1])
-			break;
-
-	if (0 == buf[*pos])
-		return(ARGS_WORD);
-
-	buf[(*pos)++] = 0;
-
-	while (' ' == buf[*pos])
-		(*pos)++;
-
-	if (0 == buf[*pos])
-		if ( ! man_pmsg(m, line, *pos, MANDOCERR_EOLNSPACE))
-			return(ARGS_ERROR);
-
-	return(ARGS_WORD);
+	*v = mandoc_getarg(v, m->msg, m->data, line, pos);
+	return('"' == *start ? ARGS_QWORD : ARGS_WORD);
 }
-
Index: mandoc.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/mandoc.c,v
retrieving revision 1.20
diff -u -p -r1.20 mandoc.c
--- mandoc.c	27 Sep 2010 21:25:28 -0000	1.20
+++ mandoc.c	1 Jan 2011 16:22:24 -0000
@@ -1,14 +1,15 @@
 /*	$Id: mandoc.c,v 1.20 2010/09/27 21:25:28 schwarze Exp $ */
 /*
  * Copyright (c) 2008, 2009, 2010 Kristaps Dzonsons <kristaps@bsd.lv>
+ * Copyright (c) 2011 Ingo Schwarze <schwarze@openbsd.org>
  *
  * Permission to use, copy, modify, and distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
  * copyright notice and this permission notice appear in all copies.
  *
- * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHORS DISCLAIM ALL WARRANTIES
  * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
- * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR
  * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
  * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
@@ -277,6 +278,83 @@ mandoc_strdup(const char *ptr)
 	}
 
 	return(p);
+}
+
+/*
+ * Parse a quoted or unquoted roff-style request or macro argument.
+ * Return a pointer to the parsed argument, which is either the original
+ * pointer or advanced by one byte in case the argument is quoted.
+ * Null-terminate the argument in place.
+ * Collapse pairs of quotes inside quoted arguments.
+ * Advance the argument pointer to the next argument,
+ * or to the null byte terminating the argument line.
+ */
+char *
+mandoc_getarg(char **cpp, mandocmsg msg, void *data, int ln, int *pos)
+{
+	char	 *start, *cp;
+	int	  quoted, pairs, white;
+
+	/* Quoting can only start with a new word. */
+	start = *cpp;
+	if ('"' == *start) {
+		quoted = 1;
+		start++;
+	} else
+		quoted = 0;
+
+	pairs = 0;
+	white = 0;
+	for (cp = start; '\0' != *cp; cp++) {
+		/* Move left after quoted quotes and escaped backslashes. */
+		if (pairs)
+			cp[-pairs] = cp[0];
+		if ('\\' == cp[0]) {
+			if ('\\' == cp[1]) {
+				/* Poor man's copy mode. */
+				pairs++;
+				cp++;
+			} else if (0 == quoted && ' ' == cp[1])
+				/* Skip escaped blanks. */
+				cp++;
+		} else if (0 == quoted) {
+			if (' ' == cp[0]) {
+				/* Unescaped blanks end unquoted args. */
+				white = 1;
+				break;
+			}
+		} else if ('"' == cp[0]) {
+			if ('"' == cp[1]) {
+				/* Quoted quotes collapse. */
+				pairs++;
+				cp++;
+			} else {
+				/* Unquoted quotes end quoted args. */
+				quoted = 2;
+				break;
+			}
+		}
+	}
+
+	/* Quoted argument without a closing quote. */
+	if (1 == quoted && msg)
+		(*msg)(MANDOCERR_BADQUOTE, data, ln, *pos, NULL);
+
+	/* Null-terminate this argument and move to the next one. */
+	if (pairs)
+		cp[-pairs] = '\0';
+	if ('\0' != *cp) {
+		*cp++ = '\0';
+		while (' ' == *cp)
+			cp++;
+	}
+	*pos += (cp - start) + (quoted ? 1 : 0);
+	*cpp = cp;
+
+	if ('\0' == *cp && msg && (white || ' ' == cp[-1]))
+		(*msg)(MANDOCERR_EOLNSPACE, data, ln, *pos, NULL);
+
+	return(start);
 }
 
 
Index: roff.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/roff.c,v
retrieving revision 1.24
diff -u -p -r1.24 roff.c
--- roff.c	21 Dec 2010 01:30:58 -0000	1.24
+++ roff.c	1 Jan 2011 16:22:25 -0000
@@ -1,7 +1,7 @@
 /*	$Id: roff.c,v 1.24 2010/12/21 01:30:58 schwarze Exp $ */
 /*
  * Copyright (c) 2010 Kristaps Dzonsons <kristaps@bsd.lv>
- * Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org>
+ * Copyright (c) 2010, 2011 Ingo Schwarze <schwarze@openbsd.org>
  *
  * Permission to use, copy, modify, and distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
@@ -1124,53 +1124,16 @@ roff_userdef(ROFF_ARGS)
 {
 	const char	 *arg[9];
 	char		 *cp, *n1, *n2;
-	int		  i, quoted, pairs;
+	int		  i;
 
 	/*
 	 * Collect pointers to macro argument strings
 	 * and null-terminate them.
 	 */
 	cp = *bufp + pos;
-	for (i = 0; i < 9; i++) {
-		/* Quoting can only start with a new word. */
-		if ('"' == *cp) {
-			quoted = 1;
-			cp++;
-		} else
-			quoted = 0;
-		arg[i] = cp;
-		for (pairs = 0; '\0' != *cp; cp++) {
-			/* Unquoted arguments end at blanks. */
-			if (0 == quoted) {
-				if (' ' == *cp)
-					break;
-				continue;
-			}
-			/* After pairs of quotes, move left. */
-			if (pairs)
-				cp[-pairs] = cp[0];
-			/* Pairs of quotes do not end words, ... */
-			if ('"' == cp[0] && '"' == cp[1]) {
-				pairs++;
-				cp++;
-				continue;
-			}
-			/* ... but solitary quotes do. */
-			if ('"' != *cp)
-				continue;
-			if (pairs)
-				cp[-pairs] = '\0';
-			*cp = ' ';
-			break;
-		}
-		/* Last argument; the remaining ones are empty strings. */
-		if ('\0' == *cp)
-			continue;
-		/* Null-terminate argument and move to the next one. */
-		*cp++ = '\0';
-		while (' ' == *cp)
-			cp++;
-	}
+	for (i = 0; i < 9; i++)
+		arg[i] = '\0' == *cp ? NULL :
+		    mandoc_getarg(&cp, r->msg, r->data, ln, &pos);
 
 	/*
 	 * Expand macro arguments.


 ----- 8< ----- schnipp ----- >8 ----- 8< ----- schnapp ----- >8 -----

REGRESSION TESTS:

Index: Makefile
===================================================================
RCS file: Makefile
diff -N Makefile
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ Makefile	1 Jan 2011 16:49:06 -0000
@@ -0,0 +1,6 @@
+# $OpenBSD: Makefile,v 1.1 2010/04/25 17:35:31 schwarze Exp $
+
+REGRESS_TARGETS=roff man
+GROFF_TARGETS=roff man
+
+.include <bsd.regress.mk>
Index: man.in
===================================================================
RCS file: man.in
diff -N man.in
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ man.in	1 Jan 2011 16:49:06 -0000
@@ -0,0 +1,113 @@
+.TH ARGS-MAN 1 "January 1, 2011"
+.SH NAME
+args-man - arguments to man macros
+.SH DESCRIPTION
+standard unquoted:
+.IB one two
+text
+.br
+escaped blanks:
+.IB one\ one two\ two
+text
+.br
+escaped 'e' character:
+.IB one\eone two
+text
+.br
+.\"escaped backslash before blank:
+.\"IB one\\ two
+.\"text
+.\"br
+escaped backslash before 'e' character:
+.IB one\\e two
+text
+.br
+double inter-argument space:
+.IB one  two
+text
+.br
+triple inter-argument space:
+.IB one   two
+text
+.br
+single eol blank:
+.IB one two 
+text
+.br
+double eol blank:
+.IB one two  
+text
+.br
+triple eol blank:
+.IB one two   
+text
+.br
+standard quoted:
+.IB "one" "two"
+text
+.br
+quoted quotes:
+.IB "one""one" """two"""
+text
+.br
+quoted whitespace:
+.IB "one one" "two two"
+text
+.br
+escaped 'e' characters:
+.IB "one \e one" "\e"
+text
+.br
+escaped backslash before blank:
+.IB "one\\ one" "\\ "
+text
+.br
+escaped backslash before 'e' character:
+.IB "one\\eone" "\\e"
+text
+.br
+double inter-argument space:
+.IB "one one"  "two two"
+text
+.br
+triple inter-argument space:
+.IB "one one"   "two two"
+text
+.br
+missing inter-argument space:
+.IB "one one"two\ two
+text
+.br
+single eol blank:
+.IB "one one" "two two" 
+text
+.br
+double eol blank:
+.IB "one one" "two two"  
+text
+.br
+triple eol blank:
+.IB "one one" "two two"   
+text
+.br
+.\"trailing blanks in arguments:
+.\"IB "one " "two "
+.\"text
+.\"br
+unterminated quotes:
+.IB "one
+.IB one "two
+text
+.br
+.\"single trailing blank in unterminated quotes:
+.\"IB "one 
+.\"IB one "two 
+.\"text
+.\"br
+.\"double trailing blank in unterminated quotes:
+.\"IB "one  
+.\"IB one "two  
+.\"text
+.\"br
+backslash at eol:
+.IB one two\
Index: man.out_ascii
===================================================================
RCS file: man.out_ascii
diff -N man.out_ascii
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ man.out_ascii	1 Jan 2011 16:49:06 -0000
@@ -0,0 +1,34 @@
+ARGS-MAN(1)                                                        ARGS-MAN(1)
+
+
+
+N\bNA\bAM\bME\bE
+       args-man - arguments to man macros
+
+D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
+       standard unquoted: _\bo_\bn_\bet\btw\bwo\bo text
+       escaped blanks: _\bo_\bn_\be _\bo_\bn_\bet\btw\bwo\bo t\btw\bwo\bo text
+       escaped 'e' character: _\bo_\bn_\be_\b\_\bo_\bn_\bet\btw\bwo\bo text
+       escaped backslash before 'e' character: _\bo_\bn_\be_\b\t\btw\bwo\bo text
+       double inter-argument space: _\bo_\bn_\bet\btw\bwo\bo text
+       triple inter-argument space: _\bo_\bn_\bet\btw\bwo\bo text
+       single eol blank: _\bo_\bn_\bet\btw\bwo\bo text
+       double eol blank: _\bo_\bn_\bet\btw\bwo\bo text
+       triple eol blank: _\bo_\bn_\bet\btw\bwo\bo text
+       standard quoted: _\bo_\bn_\bet\btw\bwo\bo text
+       quoted quotes: _\bo_\bn_\be_\b"_\bo_\bn_\be"\b"t\btw\bwo\bo"\b" text
+       quoted whitespace: _\bo_\bn_\be _\bo_\bn_\bet\btw\bwo\bo t\btw\bwo\bo text
+       escaped 'e' characters: _\bo_\bn_\be _\b\ _\bo_\bn_\be\\b\ text
+       escaped backslash before blank: _\bo_\bn_\be _\bo_\bn_\be  text
+       escaped backslash before 'e' character: _\bo_\bn_\be_\b\_\bo_\bn_\be\\b\ text
+       double inter-argument space: _\bo_\bn_\be _\bo_\bn_\bet\btw\bwo\bo t\btw\bwo\bo text
+       triple inter-argument space: _\bo_\bn_\be _\bo_\bn_\bet\btw\bwo\bo t\btw\bwo\bo text
+       missing inter-argument space: _\bo_\bn_\be _\bo_\bn_\bet\btw\bwo\bo t\btw\bwo\bo text
+       single eol blank: _\bo_\bn_\be _\bo_\bn_\bet\btw\bwo\bo t\btw\bwo\bo text
+       double eol blank: _\bo_\bn_\be _\bo_\bn_\bet\btw\bwo\bo t\btw\bwo\bo text
+       triple eol blank: _\bo_\bn_\be _\bo_\bn_\bet\btw\bwo\bo t\btw\bwo\bo text
+       unterminated quotes: _\bo_\bn_\be _\bo_\bn_\bet\btw\bwo\bo text
+       backslash at eol: _\bo_\bn_\bet\btw\bwo\bo
+
+
+
Index: roff.in
===================================================================
RCS file: roff.in
diff -N roff.in
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ roff.in	1 Jan 2011 16:49:06 -0000
@@ -0,0 +1,65 @@
+.TH ARGS-ROFF 1 "January 1, 2011"
+.SH NAME
+args-roff - arguments to roff macros
+.SH DESCRIPTION
+.de test
+(\\$1) (\\$2)
+.br
+..
+standard unquoted:
+.test one two
+escaped blanks:
+.test one\ one two\ two
+escaped 'e' character:
+.test one\eone two
+escaped backslash before blank:
+.test one\\ two
+escaped backslash before 'e' character:
+.test one\\e two
+double inter-argument space:
+.test one  two
+triple inter-argument space:
+.test one   two
+single eol blank:
+.test one two 
+double eol blank:
+.test one two  
+triple eol blank:
+.test one two   
+standard quoted:
+.test "one" "two"
+quoted quotes:
+.test "one""one" """two"""
+quoted whitespace:
+.test "one one" "two two"
+escaped 'e' characters:
+.test "one \e one" "\e"
+escaped backslash before blank:
+.test "one\\ one" "\\ "
+escaped backslash before 'e' character:
+.test "one\\eone" "\\e"
+double inter-argument space:
+.test "one one"  "two two"
+triple inter-argument space:
+.test "one one"   "two two"
+missing inter-argument space:
+.test "one one"two\ two
+single eol blank:
+.test "one one" "two two" 
+double eol blank:
+.test "one one" "two two"  
+triple eol blank:
+.test "one one" "two two"   
+trailing blanks in arguments:
+.test "one " "two "
+unterminated quotes:
+.\"test "one
+.test one "two
+single trailing blank in unterminated quotes:
+.\"test "one 
+.test one "two 
+double trailing blank in unterminated quotes:
+.\"test "one  
+.test one "two  
+backslash at eol:
+.test one two\
Index: roff.out_ascii
===================================================================
RCS file: roff.out_ascii
diff -N roff.out_ascii
--- /dev/null	1 Jan 1970 00:00:00 -0000
+++ roff.out_ascii	1 Jan 2011 16:49:06 -0000
@@ -0,0 +1,38 @@
+ARGS-ROFF(1)                                                      ARGS-ROFF(1)
+
+
+
+N\bNA\bAM\bME\bE
+       args-roff - arguments to roff macros
+
+D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
+       standard unquoted: (one) (two)
+       escaped blanks: (one one) (two two)
+       escaped 'e' character: (one\one) (two)
+       escaped backslash before blank: (one) (two)
+       escaped backslash before 'e' character: (one\) (two)
+       double inter-argument space: (one) (two)
+       triple inter-argument space: (one) (two)
+       single eol blank: (one) (two)
+       double eol blank: (one) (two)
+       triple eol blank: (one) (two)
+       standard quoted: (one) (two)
+       quoted quotes: (one"one) ("two")
+       quoted whitespace: (one one) (two two)
+       escaped 'e' characters: (one \ one) (\)
+       escaped backslash before blank: (one one) ( )
+       escaped backslash before 'e' character: (one\one) (\)
+       double inter-argument space: (one one) (two two)
+       triple inter-argument space: (one one) (two two)
+       missing inter-argument space: (one one) (two two)
+       single eol blank: (one one) (two two)
+       double eol blank: (one one) (two two)
+       triple eol blank: (one one) (two two)
+       trailing blanks in arguments: (one ) (two )
+       unterminated quotes: (one) (two)
+       single trailing blank in unterminated quotes: (one) (two )
+       double trailing blank in unterminated quotes: (one) (two  )
+       backslash at eol: (one) (two)
+
+
+
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Fwd: mandoc rendering of perl pod C<< <foo> >>
  2011-01-01 16:52   ` Ingo Schwarze
@ 2011-01-02 21:42     ` Kristaps Dzonsons
  0 siblings, 0 replies; 4+ messages in thread
From: Kristaps Dzonsons @ 2011-01-02 21:42 UTC (permalink / raw)
  To: tech

>> Andreas Voegele wrote:
>
>>> =head1 CONTRIBUTORS
>>>
>>> =over 4
>>>
>>> =item Andrew Moore - C<<  <amoore@cpan.org>  >>
>
>> The problem is that man_args() simply doesn't implement
>> double-quote double escaping, so no wonder it doesn't work.
>>
>> I have recently implemented double-quote double escaping for roff(7)
>> in roff_userdef().  The context is a bit different, and i'm not sure
>> how disrupting it would be to the mdocml library concept to have
>> a common utility function to be called both from libman and libroff;
>> in the long run, we should definitely do that, but for now, i have
>> just copied the code from roff.c to man_argv.c and adapted it to the
>> slightly different needs over there.
>
> Kristaps liked my plan to move this to libmandoc,
> so here we go.
>
> This patch (against OpenBSD) fixes the following bugs as well:
>
>   * Escaped blanks (i.e. those preceded by an odd number of backslashes)
>     were mishandled as argument separators in unquoted arguments to
>     user-defined roff macros.
>
>   * Escaped backslashes (i.e. pairs of backslashes) were not reduced
>     to single backslashes both in unquoted and quoted arguments both
>     to user-defined roff macros and to man macros.
>
> OK?

Ok by me.  Can you please add some documentation bits, if they don't 
already exist, in {mdoc,man}.7 that describe the quoting?

> P.S.
>> Oh, and i have a feeling libmdoc will probably suffer from the same
>> bugs as well.

Meh...

> I don't attempt to fix mdoc(7) argument handling right now
> because in mdoc_argv.c, basic argument handling (which probably
> needs the same fixes applied here to roff(7) and man(7) is
> intricately entangled with lots of unrelated high-level mdoc(7)
> functionality, like delimiter handling and column list phrase
> handling, to just name two examples.

Ok, let's keep this for the future.

Thanks,

Kristaps
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-01-02 21:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-26 23:10 Fwd: mandoc rendering of perl pod C<< <foo> >> Kristaps Dzonsons
2010-12-27  3:04 ` Ingo Schwarze
2011-01-01 16:52   ` Ingo Schwarze
2011-01-02 21:42     ` Kristaps Dzonsons

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).