tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
From: Ingo Schwarze <schwarze@usta.de>
To: tech@mdocml.bsd.lv
Cc: Andreas Voegele <ports@andreasvoegele.com>
Subject: Re: Fwd: mandoc rendering of perl pod C<< <foo> >>
Date: Mon, 27 Dec 2010 04:04:15 +0100	[thread overview]
Message-ID: <20101227030415.GP23914@iris.usta.de> (raw)
In-Reply-To: <4D17CB61.3080106@bsd.lv>

Hi Kristaps, hi Andreas,

> Ingo, any insight into this?  Seems to be a matter of quoting;
> maybe you've some insight into where the issue lies...

Sure, this is a known issue, unrelated to pod2man.

The problem is that man_args() simply doesn't implement
double-quote double escaping, so no wonder it doesn't work.

I have recently implemented double-quote double escaping for roff(7)
in roff_userdef().  The context is a bit different, and i'm not sure
how disrupting it would be to the mdocml library concept to have
a common utility function to be called both from libman and libroff;
in the long run, we should definitely do that, but for now, i have
just copied the code from roff.c to man_argv.c and adapted it to the
slightly different needs over there.

The consequence of the full rewrite is that it now does a bit more
and is two lines shorter.

It now handles Andreas' example correctly, but apart from that,
it is untested, so there will probably be other bugs.  I need to do
more testing later, this is just a heads-up that i'm working on it.

Apart from the quote escaping issue, there is another issue with
this code.  The blank escaping is wrong as well.  It is not
sufficient to check that the blank is not preceded by a backslash.
If there is an even number of backslashes before the blank,
the blank is not escaped; it is escaped only by an odd number of
backslashes.  But i won't continue with this tonight...

Oh, and i have a feeling libmdoc will probably suffer from the same
bugs as well.

Yours,
  Ingo


> -------- Original Message --------
> Subject: mandoc rendering of perl pod C<< <foo> >>
> Date: Sun, 26 Dec 2010 14:50:16 +0100
> From: Andreas Vögele <ports@andreasvoegele.com>
> To: Kristaps Dzonsons <kristaps@openbsd.org>
> 
> Dear Kristaps,
> 
> here's a minor difference between mandoc and nroff.  Mandoc outputs two
> double quotes before and after the mail address when rendering the
> following pod file with "pod2man | mandoc".  Nroff only outputs one
> double quote.
> 
> --------8<--------8<--------8<--------8<--------
> =head1 CONTRIBUTORS
> 
> =over 4
> 
> =item Andrew Moore - C<< <amoore@cpan.org> >>
> --------8<--------8<--------8<--------8<--------
> 
> Kind regards,
> Andreas


Index: man_argv.c
===================================================================
RCS file: /cvs/src/usr.bin/mandoc/man_argv.c,v
retrieving revision 1.3
diff -u -p -r1.3 man_argv.c
--- man_argv.c	31 Jul 2010 23:42:04 -0000	1.3
+++ man_argv.c	27 Dec 2010 02:37:11 -0000
@@ -1,6 +1,6 @@
 /*	$Id: man_argv.c,v 1.3 2010/07/31 23:42:04 schwarze Exp $ */
 /*
- * Copyright (c) 2008, 2009 Kristaps Dzonsons <kristaps@bsd.lv>
+ * Copyright (c) 2010 Ingo Schwarze <schwarze@openbsd.org>
  *
  * Permission to use, copy, modify, and distribute this software for any
  * purpose with or without fee is hereby granted, provided that the above
@@ -27,74 +27,57 @@
 int
 man_args(struct man *m, int line, int *pos, char *buf, char **v)
 {
+	char	 *cp;
+	int	  quoted, pairs, white;
 
 	assert(*pos);
-	assert(' ' != buf[*pos]);
+	*v = cp = buf + *pos;
+	assert(' ' != *cp);
 
-	if (0 == buf[*pos])
+	if ('\0' == *cp)
 		return(ARGS_EOLN);
 
-	*v = &buf[*pos];
-
-	/* 
-	 * Process a quoted literal.  A quote begins with a double-quote
-	 * and ends with a double-quote NOT preceded by a double-quote.
-	 * Whitespace is NOT involved in literal termination.
-	 */
-
-	if ('\"' == buf[*pos]) {
-		*v = &buf[++(*pos)];
-
-		for ( ; buf[*pos]; (*pos)++) {
-			if ('\"' != buf[*pos])
-				continue;
-			if ('\"' != buf[*pos + 1])
+	if ('"' == *cp) {
+		quoted = 1;
+		*v = ++cp;
+	} else
+		quoted = 0;
+
+	for (pairs = 0; '\0' != *cp; cp++) {
+		/* Unquoted arguments end at unescaped blanks. */
+		if (0 == quoted) {
+			if (' ' == cp[0] && '\\' != cp[-1])
 				break;
-			(*pos)++;
+			continue;
 		}
-
-		if (0 == buf[*pos]) {
-			if ( ! man_pmsg(m, line, *pos, MANDOCERR_BADQUOTE))
-				return(ARGS_ERROR);
-			return(ARGS_QWORD);
+		/* After pairs of quotes, move left. */
+		if (pairs)
+			cp[-pairs] = cp[0];
+		if ('"' != cp[0])
+			continue;
+		/* Solitary quotes end words, ... */
+		if ('"' != cp[1]) {
+			if (pairs)
+				cp[-pairs] = '\0';
+			break;
 		}
-
-		buf[(*pos)++] = 0;
-
-		if (0 == buf[*pos])
-			return(ARGS_QWORD);
-
-		while (' ' == buf[*pos])
-			(*pos)++;
-
-		if (0 == buf[*pos])
-			if ( ! man_pmsg(m, line, *pos, MANDOCERR_EOLNSPACE))
-				return(ARGS_ERROR);
-
-		return(ARGS_QWORD);
+		/* ... but pairs of quotes do not. */
+		pairs++;
+		cp++;
 	}
 
-	/* 
-	 * A non-quoted term progresses until either the end of line or
-	 * a non-escaped whitespace.
-	 */
-
-	for ( ; buf[*pos]; (*pos)++)
-		if (' ' == buf[*pos] && '\\' != buf[*pos - 1])
-			break;
-
-	if (0 == buf[*pos])
-		return(ARGS_WORD);
-
-	buf[(*pos)++] = 0;
-
-	while (' ' == buf[*pos])
-		(*pos)++;
-
-	if (0 == buf[*pos])
-		if ( ! man_pmsg(m, line, *pos, MANDOCERR_EOLNSPACE))
-			return(ARGS_ERROR);
-
-	return(ARGS_WORD);
+	if ('\0' == *cp) {
+		*pos = cp - buf;
+		if (quoted)
+			man_pmsg(m, line, *pos, MANDOCERR_BADQUOTE);
+	} else {
+		white = ' ' == cp[0] || ' ' == cp[1];
+		*(cp++) = '\0';
+		while (' ' == *cp)
+			cp++;
+		*pos = cp - buf;
+		if (white && '\0' == *cp)
+			man_pmsg(m, line, *pos, MANDOCERR_EOLNSPACE)
+	}
+	return(quoted ? ARGS_QWORD : ARGS_WORD);
 }
-
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

  reply	other threads:[~2010-12-27  3:04 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-26 23:10 Kristaps Dzonsons
2010-12-27  3:04 ` Ingo Schwarze [this message]
2011-01-01 16:52   ` Ingo Schwarze
2011-01-02 21:42     ` Kristaps Dzonsons

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101227030415.GP23914@iris.usta.de \
    --to=schwarze@usta.de \
    --cc=ports@andreasvoegele.com \
    --cc=tech@mdocml.bsd.lv \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).