tech@mandoc.bsd.lv
 help / color / mirror / Atom feed
* numbered character escape regression fix
@ 2011-10-22 23:46 Ingo Schwarze
  2011-10-24 16:23 ` Kristaps Dzonsons
  0 siblings, 1 reply; 3+ messages in thread
From: Ingo Schwarze @ 2011-10-22 23:46 UTC (permalink / raw)
  To: tech

Hi,

here is another fix for a regression that crept into one of the
recent releases.

In groff, the handling of \N numbered character escapes is as follows:

 (1) If \N is followed by a digit,
     the \N and one single following digit are ignored.
     That is, "x\N123x" produces "x23x".

 (2) If \N is followed by a non-digit,
     the next non-digit delimits the character number.
     That is, "x\Nu65vx" produces "xAx".
     The delimiting characters (here, u and v) need not match.

Whatever mandoc currently does is different.

I have no idea whether this applies to any of the other escapes
as well, but they are not nearly as important as \N.

While here, drop two useless checks of gly;
it has been set to ESCAPE_ERROR right before the checks.

OK?
  Ingo


--- mandoc.c.orig
+++ mandoc.c
@@ -157,8 +157,7 @@ mandoc_escape(const char **end, const char **start, int *sz)
 	case ('V'):
 		/* FALLTHROUGH */
 	case ('Y'):
-		if (ESCAPE_ERROR == gly)
-			gly = ESCAPE_IGNORE;
+		gly = ESCAPE_IGNORE;
 		/* FALLTHROUGH */
 	case ('f'):
 		if (ESCAPE_ERROR == gly)
@@ -218,10 +217,7 @@ mandoc_escape(const char **end, const char **start, int *sz)
 	case ('L'):
 		/* FALLTHROUGH */
 	case ('l'):
-		/* FALLTHROUGH */
-	case ('N'):
-		if (ESCAPE_ERROR == gly)
-			gly = ESCAPE_NUMBERED;
+		gly = ESCAPE_NUMBERED;
 		/* FALLTHROUGH */
 	case ('S'):
 		/* FALLTHROUGH */
@@ -237,6 +233,26 @@ mandoc_escape(const char **end, const char **start, int *sz)
 		term = numeric = '\'';
 		break;
 
+	/*
+	 * Special handling for the numbered character escape.
+	 * XXX Do any other escapes need similar handling?
+	 */
+	case ('N'):
+		if ('\0' == cp[i])
+			return(ESCAPE_ERROR);
+		*end = &cp[++i];
+		if (isdigit(cp[i-1]))
+			return(ESCAPE_IGNORE);
+		while (isdigit(**end))
+			(*end)++;
+		if (start)
+			*start = &cp[i];
+		if (sz)
+			*sz = *end - &cp[i];
+		if ('\0' != **end)
+			(*end)++;
+		return(ESCAPE_NUMBERED);
+
 	/* 
 	 * Sizes get a special category of their own.
 	 */
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: numbered character escape regression fix
  2011-10-22 23:46 numbered character escape regression fix Ingo Schwarze
@ 2011-10-24 16:23 ` Kristaps Dzonsons
  2011-10-24 21:26   ` Ingo Schwarze
  0 siblings, 1 reply; 3+ messages in thread
From: Kristaps Dzonsons @ 2011-10-24 16:23 UTC (permalink / raw)
  To: tech; +Cc: Ingo Schwarze

> here is another fix for a regression that crept into one of the
> recent releases.
>
> In groff, the handling of \N numbered character escapes is as follows:
>
>   (1) If \N is followed by a digit,
>       the \N and one single following digit are ignored.
>       That is, "x\N123x" produces "x23x".
>
>   (2) If \N is followed by a non-digit,
>       the next non-digit delimits the character number.
>       That is, "x\Nu65vx" produces "xAx".
>       The delimiting characters (here, u and v) need not match.
>
> Whatever mandoc currently does is different.
>
> I have no idea whether this applies to any of the other escapes
> as well, but they are not nearly as important as \N.

Ingo,

This is fine---well, it's gross, but that's not our fault!  The groff(7) 
manual doesn't documents this, but mandoc_char(7) should note it, as 
it's not intuitive to me at all and may result in confusion.

> +	/*
> +	 * Special handling for the numbered character escape.
> +	 * XXX Do any other escapes need similar handling?
> +	 */
> +	case ('N'):
> +		if ('\0' == cp[i])
> +			return(ESCAPE_ERROR);
> +		*end =&cp[++i];
> +		if (isdigit(cp[i-1]))
> +			return(ESCAPE_IGNORE);
> +		while (isdigit(**end))
> +			(*end)++;

You need a cast here to satisfy lint: isdigit(int) (unsigned char) is 
the common way of course.

Thanks!

Kristaps
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: numbered character escape regression fix
  2011-10-24 16:23 ` Kristaps Dzonsons
@ 2011-10-24 21:26   ` Ingo Schwarze
  0 siblings, 0 replies; 3+ messages in thread
From: Ingo Schwarze @ 2011-10-24 21:26 UTC (permalink / raw)
  To: tech

Hi Kristaps,

Kristaps Dzonsons wrote on Mon, Oct 24, 2011 at 06:23:04PM +0200:

> This is fine---well, it's gross, but that's not our fault!  The
> groff(7) manual doesn't documents this, but mandoc_char(7) should
> note it, as it's not intuitive to me at all and may result in
> confusion.

I don't think so.  What we say is fine:

  NUMBERED CHARACTERS
    For backward compatibility with existing manuals, mandoc(1) also supports
    the

          \N'number'

    escape sequence, inserting the character number from the current
    character set into the output.  Of course, this is inherently non-
    portable and is already marked as deprecated in the Heirloom roff manual.
    For example, do not use \N'34', use \(dq, or even the plain `"' character
    where possible.

The rest is undefined behaviour and implementation details -
of a deprecated, non-portable feature.  Documenting such stuff
would be beyond useless, i'd call that harmful.

However, i still see value in keeping even the undocumented,
undefined implementation details in sync between groff and mandoc,
and to document that by maintaining unit tests (as opposed to
man pages).

>>+	/*
>>+	 * Special handling for the numbered character escape.
>>+	 * XXX Do any other escapes need similar handling?
>>+	 */
>>+	case ('N'):
>>+		if ('\0' == cp[i])
>>+			return(ESCAPE_ERROR);
>>+		*end =&cp[++i];
>>+		if (isdigit(cp[i-1]))
>>+			return(ESCAPE_IGNORE);
>>+		while (isdigit(**end))
>>+			(*end)++;

> You need a cast here to satisfy lint: isdigit(int) (unsigned char)
> is the common way of course.

Oh well, i added (unsigned char).

I can't seem to remember how lint works.  It has dozens of options,
the default options don't report this issue, your Makefile uses
LINTFLAGS (but i can't find a definition), and when i run it,
it seems to evaluate to -hx, which doesn't report this issue
either.  I dimly remember you mentioned useful options at some
point, but i forgot them and am confused.

Yours,
  Ingo
--
 To unsubscribe send an email to tech+unsubscribe@mdocml.bsd.lv

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2011-10-24 21:33 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-10-22 23:46 numbered character escape regression fix Ingo Schwarze
2011-10-24 16:23 ` Kristaps Dzonsons
2011-10-24 21:26   ` Ingo Schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).