source@mandoc.bsd.lv
 help / color / mirror / Atom feed
* mandoc: handle the non-portable GNU-style \[charNN], \[charNNN]
@ 2018-08-10 22:13 schwarze
  0 siblings, 0 replies; only message in thread
From: schwarze @ 2018-08-10 22:13 UTC (permalink / raw)
  To: source

Log Message:
-----------
handle the non-portable GNU-style \[charNN], \[charNNN] character
escape sequences, used for example in the groff_char(7) manual page

Modified Files:
--------------
    mandoc:
        TODO
        mandoc.c
        mandoc_char.7

Revision Data
-------------
Index: TODO
===================================================================
RCS file: /home/cvs/mandoc/mandoc/TODO,v
retrieving revision 1.260
retrieving revision 1.261
diff -LTODO -LTODO -u -p -r1.260 -r1.261
--- TODO
+++ TODO
@@ -40,9 +40,10 @@ are mere guesses, and some may be wrong.
 
 - \*(.T prints the device being used,
   see groff_char(7) for an example
-
-- \[charNN], \[charNNN] prints a single-byte codepoint
-  see groff_char(7) for examples
+  This is slightly hard because -Tlocale only decides to use ascii or
+  utf8 when initializing the formatter, so the information is not
+  yet available to the preprocessor at the parsing stage.
+  loc **  exist **  algo *  size *  imp *
 
 - .ad (adjust margins)
   .ad l -- adjust left margin only (flush left)
Index: mandoc.c
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mandoc.c,v
retrieving revision 1.104
retrieving revision 1.105
diff -Lmandoc.c -Lmandoc.c -u -p -r1.104 -r1.105
--- mandoc.c
+++ mandoc.c
@@ -41,7 +41,7 @@ enum mandoc_esc
 mandoc_escape(const char **end, const char **start, int *sz)
 {
 	const char	*local_start;
-	int		 local_sz;
+	int		 local_sz, c, i;
 	char		 term;
 	enum mandoc_esc	 gly;
 
@@ -330,8 +330,26 @@ mandoc_escape(const char **end, const ch
 		}
 		break;
 	case ESCAPE_SPECIAL:
-		if (1 == *sz && 'c' == **start)
-			gly = ESCAPE_NOSPACE;
+		if (**start == 'c') {
+			if (*sz == 1) {
+				gly = ESCAPE_NOSPACE;
+				break;
+			}
+			if (*sz < 6 || *sz > 7 ||
+			    strncmp(*start, "char", 4) != 0 ||
+			    (int)strspn(*start + 4, "0123456789") + 4 < *sz)
+				break;
+			c = 0;
+			for (i = 4; i < *sz; i++)
+				c = 10 * c + ((*start)[i] - '0');
+			if (c < 0x21 || (c > 0x7e && c < 0xa0) || c > 0xff)
+				break;
+			*start += 4;
+			*sz -= 4;
+			gly = ESCAPE_NUMBERED;
+			break;
+		}
+
 		/*
 		 * Unicode escapes are defined in groff as \[u0000]
 		 * to \[u10FFFF], where the contained value must be
Index: mandoc_char.7
===================================================================
RCS file: /home/cvs/mandoc/mandoc/mandoc_char.7,v
retrieving revision 1.72
retrieving revision 1.73
diff -Lmandoc_char.7 -Lmandoc_char.7 -u -p -r1.72 -r1.73
--- mandoc_char.7
+++ mandoc_char.7
@@ -761,14 +761,16 @@ For backward compatibility with existing
 .Xr mandoc 1
 also supports the
 .Pp
-.Dl \eN\(aq Ns Ar number Ns \(aq
+.Dl \eN\(aq Ns Ar number Ns \(aq and \e[ Ns Cm char Ns Ar number ]
 .Pp
-escape sequence, inserting the character
+escape sequences, inserting the character
 .Ar number
 from the current character set into the output.
 Of course, this is inherently non-portable and is already marked
-as deprecated in the Heirloom roff manual.
-For example, do not use \eN\(aq34\(aq, use \e(dq, or even the plain
+as deprecated in the Heirloom roff manual;
+on top of that, the second form is a GNU extension.
+For example, do not use \eN\(aq34\(aq or \e[char34], use \e(dq,
+or even the plain
 .Sq \(dq
 character where possible.
 .Sh COMPATIBILITY
--
 To unsubscribe send an email to source+unsubscribe@mandoc.bsd.lv

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2018-08-10 22:13 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-10 22:13 mandoc: handle the non-portable GNU-style \[charNN], \[charNNN] schwarze

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).