source@mandoc.bsd.lv
 help / color / mirror / Atom feed
From: kristaps@mdocml.bsd.lv
To: source@mdocml.bsd.lv
Subject: mdocml: Support groff's escape for Unicode input.
Date: Sun, 15 May 2011 11:30:33 -0400 (EDT)	[thread overview]
Message-ID: <201105151530.p4FFUX2R006350@krisdoz.my.domain> (raw)

Log Message:
-----------
Support groff's escape for Unicode input.  See

  http://mdocml.bsd.lv/archives/tech/0368.html

For the time being, we just throw it away.

Modified Files:
--------------
    mdocml:
        mandoc.c
        mandoc.h
        mandoc_char.7

Revision Data
-------------
Index: mandoc.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mandoc.c,v
retrieving revision 1.51
retrieving revision 1.52
diff -Lmandoc.c -Lmandoc.c -u -p -r1.51 -r1.52
--- mandoc.c
+++ mandoc.c
@@ -125,6 +125,14 @@ mandoc_escape(const char **end, const ch
 		break;
 	case ('['):
 		gly = ESCAPE_SPECIAL;
+		/*
+		 * Unicode escapes are defined in groff as \[uXXXX] to
+		 * \[u10FFFF], where the contained value must be a valid
+		 * Unicode codepoint.  Here, however, only check whether
+		 * it's not a zero-width escape.
+		 */
+		if ('u' == cp[i] && ']' != cp[i + 1])
+			gly = ESCAPE_UNICODE;
 		term = ']';
 		break;
 	case ('C'):
Index: mandoc_char.7
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mandoc_char.7,v
retrieving revision 1.44
retrieving revision 1.45
diff -Lmandoc_char.7 -Lmandoc_char.7 -u -p -r1.44 -r1.45
--- mandoc_char.7
+++ mandoc_char.7
@@ -520,6 +520,20 @@ portable.
 .It \e*(Px   Ta \*(Px       Ta POSIX standard name
 .It \e*(Ai   Ta \*(Ai       Ta ANSI standard name
 .El
+.Sh UNICODE CHARACTERS
+The escape sequence
+.Pp
+.Dl \e[uXXXX]
+.Pp
+is interpreted as a Unicode codepoint.
+The codepoint must be in the range above U+0080 and less than U+10FFFF.
+For compatibility, points must be zero-padded to four characters; if
+greater than four characters, no zero padding is allowed.
+Unicode surrogates are not allowed.
+.\" .Pp
+.\" Unicode glyphs attenuate to the
+.\" .Sq \&?
+.\" character if invalid or not rendered by current output media.
 .Sh NUMBERED CHARACTERS
 For backward compatibility with existing manuals,
 .Xr mandoc 1
Index: mandoc.h
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mandoc.h,v
retrieving revision 1.74
retrieving revision 1.75
diff -Lmandoc.h -Lmandoc.h -u -p -r1.74 -r1.75
--- mandoc.h
+++ mandoc.h
@@ -299,6 +299,7 @@ enum	mandoc_esc {
 	ESCAPE_FONTROMAN, /* roman font mode */
 	ESCAPE_FONTPREV, /* previous font mode */
 	ESCAPE_NUMBERED, /* a numbered glyph */
+	ESCAPE_UNICODE, /* a unicode codepoint */
 	ESCAPE_NOSPACE /* suppress space if the last on a line */
 };
 
--
 To unsubscribe send an email to source+unsubscribe@mdocml.bsd.lv

                 reply	other threads:[~2011-05-15 15:30 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201105151530.p4FFUX2R006350@krisdoz.my.domain \
    --to=kristaps@mdocml.bsd.lv \
    --cc=source@mdocml.bsd.lv \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).