From: kristaps@mdocml.bsd.lv
To: source@mdocml.bsd.lv
Subject: mdocml: Support groff's escape for Unicode input.
Date: Sun, 15 May 2011 11:30:33 -0400 (EDT) [thread overview]
Message-ID: <201105151530.p4FFUX2R006350@krisdoz.my.domain> (raw)
Log Message:
-----------
Support groff's escape for Unicode input. See
http://mdocml.bsd.lv/archives/tech/0368.html
For the time being, we just throw it away.
Modified Files:
--------------
mdocml:
mandoc.c
mandoc.h
mandoc_char.7
Revision Data
-------------
Index: mandoc.c
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mandoc.c,v
retrieving revision 1.51
retrieving revision 1.52
diff -Lmandoc.c -Lmandoc.c -u -p -r1.51 -r1.52
--- mandoc.c
+++ mandoc.c
@@ -125,6 +125,14 @@ mandoc_escape(const char **end, const ch
break;
case ('['):
gly = ESCAPE_SPECIAL;
+ /*
+ * Unicode escapes are defined in groff as \[uXXXX] to
+ * \[u10FFFF], where the contained value must be a valid
+ * Unicode codepoint. Here, however, only check whether
+ * it's not a zero-width escape.
+ */
+ if ('u' == cp[i] && ']' != cp[i + 1])
+ gly = ESCAPE_UNICODE;
term = ']';
break;
case ('C'):
Index: mandoc_char.7
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mandoc_char.7,v
retrieving revision 1.44
retrieving revision 1.45
diff -Lmandoc_char.7 -Lmandoc_char.7 -u -p -r1.44 -r1.45
--- mandoc_char.7
+++ mandoc_char.7
@@ -520,6 +520,20 @@ portable.
.It \e*(Px Ta \*(Px Ta POSIX standard name
.It \e*(Ai Ta \*(Ai Ta ANSI standard name
.El
+.Sh UNICODE CHARACTERS
+The escape sequence
+.Pp
+.Dl \e[uXXXX]
+.Pp
+is interpreted as a Unicode codepoint.
+The codepoint must be in the range above U+0080 and less than U+10FFFF.
+For compatibility, points must be zero-padded to four characters; if
+greater than four characters, no zero padding is allowed.
+Unicode surrogates are not allowed.
+.\" .Pp
+.\" Unicode glyphs attenuate to the
+.\" .Sq \&?
+.\" character if invalid or not rendered by current output media.
.Sh NUMBERED CHARACTERS
For backward compatibility with existing manuals,
.Xr mandoc 1
Index: mandoc.h
===================================================================
RCS file: /usr/vhosts/mdocml.bsd.lv/cvs/mdocml/mandoc.h,v
retrieving revision 1.74
retrieving revision 1.75
diff -Lmandoc.h -Lmandoc.h -u -p -r1.74 -r1.75
--- mandoc.h
+++ mandoc.h
@@ -299,6 +299,7 @@ enum mandoc_esc {
ESCAPE_FONTROMAN, /* roman font mode */
ESCAPE_FONTPREV, /* previous font mode */
ESCAPE_NUMBERED, /* a numbered glyph */
+ ESCAPE_UNICODE, /* a unicode codepoint */
ESCAPE_NOSPACE /* suppress space if the last on a line */
};
--
To unsubscribe send an email to source+unsubscribe@mdocml.bsd.lv
reply other threads:[~2011-05-15 15:30 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201105151530.p4FFUX2R006350@krisdoz.my.domain \
--to=kristaps@mdocml.bsd.lv \
--cc=source@mdocml.bsd.lv \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).