* mandoc: Improve coverage of edge cases for 3-byte UTF-8 sequences.
@ 2024-05-16 20:37 schwarze
0 siblings, 0 replies; only message in thread
From: schwarze @ 2024-05-16 20:37 UTC (permalink / raw)
To: source
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=utf-8, Size: 11391 bytes --]
Log Message:
-----------
Improve coverage of edge cases for 3-byte UTF-8 sequences.
Coverage for 2-byte and 4-byte sequences was already reasonable.
Modified Files:
--------------
mandoc/regress/char/unicode:
input.in
input.out_ascii
input.out_lint
input.out_utf8
Revision Data
-------------
Index: input.out_lint
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/char/unicode/input.out_lint,v
diff -Lregress/char/unicode/input.out_lint -Lregress/char/unicode/input.out_lint -u -p -r1.7 -r1.8
--- regress/char/unicode/input.out_lint
+++ regress/char/unicode/input.out_lint
@@ -21,61 +21,61 @@ mandoc: input.in:34:19: ERROR: skipping
mandoc: input.in:35:17: ERROR: skipping bad character: 0xe0
mandoc: input.in:35:18: ERROR: skipping bad character: 0x9f
mandoc: input.in:35:19: ERROR: skipping bad character: 0xbf
-mandoc: input.in:42:25: ERROR: skipping bad character: 0xed
-mandoc: input.in:42:26: ERROR: skipping bad character: 0xa0
-mandoc: input.in:42:27: ERROR: skipping bad character: 0x80
-mandoc: input.in:42:17: ERROR: invalid special character: \[uD800]
mandoc: input.in:43:25: ERROR: skipping bad character: 0xed
-mandoc: input.in:43:26: ERROR: skipping bad character: 0xbf
-mandoc: input.in:43:27: ERROR: skipping bad character: 0xbf
-mandoc: input.in:43:17: ERROR: invalid special character: \[uDFFF]
-mandoc: input.in:53:19: ERROR: skipping bad character: 0xf0
-mandoc: input.in:53:20: ERROR: skipping bad character: 0x80
-mandoc: input.in:53:21: ERROR: skipping bad character: 0x80
-mandoc: input.in:53:22: ERROR: skipping bad character: 0x80
-mandoc: input.in:54:19: ERROR: skipping bad character: 0xf0
-mandoc: input.in:54:20: ERROR: skipping bad character: 0x80
-mandoc: input.in:54:21: ERROR: skipping bad character: 0x81
-mandoc: input.in:54:22: ERROR: skipping bad character: 0xbf
-mandoc: input.in:55:19: ERROR: skipping bad character: 0xf0
-mandoc: input.in:55:20: ERROR: skipping bad character: 0x80
-mandoc: input.in:55:21: ERROR: skipping bad character: 0x82
-mandoc: input.in:55:22: ERROR: skipping bad character: 0x80
-mandoc: input.in:56:19: ERROR: skipping bad character: 0xf0
-mandoc: input.in:56:20: ERROR: skipping bad character: 0x80
-mandoc: input.in:56:21: ERROR: skipping bad character: 0x9f
-mandoc: input.in:56:22: ERROR: skipping bad character: 0xbf
-mandoc: input.in:57:19: ERROR: skipping bad character: 0xf0
-mandoc: input.in:57:20: ERROR: skipping bad character: 0x80
-mandoc: input.in:57:21: ERROR: skipping bad character: 0xa0
-mandoc: input.in:57:22: ERROR: skipping bad character: 0x80
+mandoc: input.in:43:26: ERROR: skipping bad character: 0xa0
+mandoc: input.in:43:27: ERROR: skipping bad character: 0x80
+mandoc: input.in:43:17: ERROR: invalid special character: \[uD800]
+mandoc: input.in:44:25: ERROR: skipping bad character: 0xed
+mandoc: input.in:44:26: ERROR: skipping bad character: 0xbf
+mandoc: input.in:44:27: ERROR: skipping bad character: 0xbf
+mandoc: input.in:44:17: ERROR: invalid special character: \[uDFFF]
mandoc: input.in:58:19: ERROR: skipping bad character: 0xf0
-mandoc: input.in:58:20: ERROR: skipping bad character: 0x8f
-mandoc: input.in:58:21: ERROR: skipping bad character: 0xbf
-mandoc: input.in:58:22: ERROR: skipping bad character: 0xbf
-mandoc: input.in:67:31: ERROR: skipping bad character: 0xf4
-mandoc: input.in:67:32: ERROR: skipping bad character: 0x90
-mandoc: input.in:67:33: ERROR: skipping bad character: 0x80
-mandoc: input.in:67:34: ERROR: skipping bad character: 0x80
-mandoc: input.in:67:21: ERROR: invalid special character: \[u110000]
-mandoc: input.in:68:31: ERROR: skipping bad character: 0xf4
-mandoc: input.in:68:32: ERROR: skipping bad character: 0xbf
-mandoc: input.in:68:33: ERROR: skipping bad character: 0xbf
-mandoc: input.in:68:34: ERROR: skipping bad character: 0xbf
-mandoc: input.in:68:21: ERROR: invalid special character: \[u13FFFF]
-mandoc: input.in:69:31: ERROR: skipping bad character: 0xf5
-mandoc: input.in:69:32: ERROR: skipping bad character: 0x80
-mandoc: input.in:69:33: ERROR: skipping bad character: 0x80
-mandoc: input.in:69:34: ERROR: skipping bad character: 0x80
-mandoc: input.in:69:21: ERROR: invalid special character: \[u140000]
-mandoc: input.in:70:31: ERROR: skipping bad character: 0xf7
-mandoc: input.in:70:32: ERROR: skipping bad character: 0xbf
-mandoc: input.in:70:33: ERROR: skipping bad character: 0xbf
-mandoc: input.in:70:34: ERROR: skipping bad character: 0xbf
-mandoc: input.in:70:21: ERROR: invalid special character: \[u1FFFFF]
-mandoc: input.in:71:33: ERROR: skipping bad character: 0xf8
-mandoc: input.in:71:34: ERROR: skipping bad character: 0x88
-mandoc: input.in:71:35: ERROR: skipping bad character: 0x80
-mandoc: input.in:71:36: ERROR: skipping bad character: 0x80
-mandoc: input.in:71:37: ERROR: skipping bad character: 0x80
-mandoc: input.in:71:23: ERROR: invalid special character: \[u200000]
+mandoc: input.in:58:20: ERROR: skipping bad character: 0x80
+mandoc: input.in:58:21: ERROR: skipping bad character: 0x80
+mandoc: input.in:58:22: ERROR: skipping bad character: 0x80
+mandoc: input.in:59:19: ERROR: skipping bad character: 0xf0
+mandoc: input.in:59:20: ERROR: skipping bad character: 0x80
+mandoc: input.in:59:21: ERROR: skipping bad character: 0x81
+mandoc: input.in:59:22: ERROR: skipping bad character: 0xbf
+mandoc: input.in:60:19: ERROR: skipping bad character: 0xf0
+mandoc: input.in:60:20: ERROR: skipping bad character: 0x80
+mandoc: input.in:60:21: ERROR: skipping bad character: 0x82
+mandoc: input.in:60:22: ERROR: skipping bad character: 0x80
+mandoc: input.in:61:19: ERROR: skipping bad character: 0xf0
+mandoc: input.in:61:20: ERROR: skipping bad character: 0x80
+mandoc: input.in:61:21: ERROR: skipping bad character: 0x9f
+mandoc: input.in:61:22: ERROR: skipping bad character: 0xbf
+mandoc: input.in:62:19: ERROR: skipping bad character: 0xf0
+mandoc: input.in:62:20: ERROR: skipping bad character: 0x80
+mandoc: input.in:62:21: ERROR: skipping bad character: 0xa0
+mandoc: input.in:62:22: ERROR: skipping bad character: 0x80
+mandoc: input.in:63:19: ERROR: skipping bad character: 0xf0
+mandoc: input.in:63:20: ERROR: skipping bad character: 0x8f
+mandoc: input.in:63:21: ERROR: skipping bad character: 0xbf
+mandoc: input.in:63:22: ERROR: skipping bad character: 0xbf
+mandoc: input.in:72:31: ERROR: skipping bad character: 0xf4
+mandoc: input.in:72:32: ERROR: skipping bad character: 0x90
+mandoc: input.in:72:33: ERROR: skipping bad character: 0x80
+mandoc: input.in:72:34: ERROR: skipping bad character: 0x80
+mandoc: input.in:72:21: ERROR: invalid special character: \[u110000]
+mandoc: input.in:73:31: ERROR: skipping bad character: 0xf4
+mandoc: input.in:73:32: ERROR: skipping bad character: 0xbf
+mandoc: input.in:73:33: ERROR: skipping bad character: 0xbf
+mandoc: input.in:73:34: ERROR: skipping bad character: 0xbf
+mandoc: input.in:73:21: ERROR: invalid special character: \[u13FFFF]
+mandoc: input.in:74:31: ERROR: skipping bad character: 0xf5
+mandoc: input.in:74:32: ERROR: skipping bad character: 0x80
+mandoc: input.in:74:33: ERROR: skipping bad character: 0x80
+mandoc: input.in:74:34: ERROR: skipping bad character: 0x80
+mandoc: input.in:74:21: ERROR: invalid special character: \[u140000]
+mandoc: input.in:75:31: ERROR: skipping bad character: 0xf7
+mandoc: input.in:75:32: ERROR: skipping bad character: 0xbf
+mandoc: input.in:75:33: ERROR: skipping bad character: 0xbf
+mandoc: input.in:75:34: ERROR: skipping bad character: 0xbf
+mandoc: input.in:75:21: ERROR: invalid special character: \[u1FFFFF]
+mandoc: input.in:76:33: ERROR: skipping bad character: 0xf8
+mandoc: input.in:76:34: ERROR: skipping bad character: 0x88
+mandoc: input.in:76:35: ERROR: skipping bad character: 0x80
+mandoc: input.in:76:36: ERROR: skipping bad character: 0x80
+mandoc: input.in:76:37: ERROR: skipping bad character: 0x80
+mandoc: input.in:76:23: ERROR: invalid special character: \[u200000]
Index: input.out_utf8
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/char/unicode/input.out_utf8,v
diff -Lregress/char/unicode/input.out_utf8 -Lregress/char/unicode/input.out_utf8 -u -p -r1.8 -r1.9
--- regress/char/unicode/input.out_utf8
+++ regress/char/unicode/input.out_utf8
@@ -31,12 +31,17 @@ D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
U+1000 0xe18080 áá begin of second start byte
U+CFFF 0xecbfbf ì¿¿ì¿¿ end of last normal start byte
U+D000 0xed8080 íí begin of last start byte
+ U+D7FB 0xed9fbb í»í» highest valid public three-byte
U+D7FF 0xed9fbf í¿í¿ highest public three-byte
U+D800 0xeda080 ??? lowest surrogate
U+DFFF 0xedbfbf ??? highest surrogate
U+E000 0xee8080 îî lowest private use
U+F8FF 0xefa3bf  highest private use
U+F900 0xefa480 ï¤ï¤ lowest post-private
+ U+FEFF 0xefbbbf  byte-order mark
+ U+FFFC 0xefbfbc  object replacement character
+ U+FFFD 0xefbfbd �� replacement character
+ U+FFFE 0xefbfbe ￾￾ reversed byte-order mark
U+FFFF 0xefbfbf ï¿¿ï¿¿ highest three-byte
F\bFo\bou\bur\br-\b-b\bby\byt\bte\be r\bra\ban\bng\bge\be
@@ -60,4 +65,4 @@ D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
U+1FFFFF 0xf7bfbfbf ???? highest invalid four-byte
U+200000 0xf888808080 ????? lowest five-byte
-OpenBSD June 2, 2021 CHAR-UNICODE-INPUT(1)
+OpenBSD May 16, 2024 CHAR-UNICODE-INPUT(1)
Index: input.out_ascii
===================================================================
RCS file: /home/cvs/mandoc/mandoc/regress/char/unicode/input.out_ascii,v
diff -Lregress/char/unicode/input.out_ascii -Lregress/char/unicode/input.out_ascii -u -p -r1.7 -r1.8
--- regress/char/unicode/input.out_ascii
+++ regress/char/unicode/input.out_ascii
@@ -31,12 +31,17 @@ D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
U+1000 0xe18080 <?><?> begin of second start byte
U+CFFF 0xecbfbf <?><?> end of last normal start byte
U+D000 0xed8080 <?><?> begin of last start byte
+ U+D7FB 0xed9fbb <?><?> highest valid public three-byte
U+D7FF 0xed9fbf <?><?> highest public three-byte
U+D800 0xeda080 ??? lowest surrogate
U+DFFF 0xedbfbf ??? highest surrogate
U+E000 0xee8080 <?><?> lowest private use
U+F8FF 0xefa3bf <?><?> highest private use
U+F900 0xefa480 <?><?> lowest post-private
+ U+FEFF 0xefbbbf <?><?> byte-order mark
+ U+FFFC 0xefbfbc <?><?> object replacement character
+ U+FFFD 0xefbfbd <?><?> replacement character
+ U+FFFE 0xefbfbe <?><?> reversed byte-order mark
U+FFFF 0xefbfbf <?><?> highest three-byte
F\bFo\bou\bur\br-\b-b\bby\byt\bte\be r\bra\ban\bng\bge\be
@@ -60,4 +65,4 @@ D\bDE\bES\bSC\bCR\bRI\bIP\bPT\bTI\bIO\bON\bN
U+1FFFFF 0xf7bfbfbf ???? highest invalid four-byte
U+200000 0xf888808080 ????? lowest five-byte
-OpenBSD June 2, 2021 CHAR-UNICODE-INPUT(1)
+OpenBSD May 16, 2024 CHAR-UNICODE-INPUT(1)
--
To unsubscribe send an email to source+unsubscribe@mandoc.bsd.lv
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2024-05-16 20:37 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-05-16 20:37 mandoc: Improve coverage of edge cases for 3-byte UTF-8 sequences schwarze
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).