From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: Date: Fri, 19 Aug 2005 10:51:07 -0400 From: Dimitry Golubovsky To: 9fans@cse.psu.edu Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline Subject: [9fans] plan9 and the Unicode Consortium definitions Topicbox-Message-UUID: 795778c4-ead0-11e9-9d60-3106f5b1d025 I am just wondering whether any API to access more complete set of character properties defined by Unicode.org is available in Plan9. So far I have seen only library functions like isalpharune(2) defined in runetype.c, but it does not cover all the character categories defined by the Unicode Consortium. Something might be expected in the Section 7 of manpages, might not it? BTW I've got some code I wrote earlier for Hugs and Glasgow Haskell Compiler, which is autogenerated from UnicodeData.txt (runetype.c seems to be manually hardcoded, or at least there is nothing in the mkfile that shows how it was generated). If there is any interest, I may send a link. My code is based on the same princilpes as I see in runetype.c: binary search over sorted lists of character ranges. Another question: is (historical) 16-bitness of runes a limitation of the C runtime library only, or is the kernel rune-size-aware, too? Because what Unicode.org defines is wider than 16 bits, as everybody knows. Unless there is any intentional divergence from the Unicode.org definitions= . PS I looked at the sources mirror at 9grid.de, and manpages at the Bell Labs website. Outdated? --=20 Dimitry Golubovsky Anywhere on the Web