From: andrey mirtchovski <mirtchovski@gmail.com>
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Subject: Re: [9fans] Woes of New Language Support
Date: Sun, 26 Jul 2009 01:41:16 -0600 [thread overview]
Message-ID: <14ec7b180907260041h18f63c64x871a7059cc9244bb@mail.gmail.com> (raw)
In-Reply-To: <8318421630e9613cfbdf14c1eae5f080@quanstro.net>
diacritics (combining characters) are a real mess in Unicode. with so
much space in the format why did they have to go this route, i wonder?
erik mentioned cyrillic. i did have an old church slavonic bible text
i was attempting to display correctly on Plan 9 sometime in 2003-4.
top is x11 with correctly (i presume) combined characters, below is
the Plan 9 rendering:
http://mirtchovski.com/screenshots/x-p9-diacritics.jpg
there's a pattern there, as you can see: the combining char always
follows the char it's combined with, so you can try simply not
advancing forward as a first draft of implementing char combinations
in Plan 9. there doesn't seem to be a default list of "combining"
characters in UTF so you'll have to pick up all glyphs described as
"combining" and check for them when you input. fun and slow :)
the real problem isn't in viewing them however, but comes when you
start searching for them: it's easy to search for ë (e-umlaut) for
example, but what if it's described as e+"U+0308 COMBINING DIAERESIS"?
the answer is the UTS#18 Regular Expressions technical standard which
probably contributes at least half of the slowness of gnu grep
discussed in another thread. http://www.unicode.org/reports/tr18/
next prev parent reply other threads:[~2009-07-26 7:41 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-26 1:55 akumar
2009-07-26 5:08 ` erik quanstrom
2009-07-26 7:41 ` andrey mirtchovski [this message]
2009-07-26 14:32 ` erik quanstrom
2009-07-28 10:39 ` Charles Forsyth
2009-07-28 14:11 ` Ethan Grammatikidis
2009-07-28 14:52 ` John Floren
2009-07-28 17:46 ` Ethan Grammatikidis
2009-07-26 9:04 ` Salman Aljammaz
2009-07-26 13:48 ` erik quanstrom
2009-07-26 14:12 ` tlaronde
2009-07-26 14:24 ` erik quanstrom
2009-07-26 17:56 ` Nathaniel W Filardo
2009-07-26 18:39 ` Jack Johnson
2009-07-27 0:28 ` erik quanstrom
2009-07-26 11:43 Akshat Kumar
2009-07-26 12:01 Akshat Kumar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=14ec7b180907260041h18f63c64x871a7059cc9244bb@mail.gmail.com \
--to=mirtchovski@gmail.com \
--cc=9fans@9fans.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).