9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: Jack Johnson <knapjack@gmail.com>
To: Fans of the OS Plan 9 from Bell Labs <9fans@9fans.net>
Subject: Re: [9fans] Woes of New Language Support
Date: Sun, 26 Jul 2009 11:39:56 -0700	[thread overview]
Message-ID: <6e35c0620907261139u610c0431rbc3ecff6b16def29@mail.gmail.com> (raw)
In-Reply-To: <7cfc9061f18bd9aba567124d64be1ff5@quanstro.net>

If I'm reading you right, you're saying it might be easier if
everything were encoded as combining (or maybe more aptly
non-combining) codes, regardless of language?

So, we might encode 'Waffles' as w+upper a f f l e s and let the
renderer (if there is one) handle the presentation of the case shift
and the potential ligature, but things like grep get noticeably easier
with no overlap of ő and o+umlaut.

Again, oversimplified, with no real understanding on my part of the
depth or breadth of the problem space.

If this is the case, could it be handled by pushing everything into a
subset of unicode rather than use the unallocated space to create a
superset?

-J

On 7/26/09, erik quanstrom <quanstro@quanstro.net> wrote:
>> to be fair to the unicode people, this decoupling of glyphs and codepoints
>> is (i think) the most straightforward way to implement some languages like
>> arabic, where the glyphs for characters depend on their position within a
>> word.  that is, a letter at the beginning of a word looks different from
>> what it would look like if it was in the middle.
>
> my opinion (not that i'm entitled to one here) is
> that the unicode guys screwed up.  unicode is not
> consistant.  explain why there are two code points sigma.
> 03c3	greek small letter sigma
> 03c2	greek small letter final sigma
> why does german get ä, ö, ü?  if you want to take
> this further, why are there capital forms of latin letters?
> can't that also be inferred by the font?
>
> what's called a ligature in one language is a character
> in another.  i see no consistency.  it seems like the
> unicode committee had a problem with too much
> knowledge of the specific problems and few actual
> unifying (sorry) concepts.
>
> i think it would make much more sense to put this logic
> in editors.  this would also allow the freedom to use a
> capital, ligature, final form in the wrong place.
> like say studlyCaps.  i can't imagine english is the only
> language in the world that gets abused.
>
> - erik
>
>

-- 
Sent from my mobile device



  parent reply	other threads:[~2009-07-26 18:39 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-26  1:55 akumar
2009-07-26  5:08 ` erik quanstrom
2009-07-26  7:41   ` andrey mirtchovski
2009-07-26 14:32     ` erik quanstrom
2009-07-28 10:39       ` Charles Forsyth
2009-07-28 14:11         ` Ethan Grammatikidis
2009-07-28 14:52           ` John Floren
2009-07-28 17:46             ` Ethan Grammatikidis
2009-07-26  9:04   ` Salman Aljammaz
2009-07-26 13:48     ` erik quanstrom
2009-07-26 14:12       ` tlaronde
2009-07-26 14:24         ` erik quanstrom
2009-07-26 17:56       ` Nathaniel W Filardo
2009-07-26 18:39       ` Jack Johnson [this message]
2009-07-27  0:28         ` erik quanstrom
2009-07-26 11:43 Akshat Kumar
2009-07-26 12:01 Akshat Kumar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6e35c0620907261139u610c0431rbc3ecff6b16def29@mail.gmail.com \
    --to=knapjack@gmail.com \
    --cc=9fans@9fans.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).