9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
From: quanstro@quanstro.net
To: 9fans@cse.psu.edu
Subject: Re: [9fans] combining characters
Date: Fri, 19 May 2006 19:13:33 -0500	[thread overview]
Message-ID: <f1fcda236000fd7d3278cdc9b33a275c@quanstro.net> (raw)
In-Reply-To: <20060520001201.GF14448@submarine>

On Fri May 19 19:13:39 CDT 2006, rvs@sun.com wrote:
> > no.  the unicode sequences (e.g. U+0069 U+0361) are correct.
> > i checked this and several other examples with the actual books.
> 
>   How did you check it ? Visual inspection ? 

since these were actual books, i know of no other way. ;-)

>   Since I'm no expert
>   in UNICODE I'm quite curious to know how one is supposed to
>   tell between a real character and a combination of a diacritic
>   and some other character when they are visually indistinguishable ?

say i have a random accented letter.  suppose that U+x is the cp for
the letter.  suppose U+y is the cp for the accent.  suppose that we're lucky
and there exists U+w ≡ U+xU+y.  then U+w should be the same glyph
as U+xU+y.

cannonical composition would yield
	compose(U+xU+y)	U+w
	compose(U+w)		U+w
while cannonical decompostion would yield
	decompose(U+xU+y)	U+xU+y
	decompose(U+w)		U+xU+y


>   I would expect unicode to always favor single glyphs from a particular 
>   page over anything else.

it's always a single glyph.  don't confuse letters, codepoints, and glyphs.

> 
>   btw, could you send me a .png with the actual title ?

i'll send you a png of the character.  i don't have the books.

what language rule are you trying to get at?

- erik

> 
> > i think you misunderstand how unicode works.  
> 
>   That could very well be the case ;-) But I know how Russian language
>   works regardless of what committee members think.
> 
> > a base cp like U+0069 followed by a combining cp like U+0361 
> > make a single character.  this identification is called "composition".
> > unicode contains some precomposed cps, but not U+0069 U+0361.
> 
>   That's ok. My only point is -- I would expect anybody who enters 
>   titles into a database adhere to the rules of the language the
>   title is written in. Maybe its too much to expect, though.
> 
> Thanks,
> Roman.
> 


  reply	other threads:[~2006-05-20  0:13 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-05-19 14:36 rog
2006-05-19 15:11 ` quanstro
2006-05-19 16:16   ` quanstro
2006-05-19 21:03   ` Jack Johnson
2006-05-19 21:26     ` Lluís Batlle i Rossell
2006-05-19 21:21       ` quanstro
2006-05-19 21:45     ` jmk
2006-05-19 21:57       ` Francisco J Ballesteros
2006-05-19 21:51         ` quanstro
2006-05-19 22:04           ` Francisco J Ballesteros
2006-05-19 22:16             ` quanstro
2006-05-19 22:34               ` Roman Shaposhnick
2006-05-19 22:35                 ` quanstro
2006-05-19 23:40                   ` Roman Shaposhnick
2006-05-19 23:43                     ` quanstro
2006-05-20  0:12                       ` Roman Shaposhnick
2006-05-20  0:13                         ` quanstro [this message]
2006-05-20  0:43                           ` Roman Shaposhnick
2006-05-20  0:44                             ` quanstro
2006-05-20  4:13                               ` Jack Johnson
2006-05-21 18:10                               ` Joel Salomon
2006-05-21 18:12                               ` Joel Salomon
2006-05-25  0:23                                 ` Roman Shaposhnick
2006-05-27 19:41                               ` Dan Cross
2006-05-27 20:15                                 ` Victor Nazarov
2006-05-28  1:49                                   ` LiteStar numnums
2006-05-20  0:59                             ` andrey mirtchovski
2006-05-20  0:51                               ` quanstro
2006-05-20  1:43                                 ` Roman Shaposhnick
2006-05-20  1:38                               ` Roman Shaposhnick
2006-05-20  1:59                                 ` Federico Benavento
2006-05-25  0:24                                   ` Roman Shaposhnick
2006-05-20  3:59                                 ` geoff
2006-05-20 10:56                                 ` Lucio De Re
2006-05-20 11:04                                   ` Lluís Batlle i Rossell
2006-05-20  0:18                         ` andrey mirtchovski
2006-05-21 18:11                         ` Joel Salomon
2006-05-19 22:40                 ` andrey mirtchovski
2006-05-19 22:36                   ` quanstro
2006-05-19 23:28                     ` andrey mirtchovski
2006-05-19 22:54             ` Joel Salomon
2006-05-19 22:50               ` quanstro
2006-05-20  1:05                 ` Skip Tavakkolian
2006-05-21 17:52                 ` Joel Salomon
2006-05-22  4:04                   ` Jack Johnson
2006-05-19 22:29         ` Lluís Batlle i Rossell
2006-05-19 22:22           ` quanstro
2006-05-19 22:45             ` Lluís Batlle i Rossell
2006-05-22 22:55 erik quanstrom
2006-05-23  4:58 ` Jack Johnson
2006-05-23 10:07 erik quanstrom
2006-05-23 14:24 ` LiteStar numnums
2006-05-23 17:07   ` plan9
2006-05-23 17:11     ` Ronald G Minnich
2006-05-23 18:53     ` Charles Forsyth
2006-05-23 20:09     ` Harri Haataja
2006-05-25  0:31     ` Roman Shaposhnick
2006-05-25  2:34       ` Andy Newman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f1fcda236000fd7d3278cdc9b33a275c@quanstro.net \
    --to=quanstro@quanstro.net \
    --cc=9fans@cse.psu.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).