9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] The cost of Runes in Lexi
@ 2004-02-24 23:07 dbailey27
  2004-02-26 10:41 ` Douglas A. Gwyn
  0 siblings, 1 reply; 4+ messages in thread
From: dbailey27 @ 2004-02-24 23:07 UTC (permalink / raw)
  To: 9fans

Hey, all,
	I'm redesigning my tokenizer and preprocessor to use Runes,
and it got me to thinking... is it worth the time to interpret UTF
patterns besides Latin '0' -> '9' as integers? If one performs a
grep on /lib/unicode:
	grep '(digit|number)' /lib/unicode

it is obvious that more than a few languages will have disctinct
UTF patterns for digits. Is it desirable to interpret these values as
digits? I'm especially interested in the opinion of our Asian friends.

	In Japanese (and others?), integers are depicted by multiple
glyphs when written as Hiragana. However, are there single-glyph
Kanji that can be interpreted as an Integer when seen isolated from
other NAME class glyphs?

	Basically, this would allow for unicode known to equate to
digits to be interpreted as integers in expressions and assignments
throughout an ASM or C source file. Every unicode *not* known as
a digit would be interpreted as a NAME class character. Thus, one
could still properly program C source in Thai, Lao, etc, as long as
each reserved word was still in english.

	What are your thoughts?

Don (north_)




^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] The cost of Runes in Lexi
  2004-02-24 23:07 [9fans] The cost of Runes in Lexi dbailey27
@ 2004-02-26 10:41 ` Douglas A. Gwyn
  2004-02-26 13:47   ` dbailey27
  0 siblings, 1 reply; 4+ messages in thread
From: Douglas A. Gwyn @ 2004-02-26 10:41 UTC (permalink / raw)
  To: 9fans

dbailey27@ameritech.net wrote:
> ... is it worth the time to interpret UTF
> patterns besides Latin '0' -> '9' as integers? ...

Not really, at least not in the context of technical uses,
where everybody learned to use Arabic numerals.  Other
numeric glyphs can be thought of as ideographs; you have
to deal with them (unchanged) as part of text strings, but
you don't need to bog down parsers using an overly generous
implementation of isdigit().


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] The cost of Runes in Lexi
  2004-02-26 10:41 ` Douglas A. Gwyn
@ 2004-02-26 13:47   ` dbailey27
  2004-02-26 13:52     ` dbailey27
  0 siblings, 1 reply; 4+ messages in thread
From: dbailey27 @ 2004-02-26 13:47 UTC (permalink / raw)
  To: DAGwyn, 9fans

> you have
> to deal with them (unchanged) as part of text strings, but
> you don't need to bog down parsers using an overly generous
> implementation of isdigit().

How very cute.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] The cost of Runes in Lexi
  2004-02-26 13:47   ` dbailey27
@ 2004-02-26 13:52     ` dbailey27
  0 siblings, 0 replies; 4+ messages in thread
From: dbailey27 @ 2004-02-26 13:52 UTC (permalink / raw)
  To: dbailey27, DAGwyn, 9fans

> How very cute.

I'm doing it anyway.



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2004-02-26 13:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-02-24 23:07 [9fans] The cost of Runes in Lexi dbailey27
2004-02-26 10:41 ` Douglas A. Gwyn
2004-02-26 13:47   ` dbailey27
2004-02-26 13:52     ` dbailey27

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).