9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* Re: [9fans] charon fixes for utf8
@ 2001-02-06 16:14 rob pike
  0 siblings, 0 replies; 9+ messages in thread
From: rob pike @ 2001-02-06 16:14 UTC (permalink / raw)
  To: 9fans


> By the way, I found /sys/src/cmd/tcs/font directory just now.   Yes, this is my
> first time to see this directory.  I suppose this includes tools for making many
> 9 subfonts from existing ones.  I suppose we didn't see this before (2ed), did we?

I didn't even know it existed. It slipped out when I wasn't looking, I guess.
It might help you but it's written for a very old version of the system.
It'll take some work to get it running, whatever it is. I'm not exactly sure.
It might be what was used to import the original Chinese fonts.

-rob



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [9fans] charon fixes for utf8
@ 2001-02-07  1:12 okamoto
  0 siblings, 0 replies; 9+ messages in thread
From: okamoto @ 2001-02-07  1:12 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 367 bytes --]

Those sources have time stamp of Jun 13, 2000, and are for release 2 Plan 9.
I've started to try to compile it from last night.   I expect we will have various JIS
fonts of various sizes, such as 7, 9, 12, 14, 20, 28, 32, 48 dotts (all are free
fonts made by various authors in Japan).  I'll ask you, if I have trouble with it.
Please wait a while...

Kenji


[-- Attachment #2: Type: message/rfc822, Size: 2057 bytes --]

From: "rob pike" <rob@plan9.bell-labs.com>
To: 9fans@cse.psu.edu
Subject: Re: [9fans] charon fixes for utf8
Date: Tue, 6 Feb 2001 11:14:55 -0500
Message-ID: <20010206161518.A5A88199EA@mail.cse.psu.edu>


> By the way, I found /sys/src/cmd/tcs/font directory just now.   Yes, this is my
> first time to see this directory.  I suppose this includes tools for making many
> 9 subfonts from existing ones.  I suppose we didn't see this before (2ed), did we?

I didn't even know it existed. It slipped out when I wasn't looking, I guess.
It might help you but it's written for a very old version of the system.
It'll take some work to get it running, whatever it is. I'm not exactly sure.
It might be what was used to import the original Chinese fonts.

-rob

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [9fans] charon fixes for utf8
@ 2001-02-06  5:11 okamoto
  0 siblings, 0 replies; 9+ messages in thread
From: okamoto @ 2001-02-06  5:11 UTC (permalink / raw)
  To: 9fans

By the way, I found /sys/src/cmd/tcs/font directory just now.   Yes, this is my
first time to see this directory.  I suppose this includes tools for making many
9 subfonts from existing ones.  I suppose we didn't see this before (2ed), did we?

Kenji



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [9fans] charon fixes for utf8
@ 2001-02-06  1:01 okamoto
  0 siblings, 0 replies; 9+ messages in thread
From: okamoto @ 2001-02-06  1:01 UTC (permalink / raw)
  To: 9fans

>and there was still one question mark in a rectangle

Yes, this is same to me.  Only one rune just next to "問い合わせがあれ"?
Doesn't charon insert line feed or such into a rune of "ば"?

Kenji



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [9fans] charon fixes for utf8
@ 2001-02-05 18:18 Russ Cox
  0 siblings, 0 replies; 9+ messages in thread
From: Russ Cox @ 2001-02-05 18:18 UTC (permalink / raw)
  To: 9fans

In addition to the lack of jis, I really was having
problems with the lexer -- I installed a new font with JIS
and there was still one question mark in a rectangle
(not a peter face).  It looks like it might be dependent
on the timing of the connection as I haven't seen it again.

Russ


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [9fans] charon fixes for utf8
  2001-02-05  4:42 okamoto
@ 2001-02-05 10:26 ` Boyd Roberts
  0 siblings, 0 replies; 9+ messages in thread
From: Boyd Roberts @ 2001-02-05 10:26 UTC (permalink / raw)
  To: 9fans

> a ill-translated Kanji just after "問い合わせがあれ".

can be read by IE and nearly by me.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [9fans] charon fixes for utf8
@ 2001-02-05  4:42 okamoto
  2001-02-05 10:26 ` Boyd Roberts
  0 siblings, 1 reply; 9+ messages in thread
From: okamoto @ 2001-02-05  4:42 UTC (permalink / raw)
  To: 9fans

[-- Attachment #1: Type: text/plain, Size: 231 bytes --]

I visited wiki/22 by charon just before, and found there is
a ill-translated Kanji just after "問い合わせがあれ".
I don't know why, because this can be read right by
Wiki and netscape and MS's browser (IE?).

Kenji


[-- Attachment #2: Type: message/rfc822, Size: 2049 bytes --]

From: okamoto@granite.cias.osakafu-u.ac.jp
To: 9fans@cse.psu.edu
Subject: Re: [9fans] charon fixes for utf8
Date: Mon, 5 Feb 2001 13:26:31 0900
Message-ID: <20010205042736.A662D199D7@mail.cse.psu.edu>

>It looks like there is still a bug in the lexer, as
>viewing http://plan9.bell-labs.com/wiki/plan9/22

This is a font problem in charon.
In charon, fonts are fixed to /fonts/lucidasans in charon_gui.b, where
there is no Kanji font defined.  If you want to see that page, you may change
the lines of definition of fonts=array[NumFnt] of,
say "/fonts/lucidasans/unicode.8.font" to /fonts/pelm/unicode.9.font".
However, this will lead to Rob's grumble/complain ^_^ before, ie., we only
have very limited fonts there.  :-)

Kenji

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [9fans] charon fixes for utf8
@ 2001-02-05  4:26 okamoto
  0 siblings, 0 replies; 9+ messages in thread
From: okamoto @ 2001-02-05  4:26 UTC (permalink / raw)
  To: 9fans

>It looks like there is still a bug in the lexer, as
>viewing http://plan9.bell-labs.com/wiki/plan9/22

This is a font problem in charon.
In charon, fonts are fixed to /fonts/lucidasans in charon_gui.b, where
there is no Kanji font defined.  If you want to see that page, you may change
the lines of definition of fonts=array[NumFnt] of,
say "/fonts/lucidasans/unicode.8.font" to /fonts/pelm/unicode.9.font".
However, this will lead to Rob's grumble/complain ^_^ before, ie., we only
have very limited fonts there.  :-)

Kenji



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [9fans] charon fixes for utf8
@ 2001-02-02 21:04 Russ Cox
  0 siblings, 0 replies; 9+ messages in thread
From: Russ Cox @ 2001-02-02 21:04 UTC (permalink / raw)
  To: 9fans, help

The attached diffs (to the 3rd edition free Inferno
release) fix charon to correctly handle
UTF8 documents when the only indication of being UTF8
is in the HTML header (rather than the HTTP header),
as is the case for most UTF8 documents on the web
(including the wiki ones, now).

It looks like there is still a bug in the lexer, as
viewing http://plan9.bell-labs.com/wiki/plan9/22
misparses one of the UTF8 sequences, but I don't think
I did that.

http://www.columbia.edu/kermit/utf8.html displays nicely too.

Russ

diff -n build.b old.build.b
build.b:148,189 d old.build.b:147
< # must track Charsets in chutils.m
< metacharsetnames := array[] of {
< 	"unknown",
< 	"us-ascii",
< 	"iso-8859-1",
< 	"utf-8"
< };
<
< # Return document's media type and chset (if found).
< # If can't find either type, return old ones.
< parsecontent(mtype, chset : int, s: string) : (int, int)
< {
< 	if(s == "")
< 		return (mtype, chset);
<
< 	(ty, parms) := S->splitl(S->tolower(s), ";");
< 	mediatable := CU->makestrinttab(CU->mnames);
< 	(fnd, val) := T->lookup(mediatable, trim_white(ty));
< 	if(fnd) {
< 		mtype = val;
< 		(n, l) := sys->tokenize(trim_white(parms[1:]), " \t");
< 		for(; l != nil; l = tl l) {
< 			t := hd l;
< 			if(len t > 8 && t[0:8] == "charset=") {
< 				cval := -1;
< 				for(i:=0; i<len metacharsetnames; i++)
< 					if(t[8:] == metacharsetnames[i])
< 						cval = i;
< 				if(cval >= 0)
< 					chset = cval;
< 				else if(warn)
< 					sys->print("warning: unknown character set in %s\n", s);
< 			}
< 		}
< 	}
< 	else {
< 		if(warn)
< 			sys->print("warning: unknown media type in %s\n", s);
< 	}
< 	return (mtype, chset);
< }
<
build.b:204,205 d old.build.b:161
< 	chset := di.chset;
< 	mtype := is.ts.mtype;
build.b:734,736 d old.build.b:689
< 			# change character set if specified in html header
< 			is.ts.chset = di.chset = chset;
< 			is.ts.mtype = mtype;
build.b:974,975 d old.build.b:926
< 				"content-type" =>
< 					(mtype, chset) = parsecontent(mtype, chset, v);
diff -n chutils.m old.chutils.m
chutils.m:40 c old.chutils.m:40
< 	# Charsets  (must track chsetnames in chutils.b, metacharsetnames in build.b)
---
> 	# Charsets  (must track chsetnames in chutils.b)
diff -n lex.b old.lex.b
lex.b:475,476 d old.lex.b:474
< 					if (tok.tag == Thead+RBRA)
< 						break;
lex.b:1130,1133 c old.lex.b:1128,1131
< 	if(unicodechar!=-1) {
< 		ts.i=index;
< 		return unicodechar;
< 	}
---
>         if(unicodechar!=-1) {
>                         ts.i=index;
>                         return unicodechar;
>         }



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2001-02-07  1:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-02-06 16:14 [9fans] charon fixes for utf8 rob pike
  -- strict thread matches above, loose matches on Subject: below --
2001-02-07  1:12 okamoto
2001-02-06  5:11 okamoto
2001-02-06  1:01 okamoto
2001-02-05 18:18 Russ Cox
2001-02-05  4:42 okamoto
2001-02-05 10:26 ` Boyd Roberts
2001-02-05  4:26 okamoto
2001-02-02 21:04 Russ Cox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).