9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] Acme Edit/tcs and different character sets
@ 2007-09-07 14:15 Noah Evans
  2007-09-07 15:32 ` Steve Simon
  2007-09-07 16:30 ` Rob Pike
  0 siblings, 2 replies; 4+ messages in thread
From: Noah Evans @ 2007-09-07 14:15 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Hey,

Today I was trying to deal with some Japanese text data in acme and
tried to Edit ,|tcs -f ms-kanji on the text. It ended up as gibberish.
However when I did a regular tcs -f ms-kanji on the file outside of
acme it worked. Can anybody who understands Edit and the way that acme
deals with non unicode text explain to me what is going wrong here? Is
it fixable? If so what do I need to do to make it work?

Best Regards,

Noah


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] Acme Edit/tcs and different character sets
  2007-09-07 14:15 [9fans] Acme Edit/tcs and different character sets Noah Evans
@ 2007-09-07 15:32 ` Steve Simon
  2007-09-07 16:30 ` Rob Pike
  1 sibling, 0 replies; 4+ messages in thread
From: Steve Simon @ 2007-09-07 15:32 UTC (permalink / raw)
  To: 9fans

I have a suspicion that your file gets munged when acme reads it as acme
is expecting valid unicode and ms-kanji is not valid unicode. I think
you have no option but to translate to/from MS outside acme.

If this is a common problem for you then you could write a little file server
which envokes tcs to translate to and from an ms-kanji transparently and run it
behind acme (so acme inherits its namespace).

-Steve


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] Acme Edit/tcs and different character sets
  2007-09-07 14:15 [9fans] Acme Edit/tcs and different character sets Noah Evans
  2007-09-07 15:32 ` Steve Simon
@ 2007-09-07 16:30 ` Rob Pike
  2007-09-07 17:04   ` erik quanstrom
  1 sibling, 1 reply; 4+ messages in thread
From: Rob Pike @ 2007-09-07 16:30 UTC (permalink / raw)
  To: Fans of the OS Plan 9 from Bell Labs

Acme treats all text as UTF-8. If the input text was ms-kanji, it won't
be UTF-8 and when acme reads it, it will end up full of encoding errors
- represented in UTF-8. Running that UTF-8 text back through
tcs -f ms-kanji will produce gibberish.

You need to use tcs on the raw files before putting them into the editor
(or almost any other Plan 9 tool).

-rob


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [9fans] Acme Edit/tcs and different character sets
  2007-09-07 16:30 ` Rob Pike
@ 2007-09-07 17:04   ` erik quanstrom
  0 siblings, 0 replies; 4+ messages in thread
From: erik quanstrom @ 2007-09-07 17:04 UTC (permalink / raw)
  To: 9fans

> Acme treats all text as UTF-8. If the input text was ms-kanji, it won't
> be UTF-8 and when acme reads it, it will end up full of encoding errors
> - represented in UTF-8. Running that UTF-8 text back through
> tcs -f ms-kanji will produce gibberish.
>
> You need to use tcs on the raw files before putting them into the editor
> (or almost any other Plan 9 tool).
>
> -rob

one of the best decisions made in plan 9 is to have
one character set.  there are a few downsides, but
plan 9 doesn't need locals and the tools may be ignorant
of other character sets.

gnu grep is a good example of why locals are a bad idea.

- erik



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-09-07 17:04 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-09-07 14:15 [9fans] Acme Edit/tcs and different character sets Noah Evans
2007-09-07 15:32 ` Steve Simon
2007-09-07 16:30 ` Rob Pike
2007-09-07 17:04   ` erik quanstrom

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).