* Problem inputting Japanese using XIM
@ 2002-01-21 7:01 Daiki Ueno
2002-01-21 23:24 ` Bart Schaefer
0 siblings, 1 reply; 3+ messages in thread
From: Daiki Ueno @ 2002-01-21 7:01 UTC (permalink / raw)
To: zsh-workers
Hello,
I'm wondering why zsh considers characters in the [0x80, 0xa0] range
as control characters? Without the patch below, I couldn't input any
UTF-8 text via XIM. For example, the series of characters 0xe3 0x81
0x82 corresponding the Japanese "a" are translated into the following
byte sequence in the typescript:
0xe3 0xe5(= ^) 0xc1(= 0x81 | '@') 0xe5 0xc2(= 0x82 | '@')
Index: utils.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/utils.c,v
retrieving revision 1.39
diff -u -F^( -r1.39 utils.c
--- utils.c 2002/01/06 01:07:23 1.39
+++ utils.c 2002/01/21 06:08:19
@@ -2160,7 +2160,7 @@
for (t0 = 0; t0 != 256; t0++)
typtab[t0] = 0;
for (t0 = 0; t0 != 32; t0++)
- typtab[t0] = typtab[t0 + 128] = ICNTRL;
+ typtab[t0] = ICNTRL;
typtab[127] = ICNTRL;
for (t0 = '0'; t0 <= '9'; t0++)
typtab[t0] = IDIGIT | IALNUM | IWORD | IIDENT | IUSER;
Regards,
--
Daiki Ueno
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Problem inputting Japanese using XIM
2002-01-21 7:01 Problem inputting Japanese using XIM Daiki Ueno
@ 2002-01-21 23:24 ` Bart Schaefer
2002-01-22 2:35 ` Daiki Ueno
0 siblings, 1 reply; 3+ messages in thread
From: Bart Schaefer @ 2002-01-21 23:24 UTC (permalink / raw)
To: Daiki Ueno, zsh-workers
On Jan 21, 4:01pm, Daiki Ueno wrote:
> Subject: Problem inputting Japanese using XIM
>
> I'm wondering why zsh considers characters in the [0x80, 0xa0] range
> as control characters?
Zsh does not handle multibyte character sets, in general. Internally it
always treats a single byte as a single character. The characters with
ASCII values above 128 are meta-characters, and those in the range 128-
159 are control-meta-characters (just as 0-31 are control without meta).
In short, anything you're able to do with UTF-8 works by accident rather
than by design. We haven't yet undertaken the massive rewrite that will
be necessary to convert to using double-byte or variable-width characters.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Problem inputting Japanese using XIM
2002-01-21 23:24 ` Bart Schaefer
@ 2002-01-22 2:35 ` Daiki Ueno
0 siblings, 0 replies; 3+ messages in thread
From: Daiki Ueno @ 2002-01-22 2:35 UTC (permalink / raw)
To: Bart Schaefer; +Cc: zsh-workers
>>>>> In <020121152432.ZM27716@candle.brasslantern.com>
>>>>> "Bart Schaefer" <schaefer@brasslantern.com> wrote:
> > I'm wondering why zsh considers characters in the [0x80, 0xa0] range
> > as control characters?
> Zsh does not handle multibyte character sets, in general. Internally it
> always treats a single byte as a single character. The characters with
> ASCII values above 128 are meta-characters, and those in the range 128-
> 159 are control-meta-characters (just as 0-31 are control without meta).
Thank you for the response, I understood that.
By the way, the control-character formatting routines can be found here
and there in Src/Zle/*.c. While they convert 0x0a into the form of "^J"
and so on, it seems that which is only helpful in displaying C0
characters, not C1 characters. Is this expected?
The actual code is as follows:
(snipped off Src/Zle/zle_refresh.c:1122 ...)
} else if (line[t0] == 0x7f) {
*vp++ = '^';
*vp++ = '?';
} else if (icntrl(line[t0])) {
*vp++ = '^';
*vp++ = line[t0] | '@';
} else
*vp++ = line[t0];
Regards,
--
Daiki Ueno
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2002-01-22 2:35 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-01-21 7:01 Problem inputting Japanese using XIM Daiki Ueno
2002-01-21 23:24 ` Bart Schaefer
2002-01-22 2:35 ` Daiki Ueno
Code repositories for project(s) associated with this public inbox
https://git.vuxu.org/mirror/zsh/
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).