From: Kenichi Handa <handa@m17n.org>
Cc: emacs-devel@gnu.org
Subject: Re: MML charset tag regression
Date: Mon, 28 Apr 2003 20:58:34 +0900 (JST) [thread overview]
Message-ID: <200304281158.UAA10974@etlken.m17n.org> (raw)
In-Reply-To: <m3wuhhwn0l.fsf@lugabout.jhcloos.org>
In article <m3wuhhwn0l.fsf@lugabout.jhcloos.org>, "James H. Cloos Jr." <cloos@jhcloos.com> writes:
>>>>>> "Simon" == Simon Josefsson <jas@extundo.com> writes:
Simon> For me, when I yanked the string into emacs from galeon it
Simon> becomes double-width. It is single-width in galeon though.
> I also see that; any pasting of cyrillic text via pasting X's
> primary or from the clipboard. The wide cyrillic is from the
> japanese-jisx0208 charset.
[...]
In article <iluu1clxymd.fsf@latte.josefsson.org>, Simon Josefsson <jas@extundo.com> writes:
> That may be interesting by itself. Go to
> http://www.nns.ru/persons/gorbach.html using galeon (or mozilla, I
> think). Cut'n'paste the first word and yank it in Emacs. It looks as
> single-width in galeon, but when yanked into emacs it becomes double
> width. Yanking it into xterm or gnome-terminal doesn't change the
> string, it looks like single-width. Save the HTML file and open it in
> emacs as a koi8 file (note that emacs doesn't auto detect it as koi8
> so you to do that manually), then it is single-width too.
> I guess it is the emacs X cut'n'paste code that somehow makes the
> string into double width japanese characters.
I don't think so. There's no such code in Emacs that does
such a conversion.
I think galeon sends Emacs those cyrillic characters by
encoding into COMPOUND_TEXT as a charset of JISX0208.
Please try this:
At first, select a cyrillic text on galeon. Then type this
in Emacs: C-x RET X raw-text RET C-y. You'll see something
like this; "ESC $ ( B ...".
Next, try this:
At first, select a cyrillic text on galeon. Then evalute
this in Emacs:
(decode-coding-string (x-get-selecion 'PRIMARY 'UTF8_STRING) 'utf-8)
I think you'll see single width cyrillic chars (you have to
have a iso10646-1 font containing cyrillic glyphs).
The selection problem is very deep. :-(
Ideally, the requester should be able to request of the type
'TEXT instead of the specific 'COMPOUND_TEXT or
'UTF8_STRING, and the requestee should return a text by one
of these appropriate types that can endocde the text;
STRING, COMPOUND_TEXT, or UTF8_STRING (in this priority
order).
But, unfortunetely, many X clients (requestee) don't behaves
like that. If 'TEXT is requested, many returns just "?????"
even if the text can be correctly encoded by COMPOUND_TEXT
or UTF8_STRING.
So, it is necessary for Emacs to request by a specific type
'COMPOUND_TEXT ('UTF8_STRING has been recently introduced in
XFree86, and there are many clients that still doesn't
support it).
Recently, many gtk clients start supporting UTF8_STRING
without making COMPOUND_TEXT support better. It may cause
no problem between gtk clients because they will request
only the type UTF8_STING. But, it's a too shortsighted
manner. :-(
The new encoding method using "Non-Standard Character Set
Encodings" of COMPOUND_TEXT makes the cyrillic case much
more complicated. In some case (perhaps only in KOI8
locale), X clients recently start to encode cyrillic
characters in "ESC % / 0 ...". They don't consider the
situation that the requester is running in a different
locale. :-(
Perhaps, we should make Emacs to request UTF8_STRING at
first if the locale is UTF8, and if that request fails,
request COMPOUND_TEXT.
---
Ken'ichi HANDA
handa@m17n.org
next prev parent reply other threads:[~2003-04-28 11:58 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-04-24 15:28 Simon Josefsson
2003-04-24 15:50 ` Jesper Harder
2003-04-24 21:15 ` Kai Großjohann
2003-04-24 23:04 ` Simon Josefsson
2003-04-25 0:11 ` Jesper Harder
2003-04-25 9:09 ` Manoj Srivastava
2003-04-25 0:03 ` Jesper Harder
2003-04-25 13:38 ` Kai Großjohann
2003-04-24 16:23 ` Kai Großjohann
2003-04-24 23:00 ` Simon Josefsson
2003-04-25 13:24 ` Kai Großjohann
2003-04-25 13:46 ` Simon Josefsson
2003-04-25 14:19 ` Jesper Harder
2003-04-25 15:43 ` Simon Josefsson
2003-04-25 16:31 ` Simon Josefsson
2003-04-25 19:39 ` Jesper Harder
2003-04-26 0:37 ` Simon Josefsson
2003-04-26 10:50 ` James H. Cloos Jr.
2003-04-28 11:58 ` Kenichi Handa [this message]
2003-04-28 12:43 ` Stephen J. Turnbull
2003-04-28 12:59 ` Kenichi Handa
2003-04-28 23:05 ` Simon Josefsson
2003-04-29 7:12 ` Stephen J. Turnbull
2003-04-29 5:38 ` Richard Stallman
2003-05-20 12:47 ` Kenichi Handa
2003-05-20 19:42 ` Jan D.
2003-05-21 15:31 ` Richard Stallman
2003-05-21 16:23 ` Jan D.
2003-05-22 0:58 ` Kenichi Handa
2003-05-22 16:25 ` Jan D.
2003-05-23 1:33 ` Kenichi Handa
2003-05-23 7:45 ` David Kastrup
2003-05-23 22:48 ` Richard Stallman
2003-05-23 23:41 ` Jan D.
2003-05-24 0:31 ` Miles Bader
2003-05-25 16:40 ` Dave Love
2003-05-25 18:01 ` Richard Stallman
2003-05-25 16:38 ` Dave Love
2003-05-25 17:25 ` Eli Zaretskii
2003-05-30 8:39 ` Kenichi Handa
2003-05-30 9:23 ` Dave Love
2003-05-30 11:36 ` Kenichi Handa
2003-06-04 22:01 ` Dave Love
2003-06-05 1:16 ` Kenichi Handa
2003-06-11 12:33 ` Stephen J. Turnbull
2003-06-01 15:40 ` Eli Zaretskii
2003-06-04 22:04 ` Dave Love
2003-06-06 10:55 ` Eli Zaretskii
2003-05-26 13:49 ` Richard Stallman
2003-05-30 9:28 ` Dave Love
2003-05-25 16:32 ` Dave Love
2003-05-25 19:14 ` Jan D.
2003-05-30 9:23 ` Dave Love
2003-05-23 12:05 ` Richard Stallman
2003-05-25 16:31 ` Dave Love
2003-05-30 12:03 ` Kenichi Handa
2003-06-04 21:52 ` Dave Love
2003-06-05 1:36 ` Kenichi Handa
2003-05-25 16:27 ` Dave Love
2003-05-24 0:51 ` Kenichi Handa
2003-05-23 12:03 ` Richard Stallman
2003-05-23 12:21 ` Kenichi Handa
2003-05-24 23:18 ` Richard Stallman
2003-06-13 8:37 ` Kenichi Handa
2003-06-15 15:59 ` Richard Stallman
2003-06-17 11:06 ` Kenichi Handa
2003-04-25 16:43 ` Michael Teichgräber
2003-04-25 17:14 ` Simon Josefsson
2003-04-25 18:14 ` Jesper Harder
2003-04-25 23:40 ` Michael Teichgräber
2003-04-26 0:32 ` Simon Josefsson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200304281158.UAA10974@etlken.m17n.org \
--to=handa@m17n.org \
--cc=emacs-devel@gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).