Gnus development mailing list
 help / color / mirror / Atom feed
From: Kenichi Handa <handa@m17n.org>
Cc: emacs-devel@gnu.org
Subject: Re: MML charset tag regression
Date: Mon, 28 Apr 2003 20:58:34 +0900 (JST)	[thread overview]
Message-ID: <200304281158.UAA10974@etlken.m17n.org> (raw)
In-Reply-To: <m3wuhhwn0l.fsf@lugabout.jhcloos.org>

In article <m3wuhhwn0l.fsf@lugabout.jhcloos.org>, "James H. Cloos Jr." <cloos@jhcloos.com> writes:
>>>>>>  "Simon" == Simon Josefsson <jas@extundo.com> writes:
Simon>  For me, when I yanked the string into emacs from galeon it
Simon>  becomes double-width.  It is single-width in galeon though.

> I also see that; any pasting of cyrillic text via pasting X's
> primary or from the clipboard.  The wide cyrillic is from the
> japanese-jisx0208 charset.
[...]

In article <iluu1clxymd.fsf@latte.josefsson.org>, Simon Josefsson <jas@extundo.com> writes:
> That may be interesting by itself.  Go to
> http://www.nns.ru/persons/gorbach.html using galeon (or mozilla, I
> think).  Cut'n'paste the first word and yank it in Emacs.  It looks as
> single-width in galeon, but when yanked into emacs it becomes double
> width. Yanking it into xterm or gnome-terminal doesn't change the
> string, it looks like single-width.  Save the HTML file and open it in
> emacs as a koi8 file (note that emacs doesn't auto detect it as koi8
> so you to do that manually), then it is single-width too.

> I guess it is the emacs X cut'n'paste code that somehow makes the
> string into double width japanese characters.

I don't think so.  There's no such code in Emacs that does
such a conversion.

I think galeon sends Emacs those cyrillic characters by
encoding into COMPOUND_TEXT as a charset of JISX0208.

Please try this:

At first, select a cyrillic text on galeon.  Then type this
in Emacs: C-x RET X raw-text RET C-y.  You'll see something
like this; "ESC $ ( B ...".

Next, try this:

At first, select a cyrillic text on galeon.  Then evalute
this in Emacs:
   (decode-coding-string (x-get-selecion 'PRIMARY 'UTF8_STRING) 'utf-8)
I think you'll see single width cyrillic chars (you have to
have a iso10646-1 font containing cyrillic glyphs).

The selection problem is very deep.  :-(

Ideally, the requester should be able to request of the type
'TEXT instead of the specific 'COMPOUND_TEXT or
'UTF8_STRING, and the requestee should return a text by one
of these appropriate types that can endocde the text;
STRING, COMPOUND_TEXT, or UTF8_STRING (in this priority
order).

But, unfortunetely, many X clients (requestee) don't behaves
like that.  If 'TEXT is requested, many returns just "?????"
even if the text can be correctly encoded by COMPOUND_TEXT
or UTF8_STRING.

So, it is necessary for Emacs to request by a specific type
'COMPOUND_TEXT ('UTF8_STRING has been recently introduced in
XFree86, and there are many clients that still doesn't
support it).

Recently, many gtk clients start supporting UTF8_STRING
without making COMPOUND_TEXT support better.  It may cause
no problem between gtk clients because they will request
only the type UTF8_STING.  But, it's a too shortsighted
manner.  :-(

The new encoding method using "Non-Standard Character Set
Encodings" of COMPOUND_TEXT makes the cyrillic case much
more complicated.  In some case (perhaps only in KOI8
locale), X clients recently start to encode cyrillic
characters in "ESC % / 0 ...".  They don't consider the
situation that the requester is running in a different
locale.  :-(

Perhaps, we should make Emacs to request UTF8_STRING at
first if the locale is UTF8, and if that request fails,
request COMPOUND_TEXT.

---
Ken'ichi HANDA
handa@m17n.org

  reply	other threads:[~2003-04-28 11:58 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-04-24 15:28 Simon Josefsson
2003-04-24 15:50 ` Jesper Harder
2003-04-24 21:15   ` Kai Großjohann
2003-04-24 23:04     ` Simon Josefsson
2003-04-25  0:11       ` Jesper Harder
2003-04-25  9:09       ` Manoj Srivastava
2003-04-25  0:03     ` Jesper Harder
2003-04-25 13:38       ` Kai Großjohann
2003-04-24 16:23 ` Kai Großjohann
2003-04-24 23:00   ` Simon Josefsson
2003-04-25 13:24     ` Kai Großjohann
2003-04-25 13:46       ` Simon Josefsson
2003-04-25 14:19         ` Jesper Harder
2003-04-25 15:43           ` Simon Josefsson
2003-04-25 16:31             ` Simon Josefsson
2003-04-25 19:39               ` Jesper Harder
2003-04-26  0:37                 ` Simon Josefsson
2003-04-26 10:50                   ` James H. Cloos Jr.
2003-04-28 11:58                     ` Kenichi Handa [this message]
2003-04-28 12:43                       ` Stephen J. Turnbull
2003-04-28 12:59                         ` Kenichi Handa
2003-04-28 23:05                       ` Simon Josefsson
2003-04-29  7:12                         ` Stephen J. Turnbull
2003-04-29  5:38                       ` Richard Stallman
2003-05-20 12:47                         ` Kenichi Handa
2003-05-20 19:42                           ` Jan D.
2003-05-21 15:31                             ` Richard Stallman
2003-05-21 16:23                               ` Jan D.
2003-05-22  0:58                                 ` Kenichi Handa
2003-05-22 16:25                                   ` Jan D.
2003-05-23  1:33                                     ` Kenichi Handa
2003-05-23  7:45                                       ` David Kastrup
2003-05-23 22:48                                     ` Richard Stallman
2003-05-23 23:41                                       ` Jan D.
2003-05-24  0:31                                         ` Miles Bader
2003-05-25 16:40                                           ` Dave Love
2003-05-25 18:01                                         ` Richard Stallman
2003-05-25 16:38                                       ` Dave Love
2003-05-25 17:25                                         ` Eli Zaretskii
2003-05-30  8:39                                           ` Kenichi Handa
2003-05-30  9:23                                           ` Dave Love
2003-05-30 11:36                                             ` Kenichi Handa
2003-06-04 22:01                                               ` Dave Love
2003-06-05  1:16                                                 ` Kenichi Handa
2003-06-11 12:33                                                 ` Stephen J. Turnbull
2003-06-01 15:40                                             ` Eli Zaretskii
2003-06-04 22:04                                               ` Dave Love
2003-06-06 10:55                                                 ` Eli Zaretskii
2003-05-26 13:49                                         ` Richard Stallman
2003-05-30  9:28                                           ` Dave Love
2003-05-25 16:32                                     ` Dave Love
2003-05-25 19:14                                       ` Jan D.
2003-05-30  9:23                                         ` Dave Love
2003-05-23 12:05                                   ` Richard Stallman
2003-05-25 16:31                                   ` Dave Love
2003-05-30 12:03                                     ` Kenichi Handa
2003-06-04 21:52                                       ` Dave Love
2003-06-05  1:36                                         ` Kenichi Handa
2003-05-25 16:27                                 ` Dave Love
2003-05-24  0:51                               ` Kenichi Handa
2003-05-23 12:03                           ` Richard Stallman
2003-05-23 12:21                             ` Kenichi Handa
2003-05-24 23:18                               ` Richard Stallman
2003-06-13  8:37                                 ` Kenichi Handa
2003-06-15 15:59                                   ` Richard Stallman
2003-06-17 11:06                                     ` Kenichi Handa
2003-04-25 16:43             ` Michael Teichgräber
2003-04-25 17:14               ` Simon Josefsson
2003-04-25 18:14                 ` Jesper Harder
2003-04-25 23:40                 ` Michael Teichgräber
2003-04-26  0:32                   ` Simon Josefsson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200304281158.UAA10974@etlken.m17n.org \
    --to=handa@m17n.org \
    --cc=emacs-devel@gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).