From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/50710 Path: main.gmane.org!not-for-mail From: Simon Josefsson Newsgroups: gmane.emacs.gnus.general Subject: Re: charset=macintosh Date: Sun, 09 Mar 2003 12:48:30 +0100 Sender: owner-ding@hpc.uh.edu Message-ID: References: <843clxud7u.fsf@lucy.is.informatik.uni-duisburg.de> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=euc-kr Content-Transfer-Encoding: quoted-printable X-Trace: main.gmane.org 1047210521 9132 80.91.224.249 (9 Mar 2003 11:48:41 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sun, 9 Mar 2003 11:48:41 +0000 (UTC) Original-X-From: owner-ding@hpc.uh.edu Sun Mar 09 12:48:40 2003 Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 18rzIG-0002NA-00 for ; Sun, 09 Mar 2003 12:48:40 +0100 Original-Received: from sina.hpc.uh.edu ([129.7.128.10] ident=lists) by malifon.math.uh.edu with esmtp (Exim 3.20 #1) id 18rzIT-0008RF-00; Sun, 09 Mar 2003 05:48:53 -0600 Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Sun, 09 Mar 2003 05:49:54 -0600 (CST) Original-Received: from sclp3.sclp.com (sclp3.sclp.com [66.230.238.2]) by sina.hpc.uh.edu (8.9.3/8.9.3) with SMTP id FAA13762 for ; Sun, 9 Mar 2003 05:49:39 -0600 (CST) Original-Received: (qmail 39657 invoked by alias); 9 Mar 2003 11:48:34 -0000 Original-Received: (qmail 39652 invoked from network); 9 Mar 2003 11:48:34 -0000 Original-Received: from 178.230.13.217.in-addr.dgcsystems.net (HELO yxa.extundo.com) (217.13.230.178) by 66.230.238.6 with SMTP; 9 Mar 2003 11:48:34 -0000 Original-Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.8/8.12.8) with ESMTP id h29BmUZG013043 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK) for ; Sun, 9 Mar 2003 12:48:31 +0100 Original-To: ding@gnus.org Mail-Copies-To: nobody X-Payment: hashcash 1.2 0:030309:ding@gnus.org:04abdd1ccb0c9ead X-Hashcash: 0:030309:ding@gnus.org:04abdd1ccb0c9ead In-Reply-To: (Jesper Harder's message of "Sun, 09 Mar 2003 04:56:44 +0100") User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.3.50 (gnu/linux) Precedence: list X-Majordomo: 1.94.jlt7 Xref: main.gmane.org gmane.emacs.gnus.general:50710 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:50710 Jesper Harder writes: > Simon Josefsson writes: > >> But what if you are saying about UTF-8 clients being MIME capable is >> true, and since UTF-8 is typically never preferred by current emacsen, >> doesn't emacs' current guessing works the best we can hope for? >> Doesn't it detect among ISO-8859-X, ISO-2022 and Big5 properly? > > No. I was hoping we could do something like this (for headers): > > (let ((coding-systems (detect-coding-string string))) > (if (memq default coding-systems) > (decode-coding-string string default) > (decode-coding-string string (car coding-systems)))) > > i.e. if the default coding system is valid for the string, then use > that; otherwise use whatever Emacs thinks is the most likely coding > system. I think this would be ideal. Yes. (Except for the current UTF-8 bug, of course) > But unfortunately `detect-coding-string' _doesn't_ return a complete > list of possible coding systems. Consider this scenario:=20 > > I'm using Emacs in a Latin-1 locale. dk.* newsgroups work fine > because latin-1 is the default. But I also subscribe to, say, a few > Korean newsgroups. The entry in `gnus-groups-charset-alist': > > ("\\(^\\|:\\)han\\>" euc-kr) > > should take care of selecting the proper default charset. But *oops*, > `detect-coding-string' doesn't think that euc-kr is a possible charset > for a Korean string encoded in euc-kr: > > (detect-coding-string (encode-coding-string "=BE=C8=B3=E7" 'euc-kr= )) > =3D> (iso-latin-1 iso-latin-1 raw-text japanese-shift-jis=20 > chinese-big5 no-conversion) > > So the above approach would fail. I wonder if this isn't another bug. Maybe you could report it as a bug? >> 2) Users with emacs in UTF-8 prefers UTF-8 too often, even when the >> data is invalid UTF-8 and another encoding should be selected. >> >> The second situation is a bug, and I hope we can fix this. > > Yep, 2) is the most serious problem. Especially because more and more > people are (often unknowingly) using an UTF-8 locale because Redhat 8 > switched to UTF-8 by default. Those people would experience Gnus as > broken when reading hierarchies like dk.* or de.*. Fortunately we can blame it on emacs, and the PROBLEMS entry. Not so for Emacs 21.3 though, which has removed the entry but the behaviour is the same. But this should be fixed..