From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/50701 Path: main.gmane.org!not-for-mail From: Jesper Harder Newsgroups: gmane.emacs.gnus.general Subject: Re: charset=macintosh Date: Sun, 09 Mar 2003 04:56:44 +0100 Organization: http://purl.org/harder/ Sender: owner-ding@hpc.uh.edu Message-ID: References: <843clxud7u.fsf@lucy.is.informatik.uni-duisburg.de> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=euc-kr Content-Transfer-Encoding: 8bit X-Trace: main.gmane.org 1047182495 30882 80.91.224.249 (9 Mar 2003 04:01:35 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sun, 9 Mar 2003 04:01:35 +0000 (UTC) Original-X-From: owner-ding@hpc.uh.edu Sun Mar 09 05:01:33 2003 Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 18rs0D-00081x-00 for ; Sun, 09 Mar 2003 05:01:33 +0100 Original-Received: from sina.hpc.uh.edu ([129.7.128.10] ident=lists) by malifon.math.uh.edu with esmtp (Exim 3.20 #1) id 18rrzF-0004wD-00; Sat, 08 Mar 2003 22:00:33 -0600 Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Sat, 08 Mar 2003 22:01:34 -0600 (CST) Original-Received: from sclp3.sclp.com (sclp3.sclp.com [66.230.238.2]) by sina.hpc.uh.edu (8.9.3/8.9.3) with SMTP id WAA12918 for ; Sat, 8 Mar 2003 22:01:20 -0600 (CST) Original-Received: (qmail 76475 invoked by alias); 9 Mar 2003 04:00:15 -0000 Original-Received: (qmail 76470 invoked from network); 9 Mar 2003 04:00:15 -0000 Original-Received: from quimby.gnus.org (80.91.224.244) by 66.230.238.6 with SMTP; 9 Mar 2003 04:00:15 -0000 Original-Received: from news by quimby.gnus.org with local (Exim 3.12 #1 (Debian)) id 18rsJu-0007F0-00 for ; Sun, 09 Mar 2003 05:21:54 +0100 Original-To: ding@gnus.org Original-Path: localhost.localdomain!nobody Original-Newsgroups: gnus.ding Original-Lines: 46 Original-NNTP-Posting-Host: 0xc3f952fc.esnxr1.ras.tele.dk Original-X-Trace: quimby.gnus.org 1047183714 27837 195.249.82.252 (9 Mar 2003 04:21:54 GMT) Original-X-Complaints-To: usenet@quimby.gnus.org Original-NNTP-Posting-Date: 9 Mar 2003 04:21:54 GMT X-Face: ^RrvqCr7c,P$zTR:QED"@h9+BTm-"fjZJJ-3=OU7.)i/K]<.J88}s>'Z_$r; writes: > But what if you are saying about UTF-8 clients being MIME capable is > true, and since UTF-8 is typically never preferred by current emacsen, > doesn't emacs' current guessing works the best we can hope for? > Doesn't it detect among ISO-8859-X, ISO-2022 and Big5 properly? No. I was hoping we could do something like this (for headers): (let ((coding-systems (detect-coding-string string))) (if (memq default coding-systems) (decode-coding-string string default) (decode-coding-string string (car coding-systems)))) i.e. if the default coding system is valid for the string, then use that; otherwise use whatever Emacs thinks is the most likely coding system. I think this would be ideal. But unfortunately `detect-coding-string' _doesn't_ return a complete list of possible coding systems. Consider this scenario: I'm using Emacs in a Latin-1 locale. dk.* newsgroups work fine because latin-1 is the default. But I also subscribe to, say, a few Korean newsgroups. The entry in `gnus-groups-charset-alist': ("\\(^\\|:\\)han\\>" euc-kr) should take care of selecting the proper default charset. But *oops*, `detect-coding-string' doesn't think that euc-kr is a possible charset for a Korean string encoded in euc-kr: (detect-coding-string (encode-coding-string "¾È³ç" 'euc-kr)) => (iso-latin-1 iso-latin-1 raw-text japanese-shift-jis chinese-big5 no-conversion) So the above approach would fail. > 2) Users with emacs in UTF-8 prefers UTF-8 too often, even when the > data is invalid UTF-8 and another encoding should be selected. > > The second situation is a bug, and I hope we can fix this. Yep, 2) is the most serious problem. Especially because more and more people are (often unknowingly) using an UTF-8 locale because Redhat 8 switched to UTF-8 by default. Those people would experience Gnus as broken when reading hierarchies like dk.* or de.*.