From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/50700 Path: main.gmane.org!not-for-mail From: Simon Josefsson Newsgroups: gmane.emacs.gnus.general Subject: Re: charset=macintosh Date: Sat, 08 Mar 2003 21:17:12 +0100 Sender: owner-ding@hpc.uh.edu Message-ID: References: <843clxud7u.fsf@lucy.is.informatik.uni-duisburg.de> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: main.gmane.org 1047154652 18405 80.91.224.249 (8 Mar 2003 20:17:32 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sat, 8 Mar 2003 20:17:32 +0000 (UTC) Original-X-From: owner-ding@hpc.uh.edu Sat Mar 08 21:17:30 2003 Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 18rkl8-0004mb-00 for ; Sat, 08 Mar 2003 21:17:30 +0100 Original-Received: from sina.hpc.uh.edu ([129.7.128.10] ident=lists) by malifon.math.uh.edu with esmtp (Exim 3.20 #1) id 18rklC-0003xf-00; Sat, 08 Mar 2003 14:17:34 -0600 Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Sat, 08 Mar 2003 14:18:34 -0600 (CST) Original-Received: from sclp3.sclp.com (sclp3.sclp.com [66.230.238.2]) by sina.hpc.uh.edu (8.9.3/8.9.3) with SMTP id OAA12580 for ; Sat, 8 Mar 2003 14:18:20 -0600 (CST) Original-Received: (qmail 51771 invoked by alias); 8 Mar 2003 20:17:16 -0000 Original-Received: (qmail 51766 invoked from network); 8 Mar 2003 20:17:16 -0000 Original-Received: from 178.230.13.217.in-addr.dgcsystems.net (HELO yxa.extundo.com) (217.13.230.178) by 66.230.238.6 with SMTP; 8 Mar 2003 20:17:16 -0000 Original-Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.8/8.12.8) with ESMTP id h28KHCZG017123 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK) for ; Sat, 8 Mar 2003 21:17:13 +0100 Original-To: ding@gnus.org Mail-Copies-To: nobody X-Payment: hashcash 1.2 0:030308:ding@gnus.org:cbd1dad665c18a1b X-Hashcash: 0:030308:ding@gnus.org:cbd1dad665c18a1b In-Reply-To: (Jesper Harder's message of "Sat, 08 Mar 2003 20:47:34 +0100") User-Agent: Gnus/5.090016 (Oort Gnus v0.16) Emacs/21.3.50 (gnu/linux) Precedence: list X-Majordomo: 1.94.jlt7 Xref: main.gmane.org gmane.emacs.gnus.general:50700 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:50700 Jesper Harder writes: > kai.grossjohann@uni-duisburg.de (Kai Großjohann) writes: > >> Jesper Harder writes: >> >>> Just because you're using UTF-8, mac-roman, EBCDIC or whatever on >>> your local system doesn't mean that it's a good guess that you'll >>> receive mail and news using this charset. >> >> I think this guess is better than no guess at all. > > Well, I think experience shows that UTF-8 isn't a good guess. It's a > bad guess for dk.*, and as Karl says also for de.*. > > Previously `gnus-group-charset-alist' had (".*" iso-8859-1) as the last > entry -- this was probably better for the majority of users. > > I haven't got any statistics to back it up, but I suspect that clients > which send UTF-8 are probably newer, and thus more likely to include a > proper MIME charset declaration. Seems reasonable. > So, I think it's much more likely that the charset of a message with an > undeclared charset is iso-8859-x rather than UTF-8. Or ISO-2022. Or Big5. The problem with choosing ISO-8859-X is that it is western centric, and probably rightly be considered offensive. But what if you are saying about UTF-8 clients being MIME capable is true, and since UTF-8 is typically never preferred by current emacsen, doesn't emacs' current guessing works the best we can hope for? Doesn't it detect among ISO-8859-X, ISO-2022 and Big5 properly? I think it does. So there is only a problem where: 1) UTF-8 is used without MIME tagging since emacs never guess this correctly, but you argue (and I agree) that this case is unlikely. 2) Users with emacs in UTF-8 prefers UTF-8 too often, even when the data is invalid UTF-8 and another encoding should be selected. The second situation is a bug, and I hope we can fix this. Maybe I missed something.