From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/53717 Path: main.gmane.org!not-for-mail From: Simon Josefsson Newsgroups: gmane.emacs.gnus.general Subject: Re: Gnus: UTF-8 and compatibility with other MUAs Date: Fri, 15 Aug 2003 01:05:04 +0200 Sender: ding-owner@lists.math.uh.edu Message-ID: References: NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Trace: sea.gmane.org 1060902433 6869 80.91.224.253 (14 Aug 2003 23:07:13 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Thu, 14 Aug 2003 23:07:13 +0000 (UTC) Cc: ding Original-X-From: ding-owner+M2261@lists.math.uh.edu Fri Aug 15 01:07:11 2003 Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 19nRBX-0008A7-00 for ; Fri, 15 Aug 2003 01:07:11 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by malifon.math.uh.edu with smtp (Exim 3.20 #1) id 19nR9f-0007T1-00; Thu, 14 Aug 2003 18:05:15 -0500 Original-Received: from sclp3.sclp.com ([64.157.176.121]) by malifon.math.uh.edu with smtp (Exim 3.20 #1) id 19nR9b-0007Sw-00 for ding@lists.math.uh.edu; Thu, 14 Aug 2003 18:05:11 -0500 Original-Received: (qmail 5395 invoked by alias); 14 Aug 2003 23:05:11 -0000 Original-Received: (qmail 5390 invoked from network); 14 Aug 2003 23:05:10 -0000 Original-Received: from 178.230.13.217.in-addr.dgcsystems.net (HELO yxa.extundo.com) (217.13.230.178) by sclp3.sclp.com with SMTP; 14 Aug 2003 23:05:10 -0000 Original-Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178]) (authenticated bits=0) by yxa.extundo.com (8.12.9/8.12.9) with ESMTP id h7EN55dk021139 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Fri, 15 Aug 2003 01:05:05 +0200 Original-To: Xavier Maillard Mail-Copies-To: nobody X-Payment: hashcash 1.2 0:030814:zedek@gnu-rox.org:4ebe844920be1a20 X-Hashcash: 0:030814:zedek@gnu-rox.org:4ebe844920be1a20 X-Payment: hashcash 1.2 0:030814:ding@gnus.org:a22101d1bfc36945 X-Hashcash: 0:030814:ding@gnus.org:a22101d1bfc36945 In-Reply-To: (Xavier Maillard's message of "Thu, 14 Aug 2003 17:48:40 +0200") User-Agent: Gnus/5.1003 (Gnus v5.10.3) Emacs/21.3.50 (gnu/linux) Precedence: bulk Xref: main.gmane.org gmane.emacs.gnus.general:53717 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:53717 Xavier Maillard writes: > Hi, > > I know Emacs is able to use utf-8 encoding so Gnus is. > > My question is more a question of compliance with other MUAs. > Would you recommend your users to use utf-8 as a default encoding > system ? AFAIK, I can't see many MUAs aware of it and worst almost > nobody is using utf-8 which was presented as the future. So what is the > problem with utf in general that prevent users in general to use it > defaultly ? IMHO: Users should use the oldest charset widely deployed, or preferred, in their own geographic region that is able to encode what they write. This means if a user write only ASCII, it is tagged as ASCII (or rather not tagged at all). And if a (northern?) European user write =C3=A5 it should use iso-8859-1. And if a european user write =CE=95=CE=BB=CE=BB=CE=B7=CE=BD=CE=B9=CE=BA=CE= =AC it should use iso-8859-7. And if a european user write =E2=82=AC it should use iso-8859-15. (One cou= ld argue that iso-8859-15 is too recent and that it may make sense to go directly to UTF-8, but my experience, as a northern european user, is that iso-8859-15 is more appropriate, since the almost-compatibility with iso-8859-1 is friendlier for people with old software.) And if a european user write =E2=82=AC and =CE=AC it should use UTF-8. (I'm assuming no 8859-* can encode both =E2=82=AC and =CE=AC.) This also means that it is wrong to use JP-2022-2, for european users, even though it technically may be able to encode some strings, that contain characters from 8859-* that isn't available in any single 8859-*. Instead they should go to UTF-8. I think this is how Gnus works though, unless you are in a UTF-8 locale and uses an old Emacs (then I think it will skip the 8859-* step, but I might be wrong). This logic might be flawed if the receiver is in another geographic region, of if a user mostly communicate internationally. Still, I'd probably use the above logic even if I sent something to a Japanese user, and expect them to use JP-2022-2 (or whatever) in return. Perhaps some day we can try ASCII first, then fall back to UTF-8. But that will take a long time. Even moving to ISO-8859-1 in northern Europe took a long time, and still isn't finished. I still use IBMPC2 (CP437?) in some regional communication channels.