From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/53717
Path: main.gmane.org!not-for-mail
From: Simon Josefsson <jas@extundo.com>
Newsgroups: gmane.emacs.gnus.general
Subject: Re: Gnus: UTF-8 and compatibility with other MUAs
Date: Fri, 15 Aug 2003 01:05:04 +0200
Sender: ding-owner@lists.math.uh.edu
Message-ID: <iluptj7g7rz.fsf@latte.josefsson.org>
References: <plop87brus6y07.fsf@gnu-rox.org>
NNTP-Posting-Host: deer.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
X-Trace: sea.gmane.org 1060902433 6869 80.91.224.253 (14 Aug 2003 23:07:13 GMT)
X-Complaints-To: usenet@sea.gmane.org
NNTP-Posting-Date: Thu, 14 Aug 2003 23:07:13 +0000 (UTC)
Cc: ding <ding@gnus.org>
Original-X-From: ding-owner+M2261@lists.math.uh.edu Fri Aug 15 01:07:11 2003
Return-path: <ding-owner+M2261@lists.math.uh.edu>
Original-Received: from malifon.math.uh.edu ([129.7.128.13])
	by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian))
	id 19nRBX-0008A7-00
	for <ding-account@gmane.org>; Fri, 15 Aug 2003 01:07:11 +0200
Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu)
	by malifon.math.uh.edu with smtp (Exim 3.20 #1)
	id 19nR9f-0007T1-00; Thu, 14 Aug 2003 18:05:15 -0500
Original-Received: from sclp3.sclp.com ([64.157.176.121])
	by malifon.math.uh.edu with smtp (Exim 3.20 #1)
	id 19nR9b-0007Sw-00
	for ding@lists.math.uh.edu; Thu, 14 Aug 2003 18:05:11 -0500
Original-Received: (qmail 5395 invoked by alias); 14 Aug 2003 23:05:11 -0000
Original-Received: (qmail 5390 invoked from network); 14 Aug 2003 23:05:10 -0000
Original-Received: from 178.230.13.217.in-addr.dgcsystems.net (HELO yxa.extundo.com) (217.13.230.178)
  by sclp3.sclp.com with SMTP; 14 Aug 2003 23:05:10 -0000
Original-Received: from latte.josefsson.org (yxa.extundo.com [217.13.230.178])
	(authenticated bits=0)
	by yxa.extundo.com (8.12.9/8.12.9) with ESMTP id h7EN55dk021139
	(version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK);
	Fri, 15 Aug 2003 01:05:05 +0200
Original-To: Xavier Maillard <zedek@gnu-rox.org>
Mail-Copies-To: nobody
X-Payment: hashcash 1.2 0:030814:zedek@gnu-rox.org:4ebe844920be1a20
X-Hashcash: 0:030814:zedek@gnu-rox.org:4ebe844920be1a20
X-Payment: hashcash 1.2 0:030814:ding@gnus.org:a22101d1bfc36945
X-Hashcash: 0:030814:ding@gnus.org:a22101d1bfc36945
In-Reply-To: <plop87brus6y07.fsf@gnu-rox.org> (Xavier Maillard's message of
 "Thu, 14 Aug 2003 17:48:40 +0200")
User-Agent: Gnus/5.1003 (Gnus v5.10.3) Emacs/21.3.50 (gnu/linux)
Precedence: bulk
Xref: main.gmane.org gmane.emacs.gnus.general:53717
X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:53717

Xavier Maillard <zedek@gnu-rox.org> writes:

> Hi,
>
> I know Emacs is able to use utf-8 encoding so Gnus is.
>
> My question is more a question of compliance with other MUAs.
> Would you recommend your users to use utf-8 as a default encoding
> system ? AFAIK, I can't see many MUAs aware of it and worst almost
> nobody is using utf-8 which was presented as the future. So what is the
> problem with utf in general that prevent users in general to use it
> defaultly ?

IMHO:

Users should use the oldest charset widely deployed, or preferred, in
their own geographic region that is able to encode what they write.

This means if a user write only ASCII, it is tagged as ASCII (or
rather not tagged at all).

And if a (northern?) European user write =C3=A5 it should use iso-8859-1.

And if a european user write =CE=95=CE=BB=CE=BB=CE=B7=CE=BD=CE=B9=CE=BA=CE=
=AC it should use iso-8859-7.

And if a european user write =E2=82=AC it should use iso-8859-15.  (One cou=
ld
argue that iso-8859-15 is too recent and that it may make sense to go
directly to UTF-8, but my experience, as a northern european user, is
that iso-8859-15 is more appropriate, since the almost-compatibility
with iso-8859-1 is friendlier for people with old software.)

And if a european user write =E2=82=AC and =CE=AC it should use UTF-8.  (I'm
assuming no 8859-* can encode both =E2=82=AC and =CE=AC.)

This also means that it is wrong to use JP-2022-2, for european users,
even though it technically may be able to encode some strings, that
contain characters from 8859-* that isn't available in any single
8859-*.  Instead they should go to UTF-8.

I think this is how Gnus works though, unless you are in a UTF-8
locale and uses an old Emacs (then I think it will skip the 8859-*
step, but I might be wrong).

This logic might be flawed if the receiver is in another geographic
region, of if a user mostly communicate internationally.  Still, I'd
probably use the above logic even if I sent something to a Japanese
user, and expect them to use JP-2022-2 (or whatever) in return.

Perhaps some day we can try ASCII first, then fall back to UTF-8.  But
that will take a long time.  Even moving to ISO-8859-1 in northern
Europe took a long time, and still isn't finished.  I still use IBMPC2
(CP437?) in some regional communication channels.