From mboxrd@z Thu Jan  1 00:00:00 1970
X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/67666
Path: news.gmane.org!not-for-mail
From: "Stephen J. Turnbull" <stephen@xemacs.org>
Newsgroups: gmane.emacs.devel,gmane.emacs.gnus.general
Subject: Re: gnus should accept UTF8 even if UTF-8 is standard
Date: Wed, 22 Oct 2008 11:34:17 +0900
Message-ID: <87fxmptl1y.fsf@xemacs.org>
References: <jwvabd5s1qc.fsf-monnier+emacs@gnu.org>
	<E1KqTpJ-0001iY-1J@fencepost.gnu.org>
	<jwvvdvsk0od.fsf-monnier+emacs@gnu.org>
	<E1KqvTk-0006Yw-8l@fencepost.gnu.org>
	<jwvd4hxpw32.fsf-monnier+emacs@gnu.org>
	<E1KrjLV-0001mU-Mm@fencepost.gnu.org>
	<jwvwsg4vtrn.fsf-monnier+emacs@gnu.org>
	<E1KryB0-0002AM-2x@fencepost.gnu.org> <87wsg2tvcn.fsf@xemacs.org>
	<buoljwicyhg.fsf@dhapc248.dev.necel.com>
	<20081021062510.GB22593@tomas>
	<buo4p36cvtm.fsf@dhapc248.dev.necel.com> <uprlugy8g.fsf@gnu.org>
	<87prlutizh.fsf@xemacs.org> <uhc76gscp.fsf@gnu.org>
	<87k5c2tan9.fsf@xemacs.org> <uabcyglzw.fsf@gnu.org>
NNTP-Posting-Host: lo.gmane.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: ger.gmane.org 1224642572 20930 80.91.229.12 (22 Oct 2008 02:29:32 GMT)
X-Complaints-To: usenet@ger.gmane.org
NNTP-Posting-Date: Wed, 22 Oct 2008 02:29:32 +0000 (UTC)
Cc: rms@gnu.org, ding@gnus.org, emacs-devel@gnu.org, tomas@tuxteam.de,
	monnier@iro.umontreal.ca, miles@gnu.org
To: Eli Zaretskii <eliz@gnu.org>
Original-X-From: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org Wed Oct 22 04:30:32 2008
connect(): Connection refused
Return-path: <emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org>
Envelope-to: ged-emacs-devel@m.gmane.org
Original-Received: from lists.gnu.org ([199.232.76.165])
	by lo.gmane.org with esmtp (Exim 4.50)
	id 1KsTUV-0008JR-1h
	for ged-emacs-devel@m.gmane.org; Wed, 22 Oct 2008 04:30:31 +0200
Original-Received: from localhost ([127.0.0.1]:51622 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43)
	id 1KsTTO-0001zE-MP
	for ged-emacs-devel@m.gmane.org; Tue, 21 Oct 2008 22:29:22 -0400
Original-Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43)
	id 1KsTTL-0001z7-Fh
	for emacs-devel@gnu.org; Tue, 21 Oct 2008 22:29:19 -0400
Original-Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43)
	id 1KsTTJ-0001yn-2l
	for emacs-devel@gnu.org; Tue, 21 Oct 2008 22:29:18 -0400
Original-Received: from [199.232.76.173] (port=37992 helo=monty-python.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.43) id 1KsTTJ-0001yk-0j
	for emacs-devel@gnu.org; Tue, 21 Oct 2008 22:29:17 -0400
Original-Received: from mtps02.sk.tsukuba.ac.jp ([130.158.97.224]:41656)
	by monty-python.gnu.org with esmtp (Exim 4.60)
	(envelope-from <stephen@xemacs.org>)
	id 1KsTT6-000743-VQ; Tue, 21 Oct 2008 22:29:06 -0400
Original-Received: from uwakimon.sk.tsukuba.ac.jp (uwakimon.sk.tsukuba.ac.jp
	[130.158.99.156])
	by mtps02.sk.tsukuba.ac.jp (Postfix) with ESMTP id 519A1800E;
	Wed, 22 Oct 2008 11:29:01 +0900 (JST)
Original-Received: by uwakimon.sk.tsukuba.ac.jp (Postfix, from userid 1000)
	id C528C1A26AE; Wed, 22 Oct 2008 11:34:17 +0900 (JST)
In-Reply-To: <uabcyglzw.fsf@gnu.org>
X-Mailer: VM 8.0.12-devo-585 under 21.5 (beta28) "fuki" 83e35df20028+ XEmacs
	Lucid (x86_64-unknown-linux)
X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6,
	seldom 2.4 (older, 4)
X-BeenThere: emacs-devel@gnu.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: "Emacs development discussions." <emacs-devel.gnu.org>
List-Unsubscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/pipermail/emacs-devel>
List-Post: <mailto:emacs-devel@gnu.org>
List-Help: <mailto:emacs-devel-request@gnu.org?subject=help>
List-Subscribe: <http://lists.gnu.org/mailman/listinfo/emacs-devel>,
	<mailto:emacs-devel-request@gnu.org?subject=subscribe>
Original-Sender: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Errors-To: emacs-devel-bounces+ged-emacs-devel=m.gmane.org@gnu.org
Xref: news.gmane.org gmane.emacs.devel:104807 gmane.emacs.gnus.general:67666
Archived-At: <http://permalink.gmane.org/gmane.emacs.gnus.general/67666>

Eli Zaretskii writes:

 > > > Perhaps something like `canonicalize-coding-system-name' would be good.
 > > 
 > > That implies that the return value would be a string, not the coding
 > > system itself.  I suggest we return the coding system (or nil), not
 > > just the name.
 > 
 > What I meant is that, instead of returning a _string_, which is the
 > name of a coding system, it is better to return a _symbol_ of that
 > coding system.

Of course.  My point is that the symbol is the name, and therefore
"canonicalize-coding-system-name" is a reasonable name for this
function.

If it weren't for the conflict with XEmacs, which still needs
`get-coding-system' to return a coding system object, I'd be perfectly
happy using that.

 > > AIUI, the point of the function is to guess what people who don't
 > > know what they're doing are trying to express (and to provide some
 > > interactive convenience to people who do know what they're doing).
 > 
 > Agreed, but in most cases the argument will be a valid MIME charset.

Except when Richard<wink> is typing, and surely we all consider that
an important use case?  Aside from Richard's expressed preference for
a harmless convenience, the presence or absence of one or more hyphens
is something the various standards disagree about:

 > The case of "UTF8" is an exception.

Well, no, I think it is not.  AFAIK only one of "iso-8859-1" and
"iso8859-1" is registered, but Emacs uses the former exclusively, and
X11 only the latter (in XLFDs).  Both are acceptable to iconv.  (And
the ISO standards actually use "ISO 8859/1" which isn't even
acceptable to glibc iconv!)

 > And even in this exceptional case, I understand that "UTF8" came
 > from some charset= header.  That is why I suggested
 > coding-system-for-charset.

Well, the MIME nomenclature is seriously broken.  A substantial
minority of the things it denotes "charsets" are not "character sets"
in any sense.

 > I don't mind coding-system-for-mime-charset, either, if that was
 > your point.

That's the worst of several suggestions, as this mapping is not
limited to MIME charsets, but is useful for coding systems in general,
as the usage of hyphens in their names has no rhyme nor reason.  Is it
"KOI8-R" or "KOI-8R"?  That one confused me, at least, for a while.

 > (In Emacs 23+, the original Mule meaning of "charset" will fade
 > out.)

That would be sad.  While I agree that UTF-8 will fairly quickly
become universal for current text documents, I don't expect the vast
amount of legacy archives to be converted any time soon (some will be
converted at the time of converting to new media, but human beings
being what they are I expect that for a couple centuries some
bureaucrats will just make bit-level copies ;-).  Emacs should be the
premier application for reading those!