From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/53720 Path: main.gmane.org!not-for-mail From: Oliver Scholz Newsgroups: gmane.emacs.gnus.general Subject: Re: Gnus: UTF-8 and compatibility with other MUAs Date: Fri, 15 Aug 2003 15:50:17 +0200 Sender: ding-owner@lists.math.uh.edu Message-ID: References: NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Trace: sea.gmane.org 1060961763 22649 80.91.224.253 (15 Aug 2003 15:36:03 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Fri, 15 Aug 2003 15:36:03 +0000 (UTC) Original-X-From: ding-owner+M2264@lists.math.uh.edu Fri Aug 15 17:36:02 2003 Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 19ngcU-0001V6-00 for ; Fri, 15 Aug 2003 17:36:02 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by malifon.math.uh.edu with smtp (Exim 3.20 #1) id 19ngb1-000187-00; Fri, 15 Aug 2003 10:34:31 -0500 Original-Received: from sclp3.sclp.com ([64.157.176.121]) by malifon.math.uh.edu with smtp (Exim 3.20 #1) id 19nf6Y-0000wj-00 for ding@lists.math.uh.edu; Fri, 15 Aug 2003 08:58:58 -0500 Original-Received: (qmail 48598 invoked by alias); 15 Aug 2003 13:58:58 -0000 Original-Received: (qmail 48593 invoked from network); 15 Aug 2003 13:58:57 -0000 Original-Received: from main.gmane.org (80.91.224.249) by sclp3.sclp.com with SMTP; 15 Aug 2003 13:58:57 -0000 Original-Received: from root by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 19nf7e-0005i5-00 for ; Fri, 15 Aug 2003 16:00:06 +0200 X-Injected-Via-Gmane: http://gmane.org/ Original-To: ding@gnus.org Original-Received: from sea.gmane.org ([80.91.224.252]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 19nf0t-0005eC-00 for ; Fri, 15 Aug 2003 15:53:07 +0200 Original-Received: from news by sea.gmane.org with local (Exim 3.35 #1 (Debian)) id 19nezk-0002wM-00 for ; Fri, 15 Aug 2003 15:51:56 +0200 Original-Lines: 45 Original-X-Complaints-To: usenet@sea.gmane.org X-Attribution: os X-Face: "HgH2sgK|bfH$;PiOJI6|qUCf.ve<51_Od(%ynHr?=>znn#~#oS>",F%B8&\vus),2AsPYb -n>PgddtGEn}s7kH?7kH{P_~vu?]OvVN^qD(L)>G^gDCl(U9n{:d>'DkilN!_K"eNzjrtI4Ya6;Td% IZGMbJ{lawG+'J>QXPZD&TwWU@^~A}f^zAb[Ru;CT(UA]c& User-Agent: Gnus/5.1002 (Gnus v5.10.2) Emacs/21.3.50 (windows-nt) Cancel-Lock: sha1:R8HEnWuGJ5RpKKKnVX/KCb3zzYE= Precedence: bulk Xref: main.gmane.org gmane.emacs.gnus.general:53720 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:53720 Jesper Harder writes: [...] > To use UTF-8 by default would also be against RFC 2046: > > ,----[ RFC 2046, Section 4.1.2. ] > | > | In general, composition software should always use the "lowest common > | denominator" character set possible. For example, if a body contains > | only US-ASCII characters, it SHOULD be marked as being in the US- > | ASCII character set, not ISO-8859-1, which, like all the ISO-8859 > | family of character sets, is a superset of US-ASCII. More generally, > | if a widely-used character set is a subset of another character set, > | and a body contains only characters in the widely-used subset, it > | should be labelled as being in that subset. This will increase the > | chances that the recipient will be able to view the resulting entity > | correctly. > `---- [...] That's not how I read the section you quoted. In my reading this means that you should not declare the message to be in UTF-8, when it contains only ASCII characters. For characters from the right hand part of ISO 8859-1 this is not so simple: Latin-1 (as a coded character set) may be a subset of UCS. But Latin-1 (as a character encoding scheme) is _not_ a subset of UTF-8. The lowest common denominator for most German text is ISO 646-DE. For most Danish text (I presume) ISO 646-DK. Virtually nobody uses those coding systems anymore, and IMNSHO nobody should use them. (I have implemented ISO 646-DE for GNU Emacs in a way that it could be easily extended to other national variants of ISO 646, in case you are interested ...) Sure, one could say that the national variants of ISO 646 are excluded by the phrase “widely-used character sets”, but that is a bit too fuzzy for my taste. Taken literally nobody should use ISO 8859-15 then, unless the message really contains an € (or one of the other 7 characters). Maybe this is what this section wants to say, but then I dare say that it doesn't make much sense as a technical rule and I am glad that it is not stated in a way that makes it mandatory. Oliver -- 28 Thermidor an 211 de la Révolution Liberté, Egalité, Fraternité!