From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/21193 Path: main.gmane.org!not-for-mail From: Steinar Bang Newsgroups: gmane.emacs.gnus.general Subject: UTF-8 (Was: More charset things) Date: 14 Feb 1999 19:10:48 +0100 Sender: owner-ding@hpc.uh.edu Message-ID: References: <87d83qkyjf.fsf@pc-hrvoje.srce.hr> <87ognahyoh.fsf@pc-hrvoje.srce.hr> NNTP-Posting-Host: coloc-standby.netfonds.no Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: main.gmane.org 1035159346 21763 80.91.224.250 (21 Oct 2002 00:15:46 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 21 Oct 2002 00:15:46 +0000 (UTC) Return-Path: Original-Received: from spinoza.math.uh.edu (spinoza.math.uh.edu [129.7.128.18]) by sclp3.sclp.com (8.8.5/8.8.5) with ESMTP id NAA08104 for ; Sun, 14 Feb 1999 13:12:37 -0500 (EST) Original-Received: from sina.hpc.uh.edu (lists@Sina.HPC.UH.EDU [129.7.3.5]) by spinoza.math.uh.edu (8.9.1/8.9.1) with ESMTP id MAB11044; Sun, 14 Feb 1999 12:11:10 -0600 (CST) Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Sun, 14 Feb 1999 12:11:30 -0600 (CST) Original-Received: from sclp3.sclp.com (root@sclp3.sclp.com [204.252.123.139]) by sina.hpc.uh.edu (8.7.3/8.7.3) with ESMTP id MAA13652 for ; Sun, 14 Feb 1999 12:11:21 -0600 (CST) Original-Received: from viffer.oslo.metis.no (sb@viffer.oslo.metis.no [195.0.254.249]) by sclp3.sclp.com (8.8.5/8.8.5) with ESMTP id NAA08031 for ; Sun, 14 Feb 1999 13:11:12 -0500 (EST) Original-Received: (from sb@localhost) by viffer.oslo.metis.no (8.8.8/8.8.8) id TAA10513; Sun, 14 Feb 1999 19:10:48 +0100 Original-To: ding@gnus.org In-Reply-To: Steinar Bang's message of "09 Feb 1999 09:05:56 +0100" Original-Lines: 35 User-Agent: Gnus/5.070065 (Pterodactyl Gnus v0.65) XEmacs/20.4 (Emerald) Precedence: list X-Majordomo: 1.94.jlt7 Xref: main.gmane.org gmane.emacs.gnus.general:21193 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:21193 >>>>> Steinar Bang : >>>>> François Pinard : >> That is why Lars could well decide, one of these days, to support >> UTF-8 as an encoding (which it really is) on the same level as >> Base64, and moreover, rather fun to implement. It might be >> convenient that Gnus do so as a contribution to the Unicode effort, >> without really waiting for Emacs to do it. One reason to support UTF-8 decoding and encoding, is that son-of-son-of-1036 (or watchamacallit) http://www.ietf.org/internet-drafts/draft-ietf-usefor-article-01.txt seems to recommend UTF-8 for both the headers and bodies of news messages. Hm... the way this works would probably be to have a UTF-8 decoding that would always attempt to decode a news message and then revert to a locale or newsgroup specific setting if the UTF-8 decoding breaks down (use of the iso-8859-1 charset in the case of the no.* hierarchy). UTF-8 encoding should probably not be made default for a while yet. At least it should be made newsgroup hierarchy dependent. > But isn't UTF-8 support something that really should be done at the C > level (like base64 is done in newer emacsen)? Or am I thinking of > UTF-7 here...? (does anyone have some handy online references?) UTF-8 is defined in RFC2279 ftp://ftp.ntnu.no/pub/rfc/rfc2279.txt UTF-7 is defined in RFC2152 ftp://ftp.ntnu.no/pub/rfc/rfc2152.txt Both would probably be best off with decoding done in C.