From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/53758 Path: main.gmane.org!not-for-mail From: Benjamin Riefenstahl Newsgroups: gmane.emacs.gnus.general Subject: Re: Gnus: UTF-8 and compatibility with other MUAs Date: Mon, 18 Aug 2003 17:58:13 +0200 Sender: ding-owner@lists.math.uh.edu Message-ID: References: NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1061222445 14129 80.91.224.253 (18 Aug 2003 16:00:45 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Mon, 18 Aug 2003 16:00:45 +0000 (UTC) Original-X-From: ding-owner+M2299@lists.math.uh.edu Mon Aug 18 18:00:44 2003 Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 19omR1-0001Xq-00 for ; Mon, 18 Aug 2003 18:00:44 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by malifon.math.uh.edu with smtp (Exim 3.20 #1) id 19omOv-0002Da-00; Mon, 18 Aug 2003 10:58:33 -0500 Original-Received: from sclp3.sclp.com ([64.157.176.121]) by malifon.math.uh.edu with smtp (Exim 3.20 #1) id 19omOn-0002DS-00 for ding@lists.math.uh.edu; Mon, 18 Aug 2003 10:58:25 -0500 Original-Received: (qmail 95523 invoked by alias); 18 Aug 2003 15:58:24 -0000 Original-Received: (qmail 95518 invoked from network); 18 Aug 2003 15:58:24 -0000 Original-Received: from unknown (HELO mail.epost.de) (193.28.100.165) by sclp3.sclp.com with SMTP; 18 Aug 2003 15:58:24 -0000 Original-Received: from seneca.benny.turtle-trading.net.epost.de (193.99.153.30) by mail.epost.de (6.7.015) id 3F32740D000F2751 for ding@gnus.org; Mon, 18 Aug 2003 17:58:21 +0200 Original-To: ding@gnus.org In-Reply-To: (Oliver Scholz's message of "Sun, 17 Aug 2003 18:40:17 +0200") User-Agent: Gnus/5.1001 (Gnus v5.10.1) Emacs/21.3.50 (gnu/linux) Precedence: bulk Xref: main.gmane.org gmane.emacs.gnus.general:53758 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:53758 Hi Oliver, >> Simon Josefsson writes: >>> 'cmp' still says the files are different. > Benjamin Riefenstahl writes: >> Actually UTF-8 still has that problem with composed vs. decomposed >> characters. There is no perfect system AFAIK. Oliver Scholz writes: > Do you refer to the fact here that a character like, say, U+00E9 > (LATIN SMALL LETTER E WITH ACUTE) is equivalent to U+0065 followed > by U+0301 (LATIN SMALL LETTER E followed by COMBINING ACUTE ACCENT)? Yes. > So AFAIK UTF-16 is meant as a space-efficient format for East Asian > text. That and compatibility. The first Unicode versions talked much about the 16-bit representation and the most wide-spread users (Windows NT, COM, VFAT, HFS+) implemented it like that. benny