From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/53755 Path: main.gmane.org!not-for-mail From: "James H. Cloos Jr." Newsgroups: gmane.emacs.gnus.general Subject: Re: Gnus: UTF-8 and compatibility with other MUAs Date: 17 Aug 2003 22:16:34 -0400 Sender: ding-owner@lists.math.uh.edu Message-ID: References: NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1061173077 3637 80.91.224.253 (18 Aug 2003 02:17:57 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Mon, 18 Aug 2003 02:17:57 +0000 (UTC) Original-X-From: ding-owner+M2296@lists.math.uh.edu Mon Aug 18 04:17:56 2003 Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 19oZal-0001vw-00 for ; Mon, 18 Aug 2003 04:17:55 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by malifon.math.uh.edu with smtp (Exim 3.20 #1) id 19oZZg-0000jI-00; Sun, 17 Aug 2003 21:16:48 -0500 Original-Received: from sclp3.sclp.com ([64.157.176.121]) by malifon.math.uh.edu with smtp (Exim 3.20 #1) id 19oZZc-0000jD-00 for ding@lists.math.uh.edu; Sun, 17 Aug 2003 21:16:44 -0500 Original-Received: (qmail 37336 invoked by alias); 18 Aug 2003 02:16:44 -0000 Original-Received: (qmail 37331 invoked from network); 18 Aug 2003 02:16:44 -0000 Original-Received: from ore.jhcloos.com (64.240.156.239) by sclp3.sclp.com with SMTP; 18 Aug 2003 02:16:44 -0000 Original-Received: from lugabout.jhcloos.org (ppp39.pm3-7.buf-ch.ny.localnet.com [207.251.196.103]) (using TLSv1 with cipher EDH-RSA-DES-CBC3-SHA (168/168 bits)) (Client CN "lugabout.jhcloos.org", Issuer "ca.jhcloos.com" (verified OK)) by ore.jhcloos.com (Postfix) with ESMTP id 5DCF21C59C for ; Sun, 17 Aug 2003 21:16:40 -0500 (CDT) Original-Received: from lugabout.jhcloos.org (localhost [127.0.0.1]) by lugabout.jhcloos.org (Postfix on SuSE Linux 7.3 (i386)) with ESMTP id 65FD125AA2 for ; Mon, 18 Aug 2003 02:16:34 +0000 (GMT) Original-To: ding@gnus.org In-Reply-To: Original-Lines: 16 User-Agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.3.50 Precedence: bulk Xref: main.gmane.org gmane.emacs.gnus.general:53755 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:53755 >>>>> "Simon" == Simon Josefsson writes: Simon> The disadvantage with UTF-8 is that you don't know where a code Simon> value ends within the encoded data without knowledge of UTF-8, [ed's note: this should be taken as an extension of Simon's point, not a counter-argument. It seemed ambiguous w/o a disclaimer.... -JimC] That isn't really a disadvantage, since you need knowledge of unicode itself anyway: not every unit fits in a single code point. Combining characters, variation selectors, et al all mean that even with utf32 there is no guarentee that you can split at any given int32, hense the fact that utf8 cannot be split at any given int8 is irrelevant. -JimC