From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7447 invoked from network); 4 Oct 2004 16:21:45 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 4 Oct 2004 16:21:45 -0000 Received: (qmail 66575 invoked from network); 4 Oct 2004 16:21:39 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 4 Oct 2004 16:21:39 -0000 Received: (qmail 9882 invoked by alias); 4 Oct 2004 16:21:17 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 20451 Received: (qmail 9847 invoked from network); 4 Oct 2004 16:21:15 -0000 Received: from unknown (HELO a.mx.sunsite.dk) (130.225.247.88) by sunsite.dk with SMTP; 4 Oct 2004 16:21:15 -0000 Received: (qmail 64632 invoked from network); 4 Oct 2004 16:20:31 -0000 Received: from lhuumrelay3.lnd.ops.eu.uu.net (62.189.58.19) by a.mx.sunsite.dk with SMTP; 4 Oct 2004 16:20:30 -0000 Received: from MAILSWEEPER01.csr.com (mailhost1.csr.com [62.189.183.235]) by lhuumrelay3.lnd.ops.eu.uu.net (8.11.0/8.11.0) with ESMTP id i94GKRv16586 for ; Mon, 4 Oct 2004 16:20:28 GMT Received: from EXCHANGE02.csr.com (unverified [192.168.137.45]) by MAILSWEEPER01.csr.com (Content Technologies SMTPRS 4.3.12) with ESMTP id for ; Mon, 4 Oct 2004 17:19:25 +0100 Received: from news01.csr.com ([192.168.143.38]) by EXCHANGE02.csr.com with Microsoft SMTPSVC(5.0.2195.6713); Mon, 4 Oct 2004 17:22:31 +0100 Received: from news01.csr.com (localhost.localdomain [127.0.0.1]) by news01.csr.com (8.12.11/8.12.11) with ESMTP id i94GKNoH006003 for ; Mon, 4 Oct 2004 17:20:23 +0100 Received: from csr.com (pws@localhost) by news01.csr.com (8.12.11/8.12.11/Submit) with ESMTP id i94GKNro006000 for ; Mon, 4 Oct 2004 17:20:23 +0100 Message-Id: <200410041620.i94GKNro006000@news01.csr.com> X-Authentication-Warning: news01.csr.com: pws owned process doing -bs To: Zsh-workers Subject: Re: UTF-8 support In-reply-to: <23473.1096659965@trentino.logica.co.uk> References: <20041001184122.GA9094@fargo> <23473.1096659965@trentino.logica.co.uk> Date: Mon, 04 Oct 2004 17:20:23 +0100 From: Peter Stephenson X-OriginalArrivalTime: 04 Oct 2004 16:22:31.0249 (UTC) FILETIME=[590D7C10:01C4AA2E] X-Spam-Checker-Version: SpamAssassin 2.63 on a.mx.sunsite.dk X-Spam-Level: X-Spam-Status: No, hits=0.0 required=6.0 tests=none autolearn=no version=2.63 X-Spam-Hits: 0.0 Oliver Kiddle wrote: > In my opinion it would be sensible to support multibyte encodings in > general and not just UTF-8. Doing this isn't much effort beyond handling > UTF-8 if we assume basic ASCII compatibility and don't worry about > stateful encodings. I came to the conclusion that was going to be very time consuming --- it means unmetafying potentially a long string (we don't know where the characters end) and calling a function every time we want to compare multibyte characters. Doing it only for UTF-8 can be optimised to work with extensions to the current tests; it's simple to test for the length of a UTF-8 character (although some error checking is also necessary). Given that the whole point of Unicode is to replace all other schemes, I'm not so keen about supporting other schemes if it's that much less efficient. -- Peter Stephenson Software Engineer CSR Ltd., Science Park, Milton Road, Cambridge, CB4 0WH, UK Tel: +44 (0)1223 692070 ********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This footnote also confirms that this email message has been swept by MIMEsweeper for the presence of computer viruses. www.mimesweeper.com **********************************************************************