From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2712 invoked from network); 5 Jul 2006 12:14:45 -0000 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00, FORGED_RCVD_HELO autolearn=ham version=3.1.3 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 5 Jul 2006 12:14:45 -0000 Received-SPF: none (ns1.primenet.com.au: domain at sunsite.dk does not designate permitted sender hosts) Received: (qmail 16055 invoked from network); 5 Jul 2006 12:14:36 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 5 Jul 2006 12:14:36 -0000 Received: (qmail 10169 invoked by alias); 5 Jul 2006 12:14:33 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 22539 Received: (qmail 10157 invoked from network); 5 Jul 2006 12:14:32 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by sunsite.dk with SMTP; 5 Jul 2006 12:14:32 -0000 Received: (qmail 15603 invoked from network); 5 Jul 2006 12:14:32 -0000 Received: from cluster-d.mailcontrol.com (217.69.20.190) by a.mx.sunsite.dk with SMTP; 5 Jul 2006 12:14:30 -0000 Received: from cameurexb01.EUROPE.ROOT.PRI ([62.189.241.200]) by rly35d.srv.mailcontrol.com (MailControl) with ESMTP id k65CEH36010685 for ; Wed, 5 Jul 2006 13:14:27 +0100 Received: from news01.csr.com ([10.103.143.38]) by cameurexb01.EUROPE.ROOT.PRI with Microsoft SMTPSVC(6.0.3790.1830); Wed, 5 Jul 2006 13:14:17 +0100 Received: from news01.csr.com (localhost.localdomain [127.0.0.1]) by news01.csr.com (8.13.4/8.13.4) with ESMTP id k65CEHFH011363 for ; Wed, 5 Jul 2006 13:14:17 +0100 Received: from csr.com (pws@localhost) by news01.csr.com (8.13.4/8.13.4/Submit) with ESMTP id k65CEHQ2011360 for ; Wed, 5 Jul 2006 13:14:17 +0100 Message-Id: <200607051214.k65CEHQ2011360@news01.csr.com> X-Authentication-Warning: news01.csr.com: pws owned process doing -bs To: zsh-workers@sunsite.dk (Zsh hackers list) Subject: Questions about character types Date: Wed, 05 Jul 2006 13:14:17 +0100 From: Peter Stephenson X-OriginalArrivalTime: 05 Jul 2006 12:14:17.0556 (UTC) FILETIME=[89AE8940:01C6A02C] Content-Type: text/plain MIME-Version: 1.0 X-Scanned-By: MailControl A-07-00-10 (www.mailcontrol.com) on 10.68.0.145 I'm just looking at handling character types consistently for multibyte characters. The most important issue is what to allow in identifiers (which usually means parameter names). The traditional zsh behaviour is to allow all non-ASCII characters. Now we can do it properly, it makes sense to limit these to alphanumerics; this isn't portable, but allowing all 8-bit characters as before seems too gross to continue. However, according to POSIX, only characters from the "portable character set", which in our case means the ASCII subset, are allowed. The only way round this I can see is to add an option POSIX_IDENTIFIERS to limit the behaviour. I would use an existing option if one seemed relevant but it doesn't. The test for module names currently approximates that for identifiers, only allowing / as well, so I think using the same logic as above would be the obvious thing to do. Another question is what to do with user names. Currently these are just the ASCII identifier characters plus "-". Is it useful to extend these to include alphanumeric characters from the local character set? Finally, I failed to interpret this code from math.c: if (*ptr == '+' && (unary || !ialnum(*ptr))) { ptr++; return (unary) ? PREPLUS : POSTPLUS; } I don't see how ialnum(*ptr) could succeed. What does this mean? -- Peter Stephenson Software Engineer CSR PLC, Churchill House, Cambridge Business Park, Cowley Road Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070 To access the latest news from CSR copy this link into a web browser: http://www.csr.com/email_sig.php