From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 8390 invoked from network); 4 Feb 2008 14:32:26 -0000 X-Spam-Checker-Version: SpamAssassin 3.2.4 (2008-01-01) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.4 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.4 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 4 Feb 2008 14:32:26 -0000 Received-SPF: none (ns1.primenet.com.au: domain at sunsite.dk does not designate permitted sender hosts) Received: (qmail 16615 invoked from network); 4 Feb 2008 14:32:22 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 4 Feb 2008 14:32:22 -0000 Received: (qmail 25528 invoked by alias); 4 Feb 2008 14:32:19 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 24531 Received: (qmail 25511 invoked from network); 4 Feb 2008 14:32:18 -0000 Received: from bifrost.dotsrc.org (130.225.254.106) by sunsite.dk with SMTP; 4 Feb 2008 14:32:18 -0000 Received: from cluster-d.mailcontrol.com (cluster-d.mailcontrol.com [217.69.20.190]) by bifrost.dotsrc.org (Postfix) with ESMTP id C87F780482A1 for ; Mon, 4 Feb 2008 15:32:14 +0100 (CET) Received: from rly46d.srv.mailcontrol.com (localhost.localdomain [127.0.0.1]) by rly46d.srv.mailcontrol.com (MailControl) with ESMTP id m14EW9JZ025111 for ; Mon, 4 Feb 2008 14:32:09 GMT Received: from submission.mailcontrol.com (submission.mailcontrol.com [86.111.216.190]) by rly46d.srv.mailcontrol.com (MailControl) id m14EVThi022606 for zsh-workers@sunsite.dk; Mon, 4 Feb 2008 14:31:29 GMT Received: from cameurexb01.EUROPE.ROOT.PRI ([62.189.241.200]) by rly46d-eth0.srv.mailcontrol.com (envelope-sender Peter.Stephenson@csr.com) (MIMEDefang) with ESMTP id m14EVSu9022524; Mon, 04 Feb 2008 14:31:29 +0000 (GMT) Received: from news01 ([10.103.143.38]) by cameurexb01.EUROPE.ROOT.PRI with Microsoft SMTPSVC(6.0.3790.3959); Mon, 4 Feb 2008 14:30:56 +0000 Date: Mon, 4 Feb 2008 14:30:56 +0000 From: Peter Stephenson To: "Rajesh Jangam" , zsh-workers@sunsite.dk Subject: Re: Problem using some Japanese characters on Windows Message-ID: <20080204143056.2cd4e138@news01> In-Reply-To: <4cbc0dcc0802040600h30eb4100ia481deef705c3395@mail.gmail.com> References: <4cbc0dcc0802040600h30eb4100ia481deef705c3395@mail.gmail.com> Organization: CSR X-Mailer: Claws Mail 3.2.0 (GTK+ 2.12.5; i386-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 04 Feb 2008 14:30:56.0406 (UTC) FILETIME=[8DC10360:01C8673A] X-Scanned-By: MailControl A-08-00-01 (www.mailcontrol.com) on 10.68.1.156 X-Virus-Scanned: ClamAV 0.91.2/5678/Mon Feb 4 02:15:53 2008 on bifrost X-Virus-Status: Clean On Mon, 4 Feb 2008 19:30:59 +0530 "Rajesh Jangam" wrote: > I am a zsh user and using zsh-4.3.4 on a Japanese windows box. > I compiled it with multi-byte support enabled. > > However, for some Japanese strings, it gives a "bad pattern" error. > Attached is a screen shot with an example. I presume you're using cygwin. I haven't investigated the multibyte support there in any detail, but my impression last time I looked was that it was quite limited at the level of system libraries etc. (Windows has its own ways of doing this that are of course much more complete but aren't directly accessible.) > After debugging a bit, I found that we have used some characters in the > extended ascii range > for marking special characters like: Star(*), Quest(?) etc. This (probably) isn't the problem: the special characters are distinguished from native characters by appropriate internal quoting (which should be transparent to the user). I say "probably" because there may be local bugs, but you'll have to give details of what you're doing. (Try the new version 4.3.5 first.) > However these seem to clash with the CP932 character set which is the > default on Windows > CP932 defines characters in the extended Ascii character range and beyond. > (> 0x81) You certainly need to make sure the shell has been told about the character set, typically by setting the environment variable LANG to the appropriate locale, otherwise the shell won't know what to do with characters in that range. The point of multi-byte support is essentially that the shell does rely on the system, instead of guessing. This is where I'm unsure how good the support currently is in Cygwin. If it doesn't make your locale available then you're out of luck and you might be better off configuring with --disable-multibyte. You might get more information on that score from Cygwin people. -- Peter Stephenson Software Engineer CSR PLC, Churchill House, Cambridge Business Park, Cowley Road Cambridge, CB4 0WZ, UK Tel: +44 (0)1223 692070