From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3579 invoked from network); 11 Sep 2005 12:14:04 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 11 Sep 2005 12:14:04 -0000 Received: (qmail 60938 invoked from network); 11 Sep 2005 12:13:57 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 11 Sep 2005 12:13:57 -0000 Received: (qmail 2244 invoked by alias); 11 Sep 2005 12:13:55 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 21724 Received: (qmail 2234 invoked from network); 11 Sep 2005 12:13:54 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by sunsite.dk with SMTP; 11 Sep 2005 12:13:54 -0000 Received: (qmail 60648 invoked from network); 11 Sep 2005 12:13:54 -0000 Received: from mailgw1.technion.ac.il (132.68.238.34) by a.mx.sunsite.dk with SMTP; 11 Sep 2005 12:13:49 -0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by mailgw1.technion.ac.il (Postfix) with ESMTP id 5D232FF8D1 for ; Sun, 11 Sep 2005 15:13:47 +0300 (IDT) Received: from mailgw1.technion.ac.il ([127.0.0.1]) by localhost (mailgw1.technion.ac.il [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 27803-01-85 for ; Sun, 11 Sep 2005 15:13:47 +0300 (IDT) Received: from fermat.math.technion.ac.il (fermat.math.technion.ac.il [132.68.115.6]) by mailgw1.technion.ac.il (Postfix) with ESMTP id 1AF5AFF891 for ; Sun, 11 Sep 2005 15:13:47 +0300 (IDT) Received: from fermat.math.technion.ac.il (localhost [127.0.0.1]) by fermat.math.technion.ac.il (8.12.10/8.12.10) with ESMTP id j8BCDjD6017093; Sun, 11 Sep 2005 15:13:45 +0300 (IDT) Received: (from rl@localhost) by fermat.math.technion.ac.il (8.12.10/8.12.10/Submit) id j8BCDjvx017092; Sun, 11 Sep 2005 15:13:45 +0300 (IDT) X-Authentication-Warning: fermat.math.technion.ac.il: rl set sender to rl@math.technion.ac.il using -f Date: Sun, 11 Sep 2005 15:13:45 +0300 From: "Zvi Har'El" To: Zsh hackers list Cc: "Nadav Har'El" Subject: problem in prompt in utf-8 Message-ID: <20050911121345.GA14384@fermat.math.technion.ac.il> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.4.2i Organization: Technion--Israel Institute of Technology Hebrew-Date: 7 Elul 5765 X-PGP-Public-Key: http://www.math.technion.ac.il/~rl/etc/pubkey.html X-Virus-Scanned: by amavisd-new at technion.ac.il X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.3 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.0.4 Hello, I have started using zsh-4.3.0 from the CVS, in a uft-8 locale. I enjoy it very much. However, I have a problem with the prompting. This is not new, but since the completion now works nicely, I thought I'll mention it, since it is not solved yet. I have the setting PS1=%/$\ I expect that print -P $PS1 and pwd will give the same output, which will also be the zsh prompt (except the final $ and space, of course). However, if the current directory name contains hebrew letter, which are in the range U+05D0 to U+05EA, i.e., the utf-8 sequences have two bytes, with the first one is always 0xD7 (M-W) and the second in the range 0x90 (M-^P) to 0xAA (M-*). I mkdir'ed a directory which has all the letters in this range: /home/rl$ mkdir אבגדהוזחטיךכלםמןנסעףפץצקרשת cd'ed to that directory: /home/rl$ cd אבגדהוזחטיךכלםמןנסעףפץצקרשת I got as an echo a correct result: ~/אבגדהוזחטיךכלםמןנסעףפץצקרשת The next prompt had invalid utf-8 sequences: /home/rl/������������לםמןנסעףפץצקרשת$ To make it more specific, all the range U+05D0 to U+05DB, (second byte 0x90 to 0x9ba) got invalid. I don't know exactly what is wrong. Notice that 'pwd' produces /home/rl/אבגדהוזחטיךכלםמןנסעףפץצקרשת I.e, all the letters are correct, while 'print -P $PS1' produces /home/rl/אבגדהוזחטיך�לםמןנסעףפץצקרשת$ With exactly one invalid utf-8 seqience, more specifically, U+05DB (second byte 0x9ba) - the last one in the previous range, is bad. print -P $PS1 | cat -v produces /home/rl/M-WM-^PM-WM-^QM-WM-^RM-WM-^SM-WM-^TM-WM-^UM-WM-^VM-WM-^WM-WM-^XM-WM-^YM-WM-^ZM-WM-WM-^\M-WM-^]M-WM-^^M-WM-^_M-WM- M-WM-!M-WM-"M-WM-#M-WM-$M-WM-%M-WM-&M-WM-'M-WM-(M-WM-)M-WM-*$ while pwd | cat -v produces /home/rl/M-WM-^PM-WM-^QM-WM-^RM-WM-^SM-WM-^TM-WM-^UM-WM-^VM-WM-^WM-WM-^XM-WM-^YM-WM-^ZM-WM-^[M-WM-^\M-WM-^]M-WM-^^M-WM-^_M-WM- M-WM-!M-WM-"M-WM-#M-WM-$M-WM-%M-WM-&M-WM-'M-WM-(M-WM-)M-WM-* It is perhaps hard to see the difference, but a close inspection shows that the first string contains a solitary M-W between the M-WM-^Z and the the M-WM-^\ sequences, while the second one contains there the sequence M-WM-^[ , i.e., a M-^[, or Meta-Esacpe, was dropped from the string. Unfortunately, I didn't find an easy way to put the real prompt on a file, so I can't tell what is the exact sequences in it. I hope this make some sense. -- Dr. Zvi Har'El mailto:rl@math.technion.ac.il Department of Mathematics tel:+972-54-4227607 icq:179294841 Technion - Israel Institute of Technology fax:+972-4-8293388 http://www.math.technion.ac.il/~rl/ Haifa 32000, ISRAEL "If you can't say somethin' nice, don't say nothin' at all." -- Thumper (1942) Sunday, 7 Elul 5765, 11 September 2005, 1:54PM