From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 15758 invoked from network); 17 Sep 2005 18:15:48 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 17 Sep 2005 18:15:48 -0000 Received: (qmail 89730 invoked from network); 17 Sep 2005 18:15:37 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 17 Sep 2005 18:15:37 -0000 Received: (qmail 19372 invoked by alias); 17 Sep 2005 18:15:34 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 21730 Received: (qmail 19362 invoked from network); 17 Sep 2005 18:15:33 -0000 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by sunsite.dk with SMTP; 17 Sep 2005 18:15:33 -0000 Received: (qmail 89291 invoked from network); 17 Sep 2005 18:15:33 -0000 Received: from cmailm2.svr.pol.co.uk (195.92.193.210) by a.mx.sunsite.dk with SMTP; 17 Sep 2005 18:15:20 -0000 Received: from modem-4050.putangitangi.dialup.pol.co.uk ([81.78.207.210] helo=pwstephenson.fsnet.co.uk) by cmailm2.svr.pol.co.uk with esmtp (Exim 4.41) id 1EGhDZ-0003nk-R5 for zsh-workers@sunsite.dk; Sat, 17 Sep 2005 19:15:19 +0100 Received: by pwstephenson.fsnet.co.uk (Postfix, from userid 501) id D63308638; Sat, 17 Sep 2005 14:15:08 -0400 (EDT) Received: from pwstephenson.fsnet.co.uk (localhost [127.0.0.1]) by pwstephenson.fsnet.co.uk (Postfix) with ESMTP id AB0B0862C for ; Sat, 17 Sep 2005 19:15:08 +0100 (BST) To: Zsh hackers list Subject: Re: problem in prompt in utf-8 In-reply-to: <20050911121345.GA14384@fermat.math.technion.ac.il> References: <20050911121345.GA14384@fermat.math.technion.ac.il> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Date: Sat, 17 Sep 2005 19:15:07 +0100 From: Peter Stephenson Message-Id: <20050917181508.D63308638@pwstephenson.fsnet.co.uk> Content-Transfer-Encoding: quoted-printable X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00 autolearn=ham version=3.0.4 "Zvi Har'El" wrote: > I have started using zsh-4.3.0 from the CVS, in a uft-8 locale. I enjoy= it > very much. However, I have a problem with the prompting. This is not ne= w, but > since the completion now works nicely, I thought I'll mention it, since= it is > not solved yet. > /home/rl$ cd =D7=90=D7=91=D7=92=D7=93=D7=94=D7=95=D7=96=D7=97=D7=98=D7=99= =D7=9A=D7=9B=D7=9C=D7=9D=D7=9E=D7=9F=D7=A0=D7=A1=D7=A2=D7=A3=D7=A4=D7=A5=D7= =A6=D7=A7=D7=A8=D7=A9=D7=AA=20 >=20 > The next prompt had invalid utf-8 sequences: >=20 >=20 > /home/rl/=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD= =EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=EF=BF=BD=D7=9C=D7=9D=D7=9E=D7=9F=D7=A0= =D7=A1=D7=A2=D7=A3=D7=A4=D7=A5=D7=A6=D7=A7=D7=A8=D7=A9=D7=AA$=20 [This message uses raw 8-bit UTF-8, as the original did; hope this came through OK, since I hacked the headers by hand. MH in Emacs is a bit antiquated. I'm only surprised my system managed to display Hebrew characters OK... It doesn't actually matter apart from the quoted text above.] There was an inconsistency when formatting a string that contained a character in the range reserved for tokens: conversion to the zsh internal form (metafication) wasn't done correctly. This particular problem wasn't actually within zle, it was in the main shell and (as you sort of indicated) wasn't directly related to multibyte characters. This should fix the immediate problem, but note that the width of the prompt isn't calculated correctly yet: we don't scan prompts for multibyte characters. Hence you might see oddities with the display since the shell doesn't know the position of the cursor after the prompt. This is another thing on the list of fixes needed in zle. (It should come under the "not rocket science" heading, unlike the completion code, so I hope it will be fixed relatively soon.) Please do report any more of these inconsistencies; users who regularly encounter character sets other than latin-based ones are valuable for this. I hope I haven't caused any new problems... I think I caught all the uses of nicechar() and made sure they expected metafied strings. The first hunk is tangential to the rest: on the way in, I noticed that the variable pwd was metafied and so needed to be unmetafied on output. Index: Src/builtin.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /cvsroot/zsh/zsh/Src/builtin.c,v retrieving revision 1.148 diff -u -r1.148 builtin.c --- Src/builtin.c 9 Sep 2005 16:06:48 -0000 1.148 +++ Src/builtin.c 17 Sep 2005 18:09:24 -0000 @@ -699,7 +699,7 @@ else fmt =3D " "; if (OPT_ISSET(ops,'l')) - fputs(pwd, stdout); + zputs(pwd, stdout); else fprintdir(pwd, stdout); for (node =3D firstnode(dirstack); node; incnode(node)) { Index: Src/utils.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D RCS file: /cvsroot/zsh/zsh/Src/utils.c,v retrieving revision 1.89 diff -u -r1.89 utils.c --- Src/utils.c 9 Sep 2005 20:34:42 -0000 1.89 +++ Src/utils.c 17 Sep 2005 18:10:08 -0000 @@ -146,7 +146,7 @@ putc('%', stderr); break; case 'c': - fputs(nicechar(num), stderr); + zputs(nicechar(num), stderr); break; case 'e': /* print the corresponding message for this errno */ @@ -195,15 +195,21 @@ return 0; } =20 -/* Turn a character into a visible representation thereof. The visible = * - * string is put together in a static buffer, and this function returns = * - * a pointer to it. Printable characters stand for themselves, DEL is = * - * represented as "^?", newline and tab are represented as "\n" and = * - * "\t", and normal control characters are represented in "^C" form. = * - * Characters with bit 7 set, if unprintable, are represented as "\M-" = * - * followed by the visible representation of the character with bit 7 = * - * stripped off. Tokens are interpreted, rather than being treated as = * - * literal characters. = */ +/* + * Turn a character into a visible representation thereof. The visible + * string is put together in a static buffer, and this function returns + * a pointer to it. Printable characters stand for themselves, DEL is + * represented as "^?", newline and tab are represented as "\n" and + * "\t", and normal control characters are represented in "^C" form. + * Characters with bit 7 set, if unprintable, are represented as "\M-" + * followed by the visible representation of the character with bit 7 + * stripped off. Tokens are interpreted, rather than being treated as + * literal characters. + * + * Note that the returned string is metafied, so that it must be + * treated like any other zsh internal string (and not, for example, + * output directly). + */ =20 /**/ mod_export char * @@ -238,7 +244,17 @@ c +=3D 0x40; } done: - *s++ =3D c; + /* + * The resulting string is still metafied, so check if + * we are returning a character in the range that needs metafication= . + * This can't happen if the character is printed "nicely", so + * this results in a maximum of two bytes total (plus the null). + */ + if (itok(c)) { + *s++ =3D Meta; + *s++ =3D c ^ 32; + } else + *s++ =3D c; *s =3D 0; return buf; } @@ -292,7 +308,7 @@ nicefputs(char *s, FILE *f) { for (; *s; s++) - fputs(nicechar(STOUC(*s)), f); + zputs(nicechar(STOUC(*s)), f); } #endif =20 @@ -3177,7 +3193,7 @@ static char * nicedup(char const *s, int heap) { - int c, len =3D strlen(s) * 5; + int c, len =3D strlen(s) * 5 + 1; VARARR(char, buf, len); char *p =3D buf, *n; =20 @@ -3190,11 +3206,13 @@ } if (c =3D=3D Meta) c =3D *s++ ^ 32; + /* The result here is metafied */ n =3D nicechar(c); while(*n) *p++ =3D *n++; } - return metafy(buf, p - buf, (heap ? META_HEAPDUP : META_DUP)); + *p =3D '\0'; + return heap ? dupstring(buf) : ztrdup(buf); } =20 /**/ @@ -3228,7 +3246,7 @@ } if (c =3D=3D Meta) c =3D *s++ ^ 32; - if(fputs(nicechar(c), stream) < 0) + if(zputs(nicechar(c), stream) < 0) return EOF; } return 0; --=20 Peter Stephenson Work: pws@csr.com Web: http://www.pwstephenson.fsnet.co.uk