zsh-workers
 help / color / mirror / code / Atom feed
From: Peter Stephenson <pws@pwstephenson.fsnet.co.uk>
To: Zsh hackers list <zsh-workers@sunsite.dk>
Subject: Re: problem in prompt in utf-8
Date: Sat, 17 Sep 2005 19:15:07 +0100	[thread overview]
Message-ID: <20050917181508.D63308638@pwstephenson.fsnet.co.uk> (raw)
In-Reply-To: <20050911121345.GA14384@fermat.math.technion.ac.il>

"Zvi Har'El" wrote:
> I have started using zsh-4.3.0 from the CVS, in a uft-8 locale. I enjoy it
> very much. However, I have a problem with the prompting. This is not new, but
> since the completion now works nicely, I thought I'll mention it, since it is
> not solved yet.

> /home/rl$ cd אבגדהוזחטיךכלםמןנסעףפץצקרשת 
> 
> The next prompt had invalid utf-8 sequences:
> 
> 
> /home/rl/������������לםמןנסעףפץצקרשת$ 

[This message uses raw 8-bit UTF-8, as the original did; hope this
came through OK, since I hacked the headers by hand.  MH in Emacs is a
bit antiquated.  I'm only surprised my system managed to display Hebrew
characters OK...  It doesn't actually matter apart from the quoted text
above.]

There was an inconsistency when formatting a string that contained a
character in the range reserved for tokens: conversion to the zsh
internal form (metafication) wasn't done correctly.  This particular
problem wasn't actually within zle, it was in the main shell and (as you
sort of indicated) wasn't directly related to multibyte characters.

This should fix the immediate problem, but note that the width of the
prompt isn't calculated correctly yet: we don't scan prompts for
multibyte characters.  Hence you might see oddities with the display
since the shell doesn't know the position of the cursor after the
prompt.  This is another thing on the list of fixes needed in zle.  (It
should come under the "not rocket science" heading, unlike the
completion code, so I hope it will be fixed relatively soon.)

Please do report any more of these inconsistencies; users who regularly
encounter character sets other than latin-based ones are valuable for
this.

I hope I haven't caused any new problems... I think I caught all the
uses of nicechar() and made sure they expected metafied strings.  The
first hunk is tangential to the rest: on the way in, I noticed that the
variable pwd was metafied and so needed to be unmetafied on output.

Index: Src/builtin.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/builtin.c,v
retrieving revision 1.148
diff -u -r1.148 builtin.c
--- Src/builtin.c	9 Sep 2005 16:06:48 -0000	1.148
+++ Src/builtin.c	17 Sep 2005 18:09:24 -0000
@@ -699,7 +699,7 @@
 	else
 	    fmt = " ";
 	if (OPT_ISSET(ops,'l'))
-	    fputs(pwd, stdout);
+	    zputs(pwd, stdout);
 	else
 	    fprintdir(pwd, stdout);
 	for (node = firstnode(dirstack); node; incnode(node)) {
Index: Src/utils.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/utils.c,v
retrieving revision 1.89
diff -u -r1.89 utils.c
--- Src/utils.c	9 Sep 2005 20:34:42 -0000	1.89
+++ Src/utils.c	17 Sep 2005 18:10:08 -0000
@@ -146,7 +146,7 @@
 		putc('%', stderr);
 		break;
 	    case 'c':
-		fputs(nicechar(num), stderr);
+		zputs(nicechar(num), stderr);
 		break;
 	    case 'e':
 		/* print the corresponding message for this errno */
@@ -195,15 +195,21 @@
     return 0;
 }
 
-/* Turn a character into a visible representation thereof.  The visible *
- * string is put together in a static buffer, and this function returns *
- * a pointer to it.  Printable characters stand for themselves, DEL is  *
- * represented as "^?", newline and tab are represented as "\n" and     *
- * "\t", and normal control characters are represented in "^C" form.    *
- * Characters with bit 7 set, if unprintable, are represented as "\M-"  *
- * followed by the visible representation of the character with bit 7   *
- * stripped off.  Tokens are interpreted, rather than being treated as  *
- * literal characters.                                                  */
+/*
+ * Turn a character into a visible representation thereof.  The visible
+ * string is put together in a static buffer, and this function returns
+ * a pointer to it.  Printable characters stand for themselves, DEL is
+ * represented as "^?", newline and tab are represented as "\n" and
+ * "\t", and normal control characters are represented in "^C" form.
+ * Characters with bit 7 set, if unprintable, are represented as "\M-"
+ * followed by the visible representation of the character with bit 7
+ * stripped off.  Tokens are interpreted, rather than being treated as
+ * literal characters.
+ *
+ * Note that the returned string is metafied, so that it must be
+ * treated like any other zsh internal string (and not, for example,
+ * output directly).
+ */
 
 /**/
 mod_export char *
@@ -238,7 +244,17 @@
 	c += 0x40;
     }
     done:
-    *s++ = c;
+    /*
+     * The resulting string is still metafied, so check if
+     * we are returning a character in the range that needs metafication.
+     * This can't happen if the character is printed "nicely", so
+     * this results in a maximum of two bytes total (plus the null).
+     */
+    if (itok(c)) {
+	*s++ = Meta;
+	*s++ = c ^ 32;
+    } else
+	*s++ = c;
     *s = 0;
     return buf;
 }
@@ -292,7 +308,7 @@
 nicefputs(char *s, FILE *f)
 {
     for (; *s; s++)
-	fputs(nicechar(STOUC(*s)), f);
+	zputs(nicechar(STOUC(*s)), f);
 }
 #endif
 
@@ -3177,7 +3193,7 @@
 static char *
 nicedup(char const *s, int heap)
 {
-    int c, len = strlen(s) * 5;
+    int c, len = strlen(s) * 5 + 1;
     VARARR(char, buf, len);
     char *p = buf, *n;
 
@@ -3190,11 +3206,13 @@
 	}
 	if (c == Meta)
 	    c = *s++ ^ 32;
+	/* The result here is metafied */
 	n = nicechar(c);
 	while(*n)
 	    *p++ = *n++;
     }
-    return metafy(buf, p - buf, (heap ? META_HEAPDUP : META_DUP));
+    *p = '\0';
+    return heap ? dupstring(buf) : ztrdup(buf);
 }
 
 /**/
@@ -3228,7 +3246,7 @@
 	}
 	if (c == Meta)
 	    c = *s++ ^ 32;
-	if(fputs(nicechar(c), stream) < 0)
+	if(zputs(nicechar(c), stream) < 0)
 	    return EOF;
     }
     return 0;

-- 
Peter Stephenson <pws@pwstephenson.fsnet.co.uk>
Work: pws@csr.com
Web: http://www.pwstephenson.fsnet.co.uk


  parent reply	other threads:[~2005-09-17 18:15 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-09-11 12:13 Zvi Har'El
2005-09-11 16:55 ` Zvi Har'El
2005-09-11 17:05   ` Zvi Har'El
2005-09-17 18:15 ` Peter Stephenson [this message]
2005-09-17 21:33   ` Peter Stephenson
2005-09-17 21:51     ` Zvi Har'El

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050917181508.D63308638@pwstephenson.fsnet.co.uk \
    --to=pws@pwstephenson.fsnet.co.uk \
    --cc=zsh-workers@sunsite.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).