zsh-workers
 help / color / mirror / code / Atom feed
* Fw: Phil's prompt is not working when LANG is set to UTF-8
@ 2008-02-15 23:52 Peter Stephenson
  2008-02-16  7:56 ` Andrey Borzenkov
  2008-02-16 19:13 ` Bart Schaefer
  0 siblings, 2 replies; 8+ messages in thread
From: Peter Stephenson @ 2008-02-15 23:52 UTC (permalink / raw)
  To: Zsh Hackers' List

On Fri, 15 Feb 2008 22:55:58 +0300
Andrey Borzenkov <arvidjaar@newmail.ru> wrote:
> On Friday 15 February 2008, Andrey Borzenkov wrote:  
> > The actual prompt lengths are (see screenshot)
> > 
> > lpromptw = 13
> > rptomptw = 16 (it has one space in it)
> > 
> > this perfectly correspnds to something (zsh?) ignoring invalid characters
> > with high bit set.  
> 
> For sure.
> 
> Src/prompt.c:countprompt()
> 
>             case MB_INVALID:
>                 memset(&mbs, 0, sizeof mbs);
>                 /* FALL THROUGH */
>             case 0:
>                 /* Invalid character or null: assume no output. */
>                 multi = 0;
>                 break;
> 
> Oops.
> 
> I do not actually see how can we fix it except introducing prompt
> expansion syntax for ACS (or may be for any terminfo sequence in general)
> and simply assuming characters in any of them are of width 1.  

Thanks for looking.  I think I've now roughly caught up; tell me if I'm
mistaken.

- Both terminal and shell start correctly in UTF-8 mode.
- However, Phil's prompt (http://aperiodic.net/phil/prompt/) uses
  the Alternative Character Set by appropriate terminfo trickery.
- The ACS is an old-fashioned grungy VT100 thing from the days
  when nobody had heard of multibyte character sets.
- Hence it falls foul of the multibyte tests.  In principle it
  might clash with a UTF-8 character anyway and have the wrong
  width, so assuming a width 1 for an unknown character is not
  necessarily better than assuming width 0.
- Anyway, assumptions are best avoided if possible.
- Nobody is worrying about editing the ACS, only using it in prompts,
  so a prompt-specific fix is fine.  (Editing with ACS would be
  stupid since the glyphs on the screen wouldn't actually reflect what
  the bytes meant to any programme to which they got fed, right?)

How about the following tweak to prompts to support this?  The upshot is
that you include any funny characters in %{...%G%} where the %G for
`glitch' (which may be repeated or take a numeric argument) indicates a
screen cell taken up by the sequence.  I like this because it uses
facilities that have been present in the shell for a long time and hence
was trivial to implement and might work.

I played with this in simple cases, but would anybody like to confirm
this works in the cases that matter (and maybe produce an updated Phil's
Prompt)?  To put it another way:  I am happy to support this fix but
have no interest in doing anything with it myself.

I think this is clean and useful enough that I will commit it anyway.

Index: Doc/Zsh/prompt.yo
===================================================================
RCS file: /cvsroot/zsh/zsh/Doc/Zsh/prompt.yo,v
retrieving revision 1.9
diff -u -r1.9 prompt.yo
--- Doc/Zsh/prompt.yo	29 Jan 2008 17:51:02 -0000	1.9
+++ Doc/Zsh/prompt.yo	15 Feb 2008 23:34:06 -0000
@@ -188,6 +188,18 @@
 The string within the braces should not change the cursor
 position.  Brace pairs can nest.
 )
+item(tt(%G))(
+Within a tt(%{)...tt(%}) sequence, include a `glitch': that is, assume
+that a single character width will be output.  This is useful when
+outputting characters that otherwise cannot be correctly handled by the
+shell, such as the alternate character set on some terminals.
+The characters in question can be included within a tt(%{)...tt(%})
+sequence together with the appropriate number of tt(%G) sequences to
+indicate the correct width.  An integer between the `tt(%)' and `tt(G)'
+indicates a character width other than one.  Hence tt(%{)var(seq)tt(%2G%})
+outputs var(seq) and assumes it takes up the width of two standard
+characters.
+)
 enditem()
 
 sect(Conditional Substrings in Prompts)
Index: Src/prompt.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/prompt.c,v
retrieving revision 1.44
diff -u -r1.44 prompt.c
--- Src/prompt.c	20 Nov 2007 09:55:10 -0000	1.44
+++ Src/prompt.c	15 Feb 2008 23:34:06 -0000
@@ -473,6 +473,16 @@
 		    *bp++ = Inpar;
 		}
 		break;
+	    case 'G':
+		if (arg > 0) {
+		    addbufspc(arg);
+		    while (arg--)
+			*bp++ = Nularg;
+		} else {
+		    addbufspc(1);
+		    *bp++ = Nularg;
+		}
+		break;
 	    case /*{*/ '}':
 		if (trunccount && trunccount >= dontcount)
 		    return *fm;


-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2008-02-17 17:42 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-02-15 23:52 Fw: Phil's prompt is not working when LANG is set to UTF-8 Peter Stephenson
2008-02-16  7:56 ` Andrey Borzenkov
2008-02-16  8:50   ` Wael Nasreddine
2008-02-16  9:00   ` Bart Schaefer
2008-02-16 19:13 ` Bart Schaefer
2008-02-16 19:32   ` Bart Schaefer
2008-02-17 16:43   ` Peter Stephenson
2008-02-17 17:40     ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).