zsh-workers
 help / color / mirror / code / Atom feed
* Unicode problem
@ 2008-01-17 10:39 Jörg Sommer
  2008-01-17 12:09 ` Peter Stephenson
  0 siblings, 1 reply; 17+ messages in thread
From: Jörg Sommer @ 2008-01-17 10:39 UTC (permalink / raw)
  To: zsh-workers

Hi,

I'm running zsh in an UTF‐8 environment. Today I wanted to know what this
“f” character is and found that zsh can't handle it.

% unicode -s f
U+0066 LATIN SMALL LETTER F
UTF-8: 66  UTF-16BE: 0066  Decimal: f
f (F)
Uppercase: U+0046
Category: Ll (Letter, Lowercase)
Bidi: L (Left-to-Right)

U+FEFF ZERO WIDTH NO-BREAK SPACE
UTF-8: ef bb bf  UTF-16BE: feff  Decimal: 

Category: Cf (Other, Format)
Bidi: BN (Boundary Neutral)

The problem seems to be the zero with character. This explains why I can
go back before the U of “unicode.”

I'm using zsh-beta (4.3.4-dev-7) from Debian.

Bye, Jörg.
-- 
Wer eher stirbt ist länger tot.
    	 	    	       			(Un B. Kant)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Unicode problem
  2008-01-17 10:39 Unicode problem Jörg Sommer
@ 2008-01-17 12:09 ` Peter Stephenson
  2008-01-21 13:33   ` Jörg Sommer
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Stephenson @ 2008-01-17 12:09 UTC (permalink / raw)
  To: zsh-workers

On Thu, 17 Jan 2008 10:39:55 +0000 (UTC)
Jörg Sommer <joerg@alea.gnuu.de> wrote:
> I'm running zsh in an UTF‐8 environment. Today I wanted to know what this
> “f” character is and found that zsh can't handle it.

Please, when sending bug reports, can people say explicitly what's actually
going wrong and what they're doing to provoke it.  I think you mean that
when you move the cursor backward it miscounts the character it's on,
but I had to guess that from the throwaway remark at the end.

> The problem seems to be the zero with character. This explains why I can
> go back before the U of “unicode.”

So that actually displays as zero width on your terminal?  (I'm having to
guess this, too.)  Yes, that's bound to cause the shell problems.  In fact,
it's going to be quite impossible to edit: how do you even know it's there?
On gnome-terminal, which I'm using, it shows up as as a thick underscore
which is counted as width 1, and this works.  However, I can certainly see
problems in Konsole.  So this is apparently terminal-specific.

-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Unicode problem
  2008-01-17 12:09 ` Peter Stephenson
@ 2008-01-21 13:33   ` Jörg Sommer
  2008-01-21 14:15     ` Peter Stephenson
  0 siblings, 1 reply; 17+ messages in thread
From: Jörg Sommer @ 2008-01-21 13:33 UTC (permalink / raw)
  To: zsh-workers

Hallo Peter,

Peter Stephenson <pws@csr.com> wrote:
> On Thu, 17 Jan 2008 10:39:55 +0000 (UTC)
> Jörg Sommer <joerg@alea.gnuu.de> wrote:
>> I'm running zsh in an UTF‐8 environment. Today I wanted to know what this
>> “f” character is and found that zsh can't handle it.
>
> Please, when sending bug reports, can people say explicitly what's actually
> going wrong and what they're doing to provoke it.

Sorry, I'll improve this on my coming reports.

> I think you mean that when you move the cursor backward it miscounts
> the character it's on,

Yes. I insert the characters. Pressing ^A moves the cursor before the U
of unicode.

% unicode -s f<^A>
% unicode -s f
 ^ The cursor is here.

The same happens with “s̶”, “u̲” and “o̅”.

>> The problem seems to be the zero with character. This explains why I can
>> go back before the U of “unicode.”
>
> So that actually displays as zero width on your terminal?

Yes, the zero width character gets no cell. It's drawn as a small
dashed box above the previous character “f”.

> (I'm having to guess this, too.) Yes, that's bound to cause the shell
> problems.  In fact, it's going to be quite impossible to edit: how do
> you even know it's there? On gnome-terminal, which I'm using, it shows
> up as as a thick underscore which is counted as width 1,

The zero width character is drawn with width one?

Bye, Jörg.
-- 
> Ich kenn mich mit OpenBSD kaum aus, was sind denn da so die
> Vorteile gegenueber Linux und iptables?
Der Fuchsschwanzeffekt ist größer. :->
Message-ID: <slrnb11064.54g.hschlen@humbert.ddns.org>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Unicode problem
  2008-01-21 13:33   ` Jörg Sommer
@ 2008-01-21 14:15     ` Peter Stephenson
  2008-01-21 14:29       ` Mikael Magnusson
  2008-01-21 20:28       ` Jörg Sommer
  0 siblings, 2 replies; 17+ messages in thread
From: Peter Stephenson @ 2008-01-21 14:15 UTC (permalink / raw)
  To: zsh-workers

> The zero width character is drawn with width one?

Yes, that's what gnome-terminal is doing.  It seems to agree with the
screen width the library code is reporting (I didn't check explicitly
but the behaviour is consistent with that).

I think there are actually two widths involved here: the screen width
used with a fixed-size font for editing (applicable to zsh), and the
logical width of the character that would appear in a document.  The
latter is typically different to the former (in particular when a
variable width font is in use), and it's the latter case where it needs
to be zero width.  There's a kind of hybrid case for a WYSIWYG word
processor where it needs to flag up the space for editing even though
pretending it's behaving as zero-width, but this is a special case of
variable-width fonts.  The information in the library as returned by
wcwidth() is only applicable to fixed width fonts, where "zero width" is
essentially meaningless; a character width must be 1, 2, ...  So I think
gnome-terminal is doing the right thing here, although I can understand
the confusion.

-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Unicode problem
  2008-01-21 14:15     ` Peter Stephenson
@ 2008-01-21 14:29       ` Mikael Magnusson
  2008-01-21 14:45         ` Peter Stephenson
  2008-01-22  1:49         ` Clint Adams
  2008-01-21 20:28       ` Jörg Sommer
  1 sibling, 2 replies; 17+ messages in thread
From: Mikael Magnusson @ 2008-01-21 14:29 UTC (permalink / raw)
  To: zsh-workers

On 21/01/2008, Peter Stephenson <pws@csr.com> wrote:
> > The zero width character is drawn with width one?
>
> Yes, that's what gnome-terminal is doing.  It seems to agree with the
> screen width the library code is reporting (I didn't check explicitly
> but the behaviour is consistent with that).
>
> I think there are actually two widths involved here: the screen width
> used with a fixed-size font for editing (applicable to zsh), and the
> logical width of the character that would appear in a document.  The
> latter is typically different to the former (in particular when a
> variable width font is in use), and it's the latter case where it needs
> to be zero width.  There's a kind of hybrid case for a WYSIWYG word
> processor where it needs to flag up the space for editing even though
> pretending it's behaving as zero-width, but this is a special case of
> variable-width fonts.  The information in the library as returned by
> wcwidth() is only applicable to fixed width fonts, where "zero width" is
> essentially meaningless; a character width must be 1, 2, ...  So I think
> gnome-terminal is doing the right thing here, although I can understand
> the confusion.

Combining characters combine with the character(s) that come before
them. So if you write an feff or 336 or whatever, they will combine.
The terminal doesn't know where the prompt ends and the input line
starts, so when you give it a space and a combining char, it will draw
the combining char on top of the space before it. The combining char
always has width 0 because nothing has moved from drawing it. The
space with the combining char on top is of course width 1. If you
write a double width space and then a combining char, it will
similarily appear to have width 2.

I guess what you want to do in zsh is just count them as width 0, and
have the cursor skip over them, so if you are on the right of a<336>
and press left you should end up to the left of a. If the user types
one at the start of the line, i think you should _draw_ an extra space
before it, but not insert the space into the actual commandline
(because everything should work if a binary is named with a combining
character first i think). Alternatively you could disallow it and have
the user type out \u0336 instead (if that's the right syntax), or
maybe autoconvert it?

I don't know what do do with the zero width space though. :)

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Unicode problem
  2008-01-21 14:29       ` Mikael Magnusson
@ 2008-01-21 14:45         ` Peter Stephenson
  2008-01-21 18:16           ` Bart Schaefer
                             ` (2 more replies)
  2008-01-22  1:49         ` Clint Adams
  1 sibling, 3 replies; 17+ messages in thread
From: Peter Stephenson @ 2008-01-21 14:45 UTC (permalink / raw)
  To: zsh-workers

"Mikael Magnusson" wrote:
> Combining characters combine with the character(s) that come before
> them.

Combining characters are definitely broken at that moment:  that looks
like a significant chunk of work (and what you say is already more than
what I know about the subject).

It looks like I was wrong: wcwidth() *is* returning zero for the
zero-width-space character, but the refresh code is written so that that
works the same way as width 1, so there is a fundamental problem here.
We could do a nasty hack so that characters whose width is zero are
treated the same way as non-printable characters, I suppose.

-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Unicode problem
  2008-01-21 14:45         ` Peter Stephenson
@ 2008-01-21 18:16           ` Bart Schaefer
  2008-01-22  9:57             ` Peter Stephenson
  2008-01-22  0:12           ` Vincent Lefevre
  2008-01-22  1:09           ` Mikael Magnusson
  2 siblings, 1 reply; 17+ messages in thread
From: Bart Schaefer @ 2008-01-21 18:16 UTC (permalink / raw)
  To: zsh-workers

On Jan 21,  2:45pm, Peter Stephenson wrote:
}
} We could do a nasty hack so that characters whose width is zero are
} treated the same way as non-printable characters, I suppose.

In prompts, wouldn't it be sufficient for a zero-width character to
behave as if it had been written with %{ %} around it?  (Of course
that could still get all sorts of messy with PROMPT_SUBST if the
value of a variable contains a zero-width character.)

In the line editor I'm not so sure.  Treating it like a non-printable
character seems like a good first step.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Unicode problem
  2008-01-21 14:15     ` Peter Stephenson
  2008-01-21 14:29       ` Mikael Magnusson
@ 2008-01-21 20:28       ` Jörg Sommer
  1 sibling, 0 replies; 17+ messages in thread
From: Jörg Sommer @ 2008-01-21 20:28 UTC (permalink / raw)
  To: zsh-workers

Hallo Peter,

Peter Stephenson <pws@csr.com> wrote:
>> The zero width character is drawn with width one?
>
> Yes, that's what gnome-terminal is doing.

I checked XTerm and the Linux console and both behave different. They
display the character with width zero. XTerm draws a box above the
previous character and the console does nothing special.

Bye, Jörg.
-- 
> Definiere ‚Demokratie‘ …
… eine Mehrheit beweist einer Minderheit, dass Widerstand zwecklos ist.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Unicode problem
  2008-01-21 14:45         ` Peter Stephenson
  2008-01-21 18:16           ` Bart Schaefer
@ 2008-01-22  0:12           ` Vincent Lefevre
  2008-01-22  1:09           ` Mikael Magnusson
  2 siblings, 0 replies; 17+ messages in thread
From: Vincent Lefevre @ 2008-01-22  0:12 UTC (permalink / raw)
  To: zsh-workers

On 2008-01-21 14:45:21 +0000, Peter Stephenson wrote:
> It looks like I was wrong: wcwidth() *is* returning zero for the
> zero-width-space character,

This depends on the OS. On Mac OS X, for instance, U+FEFF is regarded
as a control character and wcwidth() returns -1. And the wcwidth of
the combining character U+0300 is 1 (this is also the case of HP-UX).

-- 
Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Unicode problem
  2008-01-21 14:45         ` Peter Stephenson
  2008-01-21 18:16           ` Bart Schaefer
  2008-01-22  0:12           ` Vincent Lefevre
@ 2008-01-22  1:09           ` Mikael Magnusson
  2008-01-22  1:25             ` Vincent Lefevre
  2008-01-22 16:19             ` Clint Adams
  2 siblings, 2 replies; 17+ messages in thread
From: Mikael Magnusson @ 2008-01-22  1:09 UTC (permalink / raw)
  To: zsh-workers

On 21/01/2008, Peter Stephenson <pws@csr.com> wrote:
> "Mikael Magnusson" wrote:
> > Combining characters combine with the character(s) that come before
> > them.
>
> Combining characters are definitely broken at that moment:  that looks
> like a significant chunk of work (and what you say is already more than
> what I know about the subject).
>
> It looks like I was wrong: wcwidth() *is* returning zero for the
> zero-width-space character, but the refresh code is written so that that
> works the same way as width 1, so there is a fundamental problem here.
> We could do a nasty hack so that characters whose width is zero are
> treated the same way as non-printable characters, I suppose.

I don't know if this is at all helpful and everyone probably tried it
already, but all the chars Jörg posted work fine in bash (readline).
Cursoring and backspacing over a char that has combining chars on it
skips over / deletes both of them together, and it gets cursor
movement right as far as I can tell.

-- 
Mikael Magnusson

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Unicode problem
  2008-01-22  1:09           ` Mikael Magnusson
@ 2008-01-22  1:25             ` Vincent Lefevre
  2008-01-22 16:19             ` Clint Adams
  1 sibling, 0 replies; 17+ messages in thread
From: Vincent Lefevre @ 2008-01-22  1:25 UTC (permalink / raw)
  To: zsh-workers

On 2008-01-22 02:09:03 +0100, Mikael Magnusson wrote:
> I don't know if this is at all helpful and everyone probably tried it
> already, but all the chars Jörg posted work fine in bash (readline).
> Cursoring and backspacing over a char that has combining chars on it
> skips over / deletes both of them together, and it gets cursor
> movement right as far as I can tell.

Under some conditions, bash has problems with combining characters.
See http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=397086 for
instance (this bug is still present in Debian/unstable).

-- 
Vincent Lefèvre <vincent@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Unicode problem
  2008-01-21 14:29       ` Mikael Magnusson
  2008-01-21 14:45         ` Peter Stephenson
@ 2008-01-22  1:49         ` Clint Adams
  2008-01-22  2:07           ` Mikael Magnusson
  1 sibling, 1 reply; 17+ messages in thread
From: Clint Adams @ 2008-01-22  1:49 UTC (permalink / raw)
  To: Mikael Magnusson; +Cc: zsh-workers

On Mon, Jan 21, 2008 at 03:29:48PM +0100, Mikael Magnusson wrote:
> I guess what you want to do in zsh is just count them as width 0, and
> have the cursor skip over them, so if you are on the right of a<336>
> and press left you should end up to the left of a. If the user types

Hmm, it would seem less surprising to me if character-based zle widgets
did not act upon multiple wide characters at a time.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Unicode problem
  2008-01-22  1:49         ` Clint Adams
@ 2008-01-22  2:07           ` Mikael Magnusson
  2008-01-22 16:26             ` Clint Adams
  0 siblings, 1 reply; 17+ messages in thread
From: Mikael Magnusson @ 2008-01-22  2:07 UTC (permalink / raw)
  To: zsh-workers

On 22/01/2008, Clint Adams <clint@zsh.org> wrote:
> On Mon, Jan 21, 2008 at 03:29:48PM +0100, Mikael Magnusson wrote:
> > I guess what you want to do in zsh is just count them as width 0, and
> > have the cursor skip over them, so if you are on the right of a<336>
> > and press left you should end up to the left of a. If the user types
>
> Hmm, it would seem less surprising to me if character-based zle widgets
> did not act upon multiple wide characters at a time.

(these questions are not rhetorical :)
How do you visually represent the cursor is between them? Why would
you want to be between them in the first place? It probably makes
sense to have backspace only delete the last combining char, not the
whole thing though.

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Unicode problem
  2008-01-21 18:16           ` Bart Schaefer
@ 2008-01-22  9:57             ` Peter Stephenson
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Stephenson @ 2008-01-22  9:57 UTC (permalink / raw)
  To: Zsh Hackers' List

On Mon, 21 Jan 2008 10:16:49 -0800
Bart Schaefer <schaefer@brasslantern.com> wrote:
> In the line editor I'm not so sure.  Treating it like a non-printable
> character seems like a good first step.

OK, here is a first step.  It turns out we haven't done very well with
unprintable wide characters anyway:  the only special handing is for
control characters, and the code is the same as for ASCII control
characters, which doesn't really work.

So this covers any zero-width or unprintable characters not in the range 0
to 255 when multibyte support is enabled.  Note it uses the native wide
character type, not necessarily Unicode---I don't think it's appropriate at
this level to assume Unicode.  The character shows up as hex digits in
angle brackets.  Suggest improvements if you like, but it needs to be
short.

Play with this and see if it works:  you can use insert-unicode-char to
insert character 0xfeff.

A possible way forward for the future is that I'd quite like to add
functionality for highlighting parts of the command line after 4.3.5.  (To
be more accurate, I'd quite like someone else to add it, but I don't think
that's going to happen.)  Doing this within zle_refresh.c is the easy (or
easiest) bit.  Then the non-printable character could be reverse video,
which is clearer.

This obviously doesn't preclude adding combining character support but
that's not going to happen today.

Index: Src/Zle/zle_refresh.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/Zle/zle_refresh.c,v
retrieving revision 1.52
diff -u -r1.52 zle_refresh.c
--- Src/Zle/zle_refresh.c	8 Jan 2008 15:07:02 -0000	1.52
+++ Src/Zle/zle_refresh.c	22 Jan 2008 09:54:30 -0000
@@ -447,6 +447,10 @@
     int tmpalloced;		/* flag to free tmpline when finished        */
     int remetafy;		/* flag that zle line is metafied            */
     struct rparams rpms;
+#ifdef MULTIBYTE_SUPPORT
+    int width;                  /* width of wide character                   */
+#endif
+
     
     /* If this is called from listmatches() (indirectly via trashzle()), and *
      * that was called from the end of zrefresh(), then we don't need to do  *
@@ -633,8 +637,7 @@
 		while ((++t0) & 7);
 	}
 #ifdef MULTIBYTE_SUPPORT
-	else if (iswprint(*t)) {
-	    int width = wcwidth(*t);
+	else if (iswprint(*t) && (width = wcwidth(*t)) > 0) {
 	    if (width > rpms.sen - rpms.s) {
 		/*
 		 * Too wide to fit.  Insert spaces to end of current line.
@@ -649,7 +652,7 @@
 		    rpms.nvcs = rpms.s - nbuf[rpms.nvln = rpms.ln];
 		}
 	    }
-	    if (width > rpms.sen - rpms.s) {
+	    if (width > rpms.sen - rpms.s || width == 0) {
 		/*
 		 * The screen width is too small to fit even one
 		 * occurrence.
@@ -663,7 +666,11 @@
 	    }
 	}
 #endif
-	else if (ZC_icntrl(*t)) {	/* other control character */
+	else if (ZC_icntrl(*t)
+#ifdef MULTIBYTE_SUPPORT
+		 && (unsigned)*t <= 0xffU
+#endif
+	    ) {	/* other control character */
 	    *rpms.s++ = ZWC('^');
 	    if (rpms.s == rpms.sen) {
 		/* text wrapped */
@@ -671,9 +678,42 @@
 		    break;
 	    }
 	    *rpms.s++ = (((unsigned int)*t & ~0x80u) > 31) ? ZWC('?') : (*t | ZWC('@'));
-	} else {			/* normal character */
+	}
+#ifdef MULTIBYTE_SUPPORT
+	else {
+	    /*
+	     * Not printable or zero width.
+	     * Resort to hackery.
+	     */
+	    char dispchars[11];
+	    char *dispptr = dispchars;
+	    wchar_t wc;
+
+	    if ((unsigned)*t > 0xffffU) {
+		sprintf(dispchars, "<%.08x>", (unsigned)*t);
+	    } else {
+		sprintf(dispchars, "<%.04x>", (unsigned)*t);
+	    }
+	    while (*dispptr) {
+		if (mbtowc(&wc, dispptr, 1) == 1 /* paranoia */)
+		{
+		    *rpms.s++ = wc;
+		    if (rpms.s == rpms.sen) {
+			/* text wrapped */
+			if (nextline(&rpms, 1))
+			    break;
+		    }
+		}
+		dispptr++;
+	    }
+	    if (*dispptr) /* nextline said stop processing */
+		break;
+	}
+#else
+	else {			/* normal character */
 	    *rpms.s++ = *t;
 	}
+#endif
 	if (rpms.s == rpms.sen) {
 	    /* text wrapped */
 	    if (nextline(&rpms, 1))
-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Unicode problem
  2008-01-22  1:09           ` Mikael Magnusson
  2008-01-22  1:25             ` Vincent Lefevre
@ 2008-01-22 16:19             ` Clint Adams
  2008-01-22 16:25               ` Mikael Magnusson
  1 sibling, 1 reply; 17+ messages in thread
From: Clint Adams @ 2008-01-22 16:19 UTC (permalink / raw)
  To: Mikael Magnusson; +Cc: zsh-workers

On Tue, Jan 22, 2008 at 02:09:03AM +0100, Mikael Magnusson wrote:
> I don't know if this is at all helpful and everyone probably tried it
> already, but all the chars Jörg posted work fine in bash (readline).
> Cursoring and backspacing over a char that has combining chars on it
> skips over / deletes both of them together, and it gets cursor
> movement right as far as I can tell.

Hmm, so if you add 4 combining characters to a letter, then make a
mistake, you need to backspace and start from scratch? Is readline
doing any kind of Unicode normalization?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Unicode problem
  2008-01-22 16:19             ` Clint Adams
@ 2008-01-22 16:25               ` Mikael Magnusson
  0 siblings, 0 replies; 17+ messages in thread
From: Mikael Magnusson @ 2008-01-22 16:25 UTC (permalink / raw)
  To: zsh-workers

On 22/01/2008, Clint Adams <clint@zsh.org> wrote:
> On Tue, Jan 22, 2008 at 02:09:03AM +0100, Mikael Magnusson wrote:
> > I don't know if this is at all helpful and everyone probably tried it
> > already, but all the chars Jörg posted work fine in bash (readline).
> > Cursoring and backspacing over a char that has combining chars on it
> > skips over / deletes both of them together, and it gets cursor
> > movement right as far as I can tell.
>
> Hmm, so if you add 4 combining characters to a letter, then make a
> mistake, you need to backspace and start from scratch?

So it appears.

> Is readline doing any kind of Unicode normalization?

$ echo é|cat -v
M-CM-)
$ echo é|cat -v
eM-LM-^A

btw I noticed just now the text boxes in firefox let you position the
cursor between the char and the combining char, but doesn't indicate
it in any other way than lack of cursor movement.

-- 
Mikael Magnusson

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Unicode problem
  2008-01-22  2:07           ` Mikael Magnusson
@ 2008-01-22 16:26             ` Clint Adams
  0 siblings, 0 replies; 17+ messages in thread
From: Clint Adams @ 2008-01-22 16:26 UTC (permalink / raw)
  To: Mikael Magnusson; +Cc: zsh-workers

On Tue, Jan 22, 2008 at 03:07:50AM +0100, Mikael Magnusson wrote:
> (these questions are not rhetorical :)
> How do you visually represent the cursor is between them? Why would

You wouldn't.

> you want to be between them in the first place? It probably makes
> sense to have backspace only delete the last combining char, not the
> whole thing though.

That seems fair.


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2008-01-22 16:27 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-01-17 10:39 Unicode problem Jörg Sommer
2008-01-17 12:09 ` Peter Stephenson
2008-01-21 13:33   ` Jörg Sommer
2008-01-21 14:15     ` Peter Stephenson
2008-01-21 14:29       ` Mikael Magnusson
2008-01-21 14:45         ` Peter Stephenson
2008-01-21 18:16           ` Bart Schaefer
2008-01-22  9:57             ` Peter Stephenson
2008-01-22  0:12           ` Vincent Lefevre
2008-01-22  1:09           ` Mikael Magnusson
2008-01-22  1:25             ` Vincent Lefevre
2008-01-22 16:19             ` Clint Adams
2008-01-22 16:25               ` Mikael Magnusson
2008-01-22  1:49         ` Clint Adams
2008-01-22  2:07           ` Mikael Magnusson
2008-01-22 16:26             ` Clint Adams
2008-01-21 20:28       ` Jörg Sommer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).