zsh-workers
 help / color / mirror / code / Atom feed
From: Peter Stephenson <p.w.stephenson@ntlworld.com>
To: zsh-workers <zsh-workers@sunsite.dk>
Subject: Re: crash with multibyte and matcher-list
Date: Sun, 2 Nov 2008 17:20:41 +0000	[thread overview]
Message-ID: <20081102172041.68824e1d@pws-pc> (raw)
In-Reply-To: <237967ef0811011053u4b8751f4l7ab511d552ab1e39@mail.gmail.com>

On Sat, 1 Nov 2008 18:53:52 +0100
"Mikael Magnusson" <mikachu@gmail.com> wrote:
> GNU gdb 6.8
> This GDB was configured as "i686-pc-linux-gnu"...
> (gdb) run
> Starting program: /usr/local/bin/zsh -f
> % autoload compinit;compinit
> % zstyle ':completion:*' matcher-list 'm:{a-zA-Z}={A-Za-z}'
> +'r:|[._-]=* r:|=*' +'l:|=* r:|=*'
> % touch $'\u51fa'{a,b}
> % ls <tab>出<tab>
> Program received signal SIGSEGV, Segmentation fault.

This is going to be a complete hotch-potch until I take a week out of my
life to convert the whole thing to do multibyte character properly, but
the following ought to be safer and more future-proof.  The characters
stored in the matcher are not handled as multibyte characters, so this
is not likely to handle completion for multibyte characters any better
yet.

Index: Src/Zle/computil.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/Zle/computil.c,v
retrieving revision 1.112
diff -u -r1.112 computil.c
--- Src/Zle/computil.c	29 Oct 2008 01:33:23 -0000	1.112
+++ Src/Zle/computil.c	2 Nov 2008 17:18:27 -0000
@@ -4024,7 +4024,17 @@
      * management is difficult.
      */
     for (;;) {
+	MB_METACHARINIT();
 	for (mp = ms; *add; ) {
+	    convchar_t addc;
+	    int addlen;
+
+	    addlen = MB_METACHARLENCONV(add, &addc);
+#ifdef MULTIBYTE_SUPPORT
+	    if (addc == WEOF)
+		addc = (wchar_t)(*p == Meta ? p[1] ^ 32 : *p);
+#endif
+
 	    if (!(m = *mp)) {
 		/*
 		 * No matcher, so just match the character
@@ -4034,13 +4044,10 @@
 		 * metacharacter?
 		 */
 		if (ret) {
-		    if (*add == Meta) {
-			*p++ = Meta;
-			*p++ = add[1];
-		    } else
-			*p++ = *add;
+		    memcpy(p, add, addlen);
+		    p += addlen;
 		} else
-		    len += (*add == Meta) ? 2 : 1;
+		    len += addlen;
 	    } else if (m->flags & CMF_RIGHT) {
 		/*
 		 * Right-anchored:  match anything followed
@@ -4049,16 +4056,12 @@
 		if (ret) {
 		    *p++ = '*';
 		    /* TODO: quote again? */
-		    if (*add == Meta) {
-			*p++ = Meta;
-			*p++ = add[1];
-		    } else
-			*p++ = *add;
+		    memcpy(p, add, addlen);
+		    p += addlen;
 		} else
-		    len += (*add == Meta) ? 3 : 2;
+		    len += addlen + 1;
 	    } else {
 		/* The usual set of matcher possibilities. */
-		int chr = (*add == Meta) ? add[1] ^ 32 : *add;
 		int ind;
 		if (m->line->tp == CPAT_EQUIV &&
 		    m->word->tp == CPAT_EQUIV) {
@@ -4073,21 +4076,17 @@
 		     */
 		    if (ret) {
 			*p++ = '[';
-			if (*add == Meta) {
-			    *p++ = Meta;
-			    *p++ = add[1];
-			} else
-			    *p++ = *add;
+			memcpy(p, add, addlen);
+			p += addlen;
 		    } else
-			len += (*add == Meta) ? 3 : 2;
-		    if (PATMATCHRANGE(m->line->u.str, CONVCAST(chr),
-				      &ind, &mt)) {
+			len += addlen + 1;
+		    if (PATMATCHRANGE(m->line->u.str, addc, &ind, &mt)) {
 			/*
 			 * Find the equivalent match for ind in the
 			 * word pattern.
 			 */
 			if ((ind = pattern_match_equivalence
-			     (m->word, ind, mt, CONVCAST(chr))) != -1) {
+			     (m->word, ind, mt, addc)) != -1) {
 			    if (ret) {
 				if (imeta(ind)) {
 				    *p++ = Meta;
@@ -4159,7 +4158,7 @@
 			 * if *add is ] and ] is also the first
 			 * character in the range.
 			 */
-			addadd = !pattern_match1(m->word, CONVCAST(chr), &mt);
+			addadd = !pattern_match1(m->word, addc, &mt);
 			if (addadd && *add == ']') {
 			    if (ret)
 				*p++ = *add;
@@ -4196,13 +4195,10 @@
 			}
 			if (addadd && *add != ']') {
 			    if (ret) {
-				if (imeta(*add)) {
-				    *p++ = Meta;
-				    *p++ = *add ^ 32;
-				} else
-				    *p++ = *add;
+				memcpy(p, add, addlen);
+				p += addlen;
 			    } else
-				len += imeta(*add) ? 2 : 1;
+				len += addlen;
 			}
 			if (ret)
 			    *p++ = ']';
@@ -4219,13 +4215,8 @@
 		    }
 		}
 	    }
-	    if (*add == Meta) {
-		add += 2;
-		mp += 2;
-	    } else {
-		add++;
-		mp++;
-	    }
+	    add += addlen;
+	    mp++;
 	}
 	if (ret) {
 	    *p = '\0';


-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


  reply	other threads:[~2008-11-02 17:21 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-11-01 17:53 Mikael Magnusson
2008-11-02 17:20 ` Peter Stephenson [this message]
2008-11-02 17:58   ` Mikael Magnusson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20081102172041.68824e1d@pws-pc \
    --to=p.w.stephenson@ntlworld.com \
    --cc=zsh-workers@sunsite.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).