zsh-workers
 help / color / mirror / code / Atom feed
* Broken completion with UTF-8 description
@ 2006-09-17 14:53 Andrey Borzenkov
  2006-09-17 19:16 ` Peter Stephenson
  0 siblings, 1 reply; 9+ messages in thread
From: Andrey Borzenkov @ 2006-09-17 14:53 UTC (permalink / raw)
  To: zsh-workers


[-- Attachment #1.1: Type: text/plain, Size: 990 bytes --]

Writing completion for kcmshell (attached); it outputs list of KDE 
configuration modules together with descriptions and I have not found any way 
(confirmed by KDE developers) to force descriptions in English. So far I get 
the following:

{pts/1}% kcmshell --list
Доступны следующие модули:
kwalletconfig    - Настройка бумажника KDE
privacy          - Модуль kcontrol, очищающий нежелательные следы, оставленные 
пользователем в операционной системе
...

results in following:

...
printers                       -- Настройки системы печа�[0m
privacy                        -- Модуль kcontrol, очищающ�[0m
profilechooser                 -- Mandriva KDE profilechooser
proxy                          -- Настройка серверов прок�[0m
...

(I can add screenshot)

Is it supposed to work at all? Using current CVS.

[-- Attachment #1.2: _kcmshell --]
[-- Type: text/plain, Size: 2278 bytes --]

#compdef kcmshell

# kcmshell is a command line interface to KDE configuration modules

_kcmshell_list_modules() {
  local -a x

  x=( ${${(f)"$(kcmshell --list 2> /dev/null)"}[2,-1]/[[:space:]]##-[[:space:]]/:})
  (( $#x )) || return 1

  _describe -t modules "KDE configuration module" x
}

_kcmshell() {
  local expl
  local -a context state line
  typeset -A opt_args

  _arguments \
    '--help[show help message]' \
    '--help-qt[show Qt specific options]' \
    '--help-kde[show KDE specific options]' \
    '--help-all[show all options]' \
    '--author[show author information]' \
    '-v[show version information]' \
    '--version[show version information]' \
    '--license[show license information]' \
    '(*)--list[show all available modules]' \
    '--display=:X display:_x_display' \
    '--session=:session id for restoring application: ' \
    '--cmap[use private colormap (8-bit display)]' \
    '--ncols=:limit on number of colors (8-bit display): ' \
    '--nograb[never grab mouse or keyboard]' \
    '--dograb[override nograb in debugger]' \
    '--sync[switch to synchronous mode when debugging]' \
    '--fn=:font name:_x_font' \
    '--font=:font name:_x_font' \
    '--bg=:background color:_x_color' \
    '--background=:background color:_x_color' \
    '--fg=:foreground color:_x_color' \
    '--foreround=:foreground color:_x_color' \
    '--btn=:button color:_x_color' \
    '--button=:button color:_x_color' \
    '--name=:application name: ' \
    '--title=:application title (caption): ' \
    '--visual=:specify visual:_x_visual' \
    '--inputstyle:X input method:(onthespot overthespot offthespot root)' \
    '--im:X Input Method server: ' \
    '--noxim[disable X Input Method]' \
    '--reverse[reverse widget layout]' \
    '--caption=:name in titlebar: ' \
    '--icon=:application icon: ' \
    '--miniicon=:icon in titlebar: ' \
    '--config=:configuration  file:_files' \
    '--dcopserver=:DCOP server: ' \
    '--nocrashhandler[disable crash handler, allow core dumps]' \
    '--waitforwm[wait for a WM_NET compatible window manager]' \
    '--style=:GUI style for application: ' \
    '--geometry=:client window geometry:_x_geometry' \
    '(-)1:configuration module:_kcmshell_list_modules' \
    &&  return 0
}

_kcmshell "$@"

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Broken completion with UTF-8 description
  2006-09-17 14:53 Broken completion with UTF-8 description Andrey Borzenkov
@ 2006-09-17 19:16 ` Peter Stephenson
  2006-09-21 17:04   ` PATCH: " Andrey Borzenkov
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Stephenson @ 2006-09-17 19:16 UTC (permalink / raw)
  To: Andrey Borzenkov; +Cc: zsh-workers

On Sun, 17 Sep 2006 18:53:50 +0400
Andrey Borzenkov <arvidjaar@newmail.ru> wrote:
> Writing completion for kcmshell (attached); it outputs list of KDE 
> configuration modules together with descriptions and I have not found
> any way (confirmed by KDE developers) to force descriptions in
> English. So far I get the following:
> 
> ...
> printers                       -- Настройки системы печа�[0m
> privacy                        -- Модуль kcontrol, очищающ�[0m
> profilechooser                 -- Mandriva KDE profilechooser
> proxy                          -- Настройка серверов прок�[0m
> ...

That's another evening of my life gone.

Something in computil was using strlen() to decide whether to truncate
at the end of the screen line.  Apparently descriptions never wrap.

There are lots of other strlen()s in computil, but they don't seem to be
used for anything like this.

Index: Src/Zle/computil.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/Zle/computil.c,v
retrieving revision 1.94
diff -u -r1.94 computil.c
--- Src/Zle/computil.c	30 May 2006 22:35:04 -0000	1.94
+++ Src/Zle/computil.c	17 Sep 2006 19:11:04 -0000
@@ -616,8 +616,22 @@
                     memset(buf, ' ', cd_state.pre);
                     memcpy(buf, str->str, str->len);
                     strcpy(sufp, str->desc);
-                    if (strlen(buf) >= columns - 1)
-                        buf[columns - 1] = '\0';
+                    if (MB_METASTRWIDTH(buf) >= columns - 1) {
+			char *termptr = buf;
+			int w;
+			MB_METACHARINIT();
+			for (w = columns - 1; *termptr && w > 0; ) {
+			    convchar_t cchar;
+			    int cw;
+			    termptr += MB_METACHARLENCONV(termptr, &cchar);
+			    cw = WCWIDTH(cchar);
+			    if (cw >= 0)
+				w -= cw;
+			    else
+				w--;
+			}
+                        *termptr = '\0';
+		    }
                     *dp++ = ztrdup(buf);
                 }
                 *mp = *dp = NULL;

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 9+ messages in thread

* PATCH: Re: Broken completion with UTF-8 description
  2006-09-17 19:16 ` Peter Stephenson
@ 2006-09-21 17:04   ` Andrey Borzenkov
  2006-09-21 17:22     ` Peter Stephenson
  2006-09-21 20:02     ` Peter Stephenson
  0 siblings, 2 replies; 9+ messages in thread
From: Andrey Borzenkov @ 2006-09-21 17:04 UTC (permalink / raw)
  To: zsh-workers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sunday 17 September 2006 23:16, Peter Stephenson wrote:
> That's another evening of my life gone.
>

Sorry about that esp. as this did not fixed this particular case due to it 
taking different code path.

Patch below seems to correctly compute display length in all cases (where it 
is computed in compdesrcibe at all); all my tests using different combination 
of ASCII and Russian descriptions and matches passed, including the original 
example that started this thread.

How are matches sorted currently? Do we have another function doing collate 
sort or completion simply sorts in numerical order? I am not sure if using 
strpcmp is the right thing but this is the first I caught.

Half of the code is redundant. It behaves differently when we have to group 
matches (with identical descriptions) and when not. Observing that no 
grouping case is equal to group of size 1 allows eliminate this duplicated 
part. It has one visible effect - now during menu selection in case of 
grouping only match itself is highlighted while without grouping the whole 
line (match + description) is highlighted. Anyone will miss current behavior?

Code silently depends on another code somewhere inside of completion that I 
have never found :/

Please review as I am not sure is usage of ztrdup/zfree is right in this case.

And it became noticeably slow (for Russian strings) on my PIII 733 now ...

- -andrey

Index: Src/subst.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/subst.c,v
retrieving revision 1.61
diff -u -p -r1.61 subst.c
- --- Src/subst.c	20 Sep 2006 09:22:34 -0000	1.61
+++ Src/subst.c	21 Sep 2006 16:51:38 -0000
@@ -581,7 +581,7 @@ strcatsub(char **d, char *pb, char *pe, 
 typedef int (*CompareFn) _((const void *, const void *));
 
 /**/
- -int
+mod_export int
 strpcmp(const void *a, const void *b)
 {
 #ifdef HAVE_STRCOLL
Index: Src/Zle/computil.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/Zle/computil.c,v
retrieving revision 1.95
diff -u -p -r1.95 computil.c
- --- Src/Zle/computil.c	17 Sep 2006 19:23:39 -0000	1.95
+++ Src/Zle/computil.c	21 Sep 2006 16:51:39 -0000
@@ -33,6 +33,13 @@
 
 /* Help for `_describe'. */
 
+/*
+ * FIXME this should be defined globally. I have to find other places
+ * where it is used
+ * */
+
+#define INTERMATCH_GAP 2
+
 typedef struct cdset *Cdset;
 typedef struct cdstr *Cdstr;
 typedef struct cdrun *Cdrun;
@@ -40,16 +47,18 @@ typedef struct cdrun *Cdrun;
 struct cdstate {
     int showd;			/* != 0 if descriptions should be shown */
     char *sep;			/* the separator string */
- -    int slen;			/* its length */
+    int slen;			/* its metafied length */
+    int swidth;			/* its screen width */
     int maxmlen;                /* maximum length to allow for the matches */
     Cdset sets;			/* the sets of matches */
- -    int pre;                    /* longest prefix (before description) */
+    int pre;                    /* longest prefix length (before description) 
*/
+    int premaxw;		/* ... and its screen width */
     int suf;                    /* longest suffix (description) */
     int maxg;                   /* size of largest group */
     int maxglen;                /* columns for matches of largest group */
     int groups;                 /* number of groups */
     int descs;                  /* number of non-group matches with desc */
- -    int gpre;                   /* prefix length for group display */
+    int gprew;                   /* prefix screen width for group display */
     Cdrun runs;                 /* runs to report to shell code */
 };
 
@@ -59,6 +68,7 @@ struct cdstr {
     char *desc;                 /* the description or NULL */
     char *match;                /* the match to add */
     int len;                    /* length of str or match */
+    int width;			/* ... and its screen width */
     Cdstr other;                /* next string with the same description */
     int kind;                   /* 0: not in a group, 1: the first, 2: other 
*/
     Cdset set;                  /* the set this string is in */
@@ -123,7 +133,7 @@ cd_group(int maxg)
 {
     Cdset set1, set2;
     Cdstr str1, str2, *strp;
- -    int num, len;
+    int num, width;
 
     cd_state.groups = cd_state.descs = cd_state.maxglen = 0;
     cd_state.maxg = 0;
@@ -140,20 +150,20 @@ cd_group(int maxg)
                 continue;
 
             num = 1;
- -            len = str1->len + cd_state.slen;
- -            if (len > cd_state.maxglen)
- -                cd_state.maxglen = len;
+            width = str1->width + cd_state.swidth;
+            if (width > cd_state.maxglen)
+                cd_state.maxglen = width;
             strp = &(str1->other);
 
             for (set2 = set1; set2; set2 = set2->next) {
                 for (str2 = (set2 == set1 ? str1->next : set2->strs);
                      str2; str2 = str2->next)
                     if (str2->desc && !strcmp(str1->desc, str2->desc)) {
- -                        len += 2 + str2->len;
- -                        if (len > cd_state.maxmlen || num == maxg)
+                        width += INTERMATCH_GAP + str2->width;
+                        if (width > cd_state.maxmlen || num == maxg)
                             break;
- -                        if (len > cd_state.maxglen)
- -                            cd_state.maxglen = len;
+                        if (width > cd_state.maxglen)
+                            cd_state.maxglen = width;
                         str1->kind = 1;
                         str2->kind = 2;
                         num++;
@@ -194,6 +204,8 @@ cd_calc()
             set->count++;
             if ((l = strlen(str->str)) > cd_state.pre)
                 cd_state.pre = l;
+            if ((l = MB_METASTRWIDTH(str->str)) > cd_state.premaxw)
+                cd_state.premaxw = l;
             if (str->desc) {
                 set->desc++;
                 if ((l = strlen(str->desc)) > cd_state.suf)
@@ -206,7 +218,18 @@ cd_calc()
 static int
 cd_sort(const void *a, const void *b)
 {
- -    return strcmp((*((Cdstr *) a))->str, (*((Cdstr *) b))->str);
+    char *as = ztrdup((*((Cdstr *) a))->str);
+    int aslen = strlen(as);
+    char *bs = ztrdup((*((Cdstr *) b))->str);
+    int bslen = strlen(bs);
+    int ret;
+
+    unmetafy(as, &ret);
+    unmetafy(bs, &ret);
+    ret = strpcmp(&as, &bs);
+    zfree(as, aslen);
+    zfree(bs, bslen);
+    return ret;
 }
 
 static int
@@ -234,8 +257,8 @@ cd_prep()
             for (str = set->strs; str; str = str->next) {
                 if (str->kind != 1) {
                     if (!str->kind && str->desc) {
- -                        if (str->len > wids[0])
- -                            wids[0] = str->len;
+                        if (str->width > wids[0])
+                            wids[0] = str->width;
                         str->other = NULL;
                         *strp++ = str;
                     }
@@ -248,23 +271,24 @@ cd_prep()
                 for (; gp; gp = gn) {
                     gn = gp->other;
                     gp->other = NULL;
- -                    for (gpp = &gs; *gpp && (*gpp)->len > gp->len;
+                    for (gpp = &gs; *gpp && (*gpp)->width > gp->width;
                          gpp = &((*gpp)->other));
                     gp->other = *gpp;
                     *gpp = gp;
                 }
                 for (gp = gs, i = 0; gp; gp = gp->other, i++)
- -                    if (gp->len > wids[i])
- -                        wids[i] = gp->len;
+                    if (gp->width > wids[i])
+                        wids[i] = gp->width;
 
                 *strp++ = gs;
             }
 
- -        cd_state.gpre = 0;
- -        for (i = 0; i < cd_state.maxg; i++)
- -            cd_state.gpre += wids[i] + 2;
+        cd_state.gprew = 0;
+        for (i = 0; i < cd_state.maxg; i++) {
+            cd_state.gprew += wids[i] + INTERMATCH_GAP;
+	}
 
- -        if (cd_state.gpre > cd_state.maxmlen && cd_state.maxglen > 1)
+        if (cd_state.gprew > cd_state.maxmlen && cd_state.maxglen > 1)
             return 1;
 
         qsort(grps, lines, sizeof(Cdstr), cd_sort);
@@ -443,12 +467,13 @@ cd_init(char *nam, char *hide, char *mle
     }
     setp = &(cd_state.sets);
     cd_state.sep = ztrdup(sep);
- -    cd_state.slen = ztrlen(sep);
+    cd_state.slen = strlen(sep);
+    cd_state.swidth = MB_METASTRWIDTH(sep);
     cd_state.sets = NULL;
     cd_state.showd = disp;
     cd_state.maxg = cd_state.groups = cd_state.descs = 0;
     cd_state.maxmlen = atoi(mlen);
- -    itmp = columns - cd_state.slen - 4;
+    itmp = columns - cd_state.swidth - 4;
     if (cd_state.maxmlen > itmp)
         cd_state.maxmlen = itmp;
     if (cd_state.maxmlen < 4)
@@ -489,6 +514,7 @@ cd_init(char *nam, char *hide, char *mle
             *tmp = '\0';
             str->str = str->match = ztrdup(rembslash(*ap));
             str->len = strlen(str->str);
+            str->width = MB_METASTRWIDTH(str->str);
         }
         if (str)
             str->next = NULL;
@@ -600,38 +626,57 @@ cd_get(char **params)
 
         case CRT_DESC:
             {
+		/*
+		 * The buffer size:
+		 *     max prefix length (cd_state.pre) +
+		 *     max padding (cd_state.premaxw generously :) +
+		 *     separator length (cd_state.slen) +
+		 *     inter matches gap (INTERMATCH_GAP) +
+		 *     max description length (cd_state.suf) +
+		 *     trailing \0
+		 */
                 VARARR(char, buf,
- -                       cd_state.pre + cd_state.suf + cd_state.slen + 3);
- -                char *sufp = NULL;
- -
- -                memcpy(buf + cd_state.pre + 2, cd_state.sep, cd_state.slen);
- -                buf[cd_state.pre] = buf[cd_state.pre + 1] = ' ';
- -                sufp = buf + cd_state.pre + cd_state.slen + 2;
- -
+                       cd_state.pre + cd_state.suf +
+		       cd_state.premaxw + cd_state.slen + 3);
                 mats = mp = (char **) zalloc((run->count + 1) * sizeof(char 
*));
                 dpys = dp = (char **) zalloc((run->count + 1) * sizeof(char 
*));
 
                 for (str = run->strs; str; str = str->run) {
+		    char *p = buf, *pp, *d;
+		    int l, remw;
+
                     *mp++ = ztrdup(str->match);
- -                    memset(buf, ' ', cd_state.pre);
- -                    memcpy(buf, str->str, str->len);
- -                    strcpy(sufp, str->desc);
- -                    if (MB_METASTRWIDTH(buf) >= columns - 1) {
- -			char *termptr = buf;
+		    strcpy(p, str->str);
+		    p += str->len;
+                    memset(p, ' ', (l = (cd_state.premaxw - str->width + 
INTERMATCH_GAP)));
+		    p += l;
+		    strcpy(p, cd_state.sep);
+		    p += cd_state.slen;
+
+		    /*
+		     * copy a character at once until no more screen width
+		     * is available. Leave 1 character at the end of screen
+		     * as safety margin
+		     */
+		    remw = columns - cd_state.premaxw - cd_state.swidth - 3;
+		    pp = p;
+		    d = str->desc;
+		    while (*d) {
 			int w;
- -			MB_METACHARINIT();
- -			for (w = columns - 1; *termptr && w > 0; ) {
- -			    convchar_t cchar;
- -			    int cw;
- -			    termptr += MB_METACHARLENCONV(termptr, &cchar);
- -			    cw = WCWIDTH(cchar);
- -			    if (cw >= 0)
- -				w -= cw;
- -			    else
- -				w--;
+
+			l = MB_METACHARLEN(d);
+			memcpy(pp, d, l);
+			pp[l] = '\0';
+			w = MB_METASTRWIDTH(p);
+			if (w > remw) {
+			    *pp = '\0';
+			    break;
 			}
- -                        *termptr = '\0';
+
+			pp += l;
+			d += l;
 		    }
+
                     *dp++ = ztrdup(buf);
                 }
                 *mp = *dp = NULL;
@@ -684,10 +729,10 @@ cd_get(char **params)
 	default: /* This silences the "might be used uninitialized" warnings */
         case CRT_EXPL:
             {
- -                int dlen = columns - cd_state.gpre - cd_state.slen;
- -                VARARR(char, dbuf, dlen + cd_state.slen);
- -                char buf[20];
- -                int i = run->count;
+		/* add columns as safety margin */
+                VARARR(char, dbuf, cd_state.suf + cd_state.slen + columns);
+                char buf[20], *p, *pp, *d;
+                int i = run->count, remw, w, l;
 
                 sprintf(buf, "-E%d", i);
 
@@ -699,13 +744,29 @@ cd_get(char **params)
                         *dp++ = ztrdup("");
                         continue;
                     }
- -                    memset(dbuf + cd_state.slen, ' ', dlen - 1);
- -                    dbuf[dlen + cd_state.slen - 1] = '\0';
+
                     strcpy(dbuf, cd_state.sep);
- -                    memcpy(dbuf + cd_state.slen,
- -                           str->desc,
- -                           ((int)strlen(str->desc) >= dlen ? dlen - 1 :
- -                            (int)strlen(str->desc)));
+		    remw = columns - cd_state.gprew - cd_state.swidth - INTERMATCH_GAP;
+		    p = pp = dbuf + cd_state.slen;
+		    d = str->desc;
+		    while (*d) {
+			l = MB_METACHARLEN(d);
+			memcpy(pp, d, l);
+			pp[l] = '\0';
+			w = MB_METASTRWIDTH(p);
+			if (w > remw) {
+			    *pp = '\0';
+			    break;
+			}
+
+			pp += l;
+			d += l;
+		    }
+
+		    while (w++ < remw)
+			*pp++ = ' ';
+		    *pp = '\0';
+
                     *dp++ = ztrdup(dbuf);
                 }
                 mats[0] = *dp = NULL;


- -andrey
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFFEsYXR6LMutpd94wRAqqWAJ9++5aNYVpYL2sbkkWi2jo/Ty/EIwCgy0SP
fNR74jf1tEXziM2Z8L1CJak=
=Df6S
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PATCH: Re: Broken completion with UTF-8 description
  2006-09-21 17:04   ` PATCH: " Andrey Borzenkov
@ 2006-09-21 17:22     ` Peter Stephenson
  2006-09-21 17:44       ` Andrey Borzenkov
  2006-10-07  7:58       ` Andrey Borzenkov
  2006-09-21 20:02     ` Peter Stephenson
  1 sibling, 2 replies; 9+ messages in thread
From: Peter Stephenson @ 2006-09-21 17:22 UTC (permalink / raw)
  To: zsh-workers

Andrey Borzenkov wrote:
> How are matches sorted currently? Do we have another function doing collate 
> sort or completion simply sorts in numerical order? I am not sure if using 
> strpcmp is the right thing but this is the first I caught.

Did you take a look at strbpcmp() in zle_tricky.c?

> Please review as I am not sure is usage of ztrdup/zfree is right in this case
> .

It'll work, but the standard completion system trick seems to be to
dupstring() and not worry about the temporary usage.

You'll appreciate that it's not that easy to understand the completion
code simply by looking at it, but I didn't see anything that looked
obviously stupid.

> +/*
> + * FIXME this should be defined globally. I have to find other places
> + * where it is used
> + * */
> +
> +#define INTERMATCH_GAP 2

Ah, I see you've found another undocumented magic number used without
any indication.  I found a few of those.

-- 
Peter Stephenson <pws@csr.com>                  Software Engineer
CSR PLC, Churchill House, Cambridge Business Park, Cowley Road
Cambridge, CB4 0WZ, UK                          Tel: +44 (0)1223 692070


To access the latest news from CSR copy this link into a web browser:  http://www.csr.com/email_sig.php


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PATCH: Re: Broken completion with UTF-8 description
  2006-09-21 17:22     ` Peter Stephenson
@ 2006-09-21 17:44       ` Andrey Borzenkov
  2006-10-07  7:58       ` Andrey Borzenkov
  1 sibling, 0 replies; 9+ messages in thread
From: Andrey Borzenkov @ 2006-09-21 17:44 UTC (permalink / raw)
  To: zsh-workers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thursday 21 September 2006 21:22, Peter Stephenson wrote:
> Andrey Borzenkov wrote:
> > How are matches sorted currently? Do we have another function doing
> > collate sort or completion simply sorts in numerical order? I am not sure
> > if using strpcmp is the right thing but this is the first I caught.
>
> Did you take a look at strbpcmp() in zle_tricky.c?
>

It does not work for metafied strings which we apparently have here? 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFFEs9tR6LMutpd94wRAhK3AKDIlGRsmKv7e1aU/m2YZ9cCOIc6KQCfZZWO
BVKFcDhtx9xmYiRfdp4WklI=
=IoMi
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PATCH: Re: Broken completion with UTF-8 description
  2006-09-21 17:04   ` PATCH: " Andrey Borzenkov
  2006-09-21 17:22     ` Peter Stephenson
@ 2006-09-21 20:02     ` Peter Stephenson
  2006-09-23  7:05       ` Andrey Borzenkov
  1 sibling, 1 reply; 9+ messages in thread
From: Peter Stephenson @ 2006-09-21 20:02 UTC (permalink / raw)
  To: zsh-workers

On Thu, 21 Sep 2006 21:04:17 +0400
Andrey Borzenkov <arvidjaar@newmail.ru> wrote:
>  static int
>  cd_sort(const void *a, const void *b)
>  {
> - -    return strcmp((*((Cdstr *) a))->str, (*((Cdstr *) b))->str);
> +    char *as = ztrdup((*((Cdstr *) a))->str);
> +    int aslen = strlen(as);
> +    char *bs = ztrdup((*((Cdstr *) b))->str);
> +    int bslen = strlen(bs);
> +    int ret;
> +
> +    unmetafy(as, &ret);
> +    unmetafy(bs, &ret);
> +    ret = strpcmp(&as, &bs);
> +    zfree(as, aslen);
> +    zfree(bs, bslen);
> +    return ret;
>  }

Looking in more detail, cd_sort is passed as an argument to qsort(), so
it may be called many times while sorting the arguments.  It's therefore
probably better to do the conversion in the caller, of which there's
only one at line 270, i.e. the str elements of the values in the array
grps should be copied to an array of unmetafied strings, and then simply
call strpcmp() in cd_sort.  In fact, since you're doing a copy anyway
you could probably convert the array to a format where you can pass
strpcmp() as the final argument and eliminate cd_sort() altogether.

That ought to improve your speed problem quite a bit.

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PATCH: Re: Broken completion with UTF-8 description
  2006-09-21 20:02     ` Peter Stephenson
@ 2006-09-23  7:05       ` Andrey Borzenkov
  2006-09-23  7:36         ` Andrey Borzenkov
  0 siblings, 1 reply; 9+ messages in thread
From: Andrey Borzenkov @ 2006-09-23  7:05 UTC (permalink / raw)
  To: zsh-workers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Friday 22 September 2006 00:02, Peter Stephenson wrote:
>
> Looking in more detail, cd_sort is passed as an argument to qsort(), so
> it may be called many times while sorting the arguments.  It's therefore
> probably better to do the conversion in the caller, of which there's
> only one at line 270, i.e. the str elements of the values in the array
> grps should be copied to an array of unmetafied strings, and then simply
> call strpcmp() in cd_sort.  In fact, since you're doing a copy anyway
> you could probably convert the array to a format where you can pass
> strpcmp() as the final argument and eliminate cd_sort() altogether.
>
> That ought to improve your speed problem quite a bit.

Actually it became slow even before I added unmetafy; this is likely due to 
O(n**2) string width calculations done when copying description.

Anyway, you are right of course. The modified patch does it, and also relaxes 
string width calculation - it now assumes that width of string is sum of 
widths of constituting multibyte characters. If this turns out to be wrong, 
we have to put up with slow down ...

I will commit it in this form if there are now objections.

- -andrey

Index: Src/subst.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/subst.c,v
retrieving revision 1.61
diff -u -p -r1.61 subst.c
- --- Src/subst.c	20 Sep 2006 09:22:34 -0000	1.61
+++ Src/subst.c	23 Sep 2006 06:59:19 -0000
@@ -581,7 +581,7 @@ strcatsub(char **d, char *pb, char *pe, 
 typedef int (*CompareFn) _((const void *, const void *));
 
 /**/
- -int
+mod_export int
 strpcmp(const void *a, const void *b)
 {
 #ifdef HAVE_STRCOLL
Index: Src/Zle/computil.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/Zle/computil.c,v
retrieving revision 1.95
diff -u -p -r1.95 computil.c
- --- Src/Zle/computil.c	17 Sep 2006 19:23:39 -0000	1.95
+++ Src/Zle/computil.c	23 Sep 2006 06:59:19 -0000
@@ -33,6 +33,13 @@
 
 /* Help for `_describe'. */
 
+/*
+ * FIXME this should be defined globally. I have to find other places
+ * where it is used
+ * */
+
+#define INTERMATCH_GAP 2
+
 typedef struct cdset *Cdset;
 typedef struct cdstr *Cdstr;
 typedef struct cdrun *Cdrun;
@@ -40,16 +47,18 @@ typedef struct cdrun *Cdrun;
 struct cdstate {
     int showd;			/* != 0 if descriptions should be shown */
     char *sep;			/* the separator string */
- -    int slen;			/* its length */
+    int slen;			/* its metafied length */
+    int swidth;			/* its screen width */
     int maxmlen;                /* maximum length to allow for the matches */
     Cdset sets;			/* the sets of matches */
- -    int pre;                    /* longest prefix (before description) */
+    int pre;                    /* longest prefix length (before description) 
*/
+    int premaxw;		/* ... and its screen width */
     int suf;                    /* longest suffix (description) */
     int maxg;                   /* size of largest group */
     int maxglen;                /* columns for matches of largest group */
     int groups;                 /* number of groups */
     int descs;                  /* number of non-group matches with desc */
- -    int gpre;                   /* prefix length for group display */
+    int gprew;                   /* prefix screen width for group display */
     Cdrun runs;                 /* runs to report to shell code */
 };
 
@@ -58,7 +67,9 @@ struct cdstr {
     char *str;                  /* the string to display */
     char *desc;                 /* the description or NULL */
     char *match;                /* the match to add */
+    char *sortstr;		/* unmetafied string used to sort matches */
     int len;                    /* length of str or match */
+    int width;			/* ... and its screen width */
     Cdstr other;                /* next string with the same description */
     int kind;                   /* 0: not in a group, 1: the first, 2: other 
*/
     Cdset set;                  /* the set this string is in */
@@ -102,6 +113,7 @@ freecdsets(Cdset p)
 	    freearray(p->opts);
         for (s = p->strs; s; s = sn) {
             sn = s->next;
+	    zfree(s->sortstr, strlen(s->str) + 1);
             zsfree(s->str);
             zsfree(s->desc);
             if (s->match != s->str)
@@ -123,7 +135,7 @@ cd_group(int maxg)
 {
     Cdset set1, set2;
     Cdstr str1, str2, *strp;
- -    int num, len;
+    int num, width;
 
     cd_state.groups = cd_state.descs = cd_state.maxglen = 0;
     cd_state.maxg = 0;
@@ -140,20 +152,20 @@ cd_group(int maxg)
                 continue;
 
             num = 1;
- -            len = str1->len + cd_state.slen;
- -            if (len > cd_state.maxglen)
- -                cd_state.maxglen = len;
+            width = str1->width + cd_state.swidth;
+            if (width > cd_state.maxglen)
+                cd_state.maxglen = width;
             strp = &(str1->other);
 
             for (set2 = set1; set2; set2 = set2->next) {
                 for (str2 = (set2 == set1 ? str1->next : set2->strs);
                      str2; str2 = str2->next)
                     if (str2->desc && !strcmp(str1->desc, str2->desc)) {
- -                        len += 2 + str2->len;
- -                        if (len > cd_state.maxmlen || num == maxg)
+                        width += INTERMATCH_GAP + str2->width;
+                        if (width > cd_state.maxmlen || num == maxg)
                             break;
- -                        if (len > cd_state.maxglen)
- -                            cd_state.maxglen = len;
+                        if (width > cd_state.maxglen)
+                            cd_state.maxglen = width;
                         str1->kind = 1;
                         str2->kind = 2;
                         num++;
@@ -194,6 +206,8 @@ cd_calc()
             set->count++;
             if ((l = strlen(str->str)) > cd_state.pre)
                 cd_state.pre = l;
+            if ((l = MB_METASTRWIDTH(str->str)) > cd_state.premaxw)
+                cd_state.premaxw = l;
             if (str->desc) {
                 set->desc++;
                 if ((l = strlen(str->desc)) > cd_state.suf)
@@ -206,7 +220,7 @@ cd_calc()
 static int
 cd_sort(const void *a, const void *b)
 {
- -    return strcmp((*((Cdstr *) a))->str, (*((Cdstr *) b))->str);
+    return strpcmp(&(*((Cdstr *) a))->sortstr, &(*((Cdstr *) b))->sortstr);
 }
 
 static int
@@ -234,8 +248,8 @@ cd_prep()
             for (str = set->strs; str; str = str->next) {
                 if (str->kind != 1) {
                     if (!str->kind && str->desc) {
- -                        if (str->len > wids[0])
- -                            wids[0] = str->len;
+                        if (str->width > wids[0])
+                            wids[0] = str->width;
                         str->other = NULL;
                         *strp++ = str;
                     }
@@ -248,25 +262,34 @@ cd_prep()
                 for (; gp; gp = gn) {
                     gn = gp->other;
                     gp->other = NULL;
- -                    for (gpp = &gs; *gpp && (*gpp)->len > gp->len;
+                    for (gpp = &gs; *gpp && (*gpp)->width > gp->width;
                          gpp = &((*gpp)->other));
                     gp->other = *gpp;
                     *gpp = gp;
                 }
                 for (gp = gs, i = 0; gp; gp = gp->other, i++)
- -                    if (gp->len > wids[i])
- -                        wids[i] = gp->len;
+                    if (gp->width > wids[i])
+                        wids[i] = gp->width;
 
                 *strp++ = gs;
             }
 
- -        cd_state.gpre = 0;
- -        for (i = 0; i < cd_state.maxg; i++)
- -            cd_state.gpre += wids[i] + 2;
+        cd_state.gprew = 0;
+        for (i = 0; i < cd_state.maxg; i++) {
+            cd_state.gprew += wids[i] + INTERMATCH_GAP;
+	}
 
- -        if (cd_state.gpre > cd_state.maxmlen && cd_state.maxglen > 1)
+        if (cd_state.gprew > cd_state.maxmlen && cd_state.maxglen > 1)
             return 1;
 
+	for (i = 0; i < lines; i++) {
+	    Cdstr s = grps[i];
+	    int dummy;
+
+	    s->sortstr = ztrdup(s->str);
+	    unmetafy(s->sortstr, &dummy);
+	}
+
         qsort(grps, lines, sizeof(Cdstr), cd_sort);
 
         for (i = lines, strp = grps; i > 1; i--, strp++) {
@@ -443,12 +466,13 @@ cd_init(char *nam, char *hide, char *mle
     }
     setp = &(cd_state.sets);
     cd_state.sep = ztrdup(sep);
- -    cd_state.slen = ztrlen(sep);
+    cd_state.slen = strlen(sep);
+    cd_state.swidth = MB_METASTRWIDTH(sep);
     cd_state.sets = NULL;
     cd_state.showd = disp;
     cd_state.maxg = cd_state.groups = cd_state.descs = 0;
     cd_state.maxmlen = atoi(mlen);
- -    itmp = columns - cd_state.slen - 4;
+    itmp = columns - cd_state.swidth - 4;
     if (cd_state.maxmlen > itmp)
         cd_state.maxmlen = itmp;
     if (cd_state.maxmlen < 4)
@@ -489,6 +513,8 @@ cd_init(char *nam, char *hide, char *mle
             *tmp = '\0';
             str->str = str->match = ztrdup(rembslash(*ap));
             str->len = strlen(str->str);
+            str->width = MB_METASTRWIDTH(str->str);
+	    str->sortstr = NULL;
         }
         if (str)
             str->next = NULL;
@@ -600,38 +626,61 @@ cd_get(char **params)
 
         case CRT_DESC:
             {
+		/*
+		 * The buffer size:
+		 *     max prefix length (cd_state.pre) +
+		 *     max padding (cd_state.premaxw generously :) +
+		 *     separator length (cd_state.slen) +
+		 *     inter matches gap (INTERMATCH_GAP) +
+		 *     max description length (cd_state.suf) +
+		 *     trailing \0
+		 */
                 VARARR(char, buf,
- -                       cd_state.pre + cd_state.suf + cd_state.slen + 3);
- -                char *sufp = NULL;
- -
- -                memcpy(buf + cd_state.pre + 2, cd_state.sep, cd_state.slen);
- -                buf[cd_state.pre] = buf[cd_state.pre + 1] = ' ';
- -                sufp = buf + cd_state.pre + cd_state.slen + 2;
- -
+                       cd_state.pre + cd_state.suf +
+		       cd_state.premaxw + cd_state.slen + 3);
                 mats = mp = (char **) zalloc((run->count + 1) * sizeof(char 
*));
                 dpys = dp = (char **) zalloc((run->count + 1) * sizeof(char 
*));
 
                 for (str = run->strs; str; str = str->run) {
+		    char *p = buf, *pp, *d;
+		    int l, remw, w;
+
                     *mp++ = ztrdup(str->match);
- -                    memset(buf, ' ', cd_state.pre);
- -                    memcpy(buf, str->str, str->len);
- -                    strcpy(sufp, str->desc);
- -                    if (MB_METASTRWIDTH(buf) >= columns - 1) {
- -			char *termptr = buf;
- -			int w;
- -			MB_METACHARINIT();
- -			for (w = columns - 1; *termptr && w > 0; ) {
- -			    convchar_t cchar;
- -			    int cw;
- -			    termptr += MB_METACHARLENCONV(termptr, &cchar);
- -			    cw = WCWIDTH(cchar);
- -			    if (cw >= 0)
- -				w -= cw;
- -			    else
- -				w--;
+		    strcpy(p, str->str);
+		    p += str->len;
+                    memset(p, ' ', (l = (cd_state.premaxw - str->width + 
INTERMATCH_GAP)));
+		    p += l;
+		    strcpy(p, cd_state.sep);
+		    p += cd_state.slen;
+
+		    /*
+		     * copy a character at once until no more screen width
+		     * is available. Leave 1 character at the end of screen
+		     * as safety margin
+		     */
+		    remw = columns - cd_state.premaxw - cd_state.swidth - 3;
+		    d = str->desc;
+		    w = MB_METASTRWIDTH(d);
+		    if (w <= remw)
+			strcpy(p, d);
+		    else {
+			pp = p;
+			while (remw > 0 && *d) {
+			    l = MB_METACHARLEN(d);
+			    memcpy(pp, d, l);
+			    pp[l] = '\0';
+			    w = MB_METASTRWIDTH(pp);
+			    if (w > remw) {
+				*pp = '\0';
+				break;
+			    }
+
+			    pp += l;
+			    d += l;
+			    remw -= w;
 			}
- -                        *termptr = '\0';
 		    }
+
                     *dp++ = ztrdup(buf);
                 }
                 *mp = *dp = NULL;
@@ -684,10 +733,10 @@ cd_get(char **params)
 	default: /* This silences the "might be used uninitialized" warnings */
         case CRT_EXPL:
             {
- -                int dlen = columns - cd_state.gpre - cd_state.slen;
- -                VARARR(char, dbuf, dlen + cd_state.slen);
- -                char buf[20];
- -                int i = run->count;
+		/* add columns as safety margin */
+                VARARR(char, dbuf, cd_state.suf + cd_state.slen + columns);
+                char buf[20], *p, *pp, *d;
+                int i = run->count, remw, w, l;
 
                 sprintf(buf, "-E%d", i);
 
@@ -699,13 +748,36 @@ cd_get(char **params)
                         *dp++ = ztrdup("");
                         continue;
                     }
- -                    memset(dbuf + cd_state.slen, ' ', dlen - 1);
- -                    dbuf[dlen + cd_state.slen - 1] = '\0';
+
                     strcpy(dbuf, cd_state.sep);
- -                    memcpy(dbuf + cd_state.slen,
- -                           str->desc,
- -                           ((int)strlen(str->desc) >= dlen ? dlen - 1 :
- -                            (int)strlen(str->desc)));
+		    remw = columns - cd_state.gprew - cd_state.swidth - INTERMATCH_GAP;
+		    p = pp = dbuf + cd_state.slen;
+		    d = str->desc;
+		    w = MB_METASTRWIDTH(d);
+		    if (w <= remw) {
+			strcpy(p, d);
+			remw -= w;
+			pp += strlen(d);
+		    } else
+			while (remw > 0 && *d) {
+			    l = MB_METACHARLEN(d);
+			    memcpy(pp, d, l);
+			    pp[l] = '\0';
+			    w = MB_METASTRWIDTH(pp);
+			    if (w > remw) {
+				*pp = '\0';
+				break;
+			    }
+
+			    pp += l;
+			    d += l;
+			    remw -= w;
+			}
+
+		    while (remw-- > 0)
+			*pp++ = ' ';
+		    *pp = '\0';
+
                     *dp++ = ztrdup(dbuf);
                 }
                 mats[0] = *dp = NULL;
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFFFNy8R6LMutpd94wRAtuXAKCdagFuUTvkXWYTHWo1whRcmwbwhQCfQv8c
pSa0X+lYrgYPkPV3GwF689w=
=78F3
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PATCH: Re: Broken completion with UTF-8 description
  2006-09-23  7:05       ` Andrey Borzenkov
@ 2006-09-23  7:36         ` Andrey Borzenkov
  0 siblings, 0 replies; 9+ messages in thread
From: Andrey Borzenkov @ 2006-09-23  7:36 UTC (permalink / raw)
  To: zsh-workers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Saturday 23 September 2006 11:05, Andrey Borzenkov wrote:
> I will commit it in this form if there are now objections.
s/now/no/ :)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFFFOQKR6LMutpd94wRAnE3AJ4mwqBv/uCxH9CRhJBNBa0ti9x4bwCgvpcj
rwXlgr/meBDbY7YSuJclm+o=
=90E/
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: PATCH: Re: Broken completion with UTF-8 description
  2006-09-21 17:22     ` Peter Stephenson
  2006-09-21 17:44       ` Andrey Borzenkov
@ 2006-10-07  7:58       ` Andrey Borzenkov
  1 sibling, 0 replies; 9+ messages in thread
From: Andrey Borzenkov @ 2006-10-07  7:58 UTC (permalink / raw)
  To: zsh-workers

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thursday 21 September 2006 21:22, Peter Stephenson wrote:
> > +#define INTERMATCH_GAP 2
>
> Ah, I see you've found another undocumented magic number used without
> any indication.  I found a few of those.

well, it was documented just not consistently used everywhere.

Index: Src/Zle/comp.h
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/Zle/comp.h,v
retrieving revision 1.15
diff -u -p -r1.15 comp.h
- --- Src/Zle/comp.h	7 Mar 2006 12:52:28 -0000	1.15
+++ Src/Zle/comp.h	7 Oct 2006 07:57:13 -0000
@@ -407,3 +407,7 @@ struct chdata {
     Cmatch cur;			/* current match or NULL */
 };
 
+/* The number of columns to leave empty between rows of matches. */
+
+#define CM_SPACE  2
+
Index: Src/Zle/compresult.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/Zle/compresult.c,v
retrieving revision 1.67
diff -u -p -r1.67 compresult.c
- --- Src/Zle/compresult.c	6 Oct 2006 16:48:42 -0000	1.67
+++ Src/Zle/compresult.c	7 Oct 2006 07:57:14 -0000
@@ -30,10 +30,6 @@
 #include "complete.mdh"
 #include "compresult.pro"
 
- -/* The number of columns to leave empty between rows of matches. */
- -
- -#define CM_SPACE  2
- -
 /* This counts how often the list of completions was invalidated.
  * Can be used to detect if we have a new list.  */
 
Index: Src/Zle/computil.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/Zle/computil.c,v
retrieving revision 1.98
diff -u -p -r1.98 computil.c
- --- Src/Zle/computil.c	5 Oct 2006 21:53:27 -0000	1.98
+++ Src/Zle/computil.c	7 Oct 2006 07:57:15 -0000
@@ -33,13 +33,6 @@
 
 /* Help for `_describe'. */
 
- -/*
- - * FIXME this should be defined globally. I have to find other places
- - * where it is used
- - * */
- -
- -#define INTERMATCH_GAP 2
- -
 typedef struct cdset *Cdset;
 typedef struct cdstr *Cdstr;
 typedef struct cdrun *Cdrun;
@@ -161,7 +154,7 @@ cd_group(int maxg)
                 for (str2 = (set2 == set1 ? str1->next : set2->strs);
                      str2; str2 = str2->next)
                     if (str2->desc && !strcmp(str1->desc, str2->desc)) {
- -                        width += INTERMATCH_GAP + str2->width;
+                        width += CM_SPACE + str2->width;
                         if (width > cd_state.maxmlen || num == maxg)
                             break;
                         if (width > cd_state.maxglen)
@@ -276,7 +269,7 @@ cd_prep()
 
         cd_state.gprew = 0;
         for (i = 0; i < cd_state.maxg; i++) {
- -            cd_state.gprew += wids[i] + INTERMATCH_GAP;
+            cd_state.gprew += wids[i] + CM_SPACE;
 	}
 
         if (cd_state.gprew > cd_state.maxmlen && cd_state.maxglen > 1)
@@ -631,7 +624,7 @@ cd_get(char **params)
 		 *     max prefix length (cd_state.pre) +
 		 *     max padding (cd_state.premaxw generously :) +
 		 *     separator length (cd_state.slen) +
- -		 *     inter matches gap (INTERMATCH_GAP) +
+		 *     inter matches gap (CM_SPACE) +
 		 *     max description length (cd_state.suf) +
 		 *     trailing \0
 		 */
@@ -648,7 +641,7 @@ cd_get(char **params)
                     *mp++ = ztrdup(str->match);
 		    strcpy(p, str->str);
 		    p += str->len;
- -                    memset(p, ' ', (l = (cd_state.premaxw - str->width + 
INTERMATCH_GAP)));
+                    memset(p, ' ', (l = (cd_state.premaxw - str->width + 
CM_SPACE)));
 		    p += l;
 		    strcpy(p, cd_state.sep);
 		    p += cd_state.slen;
@@ -750,7 +743,7 @@ cd_get(char **params)
                     }
 
                     strcpy(dbuf, cd_state.sep);
- -		    remw = columns - cd_state.gprew - cd_state.swidth - INTERMATCH_GAP;
+		    remw = columns - cd_state.gprew - cd_state.swidth - CM_SPACE;
 		    p = pp = dbuf + cd_state.slen;
 		    d = str->desc;
 		    w = MB_METASTRWIDTH(d);

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)

iD8DBQFFJ14uR6LMutpd94wRAg2sAJ0UEGWlS8a71UgyyrvsLymk9bMdEwCeOsUD
HOonqPkgXe2cAo/JfTaxXNI=
=mk3C
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-10-07  7:59 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-09-17 14:53 Broken completion with UTF-8 description Andrey Borzenkov
2006-09-17 19:16 ` Peter Stephenson
2006-09-21 17:04   ` PATCH: " Andrey Borzenkov
2006-09-21 17:22     ` Peter Stephenson
2006-09-21 17:44       ` Andrey Borzenkov
2006-10-07  7:58       ` Andrey Borzenkov
2006-09-21 20:02     ` Peter Stephenson
2006-09-23  7:05       ` Andrey Borzenkov
2006-09-23  7:36         ` Andrey Borzenkov

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).