From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 3370 invoked by alias); 6 Mar 2017 17:10:50 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 40763 Received: (qmail 2082 invoked from network); 6 Mar 2017 17:10:50 -0000 X-Qmail-Scanner-Diagnostics: from park01.gkg.net by f.primenet.com.au (envelope-from , uid 7791) with qmail-scanner-2.11 (clamdscan: 0.99.2/21882. spamassassin: 3.4.1. Clear:RC:0(205.235.26.22):SA:0(0.0/5.0):. Processed in 0.781278 secs); 06 Mar 2017 17:10:50 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, RP_MATCHES_RCVD,T_DKIM_INVALID autolearn=unavailable autolearn_force=no version=3.4.1 X-Envelope-From: SRS0=FxGN=2P=brasslantern.com=schaefer@bounces.park01.gkg.net X-Qmail-Scanner-Mime-Attachments: | X-Qmail-Scanner-Zip-Files: | Received-SPF: none (ns1.primenet.com.au: domain at bounces.park01.gkg.net does not designate permitted sender hosts) X-Virus-Scanned: by amavisd-new at gkg.net Authentication-Results: amavisd4.gkg.net (amavisd-new); dkim=pass (2048-bit key) header.d=brasslantern-com.20150623.gappssmtp.com X-Greylist: from auto-whitelisted by SQLgrey-1.8.0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brasslantern-com.20150623.gappssmtp.com; s=20150623; h=from:message-id:date:in-reply-to:comments:references:to:subject :mime-version; bh=lbbbm5iD2I0q4FIhPlsODqYSnurkxh1eVuluzqByMXw=; b=t9cUx4ZvRk0X07nITBN+iUSeYdM2WPMR/U62KNdUXMgQjPCsRha3Qgel7WWKpcQaE0 g913kdTbA0NgSuO5NIy2y4Ky0Z6XoWmR3NJNE2JjC7Q2u3taE+uNIe8ymc2J06XKtf8Q 0jCNs8NzlCu6jQw0X6bzhP1g3CplyDK7JcZEqsSMZpZvqprZj7xYAv82jGIZrwvM23JY NZDYW2uAAwZn2w+JfQSoExUkCuGQYG+S9w0cFYe6nxO1/p5WqWZA7A9B28zLSDCv9SAR veHiOvDSFlWvoCrTsRO21OyDKf9GqRGWBhpxY6/D64U3N+iSokBqvlbmIROgahr3mtG/ UHIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:date:in-reply-to:comments :references:to:subject:mime-version; bh=lbbbm5iD2I0q4FIhPlsODqYSnurkxh1eVuluzqByMXw=; b=bJTcjVWVtONLgKinU/956DUh62xId/nK99KP9/JaQIr8/7T7PVUH6IgfVaX+WHs6m0 N+uOfgnvOHid5ZI7AJ61IVSCmzcw3o8gI6ErwpNOkRInn2jLK7gd1ezLWaXmoAjKGQdb l709UKLp/a3z6kKU/qJSTPmOj+t7gTYkEzJOnWKb8C273x3Uqe/FcI+yvCOT84+fgg+j NuAoMplZkpIXhhZSQdL7YLcJqoVFnOSHySJC8upD5Y8H8UbXLf3W32xjYVcxODeXPBzh I9pSv7faRh1R+VbDOvy4DdluSvhKgqyuxMvt75Ku4zdiHHpTtMNggW9Z3IBexWossIZm 5Jtg== X-Gm-Message-State: AMke39lNW0n7K+ZFuHgSBHgG1p//19kBCkiztlxyqRdnpB+hetrh8r2rD+zbtaOfwqKyjg== X-Received: by 10.159.39.193 with SMTP id b59mr6470266uab.3.1488820219201; Mon, 06 Mar 2017 09:10:19 -0800 (PST) From: Bart Schaefer Message-Id: <170306091043.ZM17901@torch.brasslantern.com> Date: Mon, 6 Mar 2017 09:10:43 -0800 In-Reply-To: <20170306094744.290c1fbd@pwslap01u.europe.root.pri> Comments: In reply to Peter Stephenson "Re: [BUG] SIGSEGV under certain circumstances" (Mar 6, 9:47am) References: <170304151137.ZM30694@torch.brasslantern.com> <170305080054.ZM24832@torch.brasslantern.com> <20170305161720.6f3773d6@ntlworld.com> <170305104239.ZM25231@torch.brasslantern.com> <5096E600-D76C-4F71-BE93-C46F256BA7D7@ntlworld.com> <170305134513.ZM26364@torch.brasslantern.com> <20170306094744.290c1fbd@pwslap01u.europe.root.pri> X-Mailer: OpenZMail Classic (0.9.2 24April2005) To: zsh-workers@zsh.org Subject: Re: [BUG] SIGSEGV under certain circumstances MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii On Mar 6, 9:47am, Peter Stephenson wrote: } Subject: Re: [BUG] SIGSEGV under certain circumstances } } Bart Schaefer wrote: } > Am I going astray here? } } I don't understand your point since the functions being called here are } all adapted for metafied multibyte characters, but I'm obviously missing } something. No, it's me that's missing something. The patch I sent in 40754 is more complicated than necessary, because I didn't bother to check whether the word "meta" in the name "mb_metacharlenconv" actually meant that it was handling zsh metafied pairs. At the same time, this still means that pattern_match() was using the unmetafied wide characters to do comparisons, but cfp_matcher_pats() was counting bytes as if there were at most one metafied pair making up the whole of each character. There's also the question of whether there is one entry in the "ms" array of Cmatcher for every byte in the input string (including the Meta bytes), or one entry for every character, or (the worst case) one entry for every byte-or-metafied-pair of which there may be more than one of each per wide character. I think the answer is that since the ms array is being filled here to be passed down to cfp_matcher_range(), it should be one entry for each character. cfp_matcher_range() does mp++ at the end of its loop. So incrementing mp by one each time in cfp_matcher_pats() must also be the right thing. Here's a corrected patch with unmeta_one() simplified. Again this is against the master, not on top of previous patches. I'm reasonably confident in this now, so inclined to commit and wait for explosions. Y02compmatch passes, so I don't think anything basic is broken. diff --git a/Src/Zle/compmatch.c b/Src/Zle/compmatch.c index aedf463..1cdbb8a 100644 --- a/Src/Zle/compmatch.c +++ b/Src/Zle/compmatch.c @@ -1548,27 +1548,11 @@ pattern_match(Cpattern p, char *s, Cpattern wp, char *ws) { convchar_t c, wc; convchar_t ind, wind; - int len = 0, wlen, mt, wmt; -#ifdef MULTIBYTE_SUPPORT - mbstate_t lstate, wstate; - - memset(&lstate, 0, sizeof(lstate)); - memset(&wstate, 0, sizeof(wstate)); -#endif + int len = 0, wlen = 0, mt, wmt; while (p && wp && *s && *ws) { /* First test the word character */ -#ifdef MULTIBYTE_SUPPORT - wlen = mb_metacharlenconv_r(ws, &wc, &wstate); -#else - if (*ws == Meta) { - wc = STOUC(ws[1]) ^ 32; - wlen = 2; - } else { - wc = STOUC(*ws); - wlen = 1; - } -#endif + wc = unmeta_one(ws, &wlen); wind = pattern_match1(wp, wc, &wmt); if (!wind) return 0; @@ -1576,18 +1560,7 @@ pattern_match(Cpattern p, char *s, Cpattern wp, char *ws) /* * Now the line character. */ -#ifdef MULTIBYTE_SUPPORT - len = mb_metacharlenconv_r(s, &c, &lstate); -#else - /* We have the character itself. */ - if (*s == Meta) { - c = STOUC(s[1]) ^ 32; - len = 2; - } else { - c = STOUC(*s); - len = 1; - } -#endif + c = unmeta_one(s, &len); /* * If either is "?", they match each other; no further tests. * Apply this even if the character wasn't convertable; @@ -1627,17 +1600,7 @@ pattern_match(Cpattern p, char *s, Cpattern wp, char *ws) } while (p && *s) { -#ifdef MULTIBYTE_SUPPORT - len = mb_metacharlenconv_r(s, &c, &lstate); -#else - if (*s == Meta) { - c = STOUC(s[1]) ^ 32; - len = 2; - } else { - c = STOUC(*s); - len = 1; - } -#endif + c = unmeta_one(s, &len); if (!pattern_match1(p, c, &mt)) return 0; p = p->next; @@ -1645,17 +1608,7 @@ pattern_match(Cpattern p, char *s, Cpattern wp, char *ws) } while (wp && *ws) { -#ifdef MULTIBYTE_SUPPORT - wlen = mb_metacharlenconv_r(ws, &wc, &wstate); -#else - if (*ws == Meta) { - wc = STOUC(ws[1]) ^ 32; - wlen = 2; - } else { - wc = STOUC(*ws); - wlen = 1; - } -#endif + wc = unmeta_one(ws, &wlen); if (!pattern_match1(wp, wc, &wmt)) return 0; wp = wp->next; diff --git a/Src/Zle/computil.c b/Src/Zle/computil.c index 325da6d..e704f9f 100644 --- a/Src/Zle/computil.c +++ b/Src/Zle/computil.c @@ -4465,17 +4465,24 @@ cfp_matcher_pats(char *matcher, char *add) if (m && m != pcm_err) { char *tmp; int al = strlen(add), zl = ztrlen(add), tl, cl; - VARARR(Cmatcher, ms, zl); + VARARR(Cmatcher, ms, zl); /* One Cmatcher per character */ Cmatcher *mp; Cpattern stopp; int stopl = 0; + /* zl >= (number of wide characters) is guaranteed */ memset(ms, 0, zl * sizeof(Cmatcher)); for (; m && *add; m = m->next) { stopp = NULL; if (!(m->flags & (CMF_LEFT|CMF_RIGHT))) { if (m->llen == 1 && m->wlen == 1) { + /* + * In this loop and similar loops below we step + * through tmp one (possibly wide) character at a + * time. pattern_match() compares only the first + * character using unmeta_one() so keep in step. + */ for (tmp = add, tl = al, mp = ms; tl; ) { if (pattern_match(m->line, tmp, NULL, NULL)) { if (*mp) { @@ -4485,10 +4492,10 @@ cfp_matcher_pats(char *matcher, char *add) } else *mp = m; } - cl = (*tmp == Meta) ? 2 : 1; + (void) unmeta_one(tmp, &cl); tl -= cl; tmp += cl; - mp += cl; + mp++; } } else { stopp = m->line; @@ -4505,10 +4512,10 @@ cfp_matcher_pats(char *matcher, char *add) } else *mp = m; } - cl = (*tmp == Meta) ? 2 : 1; + (void) unmeta_one(tmp, &cl); tl -= cl; tmp += cl; - mp += cl; + mp++; } } else if (m->llen) { stopp = m->line; @@ -4531,7 +4538,7 @@ cfp_matcher_pats(char *matcher, char *add) al = tmp - add; break; } - cl = (*tmp == Meta) ? 2 : 1; + (void) unmeta_one(tmp, &cl); tl -= cl; tmp += cl; } diff --git a/Src/utils.c b/Src/utils.c index 7f3ddad..839575b 100644 --- a/Src/utils.c +++ b/Src/utils.c @@ -4788,6 +4788,48 @@ unmeta(const char *file_name) } /* + * Unmetafy just one character and store the number of bytes it occupied. + */ +/**/ +mod_export convchar_t +unmeta_one(const char *in, int *sz) +{ + convchar_t wc; + int newsz; +#ifdef MULTIBYTE_SUPPORT + int ulen; + mbstate_t wstate; +#endif + + if (!sz) + sz = &newsz; + *sz = 0; + + if (!in || !*in) + return 0; + +#ifdef MULTIBYTE_SUPPORT + memset(&wstate, 0, sizeof(wstate)); + ulen = mb_metacharlenconv_r(in, &wc, &wstate); + while (ulen-- > 0) { + if (in[*sz] == Meta) + *sz += 2; + else + *sz += 1; + } +#else + if (in[0] == Meta) { + *sz = 2; + wc = STOUC(in[1] ^ 32); + } else { + *sz = 1; + wc = STOUC(in[0]); + } +#endif + return wc; +} + +/* * Unmetafy and compare two strings, comparing unsigned character values. * "a\0" sorts after "a". *