From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 27354 invoked by alias); 17 Nov 2015 17:40:04 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 37128 Received: (qmail 11111 invoked from network); 17 Nov 2015 17:40:02 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=ham autolearn_force=no version=3.4.0 X-AuditID: cbfec7f4-f79c56d0000012ee-c3-564b64142be3 Date: Tue, 17 Nov 2015 17:29:51 +0000 From: Peter Stephenson To: zsh-workers@zsh.org Subject: Re: Bug in alias expansion Message-id: <20151117172951.01bf1825@pwslap01u.europe.root.pri> In-reply-to: <20151115200356.0f3a80a2@ntlworld.com> References: <20151115200356.0f3a80a2@ntlworld.com> Organization: Samsung Cambridge Solution Centre X-Mailer: Claws Mail 3.7.9 (GTK+ 2.22.0; i386-redhat-linux-gnu) MIME-version: 1.0 Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7bit X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrMLMWRmVeSWpSXmKPExsVy+t/xq7oiKd5hBuf7lC0ONj9kcmD0WHXw A1MAYxSXTUpqTmZZapG+XQJXxpaOfSwFb5UqDn9qZWpgnCzdxcjJISFgInHo3V1mCFtM4sK9 9WxdjFwcQgJLGSUO7tnLDuHMYJJ4cmoTI4SzjVHi7Kf/QA4HB4uAqsSPVm+QbjYBQ4mpm2Yz gtgiAuISZ9eeZwGxhQWUJVYcmAC2gVfAXuLhvw9gNZwCxhLX9iwAiwsJVEssWrmJDcTmF9CX uPr3ExPERfYSM6+cYYToFZT4Mfke2ExmAS2JzduaWCFseYnNa95CzVGXuHF3N/sERqFZSFpm IWmZhaRlASPzKkbR1NLkguKk9FxDveLE3OLSvHS95PzcTYyQoP2yg3HxMatDjAIcjEo8vALH vcKEWBPLiitzDzFKcDArifByWnmHCfGmJFZWpRblxxeV5qQWH2KU5mBREuedu+t9iJBAemJJ anZqakFqEUyWiYNTqoEx0I3r2Nr7io++GB1g28UbwVP8LUa6P61N5fZpyar63FJHn4BlkckC ks92VO0OWV2i1ViwITFb8Mm5u7aaf64u+jEra1mDXcfTNbs3Pt9u+nWVdkPPdCbB6unPbN++ ZfM8tM9AU2T3cf5Yg4/LVFVFDBh/L9h96Wn3G/cTSikHAn0OT3in3yaqxFKckWioxVxUnAgA erkDXlYCAAA= On Sun, 15 Nov 2015 20:03:56 +0000 Peter Stephenson wrote: > On Fri, 13 Nov 2015 13:03:52 -0500 > Kynn Jones wrote: > > % typeset -A frobozz > > % alias -g foo='echo xyz' > > % frobozz[$(foo)]=9 > > zsh: not an identifier: frobozz[$(fooech9 >... > This is assuming the tokstr you get back is the same you sent down, > which it may not because it might need reallocating --- it does indeed > in this case because we are expanding an alias inside. It confused me a > bit that it's only the end of the string (s, from lexbuf.ptr) that's > being passed back, but that's presumably because it's assuming the start > doesn't move, which is the (or an) erroneous assumption. It's possible > the fix therefore is simply ensuring that both the start and the end of > the string are pased back and the start passed in forgotten. I abandoned this after finding out that a whole tower of functions (is that a collective noun for functions, or only in completion code?) sitting on top of parse_subscript() is relying on the in-place tokenisation that was screwing up the string. We can get rid of the currently visible problems by allocating a new string for the token and copying back the result to the original. It's not very efficient, and it still has the problem that if something funny happens to the length --- and I don't think there's any guarantee of length preservation built into the lexer even though it happens to work in the cases we've looked at so far --- then somebody's going to get hurt. However, it does remove the immediate problem (is functionally no worse than before the problem was introduced) and the inefficiency should be minor unless the subscripts get really huge. In the error case, we could possibly just not do the copy but move the end pointer; however, that case doesn't need to be efficient and I don't want to change more of the logic than is necessary. diff --git a/Src/lex.c b/Src/lex.c index 89af961..81904c1 100644 --- a/Src/lex.c +++ b/Src/lex.c @@ -1617,7 +1617,7 @@ parsestrnoerr(char **s) mod_export char * parse_subscript(char *s, int sub, int endchar) { - int l = strlen(s), err; + int l = strlen(s), err, toklen; char *t; if (!*s || *s == endchar) @@ -1626,18 +1626,34 @@ parse_subscript(char *s, int sub, int endchar) untokenize(t = dupstring(s)); inpush(t, 0, NULL); strinbeg(0); + /* + * Warning to Future Generations: + * + * This way of passing the subscript through the lexer is brittle. + * Code above this for several layers assumes that when we tokenise + * the input it goes into the same place as the original string. + * However, the lexer may overwrite later bits of the string or + * reallocate it, in particular when expanding aliaes. To get + * around this, we copy the string and then copy it back. This is a + * bit more robust but still relies on the underlying assumption of + * length preservation. + */ lexbuf.len = 0; - lexbuf.ptr = tokstr = s; + lexbuf.ptr = tokstr = dupstring(s); lexbuf.siz = l + 1; err = dquote_parse(endchar, sub); + toklen = (int)(lexbuf.ptr - tokstr); + DPUTS(toklen > l, "Bad length for parsed subscript"); + memcpy(s, tokstr, toklen); if (err) { - err = *lexbuf.ptr; - *lexbuf.ptr = '\0'; + char *strend = s + toklen; + err = *strend; + *strend = '\0'; untokenize(s); - *lexbuf.ptr = err; + *strend = err; s = NULL; } else { - s = lexbuf.ptr; + s += toklen; } strinend(); inpop(); diff --git a/Test/D06subscript.ztst b/Test/D06subscript.ztst index cffca74..1449236 100644 --- a/Test/D06subscript.ztst +++ b/Test/D06subscript.ztst @@ -249,3 +249,20 @@ string[0]=! 1:Can't set only element zero of string ?(eval):1: string: assignment to invalid subscript range + + typeset -A assoc=(leader topcat officer dibble sidekick choochoo) + alias myind='echo leader' myletter='echo 1' myletter2='echo 4' + print ${assoc[$(myind)]} + print $assoc[$(myind)] + print ${assoc[$(myind)][$(myletter)]}${assoc[$(myind)][$(myletter2)]} + assoc[$(myind)]='of the gang' + print ${assoc[$(myind)]} + print $assoc[$(myind)] + print $assoc[leader] +0: Parsing subscript with non-trivial tokenisation +>topcat +>topcat +>tc +>of the gang +>of the gang +>of the gang