From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 449 invoked from network); 16 Dec 2007 13:54:28 -0000 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-2.6 required=5.0 tests=AWL,BAYES_00 autolearn=ham version=3.2.3 Received: from news.dotsrc.org (HELO a.mx.sunsite.dk) (130.225.247.88) by ns1.primenet.com.au with SMTP; 16 Dec 2007 13:54:28 -0000 Received-SPF: none (ns1.primenet.com.au: domain at sunsite.dk does not designate permitted sender hosts) Received: (qmail 6030 invoked from network); 16 Dec 2007 13:54:24 -0000 Received: from sunsite.dk (130.225.247.90) by a.mx.sunsite.dk with SMTP; 16 Dec 2007 13:54:24 -0000 Received: (qmail 8983 invoked by alias); 16 Dec 2007 13:54:20 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 24264 Received: (qmail 8969 invoked from network); 16 Dec 2007 13:54:19 -0000 Received: from bifrost.dotsrc.org (130.225.254.106) by sunsite.dk with SMTP; 16 Dec 2007 13:54:19 -0000 Received: from virusfilter.dotsrc.org (bifrost [127.0.0.1]) by spamfilter.dotsrc.org (Postfix) with ESMTP id F34FF8058F5F for ; Sun, 16 Dec 2007 14:51:19 +0100 (CET) Received: from mtaout02-winn.ispmail.ntl.com (mtaout02-winn.ispmail.ntl.com [81.103.221.48]) by bifrost.dotsrc.org (Postfix) with ESMTP for ; Sun, 16 Dec 2007 14:51:19 +0100 (CET) Received: from aamtaout04-winn.ispmail.ntl.com ([81.103.221.35]) by mtaout02-winn.ispmail.ntl.com with ESMTP id <20071216135452.BHSN26043.mtaout02-winn.ispmail.ntl.com@aamtaout04-winn.ispmail.ntl.com> for ; Sun, 16 Dec 2007 13:54:52 +0000 Received: from pws-pc ([82.6.96.116]) by aamtaout04-winn.ispmail.ntl.com with SMTP id <20071216135414.WEVU29112.aamtaout04-winn.ispmail.ntl.com@pws-pc> for ; Sun, 16 Dec 2007 13:54:14 +0000 Date: Sun, 16 Dec 2007 13:52:27 +0000 From: Peter Stephenson To: "Zsh Hackers' List" Subject: Re: PATCH: internal parameter flags (resend) Message-Id: <20071216135227.1ca879c2.p.w.stephenson@ntlworld.com> In-Reply-To: <071214215619.ZM2188@torch.brasslantern.com> References: <20071213204318.2ff3e43c.p.w.stephenson@ntlworld.com> <071213204651.ZM446@torch.brasslantern.com> <200712141014.lBEAEmZt001008@news01.csr.com> <071214215619.ZM2188@torch.brasslantern.com> X-Mailer: Sylpheed 2.4.7 (GTK+ 2.12.1; x86_64-redhat-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Virus-Scanned: ClamAV using ClamSMTP On Fri, 14 Dec 2007 21:56:19 -0800 Bart Schaefer wrote: > So, although I still think that ${(P)param} should use the un-altered > value In any case, I've thought of a killer argument why at least one point where we fetch a value internally must retrieve the unaltered value, which is vared. There's no argument in that case. As my whole idea was to avoid complicated rules about what happens where, applying the old behaviour to (P) etc. as well is about the only neat way out. > I'm willing to let the other part of this go because of the > following bad inconsistency with the "old way": > > torch% typeset -Z5 x=6 > torch% print $#x > 5 > torch% print $x[4] > > torch% > > Either $#x should report the "real" length, or $x[4] should index into > the string whose length was counted. I suspect that changing $#x in > this case would break a lot more things than changing the subscript. I'm much more convinced about this part than the other part. What I've done is keep the ordering of flag application and subscripting from the new patch, but added an extra value flag to indicate a call from the substitution code at the point where the flags used to be applied. So this restricts the effect to something that looks uncontroversial. I've also (for the first time that I can see) tried to document the point at which the flags are applied. The simplest way of doing this I could see was to say that generally they are only applied with $param expansions, but explicitly to say they're not applied with the substitution (P) flag. Ismail Dönmez wrote> >Indeed only one warning left: > >expn.yo:1065: No macro: time(...) This was a typo in the original patch. Index: Doc/Zsh/builtins.yo =================================================================== RCS file: /cvsroot/zsh/zsh/Doc/Zsh/builtins.yo,v retrieving revision 1.101 diff -u -r1.101 builtins.yo --- Doc/Zsh/builtins.yo 12 Dec 2007 18:43:29 -0000 1.101 +++ Doc/Zsh/builtins.yo 16 Dec 2007 13:42:30 -0000 @@ -1429,6 +1429,11 @@ flags, and all those flags are introduced with tt(PLUS()), the matching parameter names are printed but their values are not. +Attribute flags that transform the final value (tt(-L), tt(-R), tt(-Z), +tt(-l), tt(u)) are only applied to the expanded value at the point +of a `tt($)' expansion. They are not applied when a parameter is +retrieved internally by the shell for any purpose. + The following attribute flags may be specified: startitem() Index: Doc/Zsh/expn.yo =================================================================== RCS file: /cvsroot/zsh/zsh/Doc/Zsh/expn.yo,v retrieving revision 1.85 diff -u -r1.85 expn.yo --- Doc/Zsh/expn.yo 13 Dec 2007 21:57:18 -0000 1.85 +++ Doc/Zsh/expn.yo 16 Dec 2007 13:42:32 -0000 @@ -785,8 +785,12 @@ ) item(tt(P))( This forces the value of the parameter var(name) to be interpreted as a -further parameter name, whose value will be used where appropriate. If -used with a nested parameter or command substitution, the result of that +further parameter name, whose value will be used where appropriate. +Note that flags set with one of the tt(typeset) family of commands +(in particular case transformations) are not applied to the value of +var(name) used in htis fashion. + +If used with a nested parameter or command substitution, the result of that will be taken as a parameter name in the same way. For example, if you have `tt(foo=bar)' and `tt(bar=baz)', the strings tt(${(P)foo}), tt(${(P)${foo}}), and tt(${(P)$(echo bar)}) will be expanded to `tt(baz)'. @@ -1062,7 +1066,7 @@ substitution then applies the modifier tt(:h) and takes the directory part of the path.) ) -time(tt(2.) em(Internal Parameter Flags))( +item(tt(2.) em(Internal Parameter Flags))( Any parameter flags set by one of the tt(typeset) family of commands, in particular the tt(L), tt(R), tt(Z), tt(u) and tt(l) flags for padding and capitalization, are applied directly to the parameter value. Index: Src/params.c =================================================================== RCS file: /cvsroot/zsh/zsh/Src/params.c,v retrieving revision 1.138 diff -u -r1.138 params.c --- Src/params.c 13 Dec 2007 20:52:56 -0000 1.138 +++ Src/params.c 16 Dec 2007 13:42:35 -0000 @@ -1889,129 +1889,131 @@ break; } - if (v->pm->node.flags & (PM_LEFT|PM_RIGHT_B|PM_RIGHT_Z)) { - int fwidth = v->pm->width ? v->pm->width : MB_METASTRLEN(s); - switch (v->pm->node.flags & (PM_LEFT | PM_RIGHT_B | PM_RIGHT_Z)) { - char *t, *tend; - unsigned int t0; - - case PM_LEFT: - case PM_LEFT | PM_RIGHT_Z: - t = s; - if (v->pm->node.flags & PM_RIGHT_Z) - while (*t == '0') - t++; - else - while (iblank(*t)) - t++; - MB_METACHARINIT(); - for (tend = t, t0 = 0; t0 < fwidth && *tend; t0++) - tend += MB_METACHARLEN(tend); - /* - * t0 is the number of characters from t used, - * hence (fwidth - t0) is the number of padding - * characters. fwidth is a misnomer: we use - * character counts, not character widths. - * - * (tend - t) is the number of bytes we need - * to get fwidth characters or the entire string; - * the characters may be multiple bytes. - */ - fwidth -= t0; /* padding chars remaining */ - t0 = tend - t; /* bytes to copy from string */ - s = (char *) hcalloc(t0 + fwidth + 1); - memcpy(s, t, t0); - if (fwidth) - memset(s + t0, ' ', fwidth); - s[t0 + fwidth] = '\0'; - break; - case PM_RIGHT_B: - case PM_RIGHT_Z: - case PM_RIGHT_Z | PM_RIGHT_B: - { - int zero = 1; - /* Calculate length in possibly multibyte chars */ - unsigned int charlen = MB_METASTRLEN(s); - - if (charlen < fwidth) { - char *valprefend = s; - int preflen; - if (v->pm->node.flags & PM_RIGHT_Z) { - /* - * This is a documented feature: when deciding - * whether to pad with zeroes, ignore - * leading blanks already in the value; - * only look for numbers after that. - * Not sure how useful this really is. - * It's certainly confusing to code around. - */ - for (t = s; iblank(*t); t++) - ; - /* - * Allow padding after initial minus - * for numeric variables. - */ - if ((v->pm->node.flags & - (PM_INTEGER|PM_EFLOAT|PM_FFLOAT)) && - *t == '-') - t++; + if (v->flags & VALFLAG_SUBST) { + if (v->pm->node.flags & (PM_LEFT|PM_RIGHT_B|PM_RIGHT_Z)) { + int fwidth = v->pm->width ? v->pm->width : MB_METASTRLEN(s); + switch (v->pm->node.flags & (PM_LEFT | PM_RIGHT_B | PM_RIGHT_Z)) { + char *t, *tend; + unsigned int t0; + + case PM_LEFT: + case PM_LEFT | PM_RIGHT_Z: + t = s; + if (v->pm->node.flags & PM_RIGHT_Z) + while (*t == '0') + t++; + else + while (iblank(*t)) + t++; + MB_METACHARINIT(); + for (tend = t, t0 = 0; t0 < fwidth && *tend; t0++) + tend += MB_METACHARLEN(tend); + /* + * t0 is the number of characters from t used, + * hence (fwidth - t0) is the number of padding + * characters. fwidth is a misnomer: we use + * character counts, not character widths. + * + * (tend - t) is the number of bytes we need + * to get fwidth characters or the entire string; + * the characters may be multiple bytes. + */ + fwidth -= t0; /* padding chars remaining */ + t0 = tend - t; /* bytes to copy from string */ + s = (char *) hcalloc(t0 + fwidth + 1); + memcpy(s, t, t0); + if (fwidth) + memset(s + t0, ' ', fwidth); + s[t0 + fwidth] = '\0'; + break; + case PM_RIGHT_B: + case PM_RIGHT_Z: + case PM_RIGHT_Z | PM_RIGHT_B: + { + int zero = 1; + /* Calculate length in possibly multibyte chars */ + unsigned int charlen = MB_METASTRLEN(s); + + if (charlen < fwidth) { + char *valprefend = s; + int preflen; + if (v->pm->node.flags & PM_RIGHT_Z) { + /* + * This is a documented feature: when deciding + * whether to pad with zeroes, ignore + * leading blanks already in the value; + * only look for numbers after that. + * Not sure how useful this really is. + * It's certainly confusing to code around. + */ + for (t = s; iblank(*t); t++) + ; + /* + * Allow padding after initial minus + * for numeric variables. + */ + if ((v->pm->node.flags & + (PM_INTEGER|PM_EFLOAT|PM_FFLOAT)) && + *t == '-') + t++; + /* + * Allow padding after initial 0x or + * base# for integer variables. + */ + if (v->pm->node.flags & PM_INTEGER) { + if (isset(CBASES) && + t[0] == '0' && t[1] == 'x') + t += 2; + else if ((valprefend = strchr(t, '#'))) + t = valprefend + 1; + } + valprefend = t; + if (!*t) + zero = 0; + else if (v->pm->node.flags & + (PM_INTEGER|PM_EFLOAT|PM_FFLOAT)) { + /* zero always OK */ + } else if (!idigit(*t)) + zero = 0; + } + /* number of characters needed for padding */ + fwidth -= charlen; + /* bytes from original string */ + t0 = strlen(s); + t = (char *) hcalloc(fwidth + t0 + 1); + /* prefix guaranteed to be single byte chars */ + preflen = valprefend - s; + memset(t + preflen, + (((v->pm->node.flags & PM_RIGHT_B) + || !zero) ? ' ' : '0'), fwidth); /* - * Allow padding after initial 0x or - * base# for integer variables. + * Copy - or 0x or base# before any padding + * zeroes. */ - if (v->pm->node.flags & PM_INTEGER) { - if (isset(CBASES) && - t[0] == '0' && t[1] == 'x') - t += 2; - else if ((valprefend = strchr(t, '#'))) - t = valprefend + 1; - } - valprefend = t; - if (!*t) - zero = 0; - else if (v->pm->node.flags & - (PM_INTEGER|PM_EFLOAT|PM_FFLOAT)) { - /* zero always OK */ - } else if (!idigit(*t)) - zero = 0; + if (preflen) + memcpy(t, s, preflen); + memcpy(t + preflen + fwidth, + valprefend, t0 - preflen); + t[fwidth + t0] = '\0'; + s = t; + } else { + /* Need to skip (charlen - fwidth) chars */ + for (t0 = charlen - fwidth; t0; t0--) + s += MB_METACHARLEN(s); } - /* number of characters needed for padding */ - fwidth -= charlen; - /* bytes from original string */ - t0 = strlen(s); - t = (char *) hcalloc(fwidth + t0 + 1); - /* prefix guaranteed to be single byte chars */ - preflen = valprefend - s; - memset(t + preflen, - (((v->pm->node.flags & PM_RIGHT_B) - || !zero) ? ' ' : '0'), fwidth); - /* - * Copy - or 0x or base# before any padding - * zeroes. - */ - if (preflen) - memcpy(t, s, preflen); - memcpy(t + preflen + fwidth, - valprefend, t0 - preflen); - t[fwidth + t0] = '\0'; - s = t; - } else { - /* Need to skip (charlen - fwidth) chars */ - for (t0 = charlen - fwidth; t0; t0--) - s += MB_METACHARLEN(s); } + break; } + } + switch (v->pm->node.flags & (PM_LOWER | PM_UPPER)) { + case PM_LOWER: + s = casemodify(s, CASMOD_LOWER); + break; + case PM_UPPER: + s = casemodify(s, CASMOD_UPPER); break; } } - switch (v->pm->node.flags & (PM_LOWER | PM_UPPER)) { - case PM_LOWER: - s = casemodify(s, CASMOD_LOWER); - break; - case PM_UPPER: - s = casemodify(s, CASMOD_UPPER); - break; - } if (v->start == 0 && v->end == -1) return s; Index: Src/subst.c =================================================================== RCS file: /cvsroot/zsh/zsh/Src/subst.c,v retrieving revision 1.81 diff -u -r1.81 subst.c --- Src/subst.c 13 Dec 2007 20:52:56 -0000 1.81 +++ Src/subst.c 16 Dec 2007 13:42:36 -0000 @@ -2059,8 +2059,11 @@ * There really is a value. Padding and case * transformations used to be handled here, but * are now handled in getstrvalue() for greater - * consistency. + * consistency. However, we get unexpected effects + * if we allow them to applied on every call, so + * set the flag that allows them to be substituted. */ + v->flags |= VALFLAG_SUBST; val = getstrvalue(v); } } Index: Src/zsh.h =================================================================== RCS file: /cvsroot/zsh/zsh/Src/zsh.h,v retrieving revision 1.117 diff -u -r1.117 zsh.h --- Src/zsh.h 6 Jul 2007 21:52:39 -0000 1.117 +++ Src/zsh.h 16 Dec 2007 13:42:37 -0000 @@ -599,7 +599,8 @@ enum { VALFLAG_INV = 0x0001, /* We are performing inverse subscripting */ - VALFLAG_EMPTY = 0x0002 /* Subscripted range is empty */ + VALFLAG_EMPTY = 0x0002, /* Subscripted range is empty */ + VALFLAG_SUBST = 0x0004 /* Substitution, so apply padding, case flags */ }; #define MAX_ARRLEN 262144 Index: Test/B02typeset.ztst =================================================================== RCS file: /cvsroot/zsh/zsh/Test/B02typeset.ztst,v retrieving revision 1.17 diff -u -r1.17 B02typeset.ztst --- Test/B02typeset.ztst 13 Dec 2007 20:52:56 -0000 1.17 +++ Test/B02typeset.ztst 16 Dec 2007 13:42:37 -0000 @@ -430,17 +430,17 @@ local case1=upper typeset -u case1 print $case1 - UPPER="VALUE OF \$UPPER" + upper="VALUE OF \$UPPER" print ${(P)case1} -0:Upper case conversion +0:Upper case conversion, does not apply to values used internally >UPPER >VALUE OF $UPPER local case2=LOWER typeset -l case2 print $case2 - lower="value of \$lower" + LOWER="value of \$lower" print ${(P)case2} -0:Lower case conversion +0:Lower case conversion, does not apply to values used internally >lower >value of $lower -- Peter Stephenson Web page now at http://homepage.ntlworld.com/p.w.stephenson/