From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24451 invoked from network); 18 Oct 2001 14:17:28 -0000 Received: from unknown (HELO sunsite.dk) (130.225.247.90) by ns1.primenet.com.au with SMTP; 18 Oct 2001 14:17:28 -0000 Received: (qmail 24461 invoked by alias); 18 Oct 2001 14:17:17 -0000 Mailing-List: contact zsh-workers-help@sunsite.dk; run by ezmlm Precedence: bulk X-No-Archive: yes X-Seq: 16080 Received: (qmail 24432 invoked from network); 18 Oct 2001 14:17:15 -0000 Sender: kiddleo@cav.logica.co.uk Message-ID: <3BCEE406.7E3FEF6C@yahoo.co.uk> Date: Thu, 18 Oct 2001 15:15:34 +0100 From: Oliver Kiddle X-Mailer: Mozilla 4.77 [en] (X11; U; Linux 2.2.15 i686) X-Accept-Language: en MIME-Version: 1.0 To: zsh-workers@sunsite.dk Subject: PATCH: specifying arguments in printf formats Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit This patch now adds the feature I mentioned before where arguments can be specified with conversion specifications like '%1$*2$.*3$d' instead of using arguments in order. You shouldn't mix the two styles of specifying arguments - I recommend against it in the manual which is the same as the case with printf(3). If you do mix them, it will work but I may change the exact semantics if it allows me to improve the code. Peter Stephenson wrote: > > Oliver Kiddle wrote: > > The question is how should this interact with the printf(1) feature of > > reusing the format if more arguments remain. The easy answer would be > > to not reuse the format if this feature had been used. As an > > experiment, I've made it remove all arguments up to the last one used. > > This allows interesting things like: > > > > % printf '%2$s %1$s ' 1 2 3 4 5 6 ;echo > > 2 1 4 3 6 5 > > > > I can see this having some uses but I can also see it being a problem > > as this is likely to be used for picking out fields where the arguments > > are some command in $(...). > > Even in that case, the problem is really with the reuse of the format, > rather than the special argument-picking syntax. Maybe it would be best to > have a command-line option to turn it (the reuse of the format specifier, > that is) off --- or even on, since it might be regarded as a little florid > for default behaviour. But I suppose we're going to have to stick with ksh > if we're trying to match it. Reuse of arguments is defined in POSIX so it isn't ksh I'm matching there. And, this new argument specifying feature is not in ksh. I've decided that an option as Peter suggests is the best way to go here. It'll have to turn reuse off so that we are keeping to POSIX/ksh which is a slight pity as the opposite would perhaps be better. Any good suggestions on the choice of option letter? -r and -R (for reuse) are both gone but we want to indicate the opposite of that anyway. For the moment, you can use -r but I've not documented that and will change it later. For ksh compatibility, -r can't be used with -f but I'll probably suggest that be changed on the shell list. Oliver Index: Doc/Zsh/builtins.yo =================================================================== RCS file: /cvsroot/zsh/zsh/Doc/Zsh/builtins.yo,v retrieving revision 1.39 diff -u -r1.39 builtins.yo --- Doc/Zsh/builtins.yo 2001/10/15 11:34:27 1.39 +++ Doc/Zsh/builtins.yo 2001/10/18 14:11:41 @@ -725,9 +725,9 @@ item(tt(printf) var(format) [ var(arg) ... ])( Print the arguments according to the format specification. Formatting rules are the same as used in C. The same escape sequences as for tt(echo) -are recognised in the format. All C format specifications ending in one of -csdiouxXeEfgGn are handled. In addition to this, `tt(%b)' can be used -instead of `tt(%s)' to cause escape sequences in the argument to be +are recognised in the format. All C conversion specifications ending in +one of csdiouxXeEfgGn are handled. In addition to this, `tt(%b)' can be +used instead of `tt(%s)' to cause escape sequences in the argument to be recognised and `tt(%q)' can be used to quote the argument in such a way that allows it to be reused as shell input. With the numeric format specifiers, if the corresponding argument starts with a quote character, @@ -736,6 +736,13 @@ noderef(Arithmetic Evaluation) for a description of arithmetic expressions. With `tt(%n)', the corresponding argument is taken as an identifier which is created as an integer parameter. + +Normally, conversion specifications are applied to each argument in order +but they can explicitly specify the var(n)th argument is to be used by +replacing `tt(%)' by `tt(%)var(n)tt($)' and `tt(*)' by `tt(*)var(n)tt($)'. +It is recommended that you do not mix references of this explicit style +with the normal style and the handling of such mixed styles may be subject +to future change. If arguments remain unused after formatting, the format string is reused until all arguments have been consumed. If more arguments are required by Index: Src/builtin.c =================================================================== RCS file: /cvsroot/zsh/zsh/Src/builtin.c,v retrieving revision 1.59 diff -u -r1.59 builtin.c --- Src/builtin.c 2001/10/16 11:16:11 1.59 +++ Src/builtin.c 2001/10/18 14:11:41 @@ -2892,10 +2892,11 @@ int bin_print(char *name, char **args, char *ops, int func) { - int flen, width, prec, type, argc, n, nnl = 0, ret = 0; + int flen, width, prec, type, argc, n, narg; + int nnl = 0, ret = 0, maxarg = 0; int flags[5], *len; char *start, *endptr, *c, *d, *flag, spec[11], *fmt = NULL; - char **first, *flagch = "0+- #", save, nullstr = '\0'; + char **first, *curarg, *flagch = "0+- #", save, nullstr = '\0'; zlong count = 0; FILE *fout = stdout; @@ -3095,6 +3096,11 @@ /* printf style output */ *spec='%'; do { + if (maxarg) { + first += maxarg; + argc -= maxarg; + maxarg = 0; + } for (c = fmt;c-fmt < flen;c++) { if (*c != '%') { putc(*c, fout); @@ -3111,11 +3117,29 @@ type = prec = -1; width = 0; + curarg = NULL; d = spec + 1; + if (*c >= '1' && *c <= '9') { + narg = strtoul(c, &endptr, 0); + if (*endptr == '$') { + c = endptr + 1; + DPUTS(narg <= 0, "specified zero or negative arg"); + if (narg > argc) { + zwarnnam(name, "%d: argument specifier out of range", + 0, narg); + return 1; + } else { + if (narg > maxarg) maxarg = narg; + curarg = *(first + narg - 1); + } + } + } + + /* copy only one of each flag as spec has finite size */ memset(flags, 0, sizeof(flags)); - while (flag = strchr(flagch, *c)) { + while ((flag = strchr(flagch, *c))) { if (!flags[flag - flagch]) { flags[flag - flagch] = 1; *d++ = *c; @@ -3123,28 +3147,60 @@ c++; } - if (*c == '*') { - if (*args) width = (int)mathevali(*args++); - if (errflag) { - errflag = 0; - ret = 1; - } - c++; - } else if (idigit(*c)) { + if (idigit(*c)) { width = strtoul(c, &endptr, 0); c = endptr; + } else if (*c == '*') { + if (idigit(*++c)) { + narg = strtoul(c, &endptr, 0); + if (*endptr == '$') { + c = endptr + 1; + if (narg > argc || narg <= 0) { + zwarnnam(name, + "%d: argument specifier out of range", + 0, narg); + return 1; + } else { + if (narg > maxarg) maxarg = narg; + args = first + narg - 1; + } + } + } + if (*args) { + width = (int)mathevali(*args++); + if (errflag) { + errflag = 0; + ret = 1; + } + } } *d++ = '*'; if (*c == '.') { - c++; - if (*c == '*') { - prec = (*args) ? (int)mathevali(*args++) : 0; - if (errflag) { - errflag = 0; - ret = 1; + if (*++c == '*') { + if (idigit(*++c)) { + narg = strtoul(c, &endptr, 0); + if (*endptr == '$') { + c = endptr + 1; + if (narg > argc || narg <= 0) { + zwarnnam(name, + "%d: argument specifier out of range", + 0, narg); + return 1; + } else { + if (narg > maxarg) maxarg = narg; + args = first + narg - 1; + } + } + } + + if (*args) { + prec = (int)mathevali(*args++); + if (errflag) { + errflag = 0; + ret = 1; + } } - c++; } else if (idigit(*c)) { prec = strtoul(c, &endptr, 0); c = endptr; @@ -3155,30 +3211,30 @@ /* ignore any size modifier */ if (*c == 'l' || *c == 'L' || *c == 'h') c++; + if (!curarg && *args) curarg = *args++; d[1] = '\0'; switch (*d = *c) { case 'c': - if (*args) { - intval = **args; - args++; + if (curarg) { + intval = *curarg; } else intval = 0; print_val(intval); break; case 's': - stringval = *args ? *args++ : &nullstr; + stringval = curarg ? curarg : &nullstr; print_val(stringval); break; case 'b': - if (*args) { + if (curarg) { int l; - char *b = getkeystring(*args++, &l, ops['b'] ? 2 : 0, &nnl); + char *b = getkeystring(curarg, &l, ops['b'] ? 2 : 0, &nnl); fwrite(b, l, 1, fout); count += l; } break; case 'q': - stringval = *args ? bslashquote(*args++, NULL, 0) : &nullstr; + stringval = curarg ? bslashquote(curarg, NULL, 0) : &nullstr; *d = 's'; print_val(stringval); break; @@ -3200,7 +3256,7 @@ type=3; break; case 'n': - if (*args) setiparam(*args++, count); + if (curarg) setiparam(curarg, count); break; default: if (*c) { @@ -3208,20 +3264,21 @@ c[1] = '\0'; } zwarnnam(name, "%s: invalid directive", start, 0); - ret = 1; if (*c) c[1] = save; + if (fout != stdout) + fclose(fout); + return 1; } if (type > 0) { - if (*args && (**args == '\'' || **args == '"' )) { + if (curarg && (*curarg == '\'' || *curarg == '"' )) { if (type == 2) { - doubleval = (unsigned char)(*args)[1]; + doubleval = (unsigned char)curarg[1]; print_val(doubleval); } else { - intval = (unsigned char)(*args)[1]; + intval = (unsigned char)curarg[1]; print_val(intval); } - args++; } else { switch (type) { case 1: @@ -3229,7 +3286,7 @@ *d++ = 'l'; #endif *d++ = 'l', *d++ = *c, *d = '\0'; - zlongval = (*args) ? mathevali(*args++) : 0; + zlongval = (curarg) ? mathevali(curarg) : 0; if (errflag) { zlongval = 0; errflag = 0; @@ -3238,8 +3295,8 @@ print_val(zlongval) break; case 2: - if (*args) { - mnumval = matheval(*args++); + if (curarg) { + mnumval = matheval(curarg); doubleval = (mnumval.type & MN_FLOAT) ? mnumval.u.d : (double)mnumval.u.l; } else doubleval = 0; @@ -3255,9 +3312,9 @@ *d++ = 'l'; #endif *d++ = 'l', *d++ = *c, *d = '\0'; - zulongval = (*args) ? mathevali(*args++) : 0; + zulongval = (curarg) ? mathevali(curarg) : 0; if (errflag) { - doubleval = 0; + zulongval = 0; errflag = 0; ret = 1; } @@ -3265,10 +3322,13 @@ } } } + if (maxarg && (args - first > maxarg)) + maxarg = args - first; } + if (maxarg) args = first + maxarg; /* if there are remaining args, reuse format string */ - } while (*args && args != first); + } while (*args && args != first && !ops['r']); if (fout != stdout) fclose(fout); Index: Test/B03print.ztst =================================================================== RCS file: /cvsroot/zsh/zsh/Test/B03print.ztst,v retrieving revision 1.2 diff -u -r1.2 B03print.ztst --- Test/B03print.ztst 2001/10/16 11:16:11 1.2 +++ Test/B03print.ztst 2001/10/18 14:11:41 @@ -50,7 +50,9 @@ 0:test b format specifier > \ -# test %q here - it doesn't quite work yet + printf '%q\n' '=a=b \ c!' +0: test q format specifier +>\=a=b\ \\\ c! printf '%c\n' char 0:test c format specifier @@ -108,6 +110,19 @@ ?(eval):1: bad math expression: operator expected at `a' >0 + printf '%12$s' 1 2 3 +1:out of range argument specifier +?(eval):printf:1: 12: argument specifier out of range + + printf '%2$s\n' 1 2 3 +1:out of range argument specifier on format reuse +?(eval):printf:1: 2: argument specifier out of range +>2 + + printf '%*0$d' +1:out of range argument specifier on width +?(eval):printf:1: 0: argument specifier out of range + print -m -f 'format - %s.\n' 'z' a b c 0:format not printed if no arguments left after -m removal @@ -129,7 +144,8 @@ >two b:0x2% >three c:0x3% - printf '%0+- #-08.5dx\n' 123 +# this should fill spec string with '%0+- #*.*d\0' - 11 characters + printf '%1$0+- #-08.5dx\n' 123 0:maximal length format specification >+00123 x @@ -140,3 +156,41 @@ printf '%.*g\n' -1 .1 0:negative precision specified >0.1 + + printf '%2$s %1$d\n' 1 2 +0:specify argument to output explicitly +>2 1 + + printf '%3$.*1$d\n' 4 0 3 +0:specify output and precision arguments explicitly +>0003 + + printf '%2$d%1$d\n' 1 2 3 4 +0:reuse format where arguments are explictly specified +>21 +>43 + + printf '%1$*2$d' 1 2 3 4 5 6 7 8 9 10;echo +0:reuse of specified arguments +> 1 3 5 7 9 + + printf '%1$0+.3d\n' 3 +0:flags mixed with specified argument +>+003 + +# The following usage, as stated in the manual, is not recommended and the +# results are undefined. Tests are here anyway to ensure some form of +# half-sane behaviour. + + printf '%2$s %s %3$s\n' Morning Good World +0:mixed style of argument selection +>Good Morning World + + printf '%*1$.*d\n' 1 2 +0:argument specified for width only +>00 + + print -f '%*.*1$d\n' 1 2 3 +0:argument specified for precision only +>2 +>000