* printf \045 (or whatever the character code for % is) @ 2010-12-29 21:11 Jilles Tjoelker 2010-12-29 23:55 ` Alexey I. Froloff 2011-01-05 17:39 ` Peter Stephenson 0 siblings, 2 replies; 6+ messages in thread From: Jilles Tjoelker @ 2010-12-29 21:11 UTC (permalink / raw) To: Zsh Hackers' List While trying to run the FreeBSD sh testsuite with zsh, various tests fail because the printf builtin interprets \045 (ASCII) as a percent sign introducing a format specification instead of a literal percent sign. The \045 arises because I create all 255 non-zero byte values via octal escapes. POSIX's description assumes that the backslash escapes and format specifications are processed in one pass and simply says that an octal escape sequence shall write the corresponding byte. If they are separate passes the backslash escape removal step needs to know about percent signs. The sequences \% and \x25 are not specified and there seems little reason to use them. Input: printf '\045\n' Expected result: Succeeds and prints "%". Actual result: Fails and prints error message "printf: %\n: invalid directive". Input: printf '\045d.%d\n' 4 5 Expected result: %d.4 %d.5 Actual result: 4.5 -- Jilles Tjoelker ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: printf \045 (or whatever the character code for % is) 2010-12-29 21:11 printf \045 (or whatever the character code for % is) Jilles Tjoelker @ 2010-12-29 23:55 ` Alexey I. Froloff 2011-01-05 17:39 ` Peter Stephenson 1 sibling, 0 replies; 6+ messages in thread From: Alexey I. Froloff @ 2010-12-29 23:55 UTC (permalink / raw) To: zsh-workers [-- Attachment #1: Type: text/plain, Size: 253 bytes --] On Wed, Dec 29, 2010 at 10:11:55PM +0100, Jilles Tjoelker wrote: > POSIX's description assumes ... $ runas /bin/zsh sh -c 'printf '\''\045\n'\' % $ printf '%%\n' % $ -- Regards, -- Sir Raorn. --- http://thousandsofhate.blogspot.com/ [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 836 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: printf \045 (or whatever the character code for % is) 2010-12-29 21:11 printf \045 (or whatever the character code for % is) Jilles Tjoelker 2010-12-29 23:55 ` Alexey I. Froloff @ 2011-01-05 17:39 ` Peter Stephenson 2011-01-06 4:46 ` Bart Schaefer 1 sibling, 1 reply; 6+ messages in thread From: Peter Stephenson @ 2011-01-05 17:39 UTC (permalink / raw) To: Zsh Hackers' List On Wed, 29 Dec 2010 22:11:55 +0100 Jilles Tjoelker <jilles@stack.nl> wrote: > While trying to run the FreeBSD sh testsuite with zsh, various tests > fail because the printf builtin interprets \045 (ASCII) as a percent > sign introducing a format specification instead of a literal percent > sign. The \045 arises because I create all 255 non-zero byte values via > octal escapes. > > POSIX's description assumes that the backslash escapes and format > specifications are processed in one pass and simply says that an octal > escape sequence shall write the corresponding byte. If they are separate > passes the backslash escape removal step needs to know about percent > signs. That's a reasonable assumption, but the function handling print is an appalling mess so it's not easy to fix without a major rewrite. The code for printf doesn't really have any business being associated with the code for print, they're there for different purposes entirely based on completely different specifications. At the moment printf does the same as 'print -f', so it has all the same oddities as print whether it should or not. (In my opinion, anyone deliberately asking for combined print and printf behaviour deserves everything they get so I'm perfectly happy to let 'print -f' fester while standardising printf.) However, I never get volunteers for tidying the shell up, so we're probably stuck until someone gets fed up enough to look into it. -- Peter Stephenson <p.w.stephenson@ntlworld.com> Web page now at http://homepage.ntlworld.com/p.w.stephenson/ ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: printf \045 (or whatever the character code for % is) 2011-01-05 17:39 ` Peter Stephenson @ 2011-01-06 4:46 ` Bart Schaefer 2011-01-06 12:09 ` Peter Stephenson 0 siblings, 1 reply; 6+ messages in thread From: Bart Schaefer @ 2011-01-06 4:46 UTC (permalink / raw) To: Zsh Hackers' List On Jan 5, 5:39pm, Peter Stephenson wrote: } } > POSIX's description assumes that the backslash escapes and format } > specifications are processed in one pass and simply says that an } > octal escape sequence shall write the corresponding byte. If they } > are separate passes the backslash escape removal step needs to know } > about percent signs. } } That's a reasonable assumption, but the function handling print is an } appalling mess so it's not easy to fix without a major rewrite. The octal escapes are all handled by getkeystring(). There's already a special macro GETKEYS_PRINTF_FMT which (unfortunately?) is used for both "printf" and "print -f". } The code for printf doesn't really have any business being associated } with the code for print, they're there for different purposes entirely } based on completely different specifications. At the moment printf } does the same as 'print -f', so it has all the same oddities as print } whether it should or not. I'm not entirely sure that's true. The printf builtin doesn't accept any options, which means that except for the initial getkeystring(), nearly everything in bin_print() is ignored until you get down to the part that handles the format spec ... and that can't be replaced by e.g. sprintf() because of misc. special formats like %b and %q. } (In my opinion, anyone deliberately asking for combined print and } printf behaviour deserves everything they get so I'm perfectly happy } to let 'print -f' fester while standardising printf.) Although we could rip the format handling out of bin_print() and create a new bin_printf() [which would be called by "print -f"?] we'd still need something akin to getkeystring() for the octal escapes. } However, I never get volunteers for tidying the shell up, so we're } probably stuck until someone gets fed up enough to look into it. GETKEYS_PRINTF_FMT expands to GETKEY_OCTAL_ESC|GETKEY_BACKSLASH_C ... seems as though an additional flag to getkeystring() could be used to cause \045 to expand to %% as a special case, something like this in utils.c: @@ -5517,6 +5522,8 @@ } *t++ = zstrtol(s + (*s == 'x'), &s, (*s == 'x') ? 16 : 8); + if ((how & GETKEY_PRINTF) && t[-1] == '%') + *t++ = '%'; if (svchar) { u[3] = svchar; svchar = '\0'; The flag bits for "how" are an enum in zsh.h and I'm undecided whether to renumber them or just add another to the end, so I have't included a complete patch. Also I don't know whether the intent is that \045 (and \x25) should become %% only for "printf" or also for "print -f", so no patch for builtin.c yet either. -- ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: printf \045 (or whatever the character code for % is) 2011-01-06 4:46 ` Bart Schaefer @ 2011-01-06 12:09 ` Peter Stephenson 2011-01-06 16:01 ` Bart Schaefer 0 siblings, 1 reply; 6+ messages in thread From: Peter Stephenson @ 2011-01-06 12:09 UTC (permalink / raw) To: Zsh Hackers' List On Wed, 5 Jan 2011 20:46:12 -0800 Bart Schaefer <schaefer@brasslantern.com> wrote: > } The code for printf doesn't really have any business being > associated } with the code for print, they're there for different > purposes entirely } based on completely different specifications. At > the moment printf } does the same as 'print -f', so it has all the > same oddities as print } whether it should or not. > > I'm not entirely sure that's true. The printf builtin doesn't accept > any options, which means that except for the initial getkeystring(), > nearly everything in bin_print() is ignored until you get down to the > part that handles the format spec ... and that can't be replaced by > e.g. sprintf() because of misc. special formats like %b and %q. That's kind of why print & printf really ought to be separate, with common subroutines where needed (the lack of modularity in bin_print() is one of the big issues). But that's moot for now... > GETKEYS_PRINTF_FMT expands to GETKEY_OCTAL_ESC|GETKEY_BACKSLASH_C ... > seems as though an additional flag to getkeystring() could be used to > cause \045 to expand to %% as a special case, something like this in > utils.c: That's sneaky, that should be OK. > @@ -5517,6 +5522,8 @@ > } > *t++ = zstrtol(s + (*s == 'x'), &s, > (*s == 'x') ? 16 : 8); > + if ((how & GETKEY_PRINTF) && t[-1] == '%') > + *t++ = '%'; > if (svchar) { > u[3] = svchar; > svchar = '\0'; Presumably since we're contracting an escape sequence there's always enough allocated space for the extra '%'. > The flag bits for "how" are an enum in zsh.h and I'm undecided whether > to renumber them or just add another to the end, so I have't included > a complete patch. Also I don't know whether the intent is that \045 > (and \x25) should become %% only for "printf" or also for "print -f", > so no patch for builtin.c yet either. There's no particularly well-defined order in the enum, although the more recherché options tend to be later (but only because they were added later). Unless we go down the route of separate builtin handlers, I think it would be better to keep printf and print -f in sync for now. For one thing, making them different puts yet another strain on bin_print(); for another, we haven't yet gone into the details of where printf actually needs to be different from print (we'd need to look at the relevant standards for printf to see where the code is doing the wrong thing at present). -- Peter Stephenson <pws@csr.com> Software Engineer Tel: +44 (0)1223 692070 Cambridge Silicon Radio Limited Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: printf \045 (or whatever the character code for % is) 2011-01-06 12:09 ` Peter Stephenson @ 2011-01-06 16:01 ` Bart Schaefer 0 siblings, 0 replies; 6+ messages in thread From: Bart Schaefer @ 2011-01-06 16:01 UTC (permalink / raw) To: Zsh Hackers' List On Jan 6, 12:09pm, Peter Stephenson wrote: > Subject: Re: printf \045 (or whatever the character code for % is) > > On Wed, 5 Jan 2011 20:46:12 -0800 > Bart Schaefer <schaefer@brasslantern.com> wrote: > > GETKEYS_PRINTF_FMT expands to GETKEY_OCTAL_ESC|GETKEY_BACKSLASH_C ... > > seems as though an additional flag to getkeystring() could be used to > > cause \045 to expand to %% as a special case, something like this in > > utils.c: > > That's sneaky, that should be OK. > > Presumably since we're contracting an escape sequence there's always enough > allocated space for the extra '%'. It's a bit hard to follow getkeystring() in the multibyte branches, but as that should always be allocating more rather than less space, I believe the answer is "that's correct". Minimum it's \45 -> %%. > Unless we go down the route of separate builtin handlers, I think it > would be better to keep printf and print -f in sync for now. In that case, no need to touch builtin.c at all. > [...] we haven't yet gone into the details of where printf actually > needs to be different from print (we'd need to look at the relevant > standards for printf to see where the code is doing the wrong thing at > present). For one thing, hasn't austin-group been discussing \Cx where zsh at present uses the old Emacs syntax of \C-x ? (Revision numbers below are from my local repository, ignore them.) Index: Src/utils.c =================================================================== RCS file: /extra/cvsroot/zsh/zsh-4.0/Src/utils.c,v retrieving revision 1.40 diff -c -r1.40 utils.c --- utils.c 21 Dec 2010 16:41:16 -0000 1.40 +++ utils.c 6 Jan 2011 15:43:40 -0000 @@ -5517,6 +5522,8 @@ } *t++ = zstrtol(s + (*s == 'x'), &s, (*s == 'x') ? 16 : 8); + if ((how & GETKEY_PRINTF_PERCENT) && t[-1] == '%') + *t++ = '%'; if (svchar) { u[3] = svchar; svchar = '\0'; Index: Src/zsh.h =================================================================== RCS file: /extra/cvsroot/zsh/zsh-4.0/Src/zsh.h,v retrieving revision 1.43 diff -c -r1.43 zsh.h --- zsh.h 21 Dec 2010 16:41:16 -0000 1.43 +++ zsh.h 6 Jan 2011 15:50:06 -0000 @@ -2492,7 +2492,11 @@ * Yes, I know that doesn't seem to make much sense. * It's for use in completion, comprenez? */ - GETKEY_UPDATE_OFFSET = (1 << 7) + GETKEY_UPDATE_OFFSET = (1 << 7), + /* + * When replacing numeric escapes for printf format strings, % -> %% + */ + GETKEY_PRINTF_PERCENT = (1 << 8) }; /* @@ -2501,8 +2505,9 @@ */ /* echo builtin */ #define GETKEYS_ECHO (GETKEY_BACKSLASH_C) -/* printf format string: \123 -> S, \0123 -> NL 3 */ -#define GETKEYS_PRINTF_FMT (GETKEY_OCTAL_ESC|GETKEY_BACKSLASH_C) +/* printf format string: \123 -> S, \0123 -> NL 3, \045 -> %% */ +#define GETKEYS_PRINTF_FMT \ + (GETKEY_OCTAL_ESC|GETKEY_BACKSLASH_C|GETKEY_PRINTF_PERCENT) /* printf argument: \123 -> \123, \0123 -> S */ #define GETKEYS_PRINTF_ARG (GETKEY_BACKSLASH_C) /* Full print without -e */ ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-01-06 16:02 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-12-29 21:11 printf \045 (or whatever the character code for % is) Jilles Tjoelker 2010-12-29 23:55 ` Alexey I. Froloff 2011-01-05 17:39 ` Peter Stephenson 2011-01-06 4:46 ` Bart Schaefer 2011-01-06 12:09 ` Peter Stephenson 2011-01-06 16:01 ` Bart Schaefer
Code repositories for project(s) associated with this public inbox https://git.vuxu.org/mirror/zsh/ This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).