zsh-workers
 help / color / mirror / code / Atom feed
* printf \045 (or whatever the character code for % is)
@ 2010-12-29 21:11 Jilles Tjoelker
  2010-12-29 23:55 ` Alexey I. Froloff
  2011-01-05 17:39 ` Peter Stephenson
  0 siblings, 2 replies; 6+ messages in thread
From: Jilles Tjoelker @ 2010-12-29 21:11 UTC (permalink / raw)
  To: Zsh Hackers' List

While trying to run the FreeBSD sh testsuite with zsh, various tests
fail because the printf builtin interprets \045 (ASCII) as a percent
sign introducing a format specification instead of a literal percent
sign. The \045 arises because I create all 255 non-zero byte values via
octal escapes.

POSIX's description assumes that the backslash escapes and format
specifications are processed in one pass and simply says that an octal
escape sequence shall write the corresponding byte. If they are separate
passes the backslash escape removal step needs to know about percent
signs.

The sequences \% and \x25 are not specified and there seems little
reason to use them.

Input:
printf '\045\n'
Expected result:
Succeeds and prints "%".
Actual result:
Fails and prints error message "printf: %\n: invalid directive".

Input:
printf '\045d.%d\n' 4 5
Expected result:
%d.4
%d.5
Actual result:
4.5

-- 
Jilles Tjoelker


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: printf \045 (or whatever the character code for % is)
  2010-12-29 21:11 printf \045 (or whatever the character code for % is) Jilles Tjoelker
@ 2010-12-29 23:55 ` Alexey I. Froloff
  2011-01-05 17:39 ` Peter Stephenson
  1 sibling, 0 replies; 6+ messages in thread
From: Alexey I. Froloff @ 2010-12-29 23:55 UTC (permalink / raw)
  To: zsh-workers

[-- Attachment #1: Type: text/plain, Size: 253 bytes --]

On Wed, Dec 29, 2010 at 10:11:55PM +0100, Jilles Tjoelker wrote:
> POSIX's description assumes ...
$ runas /bin/zsh sh -c 'printf '\''\045\n'\'
%
$ printf '%%\n'  
%
$

-- 
Regards,    --
Sir Raorn.   --- http://thousandsofhate.blogspot.com/

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: printf \045 (or whatever the character code for % is)
  2010-12-29 21:11 printf \045 (or whatever the character code for % is) Jilles Tjoelker
  2010-12-29 23:55 ` Alexey I. Froloff
@ 2011-01-05 17:39 ` Peter Stephenson
  2011-01-06  4:46   ` Bart Schaefer
  1 sibling, 1 reply; 6+ messages in thread
From: Peter Stephenson @ 2011-01-05 17:39 UTC (permalink / raw)
  To: Zsh Hackers' List

On Wed, 29 Dec 2010 22:11:55 +0100
Jilles Tjoelker <jilles@stack.nl> wrote:
> While trying to run the FreeBSD sh testsuite with zsh, various tests
> fail because the printf builtin interprets \045 (ASCII) as a percent
> sign introducing a format specification instead of a literal percent
> sign. The \045 arises because I create all 255 non-zero byte values via
> octal escapes.
> 
> POSIX's description assumes that the backslash escapes and format
> specifications are processed in one pass and simply says that an octal
> escape sequence shall write the corresponding byte. If they are separate
> passes the backslash escape removal step needs to know about percent
> signs.

That's a reasonable assumption, but the function handling print is an
appalling mess so it's not easy to fix without a major rewrite.  The
code for printf doesn't really have any business being associated with
the code for print, they're there for different purposes entirely based
on completely different specifications.  At the moment printf does the
same as 'print -f', so it has all the same oddities as print whether it
should or not.  (In my opinion, anyone deliberately asking for combined
print and printf behaviour deserves everything they get so I'm perfectly
happy to let 'print -f' fester while standardising printf.)  However, I
never get volunteers for tidying the shell up, so we're probably stuck
until someone gets fed up enough to look into it.

-- 
Peter Stephenson <p.w.stephenson@ntlworld.com>
Web page now at http://homepage.ntlworld.com/p.w.stephenson/


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: printf \045 (or whatever the character code for % is)
  2011-01-05 17:39 ` Peter Stephenson
@ 2011-01-06  4:46   ` Bart Schaefer
  2011-01-06 12:09     ` Peter Stephenson
  0 siblings, 1 reply; 6+ messages in thread
From: Bart Schaefer @ 2011-01-06  4:46 UTC (permalink / raw)
  To: Zsh Hackers' List

On Jan 5,  5:39pm, Peter Stephenson wrote:
}
} > POSIX's description assumes that the backslash escapes and format
} > specifications are processed in one pass and simply says that an
} > octal escape sequence shall write the corresponding byte. If they
} > are separate passes the backslash escape removal step needs to know
} > about percent signs.
} 
} That's a reasonable assumption, but the function handling print is an
} appalling mess so it's not easy to fix without a major rewrite.

The octal escapes are all handled by getkeystring().  There's already
a special macro GETKEYS_PRINTF_FMT which (unfortunately?) is used for
both "printf" and "print -f".

} The code for printf doesn't really have any business being associated
} with the code for print, they're there for different purposes entirely
} based on completely different specifications. At the moment printf
} does the same as 'print -f', so it has all the same oddities as print
} whether it should or not.

I'm not entirely sure that's true.  The printf builtin doesn't accept
any options, which means that except for the initial getkeystring(),
nearly everything in bin_print() is ignored until you get down to the
part that handles the format spec ... and that can't be replaced by
e.g. sprintf() because of misc. special formats like %b and %q.

} (In my opinion, anyone deliberately asking for combined print and
} printf behaviour deserves everything they get so I'm perfectly happy
} to let 'print -f' fester while standardising printf.)

Although we could rip the format handling out of bin_print() and
create a new bin_printf() [which would be called by "print -f"?] we'd
still need something akin to getkeystring() for the octal escapes.

} However, I never get volunteers for tidying the shell up, so we're
} probably stuck until someone gets fed up enough to look into it.

GETKEYS_PRINTF_FMT expands to GETKEY_OCTAL_ESC|GETKEY_BACKSLASH_C ...
seems as though an additional flag to getkeystring() could be used to
cause \045 to expand to %% as a special case, something like this in
utils.c:

@@ -5517,6 +5522,8 @@
 		    }
 		    *t++ = zstrtol(s + (*s == 'x'), &s,
 				   (*s == 'x') ? 16 : 8);
+		    if ((how & GETKEY_PRINTF) && t[-1] == '%')
+		        *t++ = '%';
 		    if (svchar) {
 			u[3] = svchar;
 			svchar = '\0';

The flag bits for "how" are an enum in zsh.h and I'm undecided whether
to renumber them or just add another to the end, so I have't included
a complete patch.  Also I don't know whether the intent is that \045
(and \x25) should become %% only for "printf" or also for "print -f",
so no patch for builtin.c yet either.

-- 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: printf \045 (or whatever the character code for % is)
  2011-01-06  4:46   ` Bart Schaefer
@ 2011-01-06 12:09     ` Peter Stephenson
  2011-01-06 16:01       ` Bart Schaefer
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Stephenson @ 2011-01-06 12:09 UTC (permalink / raw)
  To: Zsh Hackers' List

On Wed, 5 Jan 2011 20:46:12 -0800
Bart Schaefer <schaefer@brasslantern.com> wrote:
> } The code for printf doesn't really have any business being
> associated } with the code for print, they're there for different
> purposes entirely } based on completely different specifications. At
> the moment printf } does the same as 'print -f', so it has all the
> same oddities as print } whether it should or not.
> 
> I'm not entirely sure that's true.  The printf builtin doesn't accept
> any options, which means that except for the initial getkeystring(),
> nearly everything in bin_print() is ignored until you get down to the
> part that handles the format spec ... and that can't be replaced by
> e.g. sprintf() because of misc. special formats like %b and %q.

That's kind of why print & printf really ought to be separate, with
common subroutines where needed (the lack of modularity in bin_print()
is one of the big issues).  But that's moot for now...

> GETKEYS_PRINTF_FMT expands to GETKEY_OCTAL_ESC|GETKEY_BACKSLASH_C ...
> seems as though an additional flag to getkeystring() could be used to
> cause \045 to expand to %% as a special case, something like this in
> utils.c:

That's sneaky, that should be OK.

> @@ -5517,6 +5522,8 @@
>  		    }
>  		    *t++ = zstrtol(s + (*s == 'x'), &s,
>  				   (*s == 'x') ? 16 : 8);
> +		    if ((how & GETKEY_PRINTF) && t[-1] == '%')
> +		        *t++ = '%';
>  		    if (svchar) {
>  			u[3] = svchar;
>  			svchar = '\0';

Presumably since we're contracting an escape sequence there's always enough
allocated space for the extra '%'.

> The flag bits for "how" are an enum in zsh.h and I'm undecided whether
> to renumber them or just add another to the end, so I have't included
> a complete patch.  Also I don't know whether the intent is that \045
> (and \x25) should become %% only for "printf" or also for "print -f",
> so no patch for builtin.c yet either.

There's no particularly well-defined order in the enum, although the
more recherché options tend to be later (but only because they were
added later).

Unless we go down the route of separate builtin handlers, I think it
would be better to keep printf and print -f in sync for now. For one
thing, making them different puts yet another strain on bin_print(); for
another, we haven't yet gone into the details of where printf actually
needs to be different from print (we'd need to look at the relevant
standards for printf to see where the code is doing the wrong thing at
present).

-- 
Peter Stephenson <pws@csr.com>            Software Engineer
Tel: +44 (0)1223 692070                   Cambridge Silicon Radio Limited
Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: printf \045 (or whatever the character code for % is)
  2011-01-06 12:09     ` Peter Stephenson
@ 2011-01-06 16:01       ` Bart Schaefer
  0 siblings, 0 replies; 6+ messages in thread
From: Bart Schaefer @ 2011-01-06 16:01 UTC (permalink / raw)
  To: Zsh Hackers' List

On Jan 6, 12:09pm, Peter Stephenson wrote:
> Subject: Re: printf \045 (or whatever the character code for % is)
>
> On Wed, 5 Jan 2011 20:46:12 -0800
> Bart Schaefer <schaefer@brasslantern.com> wrote:
> > GETKEYS_PRINTF_FMT expands to GETKEY_OCTAL_ESC|GETKEY_BACKSLASH_C ...
> > seems as though an additional flag to getkeystring() could be used to
> > cause \045 to expand to %% as a special case, something like this in
> > utils.c:
> 
> That's sneaky, that should be OK.
> 
> Presumably since we're contracting an escape sequence there's always enough
> allocated space for the extra '%'.

It's a bit hard to follow getkeystring() in the multibyte branches,
but as that should always be allocating more rather than less space,
I believe the answer is "that's correct".  Minimum it's \45 -> %%.

> Unless we go down the route of separate builtin handlers, I think it
> would be better to keep printf and print -f in sync for now.

In that case, no need to touch builtin.c at all.

> [...] we haven't yet gone into the details of where printf actually
> needs to be different from print (we'd need to look at the relevant
> standards for printf to see where the code is doing the wrong thing at
> present).

For one thing, hasn't austin-group been discussing \Cx where zsh at
present uses the old Emacs syntax of \C-x ?

(Revision numbers below are from my local repository, ignore them.)

Index: Src/utils.c
===================================================================
RCS file: /extra/cvsroot/zsh/zsh-4.0/Src/utils.c,v
retrieving revision 1.40
diff -c -r1.40 utils.c
--- utils.c	21 Dec 2010 16:41:16 -0000	1.40
+++ utils.c	6 Jan 2011 15:43:40 -0000
@@ -5517,6 +5522,8 @@
 		    }
 		    *t++ = zstrtol(s + (*s == 'x'), &s,
 				   (*s == 'x') ? 16 : 8);
+		    if ((how & GETKEY_PRINTF_PERCENT) && t[-1] == '%')
+		        *t++ = '%';
 		    if (svchar) {
 			u[3] = svchar;
 			svchar = '\0';
Index: Src/zsh.h
===================================================================
RCS file: /extra/cvsroot/zsh/zsh-4.0/Src/zsh.h,v
retrieving revision 1.43
diff -c -r1.43 zsh.h
--- zsh.h	21 Dec 2010 16:41:16 -0000	1.43
+++ zsh.h	6 Jan 2011 15:50:06 -0000
@@ -2492,7 +2492,11 @@
      * Yes, I know that doesn't seem to make much sense.
      * It's for use in completion, comprenez?
      */
-    GETKEY_UPDATE_OFFSET = (1 << 7)
+    GETKEY_UPDATE_OFFSET = (1 << 7),
+    /*
+     * When replacing numeric escapes for printf format strings, % -> %%
+     */
+    GETKEY_PRINTF_PERCENT = (1 << 8)
 };
 
 /*
@@ -2501,8 +2505,9 @@
  */
 /* echo builtin */
 #define GETKEYS_ECHO	(GETKEY_BACKSLASH_C)
-/* printf format string:  \123 -> S, \0123 -> NL 3 */
-#define GETKEYS_PRINTF_FMT	(GETKEY_OCTAL_ESC|GETKEY_BACKSLASH_C)
+/* printf format string:  \123 -> S, \0123 -> NL 3, \045 -> %% */
+#define GETKEYS_PRINTF_FMT	\
+        (GETKEY_OCTAL_ESC|GETKEY_BACKSLASH_C|GETKEY_PRINTF_PERCENT)
 /* printf argument:  \123 -> \123, \0123 -> S */
 #define GETKEYS_PRINTF_ARG	(GETKEY_BACKSLASH_C)
 /* Full print without -e */


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-01-06 16:02 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-29 21:11 printf \045 (or whatever the character code for % is) Jilles Tjoelker
2010-12-29 23:55 ` Alexey I. Froloff
2011-01-05 17:39 ` Peter Stephenson
2011-01-06  4:46   ` Bart Schaefer
2011-01-06 12:09     ` Peter Stephenson
2011-01-06 16:01       ` Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).