zsh-workers
 help / color / mirror / code / Atom feed
* Example / partial fix for printf with math expressions
@ 2024-02-23 23:23 Bart Schaefer
  2024-02-24 14:40 ` Stephane Chazelas
  0 siblings, 1 reply; 7+ messages in thread
From: Bart Schaefer @ 2024-02-23 23:23 UTC (permalink / raw)
  To: Zsh hackers list

No attachment, so beware line wrap.

This repairs (printf "%d" ...) but obviously hasn't touched any of the
several other places where bin_print() calls matheval() and friends.
Upshot, math expects to get metafied strings, bin_print() does not (or
at least not always) pass them that way.

I also did not investigate whether curarg needs a known length rather
than calling metafy with -1, but since math is internally going to
stop at nul bytes anyway, I don't think it matters.

Also because the strings are metafied, checkunary() needs some
MULTIBYTE tweaking to get the count right and not split characters in
the middle when showing an error.  Someone who works more with that
should investigate.

diff --git a/Src/builtin.c b/Src/builtin.c
index dd352c146..5fccbac6d 100644
--- a/Src/builtin.c
+++ b/Src/builtin.c
@@ -5465,7 +5465,8 @@ bin_print(char *name, char **args, Options ops, int func)
                        *d++ = 'l';
 #endif
                        *d++ = 'l', *d++ = *c, *d = '\0';
-                       zlongval = (curarg) ? mathevali(curarg) : 0;
+                       zlongval = (curarg) ?
+                           mathevali(metafy(curarg, -1, META_HEAPDUP)) : 0;
                        if (errflag) {
                            zlongval = 0;
                            errflag &= ~ERRFLAG_ERROR;


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Example / partial fix for printf with math expressions
  2024-02-23 23:23 Example / partial fix for printf with math expressions Bart Schaefer
@ 2024-02-24 14:40 ` Stephane Chazelas
  2024-02-24 15:10   ` Stephane Chazelas
                     ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Stephane Chazelas @ 2024-02-24 14:40 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: Zsh hackers list

2024-02-23 15:23:52 -0800, Bart Schaefer:
> No attachment, so beware line wrap.
> 
> This repairs (printf "%d" ...) but obviously hasn't touched any of the
> several other places where bin_print() calls matheval() and friends.
> Upshot, math expects to get metafied strings, bin_print() does not (or
> at least not always) pass them that way.
> 
> I also did not investigate whether curarg needs a known length rather
> than calling metafy with -1, but since math is internally going to
> stop at nul bytes anyway, I don't think it matters.
[...]

Ah sorry, I hadn't seen that message when replying in the other
thread.

The math parser seems to work OK with NULs at least in:

$ typeset -A a
$ typeset -p a
typeset -A a=( [$'\C-@']=1 )
$ let $'b = (a = ##\0) + 32'; echo $a $b
0 32


With and without your example patch applied however:

$ printf '%d\n' $'a[\0]++'
zsh: invalid subscript
0

With the one below using curlen / len[argp-args]:

$ typeset -A a
$ printf '%d\n' $'a[\0]++'
0
$ typeset -p a
typeset -A a=( [$'\C-@']=1 )

diff --git a/Src/builtin.c b/Src/builtin.c
index dd352c146..f72d14da4 100644
--- a/Src/builtin.c
+++ b/Src/builtin.c
@@ -5247,7 +5247,8 @@ bin_print(char *name, char **args, Options ops, int func)
 		    }
 		}
 		if (*argp) {
-		    width = (int)mathevali(*argp++);
+		    width = (int)mathevali(metafy(*argp, len[argp - args], META_USEHEAP));
+		    argp++;
 		    if (errflag) {
 			errflag &= ~ERRFLAG_ERROR;
 			ret = 1;
@@ -5281,7 +5282,8 @@ bin_print(char *name, char **args, Options ops, int func)
 		    }
 
 		    if (*argp) {
-			prec = (int)mathevali(*argp++);
+			prec = (int)mathevali(metafy(*argp, len[argp - args], META_USEHEAP));
+			argp++;
 			if (errflag) {
 			    errflag &= ~ERRFLAG_ERROR;
 			    ret = 1;
@@ -5465,7 +5467,7 @@ bin_print(char *name, char **args, Options ops, int func)
  		    	*d++ = 'l';
 #endif
 		    	*d++ = 'l', *d++ = *c, *d = '\0';
-			zlongval = (curarg) ? mathevali(curarg) : 0;
+			zlongval = (curarg) ? mathevali(metafy(curarg, curlen, META_HEAPDUP)) : 0;
 			if (errflag) {
 			    zlongval = 0;
 			    errflag &= ~ERRFLAG_ERROR;
@@ -5516,7 +5518,7 @@ bin_print(char *name, char **args, Options ops, int func)
 			if (!curarg)
 			    zulongval = (zulong)0;
 			else if (!zstrtoul_underscore(curarg, &zulongval))
-			    zulongval = mathevali(curarg);
+			    zulongval = mathevali(metafy(curarg, curlen, META_HEAPDUP));
 			if (errflag) {
 			    zulongval = 0;
 			    errflag &= ~ERRFLAG_ERROR;



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Example / partial fix for printf with math expressions
  2024-02-24 14:40 ` Stephane Chazelas
@ 2024-02-24 15:10   ` Stephane Chazelas
  2024-02-24 21:51   ` Bart Schaefer
  2024-02-25  6:54   ` Stephane Chazelas
  2 siblings, 0 replies; 7+ messages in thread
From: Stephane Chazelas @ 2024-02-24 15:10 UTC (permalink / raw)
  To: Bart Schaefer, Zsh hackers list

2024-02-24 14:40:41 +0000, Stephane Chazelas:
[...]
> The math parser seems to work OK with NULs at least in:
> 
> $ typeset -A a
> $ typeset -p a
> typeset -A a=( [$'\C-@']=1 )
[...]

Sorry, I messed up my copy-paste again. Should have been:

$ typeset -A a
$ let $'a[\0]++'
$ typeset -p a
typeset -A a=( [$'\C-@']=1 )

-- 
Stephane


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Example / partial fix for printf with math expressions
  2024-02-24 14:40 ` Stephane Chazelas
  2024-02-24 15:10   ` Stephane Chazelas
@ 2024-02-24 21:51   ` Bart Schaefer
  2024-02-25  6:54   ` Stephane Chazelas
  2 siblings, 0 replies; 7+ messages in thread
From: Bart Schaefer @ 2024-02-24 21:51 UTC (permalink / raw)
  To: Zsh hackers list

Thanks for the builtin.c patch.  Just a couple of notes:

On Sat, Feb 24, 2024 at 6:40 AM Stephane Chazelas
<stephane@chazelas.org> wrote (with correction from a later message):
>
> The math parser seems to work OK with NULs at least in:
>
> $ typeset -A a
> $ let $'a[\0]++'

That's not exactly the math parser; mathparse() get as far as noticing
it's a parameter reference and then goes through
params.c:getnumvalue() which actually does the subscript parsing.
You're "cheating" with this construction (and of course "let" is doing
the metafication properly to pass the reference through).

> $ let $'b = (a = ##\0) + 32'; echo $a $b

For the ## operator the parse is reading just one byte, so you're
still not involving anything that might hit a nul-terminated string.

This is the math parser and only the math parser, but again with
proper metafication:

% let $'a\0b = 2'
zsh: bad math expression: illegal character: \M-C
% let $'ab \0 2'
zsh: bad math expression: illegal character: \M-C
% let $'ab \02'
zsh: bad math expression: illegal character: ^B
% let $'ab y\02'
zsh: bad math expression: operator expected at `y^B'
% let $'ab = \02'
zsh: bad math expression: operand expected at `^B'

I'm not sure it's actually very useful to display more than the first
character in the latter two cases, but we can try to improve what's
currently displayed in multibyte cases.

Compare math parser without metafication:

% a=42
% printf "%d\n" $'a\0b = 2'
42

This is what I meant by "stops at NUL".

> With and without your example patch applied however:
>
> $ printf '%d\n' $'a[\0]++'
> zsh: invalid subscript

That does demonstrate that the length is needed in bin_print() to
properly metafy the subscript, yes.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Example / partial fix for printf with math expressions
  2024-02-24 14:40 ` Stephane Chazelas
  2024-02-24 15:10   ` Stephane Chazelas
  2024-02-24 21:51   ` Bart Schaefer
@ 2024-02-25  6:54   ` Stephane Chazelas
  2024-02-26  0:04     ` Bart Schaefer
  2 siblings, 1 reply; 7+ messages in thread
From: Stephane Chazelas @ 2024-02-25  6:54 UTC (permalink / raw)
  To: Bart Schaefer, Zsh hackers list

2024-02-24 14:40:41 +0000, Stephane Chazelas:
[...]
> With the one below using curlen / len[argp-args]:
[...]
> +			zlongval = (curarg) ? mathevali(metafy(curarg, curlen, META_HEAPDUP)) : 0;
[...]

That incur a performance penalty:

$ time zsh -c 'repeat 1000 printf %d 123_456+{1..10000}' > /dev/null
zsh -c 'repeat 1000 printf %d 123_456+{1..10000}' > /dev/null  6.07s user 0.46s system 99% cpu 6.544 total
$ time ./Src/zsh -c 'repeat 1000 printf %d 123_456+{1..10000}' > /dev/null
./Src/zsh -c 'repeat 1000 printf %d 123_456+{1..10000}' > /dev/null  9.97s user 0.62s system 99% cpu 10.602 total

The numbers end up being metafied (earlier), unmetafied by
printf, metafied again by printf here and that metafication
processed by the math handler I guess.

Maybe the performance could be *improved* instead if the
unmetafication could be skipped altogether in printf in those
cases?

-- 
Stephane


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Example / partial fix for printf with math expressions
  2024-02-25  6:54   ` Stephane Chazelas
@ 2024-02-26  0:04     ` Bart Schaefer
  2024-02-26 17:12       ` Bart Schaefer
  0 siblings, 1 reply; 7+ messages in thread
From: Bart Schaefer @ 2024-02-26  0:04 UTC (permalink / raw)
  To: Zsh hackers list

On Sat, Feb 24, 2024 at 10:54 PM Stephane Chazelas
<stephane@chazelas.org> wrote:
>
> 2024-02-24 14:40:41 +0000, Stephane Chazelas:
> [...]
> > +                     zlongval = (curarg) ? mathevali(metafy(curarg, curlen, META_HEAPDUP)) : 0;
>
> That incur a performance penalty:
>
> The numbers end up being metafied (earlier), unmetafied by
> printf, metafied again by printf here and that metafication
> processed by the math handler I guess.

The reason this went unobserved for so long is that metafication only
matters to a couple of things in math:
1) proper handling of parameter references, like your a[x]++ example
2) those pesky error messages that started this whole thread

As far as I can tell there are no other circumstances in which
metafication is necessary for the syntax of math expressions.

> Maybe the performance could be *improved* instead if the
> unmetafication could be skipped altogether in printf in those
> cases?

That's certainly possible, but it would mean that every branch of the
big "case" statements for %-replacement would have to do their own
unmetafy(), rather than a single loop at the top covering the entire
argv array.  The -s/-S/-z options also reverse the unmetafy(), so some
heavy refactoring of bin_print() is needed to unroll this completely.

However, I think you could change META_HEAPDUP and META_USEHEAP to
META_NOALLOC, becuase you're re-metafying back into the same space
that was originally unmetafied?  That might cut the performance
penalty a lot.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Example / partial fix for printf with math expressions
  2024-02-26  0:04     ` Bart Schaefer
@ 2024-02-26 17:12       ` Bart Schaefer
  0 siblings, 0 replies; 7+ messages in thread
From: Bart Schaefer @ 2024-02-26 17:12 UTC (permalink / raw)
  To: Zsh hackers list

On Sun, Feb 25, 2024 at 4:04 PM Bart Schaefer <schaefer@brasslantern.com> wrote:
>
> However, I think you could change META_HEAPDUP and META_USEHEAP to
> META_NOALLOC, becuase you're re-metafying back into the same space
> that was originally unmetafied?  That might cut the performance
> penalty a lot.

No metafication:
zsh -c 'repeat 1000 printf %d 123_456+{1..10000}' > /dev/null  7.51s
user 0.39s system 99% cpu 7.907 total

USEHEAP:
Src/zsh -c 'repeat 1000 printf %d 123_456+{1..10000}' > /dev/null
8.62s user 0.59s system 99% cpu 9.218 total

NOALLOC:
Src/zsh -c 'repeat 1000 printf %d 123_456+{1..10000}' > /dev/null
7.89s user 0.36s system 99% cpu 8.255 total

All of this is with full debugging, an optimized compile might change it.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-02-26 17:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-02-23 23:23 Example / partial fix for printf with math expressions Bart Schaefer
2024-02-24 14:40 ` Stephane Chazelas
2024-02-24 15:10   ` Stephane Chazelas
2024-02-24 21:51   ` Bart Schaefer
2024-02-25  6:54   ` Stephane Chazelas
2024-02-26  0:04     ` Bart Schaefer
2024-02-26 17:12       ` Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).