zsh-workers
 help / color / mirror / code / Atom feed
* Re: problem with RANDOM and arrays
       [not found] <chaz20100111201312.GB3902@seebyte.com>
@ 2010-01-11 20:58 ` Stephane Chazelas
  2010-01-11 22:01   ` Stephane Chazelas
  2010-01-20 13:32 ` subscript expanded twice in a[subcript]++ (Was: problem with RANDOM and arrays) Stephane Chazelas
  1 sibling, 1 reply; 8+ messages in thread
From: Stephane Chazelas @ 2010-01-11 20:58 UTC (permalink / raw)
  To: Zsh hackers list

2010-01-11 20:13:12 +0000, Stephane Chazelas:
[...]
> Similar problem:
> 
> $ x=0; ((a[++x]++)); echo $x
> 2
> 
> All the +=, -=, *=... operators are also affected.
[...]

Strange, their must be something in my .zshrc that affects the
behavior as I get:

$ zsh -c 'x=0; ((a[++x]++)); echo $x'
1

(same with an interactive zsh -f)

The ((a[RANDOM%2+1]++)) problem shows up in any case though.

-- 
Stephane


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: problem with RANDOM and arrays
  2010-01-11 20:58 ` problem with RANDOM and arrays Stephane Chazelas
@ 2010-01-11 22:01   ` Stephane Chazelas
  2010-01-11 22:14     ` Chet Ramey
  0 siblings, 1 reply; 8+ messages in thread
From: Stephane Chazelas @ 2010-01-11 22:01 UTC (permalink / raw)
  To: Zsh hackers list

2010-01-11 20:58:46 +0000, Stephane Chazelas:
> 2010-01-11 20:13:12 +0000, Stephane Chazelas:
> [...]
> > Similar problem:
> > 
> > $ x=0; ((a[++x]++)); echo $x
> > 2
> > 
> > All the +=, -=, *=... operators are also affected.
> [...]
> 
> Strange, their must be something in my .zshrc that affects the
> behavior as I get:
> 
> $ zsh -c 'x=0; ((a[++x]++)); echo $x'
> 1
[...]

Well no, actually it's the fact that "a" is previously unset
that affects the behavior:

~$ unset a
~$ x=0; ((a[++x]++)); echo $x
1
~$ unset a
~$ a[1]=0
~$ a[2]=0
~$ x=0; ((a[++x]++)); echo $x
2
~$ unset a
~$ a[3]=0
~$ x=0; ((a[++x]++)); echo $x
2


So the behavior is slighly worse than in bash or ksh93 (which
consistently give 2 above).

-- 
Stephane


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: problem with RANDOM and arrays
  2010-01-11 22:01   ` Stephane Chazelas
@ 2010-01-11 22:14     ` Chet Ramey
  0 siblings, 0 replies; 8+ messages in thread
From: Chet Ramey @ 2010-01-11 22:14 UTC (permalink / raw)
  To: stephane_chazelas; +Cc: zsh-workers, chet

> Well no, actually it's the fact that "a" is previously unset
> that affects the behavior:
> 
> ~$ unset a
> ~$ x=0; ((a[++x]++)); echo $x
> 1
> ~$ unset a
> ~$ a[1]=0
> ~$ a[2]=0
> ~$ x=0; ((a[++x]++)); echo $x
> 2
> ~$ unset a
> ~$ a[3]=0
> ~$ x=0; ((a[++x]++)); echo $x
> 2
> 
> 
> So the behavior is slighly worse than in bash or ksh93 (which
> consistently give 2 above).

Bash and ksh93 explicitly evaluate the subscript even if the variable is
unset, so any expected side effects happen.  I got a bug report on this
a while back.

Chet

-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/


^ permalink raw reply	[flat|nested] 8+ messages in thread

* subscript expanded twice in a[subcript]++ (Was: problem with RANDOM and arrays)
       [not found] <chaz20100111201312.GB3902@seebyte.com>
  2010-01-11 20:58 ` problem with RANDOM and arrays Stephane Chazelas
@ 2010-01-20 13:32 ` Stephane Chazelas
  2010-01-20 13:48   ` Peter Stephenson
                     ` (2 more replies)
  1 sibling, 3 replies; 8+ messages in thread
From: Stephane Chazelas @ 2010-01-20 13:32 UTC (permalink / raw)
  To: Zsh hackers list

2010-01-11 20:13:12 +0000, Stephane Chazelas:
[...]
> someone reported the exact same problem with bash. And it also
> occurs with zsh and ksh93.

Note that this bug has already been fixed in ksh93
(http://mailman.research.att.com/pipermail/ast-developers/2010q1/000550.html),
not in bash yet.

> ~$ unset x; for i ({1..1000}) ((x[RANDOM%2+1]++)); echo $x
> 508 509
> 
> (see how the sum is not 1000 and how the numbers are close
> together).
> 
> My understanding is that
> 
> x[RANDOM%2+1]++
> 
> is handled as
> 
> x[RANDOM%2+1] = x[RANDOM%2] + 1
> 
> and the 2 RANDOMs above are expanded twice, so will generally be
> different. But I would still say it's a bug.
> 
> Similar problem:
> 
> $ x=0; ((a[++x]++)); echo $x
> 2
> 
> All the +=, -=, *=... operators are also affected.

-- 
Stephane


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: subscript expanded twice in a[subcript]++ (Was: problem with RANDOM and arrays)
  2010-01-20 13:32 ` subscript expanded twice in a[subcript]++ (Was: problem with RANDOM and arrays) Stephane Chazelas
@ 2010-01-20 13:48   ` Peter Stephenson
  2010-01-20 16:46   ` Peter Stephenson
  2010-01-20 17:16   ` Bart Schaefer
  2 siblings, 0 replies; 8+ messages in thread
From: Peter Stephenson @ 2010-01-20 13:48 UTC (permalink / raw)
  To: Zsh hackers list

Stephane Chazelas wrote:
> 2010-01-11 20:13:12 +0000, Stephane Chazelas:
> [...]
> > someone reported the exact same problem with bash. And it also
> > occurs with zsh and ksh93.
> 
> Note that this bug has already been fixed in ksh93
> (http://mailman.research.att.com/pipermail/ast-developers/2010q1/000550.html)
> ,
> not in bash yet.

If you're trying to get more people to work on the shell, feel free, but
I've had no luck.

-- 
Peter Stephenson <pws@csr.com>            Software Engineer
Tel: +44 (0)1223 692070                   Cambridge Silicon Radio Limited
Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: subscript expanded twice in a[subcript]++ (Was: problem with RANDOM and arrays)
  2010-01-20 13:32 ` subscript expanded twice in a[subcript]++ (Was: problem with RANDOM and arrays) Stephane Chazelas
  2010-01-20 13:48   ` Peter Stephenson
@ 2010-01-20 16:46   ` Peter Stephenson
  2010-01-20 18:28     ` Bart Schaefer
  2010-01-20 17:16   ` Bart Schaefer
  2 siblings, 1 reply; 8+ messages in thread
From: Peter Stephenson @ 2010-01-20 16:46 UTC (permalink / raw)
  To: Zsh hackers list

Had an idea...  the second level down in the parameter code contains the
evaluated subscript, so we can cache that, and be as paranoid as we feel
about the validity of the cached value.

Index: Src/math.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/math.c,v
retrieving revision 1.36
diff -u -r1.36 math.c
--- Src/math.c	16 Oct 2008 09:34:51 -0000	1.36
+++ Src/math.c	20 Jan 2010 16:41:54 -0000
@@ -27,6 +27,8 @@
  *
  */
 
+struct mathvalue;
+
 #include "zsh.mdh"
 #include "math.pro"
 
@@ -304,7 +306,16 @@
 static int sp = -1;			/* stack pointer */
 
 struct mathvalue {
+    /*
+     * If we need to get a variable, this is the string to be passed
+     * to the parameter code.  It may include a subscript.
+     */
     char *lval;
+    /*
+     * If this is not zero, we've retrieved a variable and this
+     * stores a reference to it.
+     */
+    Value pval;
     mnumber val;
 };
 
@@ -317,6 +328,26 @@
     MPREC_ARG
 };
 
+
+/*
+ * Get a number from a variable.
+ * Try to be clever about reusing subscripts by caching the Value structure.
+ */
+static mnumber
+getmathparam(struct mathvalue *mptr)
+{
+    if (!mptr->pval) {
+	char *s = mptr->lval;
+	mptr->pval = (Value)zhalloc(sizeof(struct value));
+	if (!getvalue(mptr->pval, &s, 1))
+	{
+	    mptr->pval = NULL;
+	    return zero_mnumber;
+	}
+    }
+    return getnumvalue(mptr->pval);
+}
+
 static mnumber
 mathevall(char *s, enum prec_type prec_tp, char **ep)
 {
@@ -384,7 +415,7 @@
 	ret.u.l = 0;
     } else {
 	if (stack[0].val.type == MN_UNSET)
-	    ret = getnparam(stack[0].lval);
+	    ret = getmathparam(stack);
 	else
 	    ret = stack[0].val;
     }
@@ -742,6 +773,7 @@
 	sp++;
     stack[sp].val = val;
     stack[sp].lval = lval;
+    stack[sp].pval = NULL;
     if (getme)
 	stack[sp].val.type = MN_UNSET;
 }
@@ -753,7 +785,7 @@
     struct mathvalue *mv = stack+sp;
 
     if (mv->val.type == MN_UNSET && !noget)
-	mv->val = getnparam(mv->lval);
+	mv->val = getmathparam(mv);
     sp--;
     return errflag ? zero_mnumber : mv->val;
 }
@@ -790,9 +822,29 @@
 
 /**/
 static mnumber
-setvar(char *s, mnumber v)
+setmathvar(struct mathvalue *mvp, mnumber v)
 {
-    if (!s) {
+    if (mvp->pval) {
+	/*
+	 * This value may have been hanging around for a while.
+	 * Be ultra-paranoid in checking the variable is still valid.
+	 */
+	char *s = mvp->lval, *ptr;
+	Param pm;
+	DPUTS(!mvp->lval, "no variable name but variable value in math");
+	if ((ptr = strchr(s, '[')))
+	    s = dupstrpfx(s, ptr - s);
+	pm = (Param) paramtab->getnode(paramtab, s);
+	if (pm == mvp->pval->pm) {
+	    if (noeval)
+		return v;
+	    setnumvalue(mvp->pval, v);
+	    return v;
+	}
+	/* Different parameter, start again from scratch */
+	mvp->pval = NULL;
+    }
+    if (!mvp->lval) {
 	zerr("lvalue required");
 	v.type = MN_INTEGER;
 	v.u.l = 0;
@@ -800,8 +852,8 @@
     }
     if (noeval)
 	return v;
-    untokenize(s);
-    setnparam(s, v);
+    untokenize(mvp->lval);
+    setnparam(mvp->lval, v);
     return v;
 }
 
@@ -1101,8 +1153,9 @@
 	    }
 	}
 	if (tp & (OP_E2|OP_E2IO)) {
+	    struct mathvalue *mvp = stack + sp + 1;
 	    lv = stack[sp+1].lval;
-	    push(setvar(lv,c), lv, 0);
+	    push(setmathvar(mvp,c), mvp->lval, 0);
 	} else
 	    push(c,NULL, 0);
 	return;
@@ -1110,7 +1163,7 @@
 
     spval = &stack[sp].val;
     if (stack[sp].val.type == MN_UNSET)
-	*spval = getnparam(stack[sp].lval);
+	*spval = getmathparam(stack + sp);
     switch (what) {
     case NOT:
 	if (spval->type & MN_FLOAT) {
@@ -1119,6 +1172,7 @@
 	} else
 	    spval->u.l = !spval->u.l;
 	stack[sp].lval = NULL;
+	stack[sp].pval = NULL;
 	break;
     case COMP:
 	if (spval->type & MN_FLOAT) {
@@ -1127,6 +1181,7 @@
 	} else
 	    spval->u.l = ~spval->u.l;
 	stack[sp].lval = NULL;
+	stack[sp].pval = NULL;
 	break;
     case POSTPLUS:
 	a = *spval;
@@ -1134,7 +1189,7 @@
 	    a.u.d++;
 	else
 	    a.u.l++;
-	(void)setvar(stack[sp].lval, a);
+	(void)setmathvar(stack + sp, a);
 	break;
     case POSTMINUS:
 	a = *spval;
@@ -1142,10 +1197,11 @@
 	    a.u.d--;
 	else
 	    a.u.l--;
-	(void)setvar(stack[sp].lval, a);
+	(void)setmathvar(stack + sp, a);
 	break;
     case UPLUS:
 	stack[sp].lval = NULL;
+	stack[sp].pval = NULL;
 	break;
     case UMINUS:
 	if (spval->type & MN_FLOAT)
@@ -1153,6 +1209,7 @@
 	else
 	    spval->u.l = -spval->u.l;
 	stack[sp].lval = NULL;
+	stack[sp].pval = NULL;
 	break;
     case QUEST:
 	DPUTS(sp < 2, "BUG: math: three shall be the number of the counting.");
@@ -1172,14 +1229,14 @@
 	    spval->u.d++;
 	else
 	    spval->u.l++;
-	setvar(stack[sp].lval, *spval);
+	setmathvar(stack + sp, *spval);
 	break;
     case PREMINUS:
 	if (spval->type & MN_FLOAT)
 	    spval->u.d--;
 	else
 	    spval->u.l--;
-	setvar(stack[sp].lval, *spval);
+	setmathvar(stack + sp, *spval);
 	break;
     default:
 	zerr("out of integers");
@@ -1196,7 +1253,7 @@
     int tst;
 
     if (stack[sp].val.type == MN_UNSET)
-	*spval = getnparam(stack[sp].lval);
+	*spval = getmathparam(stack + sp);
     tst = (spval->type & MN_FLOAT) ? (zlong)spval->u.d : spval->u.l; 
 
     switch (tk) {
@@ -1337,7 +1394,7 @@
 	    break;
 	case QUEST:
 	    if (stack[sp].val.type == MN_UNSET)
-		stack[sp].val = getnparam(stack[sp].lval);
+		stack[sp].val = getmathparam(stack + sp);
 	    q = (stack[sp].val.type == MN_FLOAT) ? (zlong)stack[sp].val.u.d :
 		stack[sp].val.u.l;
 
Index: Test/C01arith.ztst
===================================================================
RCS file: /cvsroot/zsh/zsh/Test/C01arith.ztst,v
retrieving revision 1.17
diff -u -r1.17 C01arith.ztst
--- Test/C01arith.ztst	16 Oct 2008 09:34:51 -0000	1.17
+++ Test/C01arith.ztst	20 Jan 2010 16:41:55 -0000
@@ -188,3 +188,25 @@
 >0xFF
 >FF
 
+  array=(1)
+  x=0
+  (( array[++x]++ ))
+  print $x
+  print $#array
+  print $array
+0:no double increment for subscript
+>1
+>1
+>2
+
+  # This is a bit naughty...  the value of array
+  # isn't well defined since there's no sequence point
+  # between the increments of x, however we just want
+  # to be sure that in this case, unlike the above,
+  # x does get incremented twice.
+  x=0
+  array=(1 2)
+  (( array[++x] = array[++x] + 1 ))
+  print $x
+0:double increment for repeated expression
+>2


-- 
Peter Stephenson <pws@csr.com>            Software Engineer
Tel: +44 (0)1223 692070                   Cambridge Silicon Radio Limited
Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: subscript expanded twice in a[subcript]++ (Was: problem with RANDOM and arrays)
  2010-01-20 13:32 ` subscript expanded twice in a[subcript]++ (Was: problem with RANDOM and arrays) Stephane Chazelas
  2010-01-20 13:48   ` Peter Stephenson
  2010-01-20 16:46   ` Peter Stephenson
@ 2010-01-20 17:16   ` Bart Schaefer
  2 siblings, 0 replies; 8+ messages in thread
From: Bart Schaefer @ 2010-01-20 17:16 UTC (permalink / raw)
  To: Zsh hackers list

On Jan 20,  1:32pm, Stephane Chazelas wrote:
}
} > My understanding is that
} > 
} > x[RANDOM%2+1]++
} > 
} > is handled as
} > 
} > x[RANDOM%2+1] = x[RANDOM%2] + 1

It's not quite that simple; if it were, this would also fail:

schaefer<519> for i ({1..1000}) ((x[$((RANDOM%2+1))]++)); echo $x
506 494

} > Similar problem:
} > 
} > $ x=0; ((a[++x]++)); echo $x
} > 2

I find a slightly different problem:

schaefer<520> unset x a; repeat 2 { x=0; ((a[++x]++)); print X $x A $a }
X 1 A 1
X 2 A 1 2

Note that the first time around, the op works as expected.  This seems
to have to do with whether $a is unset:

schaefer<530> unset x a; declare -a a
schaefer<531> repeat 2 { x=0; ((a[++x]++)); print X $x A $a }
X 2 A 1
X 2 A 1

However, the error *does* seem to be caused by both getnparam() and
setnparam() calling getvalue(), which eventually calls getindex() where
the subscript expression is parsed and evaluated by matheval().

The calls are at math.c:1113 and math.c:1137 in the specific case of
the "++" operator, but of course every op has its own setvar() call.

Interestingly the problem for zsh seems to be that array elements are
always scalar, never integer, so they must always be converted back
and forth to strings when arithmetic is performed on them.  (I can't
get to the ksh list archive to see if the problem there was similar.)

I don't see any simple fix for this.  We could parse a[++x] as if it
were a[$((++x))] before entering math context, but then ternaries
((x ? a[x++] : a[++x])) would incorrectly increment in both branches.
Anyway that's problematic as I don't believe the parser knows whether
$a references a scalar array or a hash.

Another option is a completely new set of APIs for fetching and
setting variable values, which would manipulate a handle to the internal
storage of a[N] rather than returning the value stored there so that
we could both read and update through the same handle.  This almost
exists e.g. getvalue()/getnumvalue() and setnumvalue(), but there are
potential recursion issues, e.g., getnumvalue() also may make a call
to matheval() because of this:

schaefer<549> a=('y+2'); y=5; print $((++a[1]))
8

I'd like to know if ksh supports that and if so how it avoids side-
effects on evaluating the string value of a[1] as a math expression.

schaefer<550> x=0; a=(x+=3 x+=2 x+=1 x=0); ((++a[++x])); print $a
x+=3 x+=2 x+=1 x=0 5


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: subscript expanded twice in a[subcript]++ (Was: problem with RANDOM and arrays)
  2010-01-20 16:46   ` Peter Stephenson
@ 2010-01-20 18:28     ` Bart Schaefer
  0 siblings, 0 replies; 8+ messages in thread
From: Bart Schaefer @ 2010-01-20 18:28 UTC (permalink / raw)
  To: Zsh hackers list

On Jan 20,  4:46pm, Peter Stephenson wrote:
}
} Had an idea...  the second level down in the parameter code contains the
} evaluated subscript, so we can cache that, and be as paranoid as we feel
} about the validity of the cached value.

Much better:

schaefer<503> x=0; a=(x+=3 x+=2 x+=1 x+=4 x+=5); ((++a[++x])); print $a $x
5 x+=2 x+=1 x+=4 x+=5 4

Still a little dicey, but probably not fixable:

schaefer<507> x=3; a=(x+=3 x+=2 x+=1 x+=4 x+=5); ((++a[++a[1]])); print $a $x
7 x+=2 x+=1 x+=4 x+=5 1 6

In that expression, we read a[1] and get (x+=3) so we increment a[1] as
(x+=3+1)=7 and then increment a[7] (which was previously unset) to get 1.

There's probably some really complicated side-effect threading you could
get going with this.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-01-20 18:28 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <chaz20100111201312.GB3902@seebyte.com>
2010-01-11 20:58 ` problem with RANDOM and arrays Stephane Chazelas
2010-01-11 22:01   ` Stephane Chazelas
2010-01-11 22:14     ` Chet Ramey
2010-01-20 13:32 ` subscript expanded twice in a[subcript]++ (Was: problem with RANDOM and arrays) Stephane Chazelas
2010-01-20 13:48   ` Peter Stephenson
2010-01-20 16:46   ` Peter Stephenson
2010-01-20 18:28     ` Bart Schaefer
2010-01-20 17:16   ` Bart Schaefer

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).