zsh-workers
 help / color / mirror / code / Atom feed
* PATCH: 3.1.5 - sample associative array implementation
@ 1998-11-11  6:44 Bart Schaefer
  1998-11-11 13:58 ` Peter Stephenson
  0 siblings, 1 reply; 21+ messages in thread
From: Bart Schaefer @ 1998-11-11  6:44 UTC (permalink / raw)
  To: zsh-workers

The intersection of our discussion about improved completion widgets with a
debugging session to track down the PWD/chpwd problem helped me see a short
path to associative arrays.  The following patch implements a new option to
"typeset":

zsh% typeset -H assoc	# create associative (`H'ashed) array `assoc'

The syntax for referring to elements of an associative array is exactly the
same as any other array:

zsh% assoc[bread]=butter
zsh% assoc[toast]=jam
zsh% echo $assoc
butter jam
zsh% echo $assoc[bread]
butter

However, there are some caveats:  (1) You can't assign to multiple elements
of an associative array in a single expression; you must always use the
subscript syntax.  (2) The text inside the [] is not subject to arithmetic
evaluation as it is with regular arrays.  (3) $assoc, $assoc[@], $assoc[*]
all produce strings, not arrays.  (4) because of (3), subscript modifiers
such as $assoc[(r)but*] (which should produce "butter") produce the whole
string.  (5) $assoc[bread,3] produces "but" (the first 3 characters of the
value) which I think is because getarg() doesn't return soon enough; it
really ought to either ignore or gripe about what comes after the comma.

(I'm not sure why (3) happens, as I thought I'd done the right things to
produce array results, but I wanted to get this out tonight for review.)

It's actually possible to do something very similar with ordinary arrays
and scalars, if you don't mind cluttering the namespace:

zsh% bread=1
zsh% toast=2
zsh% assoc[bread]=butter
zsh% assoc[toast]=jam

Which, of course, is exactly why it was so easy to make this patch.  It
does mean that you must declare associative arrays before you begin to
assign to them, because the default interpretation of the syntax is to
create an ordinary array.

This patch is not very "ready for prime time" and should be worked over by
someone more familiar than I am with the parameter manipulation code.  In
particular, I'm afraid there may be some memory leaks in the code to form
arrays from the values.  Further, the syntax for referring to associative
array elements should probably not be the same as that for regular arrays
(perhaps $assoc[[bread]], for example, which now is a math error) but I
didn't want to delve into the parsing.  You'll note there's a hunk of
text.c where I had no idea what to do.  Lastly, "typeset -H" is messily
implemented because I didn't want to renumber gobs of flags in zsh.h.

The implementation is fairly straightforward:  I modified "struct param"
and its component unions to allow parameters to point to hash tables.  I
then ripped some pieces out of the paramtab code and made it re-usable
so that I could create additional parameter hash tables on the fly.  An
associative array is then nothing more than a struct param that refers to
a hash table of other struct param.

When an associative array element is referenced, it's hash table slot is
created and initially marked PM_UNSET.  getarg() then plays a trick on
the parameter expansion code: it substitutes the struct param from the
"nested" hash table for the input one that refers to the entire array,
and claims it was really expanding that scalar all along.

Although the user-level assignment UI so far creates only associative
arrays of scalars, an internally-constructed special associative array
could conceivably contain elements that are themselves arrays or other
associations -- and it should all pretty much just work.

This patch should apply to a raw 3.1.5 distribution with only a few lines
of offset here and there.

Index: Src/builtin.c
===================================================================
--- builtin.c	1998/11/08 06:24:16	1.6
+++ builtin.c	1998/11/10 06:25:13
@@ -50,7 +50,7 @@
     BUILTIN("cd", 0, bin_cd, 0, 2, BIN_CD, NULL, NULL),
     BUILTIN("chdir", 0, bin_cd, 0, 2, BIN_CD, NULL, NULL),
     BUILTIN("continue", BINF_PSPECIAL, bin_break, 0, 1, BIN_CONTINUE, NULL, NULL),
-    BUILTIN("declare", BINF_TYPEOPTS | BINF_MAGICEQUALS | BINF_PSPECIAL, bin_typeset, 0, -1, 0, "LRUZfilrtux", NULL),
+    BUILTIN("declare", BINF_TYPEOPTS | BINF_MAGICEQUALS | BINF_PSPECIAL, bin_typeset, 0, -1, 0, "HLRUZfilrtux", NULL),
     BUILTIN("dirs", 0, bin_dirs, 0, -1, 0, "v", NULL),
     BUILTIN("disable", 0, bin_enable, 0, -1, BIN_DISABLE, "afmr", NULL),
     BUILTIN("disown", 0, bin_fg, 0, -1, BIN_DISOWN, NULL, NULL),
@@ -78,7 +78,7 @@
     BUILTIN("jobs", 0, bin_fg, 0, -1, BIN_JOBS, "dlpZrs", NULL),
     BUILTIN("kill", 0, bin_kill, 0, -1, 0, NULL, NULL),
     BUILTIN("let", 0, bin_let, 1, -1, 0, NULL, NULL),
-    BUILTIN("local", BINF_TYPEOPTS | BINF_MAGICEQUALS | BINF_PSPECIAL, bin_typeset, 0, -1, 0, "LRUZilrtu", NULL),
+    BUILTIN("local", BINF_TYPEOPTS | BINF_MAGICEQUALS | BINF_PSPECIAL, bin_typeset, 0, -1, 0, "HLRUZilrtu", NULL),
     BUILTIN("log", 0, bin_log, 0, 0, 0, NULL, NULL),
     BUILTIN("logout", 0, bin_break, 0, 1, BIN_LOGOUT, NULL, NULL),
 
@@ -107,7 +107,7 @@
     BUILTIN("trap", BINF_PSPECIAL, bin_trap, 0, -1, 0, NULL, NULL),
     BUILTIN("true", 0, bin_true, 0, -1, 0, NULL, NULL),
     BUILTIN("type", 0, bin_whence, 0, -1, 0, "ampfsw", "v"),
-    BUILTIN("typeset", BINF_TYPEOPTS | BINF_MAGICEQUALS | BINF_PSPECIAL, bin_typeset, 0, -1, 0, "LRUZfilrtuxm", NULL),
+    BUILTIN("typeset", BINF_TYPEOPTS | BINF_MAGICEQUALS | BINF_PSPECIAL, bin_typeset, 0, -1, 0, "HLRUZfilrtuxm", NULL),
     BUILTIN("umask", 0, bin_umask, 0, 1, 0, "S", NULL),
     BUILTIN("unalias", 0, bin_unhash, 1, -1, 0, "m", "a"),
     BUILTIN("unfunction", 0, bin_unhash, 1, -1, 0, "m", "f"),
@@ -947,7 +947,6 @@
 static void
 cd_new_pwd(int func, LinkNode dir, int chaselinks)
 {
-    Param pm;
     List l;
     char *new_pwd, *s;
     int dirstacksize;
@@ -1470,7 +1469,7 @@
     Param pm;
     Asgment asg;
     Comp com;
-    char *optstr = "iLRZlurtxU";
+    char *optstr = "iLRZlurtxU----H";
     int on = 0, off = 0, roff, bit = PM_INTEGER;
     int initon, initoff, of, i;
     int returnval = 0, printflags = 0;
@@ -1545,7 +1544,8 @@
 				!(pm->flags & PM_READONLY & ~off))
 				uniqarray((*pm->gets.afn) (pm));
 			    pm->flags = (pm->flags | on) & ~off;
-			    if (PM_TYPE(pm->flags) != PM_ARRAY) {
+			    if (PM_TYPE(pm->flags) != PM_ARRAY &&
+				PM_TYPE(pm->flags) != PM_HASHED) {
 				if ((on & (PM_LEFT | PM_RIGHT_B | PM_RIGHT_Z | PM_INTEGER)) && auxlen)
 				    pm->ct = auxlen;
 				/* did we just export this? */
@@ -1619,7 +1619,8 @@
 	    if ((on & (PM_LEFT | PM_RIGHT_B | PM_RIGHT_Z | PM_INTEGER)) &&
 		auxlen)
 		pm->ct = auxlen;
-	    if (PM_TYPE(pm->flags) != PM_ARRAY) {
+	    if (PM_TYPE(pm->flags) != PM_ARRAY &&
+		PM_TYPE(pm->flags) != PM_HASHED) {
 		if (pm->flags & PM_EXPORTED) {
 		    if (!(pm->flags & PM_UNSET) && !pm->env && !asg->value)
 			pm->env = addenv(asg->name, getsparam(asg->name));
Index: Src/exec.c
===================================================================
--- exec.c	1998/10/30 17:52:43	1.6
+++ exec.c	1998/11/10 20:07:09
@@ -1923,23 +1923,8 @@
 	    } else if (!(pm->flags & PM_READONLY) &&
 		       (unset(RESTRICTED) || !(pm->flags & PM_RESTRICTED))) {
 		Param tpm = (Param) alloc(sizeof *tpm);
-
 		tpm->nam = s;
-		tpm->flags = pm->flags;
-		switch (PM_TYPE(pm->flags)) {
-		case PM_SCALAR:
-		    tpm->u.str = ztrdup(pm->gets.cfn(pm));
-		    break;
-		case PM_INTEGER:
-		    tpm->u.val = pm->gets.ifn(pm);
-		    break;
-		case PM_ARRAY:
-		    PERMALLOC {
-			tpm->u.arr = arrdup(pm->gets.afn(pm));
-		    } LASTALLOC;
-		    break;
-		}
-		pm = tpm;
+		copyparam(tpm, pm);
 	    }
 	    addlinknode(*remove_p, s);
 	    addlinknode(*restore_p, pm);
@@ -1988,6 +1973,9 @@
 		    break;
 		case PM_ARRAY:
 		    tpm->sets.afn(tpm, pm->u.arr);
+		    break;
+		case PM_HASHED:
+		    tpm->sets.hfn(tpm, pm->u.hash);
 		    break;
 		}
 	    } else
Index: Src/hashtable.c
===================================================================
--- hashtable.c	1998/10/30 17:52:44	1.3
+++ hashtable.c	1998/11/10 08:49:26
@@ -1091,8 +1091,10 @@
     if (printflags & PRINT_TYPE) {
 	if (p->flags & PM_INTEGER)
 	    printf("integer ");
-	if (p->flags & PM_ARRAY)
+	else if (p->flags & PM_ARRAY)
 	    printf("array ");
+	else if (p->flags & PM_HASHED)
+	    printf("association ");
 	if (p->flags & PM_LEFT)
 	    printf("left justified %d ", p->ct);
 	if (p->flags & PM_RIGHT_B)
@@ -1142,6 +1144,16 @@
 		putchar(' ');
 		quotedzputs(*u++, stdout);
 	    }
+	}
+	printf(")\n");
+	break;
+    case PM_HASHED:
+	/* association */
+	putchar('(');
+	{
+            HashTable ht = p->gets.hfn(p);
+            if (ht)
+		scanhashtable(ht, 0, 0, 0, ht->printnode, 0);
 	}
 	printf(")\n");
 	break;
Index: Src/params.c
===================================================================
--- params.c	1998/10/30 17:52:46	1.3
+++ params.c	1998/11/11 03:41:38
@@ -247,7 +247,94 @@
  
 /**/
 HashTable paramtab;
- 
+
+/**/
+HashTable
+newparamtable(int size, char const *name)
+{
+    HashTable ht = newhashtable(size, name, NULL);
+
+    ht->hash        = hasher;
+    ht->emptytable  = NULL;
+    ht->filltable   = NULL;
+    ht->addnode     = addhashnode;
+    ht->getnode     = gethashnode2;
+    ht->getnode2    = gethashnode2;
+    ht->removenode  = removehashnode;
+    ht->disablenode = NULL;
+    ht->enablenode  = NULL;
+    ht->freenode    = freeparamnode;
+    ht->printnode   = printparamnode;
+
+    return ht;
+}
+
+/* Copy a parameter hash table */
+
+static HashTable outtable;
+
+/**/
+static void
+scancopyparams(HashNode hn, int flags)
+{
+    Param pm = (Param)hn;
+    Param tpm = (Param) alloc(sizeof *tpm);
+    tpm->nam = ztrdup(pm->nam);
+    copyparam(tpm, pm);
+    addhashnode(outtable, tpm->nam, tpm);
+}
+
+/**/
+HashTable
+copyparamtable(HashTable ht, char *name)
+{
+    HashTable nht = newparamtable(ht->hsize, name);
+    outtable = nht;
+    scanhashtable(ht, 0, 0, 0, scancopyparams, 0);
+    outtable = NULL;
+    return nht;
+}
+
+static unsigned numparamvals;
+
+/**/
+static void
+scancountparams(HashNode hn, int flags)
+{
+    ++numparamvals;
+}
+
+static char **paramvals;
+
+/**/
+static void
+scanparamvals(HashNode hn, int flags)
+{
+    struct value v;
+    v.isarr = (flags & 1);
+    v.pm = (Param)hn;
+    v.inv = (flags & 2);
+    v.a = 0;
+    v.b = -1;
+    paramvals[numparamvals++] = getstrvalue(&v);
+}
+
+/**/
+char **
+paramvalarr(HashTable ht)
+{
+    numparamvals = 0;
+    if (ht)
+	scanhashtable(ht, 0, 0, 0, scancountparams, 0);
+    paramvals = (char **) alloc(numparamvals * sizeof(char **) + 1);
+    if (ht) {
+	numparamvals = 0;
+	scanhashtable(ht, 0, 0, 0, scanparamvals, 0);
+    }
+    paramvals[numparamvals] = 0;
+    return paramvals;
+}
+
 /* Set up parameter hash table.  This will add predefined  *
  * parameter entries as well as setting up parameter table *
  * entries for environment variables we inherit.           */
@@ -261,19 +348,7 @@
     char buf[50], *str, *iname;
     int num_env;
 
-    paramtab = newhashtable(151, "paramtab", NULL);
-
-    paramtab->hash        = hasher;
-    paramtab->emptytable  = NULL;
-    paramtab->filltable   = NULL;
-    paramtab->addnode     = addhashnode;
-    paramtab->getnode     = gethashnode2;
-    paramtab->getnode2    = gethashnode2;
-    paramtab->removenode  = removehashnode;
-    paramtab->disablenode = NULL;
-    paramtab->enablenode  = NULL;
-    paramtab->freenode    = freeparamnode;
-    paramtab->printnode   = printparamnode;
+    paramtab = newparamtable(151, "paramtab");
 
     /* Add the special parameters to the hash table */
     for (ip = special_params; ip->nam; ip++)
@@ -427,6 +502,10 @@
 	    pm->sets.afn = arrsetfn;
 	    pm->gets.afn = arrgetfn;
 	    break;
+	case PM_HASHED:
+	    pm->sets.hfn = hashsetfn;
+	    pm->gets.hfn = hashgetfn;
+	    break;
 	default:
 	    DPUTS(1, "BUG: tried to create param node without valid flag");
 	    break;
@@ -436,6 +515,31 @@
     return pm;
 }
 
+/* Copy a parameter */
+
+/**/
+void
+copyparam(Param tpm, Param pm)
+{
+    PERMALLOC {
+	tpm->flags = pm->flags;
+	switch (PM_TYPE(pm->flags)) {
+	case PM_SCALAR:
+	    tpm->u.str = ztrdup(pm->gets.cfn(pm));
+	    break;
+	case PM_INTEGER:
+	    tpm->u.val = pm->gets.ifn(pm);
+	    break;
+	case PM_ARRAY:
+	    tpm->u.arr = arrdup(pm->gets.afn(pm));
+	    break;
+	case PM_HASHED:
+	    tpm->u.hash = copyparamtable(pm->gets.hfn(pm), pm->nam);
+	    break;
+	}
+    } LASTALLOC;
+}
+
 /* Return 1 if the string s is a valid identifier, else return 0. */
 
 /**/
@@ -577,6 +681,23 @@
     singsub(&s);
 
     if (!rev) {
+	if (PM_TYPE(v->pm->flags) == PM_HASHED) {
+	    HashTable ht = v->pm->gets.hfn(v->pm);
+	    if (!ht) {
+		ht = newparamtable(17, v->pm->nam);
+		v->pm->sets.hfn(v->pm, ht);
+	    }
+	    if (!(v->pm = (Param) ht->getnode(ht, s))) {
+		HashTable tht = paramtab;
+		paramtab = ht;
+		v->pm = createparam(s, PM_SCALAR|PM_UNSET);
+		paramtab = tht;
+	    }
+	    v->isarr = 0;
+	    v->a = 0;
+	    *w = v->b = -1;
+	    r = isset(KSHARRAYS) ? 1 : 0;
+	} else
 	if (!(r = mathevalarg(s, &s)) || (isset(KSHARRAYS) && r >= 0))
 	    r++;
 	if (word && !v->isarr) {
@@ -891,11 +1012,14 @@
 	}
 
 	switch(PM_TYPE(v->pm->flags)) {
+	case PM_HASHED:
+	    ss = paramvalarr(v->pm->gets.hfn(v->pm));	/* XXX Leaky? */
+	    LASTALLOC_RETURN sepjoin(ss, NULL);
 	case PM_ARRAY:
+	    ss = v->pm->gets.afn(v->pm);
 	    if (v->isarr)
-		s = sepjoin(v->pm->gets.afn(v->pm), NULL);
+		s = sepjoin(ss, NULL);
 	    else {
-		ss = v->pm->gets.afn(v->pm);
 		if (v->a < 0)
 		    v->a += arrlen(ss);
 		s = (v->a >= arrlen(ss) || v->a < 0) ? (char *) hcalloc(1) : ss[v->a];
@@ -1181,9 +1305,12 @@
 {
     Value v;
 
-    if (!idigit(*s) && (v = getvalue(&s, 0)) &&
-	PM_TYPE(v->pm->flags) == PM_ARRAY)
-	return v->pm->gets.afn(v->pm);
+    if (!idigit(*s) && (v = getvalue(&s, 0))) {
+	if (PM_TYPE(v->pm->flags) == PM_ARRAY)
+	    return v->pm->gets.afn(v->pm);
+	else if (PM_TYPE(v->pm->flags) == PM_HASHED)
+	    return paramvalarr(v->pm->gets.hfn(v->pm));	/* XXX Leaky? */
+    }
     return NULL;
 }
 
@@ -1363,6 +1490,7 @@
     switch (PM_TYPE(pm->flags)) {
 	case PM_SCALAR: pm->sets.cfn(pm, NULL); break;
 	case PM_ARRAY:  pm->sets.afn(pm, NULL); break;
+        case PM_HASHED: pm->sets.hfn(pm, NULL); break;
     }
     pm->flags |= PM_UNSET;
 }
@@ -1427,6 +1555,26 @@
     if (pm->flags & PM_UNIQUE)
 	uniqarray(x);
     pm->u.arr = x;
+}
+
+/* Function to get value of an association parameter */
+
+/**/
+static HashTable
+hashgetfn(Param pm)
+{
+    return pm->u.hash;
+}
+
+/* Function to set value of an association parameter */
+
+/**/
+static void
+hashsetfn(Param pm, HashTable x)
+{
+    if (pm->u.hash && pm->u.hash != x)
+	deletehashtable(pm->u.hash);
+    pm->u.hash = x;
 }
 
 /* This function is used as the set function for      *
Index: Src/text.c
===================================================================
--- text.c	1998/06/01 17:08:45	1.1.1.1
+++ text.c	1998/11/10 05:33:22
@@ -450,6 +450,8 @@
 	    taddchr('(');
 	    taddlist(v->arr);
 	    taddstr(") ");
+	} else if (PM_TYPE(v->type) == PM_HASHED) {
+	    /* XXX */
 	} else {
 	    taddstr(v->str);
 	    taddchr(' ');
Index: Src/zsh.h
===================================================================
--- zsh.h	1998/11/04 17:13:55	1.3
+++ zsh.h	1998/11/10 08:49:41
@@ -813,6 +813,7 @@
 	char **arr;		/* value if declared array   (PM_ARRAY)   */
 	char *str;		/* value if declared string  (PM_SCALAR)  */
 	long val;		/* value if declared integer (PM_INTEGER) */
+        HashTable hash;		/* value if declared assoc   (PM_HASHED)  */
     } u;
 
     /* pointer to function to set value of this parameter */
@@ -820,6 +821,7 @@
 	void (*cfn) _((Param, char *));
 	void (*ifn) _((Param, long));
 	void (*afn) _((Param, char **));
+        void (*hfn) _((Param, HashTable));
     } sets;
 
     /* pointer to function to get value of this parameter */
@@ -827,6 +829,7 @@
 	char *(*cfn) _((Param));
 	long (*ifn) _((Param));
 	char **(*afn) _((Param));
+        HashTable (*hfn) _((Param));
     } gets;
 
     /* pointer to function to unset this parameter */
@@ -845,8 +848,9 @@
 #define PM_SCALAR	0	/* scalar                                     */
 #define PM_ARRAY	(1<<0)	/* array                                      */
 #define PM_INTEGER	(1<<1)	/* integer                                    */
+#define PM_HASHED	(1<<15)	/* association                                */
 
-#define PM_TYPE(X) (X & (PM_SCALAR|PM_INTEGER|PM_ARRAY))
+#define PM_TYPE(X) (X & (PM_SCALAR|PM_INTEGER|PM_ARRAY|PM_HASHED))
 
 #define PM_LEFT		(1<<2)	/* left justify and remove leading blanks     */
 #define PM_RIGHT_B	(1<<3)	/* right justify and fill with leading blanks */

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: PATCH: 3.1.5 - sample associative array implementation
  1998-11-11  6:44 PATCH: 3.1.5 - sample associative array implementation Bart Schaefer
@ 1998-11-11 13:58 ` Peter Stephenson
  1998-11-11 14:43   ` Bruce Stephens
  1998-11-11 18:16   ` PATCH: 3.1.5 - sample associative array implementation Bart Schaefer
  0 siblings, 2 replies; 21+ messages in thread
From: Peter Stephenson @ 1998-11-11 13:58 UTC (permalink / raw)
  To: Zsh hackers list

"Bart Schaefer" wrote:
> zsh% typeset -H assoc	# create associative (`H'ashed) array `assoc'
> 
> zsh% assoc[bread]=butter
> zsh% assoc[toast]=jam
> zsh% echo $assoc
> butter jam
> zsh% echo $assoc[bread]
> butter
> 
> However, there are some caveats:  (1) You can't assign to multiple elements
> of an associative array in a single expression; you must always use the
> subscript syntax.

This can probably be fixed in a perl-like fasion by adapting
setarrvalue(), which should be reasonably painless, though I haven't
looked at the details yet.  One question is whether
  hash=(key1 val1 key2 val2)
replaces the array entirely, or just adds/replaces those elements.  In
the former case it's difficult to think of a way of replacing multiple
elements at once; maybe another new typeset flag.

> (2) The text inside the [] is not subject to arithmetic
> evaluation as it is with regular arrays.

This is obviously correct, it's really the same issue as the syntax.

> (3) $assoc, $assoc[@], $assoc[*]
> all produce strings, not arrays.  (4) because of (3), subscript modifiers
> such as $assoc[(r)but*] (which should produce "butter") produce the whole
> string.

The patch below fixes this by judicious use of v->isarr and making more
places where whole hashes can be returned as if they were arrays.  The
Value now caches the whole array to prevent this having to be done
more than once.  (Value's are short-lived, only for the length of one
substitution.)

> (5) $assoc[bread,3] produces "but" (the first 3 characters of the
> value) which I think is because getarg() doesn't return soon enough; it
> really ought to either ignore or gripe about what comes after the comma.

I haven't touched this.

> This patch is not very "ready for prime time" and should be worked over by
> someone more familiar than I am with the parameter manipulation code.  In
> particular, I'm afraid there may be some memory leaks in the code to form
> arrays from the values.

I put in a MUSTUSEHEAP to check for this.  It hasn't printed a message
so far.  One problem may be in the use of getaparam() from various
builtins; at present that's not very useful for hashes.  A single
HEAPALLOC could (presumably) cure the problem.

In fact, it's not absolutely clear to me this is an appropriate use
for getaparam(), whose return value is used to test whether things can
be shifted, etc.  It might be better to make a separate gethparam();
at the moment, that would only be needed in zle_tricky.c if you want
to use the values of hashes for completion, and later it may be needed
in typeset etc.

> Further, the syntax for referring to associative
> array elements should probably not be the same as that for regular arrays
> (perhaps $assoc[[bread]], for example, which now is a math error) but I
> didn't want to delve into the parsing.

It's a question of whether it's more convenient having a minimal
re-write, as now, or whether the fiddling to get the new syntax to
work is simple enough.  The double brackets are probably the easiest
to make work, but even so there could be a profusion of new tests.

> You'll note there's a hunk of
> text.c where I had no idea what to do.

This goes along with what to do about assigning whole hashes; it'll
probably end up looking the same as for ordinary arrays at this point.

> Lastly, "typeset -H" is messily
> implemented because I didn't want to renumber gobs of flags in zsh.h.

typeset is messily implemented anyway, because it's extremely hard to
cover all the cases of what to do when a parameter of a different type
or local level already exists (do we use it or hide it, do we convert
the existing value, etc.)

> An
> associative array is then nothing more than a struct param that refers to
> a hash table of other struct param.
> 
> When an associative array element is referenced, it's hash table slot is
> created and initially marked PM_UNSET.

This means (post patch):

% typeset -H hash
% hash[one]=eins
% print $hash[two]

% print -l "$hash[@]"
one
                       <- $hash[two] was created unset
% 

Is this really correct?  It's not normal for a non-existent shell
parameter to spring into existence when it's used, unlike Perl.

Another thing:  there's no way of getting the keys of the hash.
Something like $hash[(k)*] would be OK, except that * and @ don't seem
to work with flags at the moment.


*** Src/params.c.bart	Wed Nov 11 09:38:31 1998
--- Src/params.c	Wed Nov 11 14:00:15 1998
***************
*** 323,328 ****
--- 323,329 ----
  char **
  paramvalarr(HashTable ht)
  {
+     MUSTUSEHEAP("paramvalarr");
      numparamvals = 0;
      if (ht)
  	scanhashtable(ht, 0, 0, 0, scancountparams, 0);
***************
*** 335,340 ****
--- 336,358 ----
      return paramvals;
  }
  
+ /* Return the full array (no indexing) referred to by a Value. *
+  * The array value is cached for the lifetime of the Value.    */
+ 
+ /**/
+ static char **
+ getvaluearr(Value v)
+ {
+     if (v->arr)
+ 	return v->arr;
+     else if (PM_TYPE(v->pm->flags) == PM_ARRAY)
+ 	return v->arr = v->pm->gets.afn(v->pm);
+     else if (PM_TYPE(v->pm->flags) == PM_HASHED)
+ 	return v->arr = paramvalarr(v->pm->gets.hfn(v->pm));
+     else
+ 	return NULL;
+ }
+ 
  /* Set up parameter hash table.  This will add predefined  *
   * parameter entries as well as setting up parameter table *
   * entries for environment variables we inherit.           */
***************
*** 965,971 ****
  	if (!pm || (pm->flags & PM_UNSET))
  	    return NULL;
  	v = (Value) hcalloc(sizeof *v);
! 	if (PM_TYPE(pm->flags) == PM_ARRAY)
  	    v->isarr = isvarat ? -1 : 1;
  	v->pm = pm;
  	v->inv = 0;
--- 983,989 ----
  	if (!pm || (pm->flags & PM_UNSET))
  	    return NULL;
  	v = (Value) hcalloc(sizeof *v);
! 	if (PM_TYPE(pm->flags) & (PM_ARRAY|PM_HASHED))
  	    v->isarr = isvarat ? -1 : 1;
  	v->pm = pm;
  	v->inv = 0;
***************
*** 1013,1022 ****
  
  	switch(PM_TYPE(v->pm->flags)) {
  	case PM_HASHED:
- 	    ss = paramvalarr(v->pm->gets.hfn(v->pm));	/* XXX Leaky? */
- 	    LASTALLOC_RETURN sepjoin(ss, NULL);
  	case PM_ARRAY:
! 	    ss = v->pm->gets.afn(v->pm);
  	    if (v->isarr)
  		s = sepjoin(ss, NULL);
  	    else {
--- 1031,1038 ----
  
  	switch(PM_TYPE(v->pm->flags)) {
  	case PM_HASHED:
  	case PM_ARRAY:
! 	    ss = getvaluearr(v);
  	    if (v->isarr)
  		s = sepjoin(ss, NULL);
  	    else {
***************
*** 1070,1076 ****
  	s[0] = dupstring(buf);
  	return s;
      }
!     s = v->pm->gets.afn(v->pm);
      if (v->a == 0 && v->b == -1)
  	return s;
      if (v->a < 0)
--- 1086,1092 ----
  	s[0] = dupstring(buf);
  	return s;
      }
!     s = getvaluearr(v);
      if (v->a == 0 && v->b == -1)
  	return s;
      if (v->a < 0)
***************
*** 1305,1316 ****
  {
      Value v;
  
!     if (!idigit(*s) && (v = getvalue(&s, 0))) {
! 	if (PM_TYPE(v->pm->flags) == PM_ARRAY)
! 	    return v->pm->gets.afn(v->pm);
! 	else if (PM_TYPE(v->pm->flags) == PM_HASHED)
! 	    return paramvalarr(v->pm->gets.hfn(v->pm));	/* XXX Leaky? */
!     }
      return NULL;
  }
  
--- 1321,1328 ----
  {
      Value v;
  
!     if (!idigit(*s) && (v = getvalue(&s, 0)))
! 	return getvaluearr(v);
      return NULL;
  }
  
*** Src/zsh.h.bart	Wed Nov 11 11:36:32 1998
--- Src/zsh.h	Wed Nov 11 11:36:59 1998
***************
*** 538,543 ****
--- 538,544 ----
      int inv;		/* should we return the index ?        */
      int a;		/* first element of array slice, or -1 */
      int b;		/* last element of array slice, or -1  */
+     char **arr;		/* cache for hash turned into array */
  };
  
  /* structure for foo=bar assignments */

-- 
Peter Stephenson <pws@ibmth.df.unipi.it>       Tel: +39 050 844536
WWW:  http://www.ifh.de/~pws/
Dipartimento di Fisica, Via Buonarotti 2, 56100 Pisa, Italy


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: PATCH: 3.1.5 - sample associative array implementation
  1998-11-11 13:58 ` Peter Stephenson
@ 1998-11-11 14:43   ` Bruce Stephens
  1998-11-11 20:00     ` Timothy Writer
  1998-11-11 18:16   ` PATCH: 3.1.5 - sample associative array implementation Bart Schaefer
  1 sibling, 1 reply; 21+ messages in thread
From: Bruce Stephens @ 1998-11-11 14:43 UTC (permalink / raw)
  To: Zsh hackers list

Peter Stephenson <pws@ibmth.df.unipi.it> writes:

> This can probably be fixed in a perl-like fasion by adapting
> setarrvalue(), which should be reasonably painless, though I haven't
> looked at the details yet.  One question is whether
>   hash=(key1 val1 key2 val2)
> replaces the array entirely, or just adds/replaces those elements.  In
> the former case it's difficult to think of a way of replacing multiple
> elements at once; maybe another new typeset flag.

What does ksh93 provide in the way of associative array functionality?
(I don't have it installed at work, so I can't look it up right not.)

I'm not suggesting that ksh93 is always right about everything, but it
would surely be a good starting point, and a zsh which contained ksh93
as a subset would be much more convenient than having gratuitous
syntactic differences.  Except in those places where ksh93 is just
wrong, of course.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: PATCH: 3.1.5 - sample associative array implementation
  1998-11-11 13:58 ` Peter Stephenson
  1998-11-11 14:43   ` Bruce Stephens
@ 1998-11-11 18:16   ` Bart Schaefer
  1998-11-13 16:16     ` Associative arrays and memory Peter Stephenson
  1 sibling, 1 reply; 21+ messages in thread
From: Bart Schaefer @ 1998-11-11 18:16 UTC (permalink / raw)
  To: Peter Stephenson, Zsh hackers list

Thanks for looking at this, Peter.

On Nov 11,  2:58pm, Peter Stephenson wrote:
} Subject: Re: PATCH: 3.1.5 - sample associative array implementation
}
} "Bart Schaefer" wrote:
} > However, there are some caveats:  (1) You can't assign to multiple elements
} > of an associative array in a single expression; you must always use the
} > subscript syntax.
} 
} One question is whether
}   hash=(key1 val1 key2 val2)
} replaces the array entirely, or just adds/replaces those elements.

The problem with that syntax is that you can't tell whether `hash' is
intended to be an associative array, so you still have to declare it
first.  (Perl handles this by always prefixing variable names with $
@ or % even when assigning.)

The more lispy
    hash=((key1 val1) (key2 val2))
appears to be syntactically invalid at present ("number expected" ??).
However, it appears that it WAS valid in 3.0.5, so perhaps that's a bug
in 3.1.5.

} In
} the former case it's difficult to think of a way of replacing multiple
} elements at once [...]
} Another thing:  there's no way of getting the keys of the hash.

What's needed is a syntax to have the hash produce its key/value pairs
in the appropriate form so that you can do e.g.

    hash=(${(kv)hash} key1 newval1 key2 newval2)

(where ${(k)hash} means substitute the keys, and ${(v)hash} the values,
which is also the default when (k) is not present).

Then simply decree that, after the whole hash is replaced, elements are
assigned in left-to-right order, replacing duplicates.  Then you can do

    hash=(key 1 newval1 key2 newval2 ${(kv)hash})

to insert key1 and key2 only if they were not already present.

} Something like $hash[(k)*] would be OK, except that * and @ don't seem
} to work with flags at the moment.

I think an expansion flag rather than a subscription flag is the way,
see sample above.

} > I'm afraid there may be some memory leaks in the code to form
} > arrays from the values.
} 
} I put in a MUSTUSEHEAP to check for this.  It hasn't printed a message
} so far.

What I'm worried about is that copyparam() uses PERMALLOC, but I think
the pointers that it returns are being placed in a heap-allocated array
in some cases.  There's also a ztrdup() in scancopyparams().  MUSTUSEHEAP
wouldn't report either problem.  This would come up e.g. when you have a
local with the same name as an existing hash, so you have to save and
restore the hash.

} > When an associative array element is referenced, it's hash table slot is
} > created and initially marked PM_UNSET.
} 
} This means (post patch):
} 
} % typeset -H hash
} % hash[one]=eins
} % print $hash[two]
} 
} % print -l "$hash[@]"
} one
}                        <- $hash[two] was created unset
} % 
} 
} Is this really correct?

We can fix that by changing the way the hashtable scan happens, so it
ignores unset parameters.  The patch below fixes this, and adds some
support for the ${(k)hash} substitution flag, but I didn't actually put
in those flags themselves.

Will v->pm->flags &= ~PM_UNSET hurt anything in setstrvalue()?  It didn't
seem as if it should, but there are many details I don't understand.

} > (5) $assoc[bread,3] produces "but" (the first 3 characters of the
} > value) which I think is because getarg() doesn't return soon enough; it
} > really ought to either ignore or gripe about what comes after the comma.
} 
} I haven't touched this.

I no longer think it's as simple as I previously thought, so I didn't do
any more with it either.

This patch of course requires both my previous one and Peter's.

Index: Src/params.c
===================================================================
--- params.c	1998/11/11 17:40:25	1.5
+++ params.c	1998/11/11 18:14:34
@@ -295,13 +295,21 @@
     return nht;
 }
 
+#define SCANPM_WANTVALS   (1<<0)
+#define SCANPM_WANTKEYS   (1<<1)
+#define SCANPM_WANTINDEX  (1<<2)
+
 static unsigned numparamvals;
 
 /**/
 static void
 scancountparams(HashNode hn, int flags)
 {
-    ++numparamvals;
+    if (!(((Param)hn)->flags & PM_UNSET)) {
+	++numparamvals;
+	if ((flags & SCANPM_WANTKEYS) && (flags & SCANPM_WANTVALS))
+	    ++numparamvals;
+    }
 }
 
 static char **paramvals;
@@ -311,26 +319,33 @@
 scanparamvals(HashNode hn, int flags)
 {
     struct value v;
-    v.isarr = (flags & 1);
     v.pm = (Param)hn;
-    v.inv = (flags & 2);
-    v.a = 0;
-    v.b = -1;
-    paramvals[numparamvals++] = getstrvalue(&v);
+    if (!(v.pm->flags & PM_UNSET)) {
+	if (flags & SCANPM_WANTKEYS) {
+	    paramvals[numparamvals++] = v.pm->nam;
+	    if (!(flags & SCANPM_WANTVALS))
+		return;
+	}
+	v.isarr = (PM_TYPE(v.pm->flags) & (PM_ARRAY|PM_HASHED));
+	v.inv = (flags & SCANPM_WANTINDEX);
+	v.a = 0;
+	v.b = -1;
+	paramvals[numparamvals++] = getstrvalue(&v);
+    }
 }
 
 /**/
 char **
-paramvalarr(HashTable ht)
+paramvalarr(HashTable ht, unsigned flags)
 {
     MUSTUSEHEAP("paramvalarr");
     numparamvals = 0;
     if (ht)
-	scanhashtable(ht, 0, 0, 0, scancountparams, 0);
+	scanhashtable(ht, 0, 0, 0, scancountparams, flags);
     paramvals = (char **) alloc(numparamvals * sizeof(char **) + 1);
     if (ht) {
 	numparamvals = 0;
-	scanhashtable(ht, 0, 0, 0, scanparamvals, 0);
+	scanhashtable(ht, 0, 0, 0, scanparamvals, flags);
     }
     paramvals[numparamvals] = 0;
     return paramvals;
@@ -348,7 +363,7 @@
     else if (PM_TYPE(v->pm->flags) == PM_ARRAY)
 	return v->arr = v->pm->gets.afn(v->pm);
     else if (PM_TYPE(v->pm->flags) == PM_HASHED)
-	return v->arr = paramvalarr(v->pm->gets.hfn(v->pm));
+	return v->arr = paramvalarr(v->pm->gets.hfn(v->pm), 0);
     else
 	return NULL;
 }
@@ -1133,6 +1148,7 @@
 	zsfree(val);
 	return;
     }
+    v->pm->flags &= ~PM_UNSET;
     switch (PM_TYPE(v->pm->flags)) {
     case PM_SCALAR:
 	MUSTUSEHEAP("setstrvalue");

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: PATCH: 3.1.5 - sample associative array implementation
  1998-11-11 14:43   ` Bruce Stephens
@ 1998-11-11 20:00     ` Timothy Writer
  1998-11-11 20:52       ` Bart Schaefer
  0 siblings, 1 reply; 21+ messages in thread
From: Timothy Writer @ 1998-11-11 20:00 UTC (permalink / raw)
  To: Bruce Stephens; +Cc: Zsh hackers list

Bruce Stephens <b.stephens@isode.com> writes:

> Peter Stephenson <pws@ibmth.df.unipi.it> writes:
> 
> > This can probably be fixed in a perl-like fasion by adapting
> > setarrvalue(), which should be reasonably painless, though I haven't
> > looked at the details yet.  One question is whether
> >   hash=(key1 val1 key2 val2)
> > replaces the array entirely, or just adds/replaces those elements.  In
> > the former case it's difficult to think of a way of replacing multiple
> > elements at once; maybe another new typeset flag.
> 
> What does ksh93 provide in the way of associative array functionality?
> (I don't have it installed at work, so I can't look it up right not.)

In ksh93 associative arrays are declared using "typeset -A".  They use the
same syntax as indexed arrays, e.g. "foo[bar]=baz"; the text within [] is
subject to variable expansion and whitespace counts.  The following special
notation is used to get all keys:

    "${!arrayname[@]}"

-- 
Tim Writer                                              Tim.Writer@ftlsol.com
FTL Solutions Inc.
Toronto, Ontario, CANADA


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: PATCH: 3.1.5 - sample associative array implementation
  1998-11-11 20:00     ` Timothy Writer
@ 1998-11-11 20:52       ` Bart Schaefer
  1998-11-12  8:22         ` Timothy Writer
  0 siblings, 1 reply; 21+ messages in thread
From: Bart Schaefer @ 1998-11-11 20:52 UTC (permalink / raw)
  To: Timothy Writer, zsh-workers

On Nov 11,  3:00pm, Timothy Writer wrote:
} Subject: Re: PATCH: 3.1.5 - sample associative array implementation
}
} In ksh93 associative arrays are declared using "typeset -A".

This could easily be mimicked.

} They use the
} same syntax as indexed arrays, e.g. "foo[bar]=baz"; the text within [] is
} subject to variable expansion and whitespace counts.

Hrm.  Zsh currently can't handle whitespace inside the [ ]; and I'm not
sure the paramtab implementation I did will work with keys that aren't
parsable as "identifier".

Otherwise this is good.

How can you (or can you not) assign to multiple elements simultaneously?

} The following special notation is used to get all keys:
} 
}     "${!arrayname[@]}"

Exactly what about that is the special notation?  The `!'?  Does ksh93
recognize other special characters in that position?  Is there any way
to get both the keys and the values in a single expansion?

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: PATCH: 3.1.5 - sample associative array implementation
  1998-11-11 20:52       ` Bart Schaefer
@ 1998-11-12  8:22         ` Timothy Writer
  1998-11-12  9:23           ` Bart Schaefer
  0 siblings, 1 reply; 21+ messages in thread
From: Timothy Writer @ 1998-11-12  8:22 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

"Bart Schaefer" <schaefer@brasslantern.com> writes:

> On Nov 11,  3:00pm, Timothy Writer wrote:
> } Subject: Re: PATCH: 3.1.5 - sample associative array implementation
> }
> } In ksh93 associative arrays are declared using "typeset -A".
> 
> This could easily be mimicked.

Good.

> } They use the
> } same syntax as indexed arrays, e.g. "foo[bar]=baz"; the text within [] is
> } subject to variable expansion and whitespace counts.
> 
> Hrm.  Zsh currently can't handle whitespace inside the [ ]; and I'm not
> sure the paramtab implementation I did will work with keys that aren't
> parsable as "identifier".
> 
> Otherwise this is good.
> 
> How can you (or can you not) assign to multiple elements simultaneously?

I don't know, I don't think you can.  Unfortunately, I'm not a ksh93 expert,
I just paraphrased the above info from the ksh93 reference in "Desktop Korn
Shell Graphical Programming".

> } The following special notation is used to get all keys:
> } 
> }     "${!arrayname[@]}"
> 
> Exactly what about that is the special notation?  The `!'?  Does ksh93

I guess so, "special" is the book's term.

> recognize other special characters in that position?  Is there any way
> to get both the keys and the values in a single expansion?

As far as I can tell, only "!" and (of course) "#" are "special" in that
position.  However, "${!variable}" is also special syntax for name reference
variables, e.g.:

    B=2
    nameref A=B
    print $A		# prints: 2
    print ${!A}		# prints: B

Does zsh have namerefs?

I don't think there's a way to get both keys and values in a single
expression.

-- 
Tim Writer                                              Tim.Writer@ftlsol.com
FTL Solutions Inc.
Toronto, Ontario, CANADA


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: PATCH: 3.1.5 - sample associative array implementation
  1998-11-12  8:22         ` Timothy Writer
@ 1998-11-12  9:23           ` Bart Schaefer
  1998-11-13  0:04             ` Timothy Writer
  0 siblings, 1 reply; 21+ messages in thread
From: Bart Schaefer @ 1998-11-12  9:23 UTC (permalink / raw)
  To: zsh-workers

On Nov 12,  3:22am, Timothy Writer wrote:
} Subject: Re: PATCH: 3.1.5 - sample associative array implementation
}
} > On Nov 11,  3:00pm, Timothy Writer wrote:
} > } The following special notation is used to get all keys:
} > } 
} > }     "${!arrayname[@]}"
} 
} As far as I can tell, only "!" and (of course) "#" are "special" in that
} position.  However, "${!variable}" is also special syntax for name reference
} variables [...]
} 
} Does zsh have namerefs?

No, zsh does not.  It wouldn't be hard to implement, though.  Can you tell
what ${!arrayname} would do?  (My recent patch simply ignores the `!' in
that event.)

} "Bart Schaefer" <schaefer@brasslantern.com> writes:
} > How can you (or can you not) assign to multiple elements simultaneously?
} 
} I don't know, I don't think you can.
[...]
} I don't think there's a way to get both keys and values in a single
} expression.

Thanks for the info.

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: PATCH: 3.1.5 - sample associative array implementation
  1998-11-12  9:23           ` Bart Schaefer
@ 1998-11-13  0:04             ` Timothy Writer
  1998-11-13  1:32               ` Bart Schaefer
  0 siblings, 1 reply; 21+ messages in thread
From: Timothy Writer @ 1998-11-13  0:04 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

"Bart Schaefer" <schaefer@brasslantern.com> writes:

> On Nov 12,  3:22am, Timothy Writer wrote:
> } Does zsh have namerefs?
> 
> No, zsh does not.  It wouldn't be hard to implement, though.  Can you tell
> what ${!arrayname} would do?  (My recent patch simply ignores the `!' in
> that event.)

Sure

    typeset -A foo
    foo[bar]=baz
    echo ${!foo}

just prints "foo".  Makes sense because foo is not a nameref.

-- 
Tim Writer                                              Tim.Writer@ftlsol.com
FTL Solutions Inc.
Toronto, Ontario, CANADA


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: PATCH: 3.1.5 - sample associative array implementation
  1998-11-13  0:04             ` Timothy Writer
@ 1998-11-13  1:32               ` Bart Schaefer
  1998-11-14  1:55                 ` Timothy Writer
  0 siblings, 1 reply; 21+ messages in thread
From: Bart Schaefer @ 1998-11-13  1:32 UTC (permalink / raw)
  To: Timothy Writer, zsh-workers

On Nov 12,  7:04pm, Timothy Writer wrote:
> Subject: Re: PATCH: 3.1.5 - sample associative array implementation
[Referring to ksh93]
> 
>     typeset -A foo
>     foo[bar]=baz
>     echo ${!foo}
> 
> just prints "foo".  Makes sense because foo is not a nameref.

That's not what I would have expected; I would have expected it to either
(a) behave the same as ${foo} [which means what, when foo is an associative
array?] or (b) print nothing, because there's no variable to which the value
of foo can possibly refer.  Printing its own *name* like that isn't sensible
to me.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Associative arrays and memory
  1998-11-11 18:16   ` PATCH: 3.1.5 - sample associative array implementation Bart Schaefer
@ 1998-11-13 16:16     ` Peter Stephenson
  1998-11-13 17:57       ` Bart Schaefer
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Stephenson @ 1998-11-13 16:16 UTC (permalink / raw)
  To: Zsh hackers list

"Bart Schaefer" wrote:
> What I'm worried about is that copyparam() uses PERMALLOC, but I think
> the pointers that it returns are being placed in a heap-allocated array
> in some cases.  There's also a ztrdup() in scancopyparams().  MUSTUSEHEAP
> wouldn't report either problem.  This would come up e.g. when you have a
> local with the same name as an existing hash, so you have to save and
> restore the hash.

It does seem the existing ztrdup()'s in what's now copyparam() are
memory leaks.  This will need fixing in any case, independently from
assoc arrays.

However, it looks like the only purpose of copyparam() is simply to
save special parameters and only in the following case:
  foo=bar some-builtin
to restore $foo (it's separate from the usual local restore mechanism).
There are currently no special assoc arrays, of course, and it should
probably be possible to prevent there being any --- the fact that the
shell uses a parameter doesn't make it special, it's only necessary
when the parameter has an otherwise inappropriate relationship with a
shell variable, that's why a lot of parameters have recently been
despecialised.  So I'd recommend removing all references to assoc
arrays from copyparam(), i.e. putting that part of the code back how
it was, maybe with a
  DPUTS(PM_TYPE(pm->flags) == PM_HASHED, "attempt to copy assoc array")
to be on the safe side.

As an aside, the ordinary `local' mechanism never deletes or copies
anything, it just stacks the old parameters.  So there shouldn't be
any big issues there.

Combining that with my other suggestion of separating assoc array
returns out of getaparam() would be a big help in warding off possible
memory leaks.  (Unless `shift hash', which is where getaparam() comes
in, is really supposed to work? It does something at the moment, but
what does it mean --- randomly remove element from hash??  Same goes
for the call in `set -A': there's no point getting an array only to
find you can't set it because it's the wrong type.)

Taking this all into account, I don't believe proper use of assoc
arrays is going to cause memory leaks, even at present.  Ensuring
improper use doesn't either is the hard part.

-- 
Peter Stephenson <pws@ibmth.df.unipi.it>       Tel: +39 050 844536
WWW:  http://www.ifh.de/~pws/
Dipartimento di Fisica, Via Buonarroti 2, 56100 Pisa, Italy


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Associative arrays and memory
  1998-11-13 16:16     ` Associative arrays and memory Peter Stephenson
@ 1998-11-13 17:57       ` Bart Schaefer
  1998-11-14 15:26         ` Peter Stephenson
  0 siblings, 1 reply; 21+ messages in thread
From: Bart Schaefer @ 1998-11-13 17:57 UTC (permalink / raw)
  To: Peter Stephenson, Zsh hackers list

On Nov 13,  5:16pm, Peter Stephenson wrote:
} Subject: Associative arrays and memory
} 
} It does seem the existing ztrdup()'s in what's now copyparam() are
} memory leaks.  This will need fixing in any case, independently from
} assoc arrays.

I'm not so sure any more.  When restore_params() is called, it uses the
pm->sets functions to assign directly to the struct param already in the
paramtab.  So that memory is now owned by the paramtab and should be
freed when the param next becomes unset ... right?

} However, it looks like the only purpose of copyparam() is simply to
} save special parameters and only in the following case:
}   foo=bar some-builtin
} to restore $foo (it's separate from the usual local restore mechanism).

I appear to have broken it somehow, then:

zagzig<17> HISTSIZE=1000 
zagzig<18> HISTSIZE=0 echo hello
hello
zagzig<19> echo $HISTSIZE
2

Nonspecial variables work:

zagzig<23> FOO=before
zagzig<24> FOO=after echo hello
hello
zagzig<25> echo $FOO
before

} There are currently no special assoc arrays, of course, and it should
} probably be possible to prevent there being any

What about the discussion that started all this -- using an associative
array to give user access to shell-internal completion data?

} So I'd recommend removing all references to assoc
} arrays from copyparam(), i.e. putting that part of the code back how
} it was, maybe with a
}   DPUTS(PM_TYPE(pm->flags) == PM_HASHED, "attempt to copy assoc array")
} to be on the safe side.
} 
} Combining that with my other suggestion of separating assoc array
} returns out of getaparam() would be a big help in warding off possible
} memory leaks.  (Unless `shift hash', which is where getaparam() comes
} in, is really supposed to work? It does something at the moment, but
} what does it mean --- randomly remove element from hash??  Same goes
} for the call in `set -A': there's no point getting an array only to
} find you can't set it because it's the wrong type.)

If you have time and inclination, please go ahead and do this.  I'm not
going to be able to work on it again until tomorrow at the earliest.

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: PATCH: 3.1.5 - sample associative array implementation
  1998-11-13  1:32               ` Bart Schaefer
@ 1998-11-14  1:55                 ` Timothy Writer
  1998-11-14  6:41                   ` Bart Schaefer
  1998-11-15 20:47                   ` PATCH: 3.1.5 + associative arrays under ksh emulation Bart Schaefer
  0 siblings, 2 replies; 21+ messages in thread
From: Timothy Writer @ 1998-11-14  1:55 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

"Bart Schaefer" <schaefer@brasslantern.com> writes:

> On Nov 12,  7:04pm, Timothy Writer wrote:
> > Subject: Re: PATCH: 3.1.5 - sample associative array implementation
> [Referring to ksh93]
> > 
> >     typeset -A foo
> >     foo[bar]=baz
> >     echo ${!foo}
> > 
> > just prints "foo".  Makes sense because foo is not a nameref.
> 
> That's not what I would have expected; I would have expected it to either
> (a) behave the same as ${foo} [which means what, when foo is an associative
> array?] or (b) print nothing, because there's no variable to which the value
> of foo can possibly refer.  Printing its own *name* like that isn't sensible
> to me.

In ksh93, $arrayname is a synonym for ${arrayname[0]}.  This appears to be
true for indexed arrays _and_ associative arrays.

If you think of nameref as being like a symbolic link and ${!foo} as being
like lstat(), it makes perfect sense (at least to my twisted mind) that
${!foo} is just "foo" for ordinary (non nameref) variables.

-- 
Tim Writer                                              Tim.Writer@ftlsol.com
FTL Solutions Inc.
Toronto, Ontario, CANADA


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: PATCH: 3.1.5 - sample associative array implementation
  1998-11-14  1:55                 ` Timothy Writer
@ 1998-11-14  6:41                   ` Bart Schaefer
  1998-11-15  8:42                     ` Timothy Writer
  1998-11-15 20:47                   ` PATCH: 3.1.5 + associative arrays under ksh emulation Bart Schaefer
  1 sibling, 1 reply; 21+ messages in thread
From: Bart Schaefer @ 1998-11-14  6:41 UTC (permalink / raw)
  To: Timothy Writer, zsh-workers

On Nov 13,  8:55pm, Timothy Writer wrote:
} Subject: Re: PATCH: 3.1.5 - sample associative array implementation
}
} In ksh93, $arrayname is a synonym for ${arrayname[0]}.  This appears to be
} true for indexed arrays _and_ associative arrays.

So it actually dereferences a slot with key "0" and substitutes the value
that is there?  And substitutes nothing if there is no key "0"?  Or ...?

(In the zsh implementation from my + PWS's patches, with `emulate ksh' in
effect, $arrayname prints the first hash table element in whatever order
the hash happens to get traversed, which may change as elements are added.

That leads to the question, how do you remove a particular key/value pair
from a ksh93 associative array?)

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Associative arrays and memory
  1998-11-13 17:57       ` Bart Schaefer
@ 1998-11-14 15:26         ` Peter Stephenson
  1998-11-14 18:47           ` Bart Schaefer
  0 siblings, 1 reply; 21+ messages in thread
From: Peter Stephenson @ 1998-11-14 15:26 UTC (permalink / raw)
  To: Zsh hackers list

"Bart Schaefer" wrote:
> } It does seem the existing ztrdup()'s in what's now copyparam() are
> } memory leaks.  This will need fixing in any case, independently from
> } assoc arrays.
> 
> I'm not so sure any more.  When restore_params() is called, it uses the
> pm->sets functions to assign directly to the struct param already in the
> paramtab.  So that memory is now owned by the paramtab and should be
> freed when the param next becomes unset ... right?

Yes, it looks like you're right.  Maybe it's still unnecessarily
confusing.  The struct param is really just a placeholder with a type
flag for the real information.

> zagzig<17> HISTSIZE=1000 
> zagzig<18> HISTSIZE=0 echo hello
> hello
> zagzig<19> echo $HISTSIZE
> 2

This seems to be OK after restoring the old behaviour, so I haven't
look further into what had gone wrong.

> } There are currently no special assoc arrays, of course, and it should
> } probably be possible to prevent there being any
> 
> What about the discussion that started all this -- using an associative
> array to give user access to shell-internal completion data?

Rats, I just looked at zle_params.c and you're right --- they're
marked special.  I had wrongly assumed it was just a case of setting
the value at the start and using it at the end, but there tied to
functions.

That doesn't mean assoc arrays need to be the same, though.  The case
that needs worrying about is the following:

  hash=(?...?) builtin_or_func

where the ?...? represents whatever we pick for whole array
assignments.  If there is none, there's no problem; that's the only
case where copying parameters is necessary.

There's also no problem if the hash is simply used for storing
information and is not tied to special variables or functions.  You
only need a special mechanism for restoration for something like
$path, where there's an internal variable that needs setting; simply
restoring the struct param won't do that.  If we can agree that use of
assoc arrays is simply going to be by direct access to the parameter
we can avoid ever copying it: the above shell pseudocode simply makes
the supplied $hash available for the duration of builtin_func.  It
would certainly keep things much simpler, and I can provide a patch
straight away.

So, the question is are there any uses for special hashes which would
require tying them directly to an internal variable or function, or
can they always be accessed by the standard parameter functions?  I
would think the whole point of using assoc arrays is to avoid any
unpleasantness of the former kind.  Probably Sven can answer that
better than anyone.

-- 
Peter Stephenson <pws@ibmth.df.unipi.it>       Tel: +39 050 844536
WWW:  http://www.ifh.de/~pws/
Dipartimento di Fisica, Via Buonarroti 2, 56100 Pisa, Italy


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Associative arrays and memory
  1998-11-14 15:26         ` Peter Stephenson
@ 1998-11-14 18:47           ` Bart Schaefer
  1998-11-15 15:54             ` Peter Stephenson
  0 siblings, 1 reply; 21+ messages in thread
From: Bart Schaefer @ 1998-11-14 18:47 UTC (permalink / raw)
  To: Peter Stephenson, Zsh hackers list

On Nov 14,  4:26pm, Peter Stephenson wrote:
} Subject: Re: Associative arrays and memory
}
} "Bart Schaefer" wrote:
} > zagzig<17> HISTSIZE=1000 
} > zagzig<18> HISTSIZE=0 echo hello
} > hello
} > zagzig<19> echo $HISTSIZE
} > 2
} 
} This seems to be OK after restoring the old behaviour, so I haven't
} look further into what had gone wrong.

I found it ... the assignment "pm = tpm;" was mistakenly deleted from
below the code that was replaced by the call to copyparam().  That is,
in save_params() in exec.c, it should look like

	tpm->nam = s;
	copyparam(tpm, pm);
	pm = tpm;

So copyparam() itself is OK.

} > } There are currently no special assoc arrays, of course, and it should
} > } probably be possible to prevent there being any
} > 
} > What about the discussion that started all this -- using an associative
} > array to give user access to shell-internal completion data?
} 
} Rats, I just looked at zle_params.c and you're right --- they're
} marked special.

Of course, a non-special associative array can contain special parameters.
That was part of the idea of using a parameter hash as the implementation.
And that would be the way to do it, rather than make the associative array
itself special.

} That doesn't mean assoc arrays need to be the same, though.  The case
} that needs worrying about is the following:
} 
}   hash=(?...?) builtin_or_func
} 
} where the ?...? represents whatever we pick for whole array
} assignments.

It doesn't matter that a whole array is being assigned to the hash, does
it?  Even creating a scalar with the same parameter name would require
that it be saved/restored.

Or are you talking *about* special parameters within the associative array
that need to be saved, and that's why whole-array assignment would matter?

} There's also no problem if the hash is simply used for storing
} information and is not tied to special variables or functions.  You
} only need a special mechanism for restoration for something like
} $path, where there's an internal variable that needs setting; simply
} restoring the struct param won't do that.

That is, you need to call the pm->sets functions when restoring specials,
which is what restore_params() does.  But my implementation does not call
anything equivalent to restore_params() on the parameter table stored in
the association, so it isn't helping to have save_params() copy it in the
first place.

} If we can agree that use of
} assoc arrays is simply going to be by direct access to the parameter
} we can avoid ever copying it: the above shell pseudocode simply makes
} the supplied $hash available for the duration of builtin_func.

I'm not entirely sure what you mean about "direct access."  If we're
going to put stuff like the present $BUFFER into an association, it needs
to be the case that assignment to (say) zle[BUFFER] actually modifies the
line editor state.

So the question is, should it ever be the case that

    zle=(? BUFFER "this is the command line" ... ?) builtin_or_func

causes the old command line to be restored following builtin_or_func?
And if we save and restore such a hash by the simple stacking method, is
$zle[BUFFER] going to give the right thing after builtin_or_func?

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: PATCH: 3.1.5 - sample associative array implementation
  1998-11-14  6:41                   ` Bart Schaefer
@ 1998-11-15  8:42                     ` Timothy Writer
  1998-11-15 20:03                       ` Bart Schaefer
  0 siblings, 1 reply; 21+ messages in thread
From: Timothy Writer @ 1998-11-15  8:42 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

"Bart Schaefer" <schaefer@brasslantern.com> writes:

> On Nov 13,  8:55pm, Timothy Writer wrote:
> } Subject: Re: PATCH: 3.1.5 - sample associative array implementation
> }
> } In ksh93, $arrayname is a synonym for ${arrayname[0]}.  This appears to be
> } true for indexed arrays _and_ associative arrays.
> 
> So it actually dereferences a slot with key "0" and substitutes the value
> that is there?  And substitutes nothing if there is no key "0"?  Or ...?

It appears to.

> (In the zsh implementation from my + PWS's patches, with `emulate ksh' in
> effect, $arrayname prints the first hash table element in whatever order
> the hash happens to get traversed, which may change as elements are added.

Perhaps that's why they go to the trouble of dereferencing a slot with key
"0", so its predictable.

> That leads to the question, how do you remove a particular key/value pair
> from a ksh93 associative array?)

"unset foo[bar]" seems to work.

-- 
Tim Writer                                              Tim.Writer@ftlsol.com
FTL Solutions Inc.
Toronto, Ontario, CANADA


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Associative arrays and memory
  1998-11-14 18:47           ` Bart Schaefer
@ 1998-11-15 15:54             ` Peter Stephenson
  0 siblings, 0 replies; 21+ messages in thread
From: Peter Stephenson @ 1998-11-15 15:54 UTC (permalink / raw)
  To: Zsh hackers list

"Bart Schaefer" wrote:
> Of course, a non-special associative array can contain special parameters.
> That was part of the idea of using a parameter hash as the implementation.
> And that would be the way to do it, rather than make the associative array
> itself special.

OK, I missed this point, then you're stuck with something of the
complexity of what you had if you want to restore it.  But what's
wrong with making the associative array special?  It could do some
sort of dispatch based on the values of the hash it stored.  Then you
could get away with doing only a shallow save of the assoc array at
this point (but see below), i.e. pretty much like what happens with
other special parameters.

Possibly in practice the parent AA will have to be marked as special,
even if its own functionality is as normal.  The alternatives to that
are (1) check all non-special AA's for special contents at the point
in question, which probably isn't on (2) the elements of the AA stop
being special, which is a bit counterintuitive.

> It doesn't matter that a whole array is being assigned to the hash, does
> it?  Even creating a scalar with the same parameter name would require
> that it be saved/restored.

That's true if the assoc array is not special, but its elements are.
If the AA is itself special, it retains that behaviour on set/restore.
Consider
   PATH=foo:bar builtin_or_func
--- here PATH still works the same way as always, just with special
save/restore.  Then assigning a scalar to zle would give some kind of
error.

> So the question is, should it ever be the case that
> 
>     zle=(? BUFFER "this is the command line" ... ?) builtin_or_func
> 
> causes the old command line to be restored following builtin_or_func?
> And if we save and restore such a hash by the simple stacking method, is
> $zle[BUFFER] going to give the right thing after builtin_or_func?

If the bits of the AA are special, and you want to restore them, then
as I said above it seems you'll need the full complexity of the
copyarams() mechanism because you can't guarantee restoring the state
otherwise.  Should it work?  Probably yes.  At some stage we will want
whole assoc array assignments like 'hash2=$hash1', which presumably
requires something like copyparams() anyway.  Maybe I'll just have to
bit the bullet and check for memory leaks properly :-(.

-- 
Peter Stephenson <pws@ibmth.df.unipi.it>       Tel: +39 050 844536
WWW:  http://www.ifh.de/~pws/
Dipartimento di Fisica, Via Buonarroti 2, 56100 Pisa, Italy


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: PATCH: 3.1.5 - sample associative array implementation
  1998-11-15  8:42                     ` Timothy Writer
@ 1998-11-15 20:03                       ` Bart Schaefer
  1998-11-16 19:16                         ` Timothy Writer
  0 siblings, 1 reply; 21+ messages in thread
From: Bart Schaefer @ 1998-11-15 20:03 UTC (permalink / raw)
  To: Timothy Writer, zsh-workers

On Nov 15,  3:42am, Timothy Writer wrote:
} Subject: Re: PATCH: 3.1.5 - sample associative array implementation
}
} > That leads to the question, how do you remove a particular key/value pair
} > from a ksh93 associative array?
} 
} "unset foo[bar]" seems to work.

Ooh, ick.  That means in ksh93 `unset' is a keyword, not just a builtin, and
changes the parse of what follows it.

I rather don't like that ... can you make a shell function named "unset" in
ksh93?

(Shell function or not, zsh globs foo[bar], which could mean unsetting the
variables fooa, foob, and foor.)

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* PATCH: 3.1.5 + associative arrays under ksh emulation
  1998-11-14  1:55                 ` Timothy Writer
  1998-11-14  6:41                   ` Bart Schaefer
@ 1998-11-15 20:47                   ` Bart Schaefer
  1 sibling, 0 replies; 21+ messages in thread
From: Bart Schaefer @ 1998-11-15 20:47 UTC (permalink / raw)
  To: zsh-workers

On Nov 13,  8:55pm, Timothy Writer wrote:
} Subject: Re: PATCH: 3.1.5 - sample associative array implementation
}
} In ksh93, $arrayname is a synonym for ${arrayname[0]}.  This appears to be
} true for indexed arrays _and_ associative arrays.

Index: Src/params.c
===================================================================
--- params.c	1998/11/12 09:21:35	1.8
+++ params.c	1998/11/15 20:39:19
@@ -1055,6 +1055,13 @@
 
 	switch(PM_TYPE(v->pm->flags)) {
 	case PM_HASHED:
+	    /* (!v->isarr) should be impossible unless emulating ksh */
+	    if (!v->isarr && emulation == EMULATE_KSH) {
+		s = dupstring("[0]");
+		if (getindex(&s, v) == 0)
+		    s = getstrvalue(v);
+		LASTALLOC_RETURN s;
+	    } /* else fall through */
 	case PM_ARRAY:
 	    ss = getvaluearr(v);
 	    if (v->isarr)

-- 
Bart Schaefer                                 Brass Lantern Enterprises
http://www.well.com/user/barts              http://www.brasslantern.com


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: PATCH: 3.1.5 - sample associative array implementation
  1998-11-15 20:03                       ` Bart Schaefer
@ 1998-11-16 19:16                         ` Timothy Writer
  0 siblings, 0 replies; 21+ messages in thread
From: Timothy Writer @ 1998-11-16 19:16 UTC (permalink / raw)
  To: Bart Schaefer; +Cc: zsh-workers

"Bart Schaefer" <schaefer@brasslantern.com> writes:

> On Nov 15,  3:42am, Timothy Writer wrote:
> } Subject: Re: PATCH: 3.1.5 - sample associative array implementation
> }
> } > That leads to the question, how do you remove a particular key/value pair
> } > from a ksh93 associative array?
> } 
> } "unset foo[bar]" seems to work.
> 
> Ooh, ick.  That means in ksh93 `unset' is a keyword, not just a builtin, and
> changes the parse of what follows it.
> 
> I rather don't like that ... can you make a shell function named "unset" in
> ksh93?

No:

    $ unset() {
    > builtin unset "$@"
    > }
    /usr/dt/bin/dtksh: unset: illegal function name
    $

-- 
Tim Writer                                              Tim.Writer@ftlsol.com
FTL Solutions Inc.
Toronto, Ontario, CANADA


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~1998-11-16 19:19 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1998-11-11  6:44 PATCH: 3.1.5 - sample associative array implementation Bart Schaefer
1998-11-11 13:58 ` Peter Stephenson
1998-11-11 14:43   ` Bruce Stephens
1998-11-11 20:00     ` Timothy Writer
1998-11-11 20:52       ` Bart Schaefer
1998-11-12  8:22         ` Timothy Writer
1998-11-12  9:23           ` Bart Schaefer
1998-11-13  0:04             ` Timothy Writer
1998-11-13  1:32               ` Bart Schaefer
1998-11-14  1:55                 ` Timothy Writer
1998-11-14  6:41                   ` Bart Schaefer
1998-11-15  8:42                     ` Timothy Writer
1998-11-15 20:03                       ` Bart Schaefer
1998-11-16 19:16                         ` Timothy Writer
1998-11-15 20:47                   ` PATCH: 3.1.5 + associative arrays under ksh emulation Bart Schaefer
1998-11-11 18:16   ` PATCH: 3.1.5 - sample associative array implementation Bart Schaefer
1998-11-13 16:16     ` Associative arrays and memory Peter Stephenson
1998-11-13 17:57       ` Bart Schaefer
1998-11-14 15:26         ` Peter Stephenson
1998-11-14 18:47           ` Bart Schaefer
1998-11-15 15:54             ` Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).