zsh-workers
 help / color / mirror / code / Atom feed
* MAX_ARRLEN
@ 2012-04-23 15:27 Peter Stephenson
  2012-04-23 16:10 ` MAX_ARRLEN Mikael Magnusson
  2012-04-23 16:38 ` MAX_ARRLEN Bart Schaefer
  0 siblings, 2 replies; 11+ messages in thread
From: Peter Stephenson @ 2012-04-23 15:27 UTC (permalink / raw)
  To: Zsh Hackers' List

I've just hit MAX_ARRLEN.  The array in question wasn't much larger than
the limit and when I commented out the checks everything just worked fine.
So it looks like an arbitrary limit isn't much use --- no great
surprise, I don't think anyone here is a big fan of them.

What's the right thing to do?  There are various grades ranging from
making it compilable out, through making it compile-time configurable
with an option to compile out, through making it an option to have the
check turned on, to having a variable that we check using getiparam()
each time, to having a special variable so that we don't need to get it
each time.  I think the last option with a clearly named variable such
as ZSH_MAX_ARRAY_LENGTH that can be set to 0 to turn it off is probably
the best.

-- 
Peter Stephenson <pws@csr.com>            Software Engineer
Tel: +44 (0)1223 692070                   Cambridge Silicon Radio Limited
Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Follow CSR on Twitter at http://twitter.com/CSR_PLC and read our blog at www.csr.com/blog


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MAX_ARRLEN
  2012-04-23 15:27 MAX_ARRLEN Peter Stephenson
@ 2012-04-23 16:10 ` Mikael Magnusson
  2012-04-23 16:21   ` MAX_ARRLEN Bart Schaefer
  2012-04-23 16:38 ` MAX_ARRLEN Bart Schaefer
  1 sibling, 1 reply; 11+ messages in thread
From: Mikael Magnusson @ 2012-04-23 16:10 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh Hackers' List

On 2012-04-23, Peter Stephenson <Peter.Stephenson@csr.com> wrote:
> I've just hit MAX_ARRLEN.  The array in question wasn't much larger than
> the limit and when I commented out the checks everything just worked fine.
> So it looks like an arbitrary limit isn't much use --- no great
> surprise, I don't think anyone here is a big fan of them.
>
> What's the right thing to do?  There are various grades ranging from
> making it compilable out, through making it compile-time configurable
> with an option to compile out, through making it an option to have the
> check turned on, to having a variable that we check using getiparam()
> each time, to having a special variable so that we don't need to get it
> each time.  I think the last option with a clearly named variable such
> as ZSH_MAX_ARRAY_LENGTH that can be set to 0 to turn it off is probably
> the best.

http://www.zsh.org/mla/workers/2010/msg00013.html

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MAX_ARRLEN
  2012-04-23 16:10 ` MAX_ARRLEN Mikael Magnusson
@ 2012-04-23 16:21   ` Bart Schaefer
  2012-04-23 16:27     ` MAX_ARRLEN Peter Stephenson
  0 siblings, 1 reply; 11+ messages in thread
From: Bart Schaefer @ 2012-04-23 16:21 UTC (permalink / raw)
  To: Zsh Hackers' List

On Apr 23,  6:10pm, Mikael Magnusson wrote:
} 
} http://www.zsh.org/mla/workers/2010/msg00013.html

And

http://www.zsh.org/mla/workers/2010/msg00015.html

-- 
Barton E. Schaefer


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MAX_ARRLEN
  2012-04-23 16:21   ` MAX_ARRLEN Bart Schaefer
@ 2012-04-23 16:27     ` Peter Stephenson
  2012-04-23 16:36       ` MAX_ARRLEN Mikael Magnusson
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Stephenson @ 2012-04-23 16:27 UTC (permalink / raw)
  To: Zsh Hackers' List

On Mon, 23 Apr 2012 09:21:23 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Apr 23,  6:10pm, Mikael Magnusson wrote:
> } 
> } http://www.zsh.org/mla/workers/2010/msg00013.html
> 
> And
> 
> http://www.zsh.org/mla/workers/2010/msg00015.html

Those are basically saying yes, the current set up has problems but we'd
quite like something.  Hence my immediate suggestions of what we
*actually* do.

The only additional matter arising is that it appears quite a lot of
people would be happy with the limit defaulting off.

-- 
Peter Stephenson <pws@csr.com>            Software Engineer
Tel: +44 (0)1223 692070                   Cambridge Silicon Radio Limited
Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Follow CSR on Twitter at http://twitter.com/CSR_PLC and read our blog at www.csr.com/blog


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MAX_ARRLEN
  2012-04-23 16:27     ` MAX_ARRLEN Peter Stephenson
@ 2012-04-23 16:36       ` Mikael Magnusson
  2012-04-23 16:40         ` MAX_ARRLEN Peter Stephenson
  0 siblings, 1 reply; 11+ messages in thread
From: Mikael Magnusson @ 2012-04-23 16:36 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh Hackers' List

On 2012-04-23, Peter Stephenson <Peter.Stephenson@csr.com> wrote:
> On Mon, 23 Apr 2012 09:21:23 -0700
> Bart Schaefer <schaefer@brasslantern.com> wrote:
>> On Apr 23,  6:10pm, Mikael Magnusson wrote:
>> }
>> } http://www.zsh.org/mla/workers/2010/msg00013.html
>>
>> And
>>
>> http://www.zsh.org/mla/workers/2010/msg00015.html
>
> Those are basically saying yes, the current set up has problems but we'd
> quite like something.  Hence my immediate suggestions of what we
> *actually* do.
>
> The only additional matter arising is that it appears quite a lot of
> people would be happy with the limit defaulting off.

I replied with the link because you didn't refer to the previous
discussion at all, so I wasn't sure if you remembered it :).

The problem with the current approach is that it only limits accessing
an array beyond a certain index, even if it's already larger, and you
can create arrays of any size by other means. To me it seems like the
limit is applied in the wrong place at least. If there are places that
unexpectedly create large arrays, we should add the safeguards in
those places, if possible.

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MAX_ARRLEN
  2012-04-23 15:27 MAX_ARRLEN Peter Stephenson
  2012-04-23 16:10 ` MAX_ARRLEN Mikael Magnusson
@ 2012-04-23 16:38 ` Bart Schaefer
  2012-04-24 13:37   ` MAX_ARRLEN Peter Stephenson
  1 sibling, 1 reply; 11+ messages in thread
From: Bart Schaefer @ 2012-04-23 16:38 UTC (permalink / raw)
  To: Zsh Hackers' List

On Apr 23,  4:27pm, Peter Stephenson wrote:
}
} What's the right thing to do?  There are various grades ranging from
} making it compilable out, through making it compile-time configurable
} with an option to compile out, through making it an option to have the
} check turned on, to having a variable that we check using getiparam()
} each time, to having a special variable so that we don't need to get it
} each time.  I think the last option with a clearly named variable such
} as ZSH_MAX_ARRAY_LENGTH that can be set to 0 to turn it off is probably
} the best.

I think something based on one of the process limits would be good.
Maybe combined with stashing it in a variable that can be modified.
Maybe even putting that variable in a module so it's not visible in a
barebones shell.

datasize, stacksize, and addressspace are all candidates for how to
figure out the limit.  (Do we ever allocate arrays for zsh parameter
expansion on the stack?)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MAX_ARRLEN
  2012-04-23 16:36       ` MAX_ARRLEN Mikael Magnusson
@ 2012-04-23 16:40         ` Peter Stephenson
  2012-04-23 16:45           ` MAX_ARRLEN Mikael Magnusson
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Stephenson @ 2012-04-23 16:40 UTC (permalink / raw)
  To: Zsh Hackers' List

On Mon, 23 Apr 2012 18:36:22 +0200
Mikael Magnusson <mikachu@gmail.com> wrote:
> The problem with the current approach is that it only limits accessing
> an array beyond a certain index, even if it's already larger, and you
> can create arrays of any size by other means. To me it seems like the
> limit is applied in the wrong place at least. If there are places that
> unexpectedly create large arrays, we should add the safeguards in
> those places, if possible.

That's a good point --- and, in fact, creating *whole* arrays isn't a
problem, either, unless we think we're protecting against

  stuff=($(<bigfile))

and I don't think we are; why is that more problematic than the
unprotected

  stuff="$(<bigfile)"

?

However, this has now turned from a simple change to a research project.

-- 
Peter Stephenson <pws@csr.com>            Software Engineer
Tel: +44 (0)1223 692070                   Cambridge Silicon Radio Limited
Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK




Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Follow CSR on Twitter at http://twitter.com/CSR_PLC and read our blog at www.csr.com/blog


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MAX_ARRLEN
  2012-04-23 16:40         ` MAX_ARRLEN Peter Stephenson
@ 2012-04-23 16:45           ` Mikael Magnusson
  0 siblings, 0 replies; 11+ messages in thread
From: Mikael Magnusson @ 2012-04-23 16:45 UTC (permalink / raw)
  To: Peter Stephenson; +Cc: Zsh Hackers' List

On 2012-04-23, Peter Stephenson <Peter.Stephenson@csr.com> wrote:
> On Mon, 23 Apr 2012 18:36:22 +0200
> Mikael Magnusson <mikachu@gmail.com> wrote:
>> The problem with the current approach is that it only limits accessing
>> an array beyond a certain index, even if it's already larger, and you
>> can create arrays of any size by other means. To me it seems like the
>> limit is applied in the wrong place at least. If there are places that
>> unexpectedly create large arrays, we should add the safeguards in
>> those places, if possible.
>
> That's a good point --- and, in fact, creating *whole* arrays isn't a
> problem, either, unless we think we're protecting against
>
>   stuff=($(<bigfile))
>
> and I don't think we are; why is that more problematic than the
> unprotected
>
>   stuff="$(<bigfile)"
>
> ?
>
> However, this has now turned from a simple change to a research project.

FWIW, I've had the patch applied that removes the check since that
other thread was started and I haven't OOMed myself because of it yet.
Maybe we should leave it for 5.1.0 though? :)

-- 
Mikael Magnusson


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MAX_ARRLEN
  2012-04-23 16:38 ` MAX_ARRLEN Bart Schaefer
@ 2012-04-24 13:37   ` Peter Stephenson
  2012-04-24 19:45     ` MAX_ARRLEN Bart Schaefer
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Stephenson @ 2012-04-24 13:37 UTC (permalink / raw)
  To: Zsh Hackers' List

On Mon, 23 Apr 2012 09:38:12 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:
> On Apr 23,  4:27pm, Peter Stephenson wrote:
> } What's the right thing to do?  There are various grades ranging from
> } making it compilable out, through making it compile-time configurable
> } with an option to compile out, through making it an option to have the
> } check turned on, to having a variable that we check using getiparam()
> } each time, to having a special variable so that we don't need to get it
> } each time.  I think the last option with a clearly named variable such
> } as ZSH_MAX_ARRAY_LENGTH that can be set to 0 to turn it off is probably
> } the best.
> 
> I think something based on one of the process limits would be good.
> Maybe combined with stashing it in a variable that can be modified.
> Maybe even putting that variable in a module so it's not visible in a
> barebones shell.
> 
> datasize, stacksize, and addressspace are all candidates for how to
> figure out the limit.  (Do we ever allocate arrays for zsh parameter
> expansion on the stack?)

I don't really know how to estimate the array size, however, since it
consists of the array length and the size occupied by each element, and
it depends what the elements are doing.  So the relationship between the
limit and the size of the array is very vague, to the point where I'm
not sure if it's useful.  Nor is it clear to me that basing per-array
limits on global limits is useful --- it's only helpful in practice if
you've got one, single large array.  If you treat as some sort of finger
in the air estimate as to how much space the user is likely to be able
to cram into an array, I'm still not sure you can do it well enough to
make it useful.  Furthermore, here are my limits (which, like most
users, I haven't touched):

cputime         unlimited
filesize        unlimited
datasize        unlimited
stacksize       8MB
coredumpsize    unlimited
memoryuse       unlimited
maxproc         1024
descriptors     1024
memorylocked    64kB
addressspace    unlimited
maxfilelocks    unlimited
sigpending      15927
msgqueue        819200
nice            0
rt_priority     0

The only relevant useful one is stacksize; but array length doesn't have
direct implications for the stack.

So I don't see anything I personally would be able to implement here,
though if anyone else has ideas they're certainly welcome to look.

Indeed, given the original intention, is it actually useful to apply the
limit to anything other than the argument array?

As something to do now, I'd be tempted either to "#if 0" the code until
someone can come up with a replacement that is demonstrably useful, or
implement $ZSH_MAX_ARRAY_LENGTH and initialise it to 0 (no limit),
applying it at the current definitely non-optimal location.  Either
option at least gives us something basic usable, which the current code
isn't really.  Anything beyond that still seems to be somewhat
ill-defined and I'd like finally to have something non-broken ASAP.

-- 
Peter Stephenson <pws@csr.com>            Software Engineer
Tel: +44 (0)1223 692070                   Cambridge Silicon Radio Limited
Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Follow CSR on Twitter at http://twitter.com/CSR_PLC and read our blog at www.csr.com/blog


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MAX_ARRLEN
  2012-04-24 13:37   ` MAX_ARRLEN Peter Stephenson
@ 2012-04-24 19:45     ` Bart Schaefer
  2012-04-25  9:01       ` MAX_ARRLEN Peter Stephenson
  0 siblings, 1 reply; 11+ messages in thread
From: Bart Schaefer @ 2012-04-24 19:45 UTC (permalink / raw)
  To: Zsh Hackers' List

On Apr 24,  2:37pm, Peter Stephenson wrote:
} 
} As something to do now, I'd be tempted either to "#if 0" the code until
} someone can come up with a replacement that is demonstrably useful, or
} implement $ZSH_MAX_ARRAY_LENGTH and initialise it to 0 (no limit),
} applying it at the current definitely non-optimal location.  Either
} option at least gives us something basic usable, which the current code
} isn't really.  Anything beyond that still seems to be somewhat
} ill-defined and I'd like finally to have something non-broken ASAP.

I'm OK with just removing the check entirely.  It's not like we don't
have other places where the shell might run out of memory.  This one
was just particularly egregious back in the day because you could eat
vast amounts of memory with what looked like an innocuous subscript
expression.  (I think the strange number for the limit was based on
the assumption that you were creating a mostly empty array. If we used
linked lists [ala bash] this would never come up.)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: MAX_ARRLEN
  2012-04-24 19:45     ` MAX_ARRLEN Bart Schaefer
@ 2012-04-25  9:01       ` Peter Stephenson
  0 siblings, 0 replies; 11+ messages in thread
From: Peter Stephenson @ 2012-04-25  9:01 UTC (permalink / raw)
  To: Zsh Hackers' List

On Tue, 24 Apr 2012 12:45:23 -0700
Bart Schaefer <schaefer@brasslantern.com> wrote:

> On Apr 24,  2:37pm, Peter Stephenson wrote:
> } 
> } As something to do now, I'd be tempted either to "#if 0" the code until
> } someone can come up with a replacement that is demonstrably useful, or
> } implement $ZSH_MAX_ARRAY_LENGTH and initialise it to 0 (no limit),
> } applying it at the current definitely non-optimal location.  Either
> } option at least gives us something basic usable, which the current code
> } isn't really.  Anything beyond that still seems to be somewhat
> } ill-defined and I'd like finally to have something non-broken ASAP.
> 
> I'm OK with just removing the check entirely.

I think I'll do that, pending any fully developed proposal for doing it
better.

> (I think the strange number for the limit was based on
> the assumption that you were creating a mostly empty array. If we used
> linked lists [ala bash] this would never come up.)

The trade off with linked lists is in favour of pointer arrays if the large
arrays you're creating aren't going to be sparse, though, which was the
case I ran up against this week.

I've changed references to "parameter expansion" to "parameter
substitution" below, since that's more standard.

===================================================================
RCS file: /cvsroot/zsh/zsh/NEWS,v
retrieving revision 1.54
diff -p -u -r1.54 NEWS
--- NEWS	15 Apr 2012 19:35:29 -0000	1.54
+++ NEWS	25 Apr 2012 08:57:51 -0000
@@ -52,31 +52,38 @@ Expansion (parameters, globbing, etc.) a
   This is useful for expanding paths with many variable components as
   commonly found in software development.
 
-- Parameter expansion has the ${NAME:OFFSET} and ${NAME:OFFSET:LENGTH}
+- Parameter substitution has the ${NAME:OFFSET} and ${NAME:OFFSET:LENGTH}
   syntax for compatibility with other shells (and zero-based indexing
   is used to enhance compatibility).  LENGTH may be negative to count
   from the end.
 
-- The parameter expansion flag (D) abbreviates directories in parameters
+- The arbitrary limit on parameter subscripts (262144) has been removed.
+  As it was not configurable and tested in an inconvenient place it
+  was deemed preferable to remove it completely.  The limit was originally
+  introduced to prevent accidental creation of a large parameter array
+  by typos that generated assignments along the lines of "12345678=0".
+  The general advice is not to do that.
+
+- The parameter substitution flag (D) abbreviates directories in parameters
   using the familiar ~ form.
 
-- The parameter expansion flag (g) can take delimited arguments o, e and
+- The parameter substitution flag (g) can take delimited arguments o, e and
   c to provide echo- and print-style expansion: (g::) provides basic
   echo-style expansion; (g:e:) provides the extended capabilities of
   print; (g:o:) provides octal escapes without a leading zero; (g:c:)
   additionally expands "^c" style control characters as for bindkey.
   Options may be combined, e.g. (g:eoc:).
 
-- The parameter expansion flag (m) indicates that string lengths used
+- The parameter substitution flag (m) indicates that string lengths used
   calculated by the (l) and (r) flags or the # operator should take
   account of the printing width of characters in multibyte mode, whether
   0, 1 or more.  (mm) causes printing characters to count as 1 and
   non-printing chracters to count as 0.
 
-- The parameter expansion flag (q-) picks the most minimal way of
+- The parameter substitution flag (q-) picks the most minimal way of
   quoting the parameter words, to make the result as readable as possible.
 
-- The parameter expansion flag (Z), a variant of (z), takes arguments
+- The parameter substitution flag (Z), a variant of (z), takes arguments
   describing how to split a variable using shell syntax: (Z:c:) parses
   comments as strings (the default is not to treat comment characters
   specially); (Z:C:) parses comments and strips them; (Z:n:) treats
Index: Src/params.c
===================================================================
RCS file: /cvsroot/zsh/zsh/Src/params.c,v
retrieving revision 1.181
diff -p -u -r1.181 params.c
--- Src/params.c	22 Apr 2012 18:10:43 -0000	1.181
+++ Src/params.c	25 Apr 2012 08:57:52 -0000
@@ -1905,6 +1905,18 @@ fetchvalue(Value v, char **pptr, int bra
     if (!bracks && *s)
 	return NULL;
     *pptr = s;
+#if 0
+    /*
+     * Check for large subscripts that might be erroneous.
+     * This code is too gross in several ways:
+     * - the limit is completely arbitrary
+     * - the test vetoes operations on existing arrays
+     * - it's not at all clear a general test on large arrays of
+     *   this kind is any use.
+     *
+     * Until someone comes up with workable replacement code it's
+     * therefore commented out.
+     */
     if (v->start > MAX_ARRLEN) {
 	zerr("subscript too %s: %d", "big", v->start + !isset(KSHARRAYS));
 	return NULL;
@@ -1921,6 +1933,7 @@ fetchvalue(Value v, char **pptr, int bra
 	zerr("subscript too %s: %d", "small", v->end);
 	return NULL;
     }
+#endif
     return v;
 }
 
-- 
Peter Stephenson <pws@csr.com>            Software Engineer
Tel: +44 (0)1223 692070                   Cambridge Silicon Radio Limited
Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, UK


Member of the CSR plc group of companies. CSR plc registered in England and Wales, registered number 4187346, registered office Churchill House, Cambridge Business Park, Cowley Road, Cambridge, CB4 0WZ, United Kingdom
More information can be found at www.csr.com. Follow CSR on Twitter at http://twitter.com/CSR_PLC and read our blog at www.csr.com/blog


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-04-25  9:03 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-23 15:27 MAX_ARRLEN Peter Stephenson
2012-04-23 16:10 ` MAX_ARRLEN Mikael Magnusson
2012-04-23 16:21   ` MAX_ARRLEN Bart Schaefer
2012-04-23 16:27     ` MAX_ARRLEN Peter Stephenson
2012-04-23 16:36       ` MAX_ARRLEN Mikael Magnusson
2012-04-23 16:40         ` MAX_ARRLEN Peter Stephenson
2012-04-23 16:45           ` MAX_ARRLEN Mikael Magnusson
2012-04-23 16:38 ` MAX_ARRLEN Bart Schaefer
2012-04-24 13:37   ` MAX_ARRLEN Peter Stephenson
2012-04-24 19:45     ` MAX_ARRLEN Bart Schaefer
2012-04-25  9:01       ` MAX_ARRLEN Peter Stephenson

Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).