From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 10256 invoked by alias); 27 Nov 2015 07:37:34 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 37236 Received: (qmail 7777 invoked from network); 27 Nov 2015 07:37:33 -0000 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,T_DKIM_INVALID autolearn=ham autolearn_force=no version=3.4.0 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= daniel.shahaf.name; h=content-transfer-encoding:content-type :date:from:message-id:mime-version:subject:to:x-sasl-enc :x-sasl-enc; s=mesmtp; bh=ALwuwsQZsVCoewUhbk/ShzbhWHk=; b=errz+O jCth9qHuZKuDeuYB+zwZTwv2HtWydCOW96aFIHi+CUZZKTyW/HxpWYSsGFwtM94x krHwc471zJPid9cvukt+dwT1iKSjgKQIfavW5blZROP5aLJklsGQMUA20eWj7PyP TshyX9K00s3J4PPMuwRV1rmNgPk3+N/SDQ8NE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:message-id:mime-version:subject:to:x-sasl-enc :x-sasl-enc; s=smtpout; bh=ALwuwsQZsVCoewUhbk/ShzbhWHk=; b=KbZFF ks442cpnWSoSX+mDYGuipw0c0Dron0wYO2l6pfKizR/BwHUFVDY2EP2sIBjjxVlq 9sdzhCLqaHa/FvXkLjYJviz2T1Ea0vHx0LMr2FTnHKQ6Y6DR6ie6AD26547yZxjR k6Dbg7rBJRsCvDg16PwsBG0PbxqpOd1zL5HJMM= X-Sasl-enc: TscQwxY4XHTrQBNNd3ekSNDQs+ZbBHjEly9BR5n9ddRw 1448609852 Date: Fri, 27 Nov 2015 07:37:30 +0000 From: Daniel Shahaf To: zsh-workers@zsh.org Subject: Array appends are quadratic Message-ID: <20151127073730.GI1899@tarsus.local2> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit User-Agent: Mutt/1.5.21 (2010-09-15) Making N appends takes O(N²) time: % zsh -fc 'local -a a; time ( repeat 5000 { a+=(foo) } )' ( repeat 5000; do; a+=(foo) ; done; ) 1.26s user 0.00s system 99% cpu 1.268 total % zsh -fc 'local -a a; time ( repeat 10000 { a+=(foo) } )' ( repeat 10000; do; a+=(foo) ; done; ) 5.02s user 0.00s system 99% cpu 5.028 total % zsh -fc 'local -a a; time ( repeat 20000 { a+=(foo) } )' ( repeat 20000; do; a+=(foo) ; done; ) 19.94s user 0.08s system 99% cpu 20.036 total % zsh -fc 'local -a a; N=5000; time ( a[N]=(); integer i; while (( i++ < N )) { a[i]=foo } )' ( a[N]=() ; integer i; while (( i++ < N )); do; a[i]=foo ; done; ) 2.10s user 0.00s system 99% cpu 2.004 total % zsh -fc 'local -a a; N=10000; time ( a[N]=(); integer i; while (( i++ < N )) { a[i]=foo } )' ( a[N]=() ; integer i; while (( i++ < N )); do; a[i]=foo ; done; ) 8.33s user 0.41s system 99% cpu 8.506 total % zsh -fc 'local -a a; N=20000; time ( a[N]=(); integer i; while (( i++ < N )) { a[i]=foo } )' ( a[N]=() ; integer i; while (( i++ < N )); do; a[i]=foo ; done; ) 34.92s user 0.00s system 99% cpu 34.246 total This is because each append reallocates the array with one extra element. This may be confirmed by these debug prints: diff --git a/Src/params.c b/Src/params.c index b121bd6..a4ca3ce 100644 --- a/Src/params.c +++ b/Src/params.c @@ -2579,6 +2579,7 @@ setarrvalue(Value v, char **val) if (v->end <= n) ll += n - v->end + 1; + DPUTS2(1, "setarrvalue: Reallocating '%s' to size %d", v->pm->node.nam, ll); p = new = (char **) zshcalloc(sizeof(char *) * (ll + 1)); for (i = 0; i < v->start; i++) @@ -2909,6 +2910,7 @@ assignaparam(char *s, char **val, int flags) char **new; int lv = arrlen(val); + DPUTS2(1, "assignaparam: Reallocating '%s' to size %d", t, lv+1); new = (char **) zalloc(sizeof(char *) * (lv + 2)); *new = ztrdup(getstrvalue(v)); memcpy(new+1, val, sizeof(char *) * (lv + 1)); The first example triggers the assignaparam() debug print with values 1,2,…,5000. The second example triggers the setarrvalue() debug print with values 5000,5000,…,5000, and has five thousand iterations. Complexity could be reduced to O(N) through doubling reallocation (whenever reallocating, allocate not oldsize + O(1) but oldsize * O(1)). Is there a reason not to do that? Any edge cases to take care of? Cheers, Daniel