zsh-workers
 help / color / mirror / code / Atom feed
From: Daniel Shahaf <d.s@daniel.shahaf.name>
To: zsh-workers@zsh.org
Subject: Array appends are quadratic
Date: Fri, 27 Nov 2015 07:37:30 +0000	[thread overview]
Message-ID: <20151127073730.GI1899@tarsus.local2> (raw)

Making N appends takes O(N²) time:

    % zsh -fc 'local -a a; time ( repeat 5000 { a+=(foo) } )' 
    ( repeat 5000; do; a+=(foo) ; done; )  1.26s user 0.00s system 99% cpu 1.268 total
    % zsh -fc 'local -a a; time ( repeat 10000 { a+=(foo) } )' 
    ( repeat 10000; do; a+=(foo) ; done; )  5.02s user 0.00s system 99% cpu 5.028 total
    % zsh -fc 'local -a a; time ( repeat 20000 { a+=(foo) } )' 
    ( repeat 20000; do; a+=(foo) ; done; )  19.94s user 0.08s system 99% cpu 20.036 total
    
    % zsh -fc 'local -a a; N=5000; time ( a[N]=(); integer i; while (( i++ < N )) { a[i]=foo } )' 
    ( a[N]=() ; integer i; while (( i++ < N )); do; a[i]=foo ; done; )  2.10s user 0.00s system 99% cpu 2.004 total
    % zsh -fc 'local -a a; N=10000; time ( a[N]=(); integer i; while (( i++ < N )) { a[i]=foo } )' 
    ( a[N]=() ; integer i; while (( i++ < N )); do; a[i]=foo ; done; )  8.33s user 0.41s system 99% cpu 8.506 total
    % zsh -fc 'local -a a; N=20000; time ( a[N]=(); integer i; while (( i++ < N )) { a[i]=foo } )' 
    ( a[N]=() ; integer i; while (( i++ < N )); do; a[i]=foo ; done; )  34.92s user 0.00s system 99% cpu 34.246 total

This is because each append reallocates the array with one extra
element.  This may be confirmed by these debug prints:

    diff --git a/Src/params.c b/Src/params.c
    index b121bd6..a4ca3ce 100644
    --- a/Src/params.c
    +++ b/Src/params.c
    @@ -2579,6 +2579,7 @@ setarrvalue(Value v, char **val)
            if (v->end <= n)
                ll += n - v->end + 1;
     
    +       DPUTS2(1, "setarrvalue: Reallocating '%s' to size %d", v->pm->node.nam, ll);
            p = new = (char **) zshcalloc(sizeof(char *) * (ll + 1));
     
            for (i = 0; i < v->start; i++)
    @@ -2909,6 +2910,7 @@ assignaparam(char *s, char **val, int flags)
                    char **new;
                    int lv = arrlen(val);
     
    +               DPUTS2(1, "assignaparam: Reallocating '%s' to size %d", t, lv+1);
                    new = (char **) zalloc(sizeof(char *) * (lv + 2));
                    *new = ztrdup(getstrvalue(v));
                    memcpy(new+1, val, sizeof(char *) * (lv + 1));

The first example triggers the assignaparam() debug print with values
1,2,…,5000.  The second example triggers the setarrvalue() debug print
with values 5000,5000,…,5000, and has five thousand iterations.

Complexity could be reduced to O(N) through doubling reallocation
(whenever reallocating, allocate not oldsize + O(1) but oldsize * O(1)).

Is there a reason not to do that?  Any edge cases to take care of?

Cheers,

Daniel


             reply	other threads:[~2015-11-27  7:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-27  7:37 Daniel Shahaf [this message]
2015-11-27  8:26 ` Mikael Magnusson
2015-11-27 10:59   ` Bart Schaefer
2015-11-28 18:05     ` Daniel Shahaf
2015-11-28 19:36       ` Peter Stephenson
2015-11-30  3:20         ` Daniel Shahaf
2015-12-03 23:36         ` Daniel Shahaf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151127073730.GI1899@tarsus.local2 \
    --to=d.s@daniel.shahaf.name \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).