zsh-workers
 help / color / mirror / code / Atom feed
From: Daniel Shahaf <d.s@daniel.shahaf.name>
To: zsh workers <zsh-workers@zsh.org>
Subject: Re: Array appends are quadratic
Date: Sat, 28 Nov 2015 18:05:17 +0000	[thread overview]
Message-ID: <20151128180517.GL2498@tarsus.local2> (raw)
In-Reply-To: <151127025958.ZM23933@torch.brasslantern.com>

Bart Schaefer wrote on Fri, Nov 27, 2015 at 02:59:58 -0800:
> On Nov 27,  9:26am, Mikael Magnusson wrote:
> } Subject: Re: Array appends are quadratic
> }
> } According to your debug prints, the array is reallocated every time it
> } is assigned to, even when the required element is already allocated.
> } Doubling the allocation size would do nothing to fix that, presumably.
> 
> Right, this is generic code to handle array splicing, so starting from
> an array with 200 elements,
> 
>     a[100]=foo
> 
> and
> 
>     a[50,150]=(foo)
> 
> use the same loop even though the former does not change the size of
> the resulting array and the latter will waste 100 slots at the top.

(The reason it'll waste 100 slots is that it will allocate 200 elements
even though only 200-(150-50+1) are needed.)

> I vaguely recall a conversation in which we decided that fewer code
> branches was better than attempting to efficiently re-use array slots.

I'm not looking to implement sparse arrays.   I just wish it didn't
reallocate unless it needed to.  That _would_ mean some additional code
complexity, but it seems it would be manageable: having the array
remember the number of allocated slots, using memmove() to shift the
past-the-slice portion, copying the slice directly into the slots.

That should make this part of the code O(slice) rather than O(array
length).

The effort involved in implementing the redoubling allocation (for the
other problem indicated in 37236) would be similar: mainly having the
array remember the number of allocated slots.  I'll have to track done
all places in the code that allocate/reallocate arrays, hopefully there
aren't too many.

There may be other parts in the code that do O(n) processing instead
of O(1) — Mikael points out arrlen() as an example — but those can be
improved separately.

> } % a[100]=foo
> }  params.c:2585: setarrvalue: Reallocating 'a' to size 100
> } % a[5]=bar
> }  params.c:2585: setarrvalue: Reallocating 'a' to size 101
> } % a[10]=bar
> }  params.c:2585: setarrvalue: Reallocating 'a' to size 101
> } 
> } Do we have an... off-by-two bug somewhere?
> 
> No, it just always allocates 1 more than the largest occupied array
> index position if the array isn't empty to begin with.
> 

Cheers,

Daniel

P.S. It seems that bin_compquote is calling ztrdup() on every element of
an array parameter passed as an argument, so it would reallocate all array
members even if none of them need quoting.  I'm not planning to look
into it further, though, since I haven't run into it in practice yet.


  reply	other threads:[~2015-11-28 18:13 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-27  7:37 Daniel Shahaf
2015-11-27  8:26 ` Mikael Magnusson
2015-11-27 10:59   ` Bart Schaefer
2015-11-28 18:05     ` Daniel Shahaf [this message]
2015-11-28 19:36       ` Peter Stephenson
2015-11-30  3:20         ` Daniel Shahaf
2015-12-03 23:36         ` Daniel Shahaf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151128180517.GL2498@tarsus.local2 \
    --to=d.s@daniel.shahaf.name \
    --cc=zsh-workers@zsh.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.vuxu.org/mirror/zsh/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).