From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17961 invoked by alias); 7 May 2011 19:11:19 -0000 Mailing-List: contact zsh-workers-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Workers List List-Post: List-Help: X-Seq: 29175 Received: (qmail 4626 invoked from network); 7 May 2011 19:11:17 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received-SPF: none (ns1.primenet.com.au: domain at closedmail.com does not designate permitted sender hosts) From: Bart Schaefer Message-id: <110507121112.ZM9261@torch.brasslantern.com> Date: Sat, 07 May 2011 12:11:10 -0700 In-reply-to: Comments: In reply to Mikael Magnusson "the source of slow large for loops" (May 7, 7:00pm) References: X-Mailer: OpenZMail Classic (0.9.2 24April2005) To: zsh workers Subject: Re: the source of slow large for loops MIME-version: 1.0 Content-type: text/plain; charset=us-ascii On May 7, 7:00pm, Mikael Magnusson wrote: } Subject: the source of slow large for loops } } I don't know if anyone looked into why things like } for i in {1..700000}; do true; done } is extremely slow in zsh, so I did now. Turns out zhalloc is extremely } slow Hmm ... I think you've misdiagnosed. zhalloc() is reasonably fast. What's amazingly slow is freeheap(). If I stick a "return;" at the top of freeheap() the above loop runs in just over 4 seconds on my 3GHz P4. (AFAICT the memory is still eventually freed by popheap(), it just grows a lot more before releasing any.) } and for the above loop, one particular line runs 12049901132 } times according to gcov. It would have been nice if you'd told us which line. :-) However, I don't think that's actually the problem. There are a couple of places where zhalloc() and friends search the heap for free space by doing a linear scan over a linked list, which will cause a huge number of loop iterations, but each of those iterations is at most a couple of machine instructions. You need to look at how much time something takes, not just how often it's done. The real time sink is this bit of freeheap(): for (h = heaps; h; h = hn) { hn = h->next; if (h->sp) { #ifdef ZSH_MEM_DEBUG memset(arena(h) + h->sp->used, 0xff, h->used - h->sp->used); #endif h->used = h->sp->used; if (!fheap && h->used < ARENA_SIZEOF(h)) fheap = h; hl = h; } else { #ifdef USE_MMAP munmap((void *) h, h->size); #else zfree(h, HEAPSIZE); #endif } } Of course if you've used --enable-zsh-mem-debug for configure then this is also zeroing all the memory on each free and the performance is even worse, but that isn't really the issue. Most specifically, take out the stuff inside "if (h->sp)" and the above 700k-iterations loop runs in 11 seconds. If I understand what's going on here ... Index: Src/mem.c =================================================================== RCS file: /extra/cvsroot/zsh/zsh-4.0/Src/mem.c,v retrieving revision 1.10 diff -c -r1.10 mem.c --- Src/mem.c 4 Nov 2008 04:47:53 -0000 1.10 +++ Src/mem.c 7 May 2011 19:06:23 -0000 @@ -220,8 +220,28 @@ h_free++; #endif + /* At this point we used to do: fheap = NULL; - for (h = heaps; h; h = hn) { + * + * When pushheap() is called, it sweeps over the entire heaps list of + * arenas and marks every one of them with the amount of free space in + * that arena at that moment. zhalloc() is then allowed to grab bits + * out of any of those arenas that have free space. + * + * With the above reset of fheap, the loop below sweeps back over the + * entire heap list again, resetting the free space in every arena to + * the amount stashed by pushheap() and finding the first arena with + * free space to optimize zhalloc()'s next search. When there's a lot + * of stuff already on the heap, this is an enormous amount of work, + * and peformance goes to hell. + * + * However, there doesn't seem to be any reason to reset fheap before + * beginning this loop. Either it's already correct, or it has never + * been set and this loop will do it, or it'll be reset from scratch + * on the next popheap(). So all that's needed here is to pick up + * the scan wherever the last pass [or the last popheap()] left off. + */ + for (h = (fheap ? fheap : heaps); h; h = hn) { hn = h->next; if (h->sp) { #ifdef ZSH_MEM_DEBUG @@ -242,7 +262,7 @@ if (hl) hl->next = NULL; else - heaps = NULL; + heaps = fheap = NULL; unqueue_signals(); } --