From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 17901 invoked by alias); 20 Nov 2012 17:03:47 -0000 Mailing-List: contact zsh-users-help@zsh.org; run by ezmlm Precedence: bulk X-No-Archive: yes List-Id: Zsh Users List List-Post: List-Help: X-Seq: 17424 Received: (qmail 16127 invoked from network); 20 Nov 2012 17:03:45 -0000 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on f.primenet.com.au X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 Received-SPF: none (ns1.primenet.com.au: domain at closedmail.com does not designate permitted sender hosts) From: Bart Schaefer Message-id: <121120090300.ZM5552@torch.brasslantern.com> Date: Tue, 20 Nov 2012 09:03:00 -0800 In-reply-to: <20121120130457.GD2500@localhost.localdomain> Comments: In reply to Han Pingtian "Re: argv subscript range uses too many memory" (Nov 20, 9:04pm) References: <20121108084001.GA7594@localhost.localdomain> <20121108100226.575b0788@pwslap01u.europe.root.pri> <20121110105811.GA7136@localhost.localdomain> <121110065709.ZM4781@torch.brasslantern.com> <20121120130457.GD2500@localhost.localdomain> X-Mailer: OpenZMail Classic (0.9.2 24April2005) To: zsh-users@zsh.org Subject: Re: argv subscript range uses too many memory MIME-version: 1.0 Content-type: text/plain; charset=us-ascii [C code discussion proceeds below, so those zsh-users who don't care about the internals can skip this message. Once again, we should move the rest of this thread to zsh-workers, thanks.] Han, thanks for the diagnosis. On Nov 20, 9:04pm, Han Pingtian wrote: } Subject: Re: argv subscript range uses too many memory } } On Sat, Nov 10, 2012 at 06:57:09AM -0800, Bart Schaefer wrote: } > In a loop, the heap allocations are not popped until the loop is done, } > IIRC, so you'll end up with a large number of copies of the original } > array in the heap with slice results pointing into different parts of } > each copy. Maybe there's a narrower scope in which a pushheap/popheap } > could be inserted. } } Looks like I have found the reason of this problem. If I revert this commit: } } commit 61505654942cb9895a9811fde1dcbb662fd7d66a } Author: Bart Schaefer } Date: Sat May 7 19:32:57 2011 +0000 } } 29175: optimize freeheap Aha; this jibes with both the excerpted text from me above and also with what PWS said in workers/30791: : What's puzzling me is that loops, including the "while" involved here, : execute freeheap() at the end of each iteration. That should restore : the pristine state of the loop According to the comment in workers/29175: + * However, there doesn't seem to be any reason to reset fheap before + * beginning this loop. Either it's already correct, or it has never + * been set and this loop will do it, or it'll be reset from scratch + * on the next popheap(). So all that's needed here is to pick up + * the scan wherever the last pass [or the last popheap()] left off. The consequence of this optimization is that, in the name of speed, we don't do a full-fledged garbage collection upon freeheap(), only upon popheap(). So the freeheap() on each loop iteration does not "restore the pristine state" and "a narrower scope [of] pushheap/popheap" would be one potential solution. Unfortunately as far as I can tell these two issues (the speed problem in last year's "the source of slow large for loops" thread and the space problem in this thread) are directly in conflict with one another. The speed problem requires that the heap not be fully garbage collected on every loop pass, but the space problem requires that it be collected at some point before the loop is done. Maybe there's a hybrid where freeheap() can examine the difference in position (fheaps - heaps) and do a full garbage collect only when the heap has become "too full". The question then is, what difference in position is large enough to trigger a collection?