From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/65323 Path: news.gmane.org!not-for-mail From: Daniel Pittman Newsgroups: gmane.emacs.gnus.general Subject: Re: Huge memory consumption on accessing large newsgroup Date: Mon, 01 Oct 2007 11:04:54 +1000 Organization: Cybersource: Australia's Leading Linux and Open Source Solutions Company Message-ID: <87odfjprux.fsf@enki.rimspace.net> References: <87wsw4u21m.fsf@gmx.de> <87sl6rh886.fsf@gmx.de> <87hcldp4j1.fsf@srcf.ucam.org> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1191280647 4487 80.91.229.12 (1 Oct 2007 23:17:27 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Mon, 1 Oct 2007 23:17:27 +0000 (UTC) To: ding@gnus.org Original-X-From: ding-owner+M13834@lists.math.uh.edu Tue Oct 02 01:17:22 2007 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by lo.gmane.org with esmtp (Exim 4.50) id 1IcUVs-0006hu-Ix for ding-account@gmane.org; Tue, 02 Oct 2007 01:17:20 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1IcUVL-0006BO-JO; Mon, 01 Oct 2007 18:16:47 -0500 Original-Received: from mx2.math.uh.edu ([129.7.128.33]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1Ic9qg-0007Jl-99 for ding@lists.math.uh.edu; Sun, 30 Sep 2007 20:13:26 -0500 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx2.math.uh.edu with esmtp (Exim 4.67) (envelope-from ) id 1Ic9qY-0007yD-Hb for ding@lists.math.uh.edu; Sun, 30 Sep 2007 20:13:26 -0500 Original-Received: from main.gmane.org ([80.91.229.2] helo=ciao.gmane.org) by quimby.gnus.org with esmtp (Exim 3.35 #1 (Debian)) id 1Ic9qP-0000F6-00 for ; Mon, 01 Oct 2007 03:13:09 +0200 Original-Received: from list by ciao.gmane.org with local (Exim 4.43) id 1Ic9pW-0005I2-AZ for ding@gnus.org; Mon, 01 Oct 2007 01:12:14 +0000 Original-Received: from nat20.cyber.com.au ([203.7.155.20]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 01 Oct 2007 01:12:14 +0000 Original-Received: from daniel by nat20.cyber.com.au with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Mon, 01 Oct 2007 01:12:14 +0000 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 68 Original-X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: nat20.cyber.com.au User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/23.0.0 (gnu/linux) Cancel-Lock: sha1:f8tdEmws92KUK4QH6DxIZT76INk= X-Spam-Score: -2.6 (--) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:65323 Archived-At: Ted Zlatanov writes: > On Sat, 29 Sep 2007 22:04:18 +0100 Gaute Strokkenes wrote: > > GS> I wonder if it would be possible to make Gnus work solely with > GS> compressed ranges (i.e. lists where dotted pairs are used to > GS> represent runs of consecutive integers)? > > GS> Unless there is some deeper reason why this cannot work, I might > GS> have a stab at it (eventually). > > I think this would be a good idea. Ah. > Consider using inversion lists. This is almost certainly unnecessary, not to mention that it would involve building an entire parallel infrastructure to handle them. The nnimap code had a similar performance killing "feature" where it would expand two 'range' lists completely, intersect them, then compress them again. This was trivially resolved by using the existing code from `gnus-range.el' to process this on the compressed versions. You should be able to find the appropriate bit of history tucked away in the history of the nnimap.el code via CVS. I am also happy to try and dig up my memories of the work though, frankly, they were trivial enough. The one specific advice I would give you: Write your code so it is "self testing" and run with that for a long while. I ended up having the code do the gnus-range based calculations and compare them to the non-range calculations, then signal an error if they disagreed. This cost extra CPU time for the couple of weeks I used it in production but gave me (and, I think, the rest of the list) a much higher sense of security that the changes were, in practice, correct. (You might want to leave that in with a debug option to turn it on so that the rest of the Gnus CVS userbase also test this, to catch faults that your own use doesn't show. I didn't do that and I vaguely regret it now.) > They don't require pairs of integers; each value represents a flip. > They have other nice properties too, though it's been a while since I > looked at them so I can't name them. Mmm. The existing Gnus range code is pretty much as efficient, has lower computational complexity for the operations Gnus uses and is already existent and tested. I don't think that you would see sufficient benefit from introducing the additional data type to justify spending your time on it -- but since I am not volunteering to do the work I can't tell you how to do it. ;) Regards, Daniel -- Daniel Pittman Phone: 03 9621 2377 Level 4, 10 Queen St, Melbourne Web: http://www.cyber.com.au Cybersource: Australia's Leading Linux and Open Source Solutions Company