From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/65330 Path: news.gmane.org!not-for-mail From: Daniel Pittman Newsgroups: gmane.emacs.gnus.general Subject: Re: Huge memory consumption on accessing large newsgroup Date: Tue, 02 Oct 2007 22:17:46 +1000 Organization: Cybersource: Australia's Leading Linux and Open Source Solutions Company Message-ID: <871wcdww0l.fsf@enki.rimspace.net> References: <87wsw4u21m.fsf@gmx.de> <87sl6rh886.fsf@gmx.de> <87hcldp4j1.fsf@srcf.ucam.org> <87odfjprux.fsf@enki.rimspace.net> <874phatd0z.fsf@enki.rimspace.net> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1191327518 17694 80.91.229.12 (2 Oct 2007 12:18:38 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Tue, 2 Oct 2007 12:18:38 +0000 (UTC) To: ding@gnus.org Original-X-From: ding-owner+M13842@lists.math.uh.edu Tue Oct 02 14:18:35 2007 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by lo.gmane.org with esmtp (Exim 4.50) id 1Icghu-0002VG-Jf for ding-account@gmane.org; Tue, 02 Oct 2007 14:18:34 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1IcghR-0003Fo-Up; Tue, 02 Oct 2007 07:18:06 -0500 Original-Received: from mx2.math.uh.edu ([129.7.128.33]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1IcghQ-0003Fb-Al for ding@lists.math.uh.edu; Tue, 02 Oct 2007 07:18:04 -0500 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx2.math.uh.edu with esmtp (Exim 4.67) (envelope-from ) id 1IcghJ-0004Xg-Qm for ding@lists.math.uh.edu; Tue, 02 Oct 2007 07:18:04 -0500 Original-Received: from 203-217-31-68.perm.iinet.net.au ([203.217.31.68] helo=anu.rimspace.net) by quimby.gnus.org with esmtp (Exim 3.35 #1 (Debian)) id 1IcghA-0004o2-00 for ; Tue, 02 Oct 2007 14:17:48 +0200 Original-Received: by anu.rimspace.net (Postfix, from userid 10) id 5F95F192002F; Tue, 2 Oct 2007 22:17:53 +1000 (EST) Original-Received: by enki.rimspace.net (Postfix, from userid 1000) id 3830A2F0577; Tue, 2 Oct 2007 22:17:47 +1000 (EST) In-Reply-To: (Ted Zlatanov's message of "Tue, 02 Oct 2007 06:11:40 -0500") User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/23.0.0 (gnu/linux) X-Spam-Score: -2.5 (--) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:65330 Archived-At: Ted Zlatanov writes: > On Tue, 02 Oct 2007 13:23:56 +1000 Daniel Pittman wrote: > > DP> The issue with large memory consumption, as far as I could see > DP> from the thread, was that the compressed range data type was > DP> expanded to a flat list. This caused, no surprise, huge memory > DP> use. > > DP> The original problem is that the code calls > DP> `gnus-uncompress-range' on the data *at all* -- and, so, turns a > DP> nicely brief data structure into a vast bloated million-number > DP> list. > > DP> The *solution* is to rebuild the algorithm to operate on the > DP> compressed version (regardless of the internal representation), > DP> not to change the representation. > > You're right, I didn't know this. I thought the memory problems were > caused by the original list. *nod* I figured as much. :) [...] > DP> (And because this has been a stupidly annoying couple of week in other > DP> areas, and because this is nice simple and essentially stress-free work > DP> I am getting tempted to fix it myself. > > DP> So, maybe inversion lists were the way to get the code fixed after all, > DP> if not quite so directly as expected. ;) > > The `gnus-uncompress-range' function is not called in too many places: [...] > In gnus-number-of-unseen-articles-in-group for example it's called > only to find the length of the list, and in gnus-move-group-to-server > only to see if the list is not nil. Mmm. The not nil version should really be killed, as a guide, and the length isn't too awful. The length version can just call `gnus-range-length' and be done with it... > There's room for improvement. Maybe there should be a group API that > abstracts the data structures, so subordinate code doesn't have to > know if it's a list or a compressed range? That would be a good first > step to cleaning things up. Maybe. I think in most cases it is just that the range code wasn't pushed everywhere it could be because, in most cases, you never /really/ see the pain. Regards, Daniel -- Daniel Pittman Phone: 03 9621 2377 Level 4, 10 Queen St, Melbourne Web: http://www.cyber.com.au Cybersource: Australia's Leading Linux and Open Source Solutions Company