From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/47348 Path: main.gmane.org!not-for-mail From: Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai =?iso-8859-1?q?Gro=DFjohann?=) Newsgroups: gmane.emacs.gnus.general Subject: Agent downloads too many headers Date: Tue, 22 Oct 2002 08:20:46 +0200 Organization: University of Dortmund, Germany Sender: owner-ding@hpc.uh.edu Message-ID: <84u1jfvvsx.fsf@crybaby.cs.uni-dortmund.de> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1035273576 6491 80.91.224.249 (22 Oct 2002 07:59:36 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Tue, 22 Oct 2002 07:59:36 +0000 (UTC) Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 183twt-0001gP-00 for ; Tue, 22 Oct 2002 09:59:35 +0200 Original-Received: from sina.hpc.uh.edu ([129.7.128.10] ident=lists) by malifon.math.uh.edu with esmtp (Exim 3.20 #1) id 183twc-0000AC-00; Tue, 22 Oct 2002 02:59:18 -0500 Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Tue, 22 Oct 2002 03:00:02 -0500 (CDT) Original-Received: from sclp3.sclp.com (qmailr@sclp3.sclp.com [209.196.61.66]) by sina.hpc.uh.edu (8.9.3/8.9.3) with SMTP id CAA20532 for ; Tue, 22 Oct 2002 02:59:46 -0500 (CDT) Original-Received: (qmail 4749 invoked by alias); 22 Oct 2002 07:58:57 -0000 Original-Received: (qmail 4744 invoked from network); 22 Oct 2002 07:58:57 -0000 Original-Received: from quimby.gnus.org (80.91.224.244) by gnus.org with SMTP; 22 Oct 2002 07:58:57 -0000 Original-Received: from news by quimby.gnus.org with local (Exim 3.12 #1 (Debian)) id 183twL-00022p-00 for ; Tue, 22 Oct 2002 09:59:01 +0200 Original-To: ding@gnus.org Original-Path: not-for-mail Original-Newsgroups: gnus.ding Original-Lines: 51 Original-NNTP-Posting-Host: crybaby.uni-duisburg.de Original-X-Trace: quimby.gnus.org 1035273541 7737 134.91.30.116 (22 Oct 2002 07:59:01 GMT) Original-X-Complaints-To: usenet@quimby.gnus.org Original-NNTP-Posting-Date: 22 Oct 2002 07:59:01 GMT User-Agent: Gnus/5.090008 (Oort Gnus v0.08) Emacs/21.3.50 (i686-pc-linux-gnu) Cancel-Lock: sha1:RM9G3CMq9epkUk5oziMOE7sQ+kI= Precedence: list X-Majordomo: 1.94.jlt7 Xref: main.gmane.org gmane.emacs.gnus.general:47348 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:47348 The situation is as follows: in gnus-agent-fetch-headers, if gnus-agent-consider-all-articles is non-nil, the list of articles is set to the active range of the group. The active range could be something like (1 . 4711), so the list of articles would be (1 2 ... 4710 4711). Then we remove from that list the list of already-downloaded articles. But probably the user has started reading the group when the article numbers were greater than 1 already. So most probably there are a lot of articles in the low-number range which are not in the group at all. Then the agent fetches the headers from that group for this list of articles. So the agent ends up fetching (almost) all headers for all groups. What can we do? We want to avoid fetching headers for the low-numbered articles where we already learned yesterday that these articles don't exist. One possibility would be to tell the agent to never fetch articles with numbers less than what we've already fetched. This would be (fairly) easy to implement, but it would lead to a problem: people who start using the Agent and download the (unread) message 4711, then decide they want to download old articles, too. For them, the agent would never download articles with numbers lower than 4711 because that's the lowest number fetched already. One workaround would be to enter the group and type `C-u J u' which would fetch even those articles. But I think it is not nice to require them to do that, they might have lots of groups. Another possibility is to keep an "unactive list (of ranges)". This would be a range of articles known not to exist. This would require storing more data in the agent. But it would also be precise. I see two problems with this, a minor one and a major one. The minor problem is that I don't know how to store additional data in the agent. The major problem is that I don't know if it will be efficient: the unactive ranges might grow quite large. For example, if you start reading a group starting with article 1 and the agent fetches every tenth article, then the unactive ranges will be ((2 . 10) (12 . 20) ...) and after a few tens of thousands of articles there will be information about those long-gone articles 1, 11, 21 that have presumably long been expired from the agent. What do you think? Please help! kai -- ~/.signature is: umop ap!sdn (Frank Nobis)