From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/46424 Path: main.gmane.org!not-for-mail From: Harry Putnam Newsgroups: gmane.emacs.gnus.general Subject: Re: Diffinitive archiving method sought - Big prize money for best entrant Date: Fri, 06 Sep 2002 14:42:08 -0700 Sender: owner-ding@hpc.uh.edu Message-ID: References: NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit X-Trace: main.gmane.org 1031348733 14641 127.0.0.1 (6 Sep 2002 21:45:33 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Fri, 6 Sep 2002 21:45:33 +0000 (UTC) Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 17nQux-0003o1-00 for ; Fri, 06 Sep 2002 23:45:31 +0200 Original-Received: from sina.hpc.uh.edu ([129.7.128.10] ident=lists) by malifon.math.uh.edu with esmtp (Exim 3.20 #1) id 17nQv8-0002jZ-00; Fri, 06 Sep 2002 16:45:42 -0500 Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Fri, 06 Sep 2002 16:46:17 -0500 (CDT) Original-Received: from sclp3.sclp.com (qmailr@sclp3.sclp.com [209.196.61.66]) by sina.hpc.uh.edu (8.9.3/8.9.3) with SMTP id QAA03319 for ; Fri, 6 Sep 2002 16:46:04 -0500 (CDT) Original-Received: (qmail 13392 invoked by alias); 6 Sep 2002 21:45:24 -0000 Original-Received: (qmail 13387 invoked from network); 6 Sep 2002 21:45:24 -0000 Original-Received: from mail.dslextreme.com (66.51.205.14) by gnus.org with SMTP; 6 Sep 2002 21:45:24 -0000 Original-Received: (qmail 10638 invoked from network); 6 Sep 2002 21:43:21 -0000 Original-Received: from adsl-66.51.210.228.dslextreme.com (HELO reader.local.lan) (66.51.210.228) by mail.dslextreme.com with SMTP; 6 Sep 2002 21:43:21 -0000 Original-Received: from reader.local.lan (localhost [127.0.0.1]) by reader.local.lan (8.12.3/8.12.3) with ESMTP id g86LjH8M021433 for ; Fri, 6 Sep 2002 14:45:17 -0700 Original-Received: (from reader@localhost) by reader.local.lan (8.12.3/8.12.3/Submit) id g86LjH2h021430; Fri, 6 Sep 2002 14:45:17 -0700 X-Authentication-Warning: reader.local.lan: reader set sender to reader@newsguy.com using -f Original-To: In-Reply-To: (Kai.Grossjohann@CS.Uni-Dortmund.DE's message of "Fri, 06 Sep 2002 17:50:23 +0200") User-Agent: Gnus/5.090008 (Oort Gnus v0.08) Emacs/21.3.50 (i686-pc-linux-gnu) Original-Lines: 61 Precedence: list X-Majordomo: 1.94.jlt7 Xref: main.gmane.org gmane.emacs.gnus.general:46424 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:46424 Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes: > Collect 6 months worth of articles in a directory. Then archive the > ones older than 3 months into your archive and remove them from the > directory. Then you wait another three months and again archive the > old messages and remove them. > > Now comes the problem how to remove articles. If you're careful, it > should be possible by removing the articles themselves plus the > overview entries plus perhaps adjusting the active file. Another > possibility is to figure out which function F is called from > gnus-agent-expire to actually delete, then get a list of messages > archived and call that function F on those messages. I think I understand the proceedure Kai, but it sounds even more labor intensive than anything I had come up with. My whole aim is to find a lazy way to do it. Piecing out the overview files and such doesn't fit into my lazy man scheme.. hehe. Also I may not have made clear that the archive itself is not maintained under gnus. It just gets fed from there. I'm thinking a script that works on message dates will be about right. The odd message that comes in 3mnths late in a thread won't be important enough to try to allow for. I considered using file dates instead but there are too many ways a file date might get changed over time given OS changes or complete revamps etc etc. Even mishaps where a section or all is destroyed and rebuilt from online archives or the like. Thanks for the idea.. I guess I've just been too lazy to write some perl to do this. I've begun to get a semi outline in mind so guess I'll get started on it. Maybe I should repost this stuff to gnu.emacs.gnus too: Something like this should work: 1) date regex like this: look for /^Date:.*(Jan|Feb|Mar)/ Take action if blank line is seen Action to take 2) (There is perl stuff to do these things but I have refresh my self on it). using pwd, establish this files address them mkdir -p that address while changing the first directoryname in path to correct quarter. Then use perls rename to put the file at the end of the newly created (or existing) address This process would have to be carried out on each file. In the current case that is 500,000+ but it would be much smaller next time around. The current archive wound not need to be concerned about year, but that shouldn't be to tough to add to the regex either unless the dates I have have lots of different weird date syntax.