Gnus development mailing list
 help / color / mirror / Atom feed
From: Harry Putnam <reader@newsguy.com>
Subject: Re: Diffinitive archiving method sought - Big prize money for best entrant
Date: Fri, 06 Sep 2002 14:42:08 -0700	[thread overview]
Message-ID: <m3y9aelrrj.fsf@newsguy.com> (raw)
In-Reply-To: <vafit1j86dc.fsf@lucy.cs.uni-dortmund.de> (Kai.Grossjohann@CS.Uni-Dortmund.DE's message of "Fri, 06 Sep 2002 17:50:23 +0200")

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 2537 bytes --]

Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes:

> Collect 6 months worth of articles in a directory.  Then archive the
> ones older than 3 months into your archive and remove them from the
> directory.  Then you wait another three months and again archive the
> old messages and remove them.
>
> Now comes the problem how to remove articles.  If you're careful, it
> should be possible by removing the articles themselves plus the
> overview entries plus perhaps adjusting the active file.  Another
> possibility is to figure out which function F is called from
> gnus-agent-expire to actually delete, then get a list of messages
> archived and call that function F on those messages.

I think I understand the proceedure Kai, but it sounds even more
labor intensive than anything I had come up with.  My whole aim is to
find a lazy way to do it.

Piecing out the overview files and such doesn't fit into my lazy man
scheme.. hehe.

Also I may not have made clear that the archive itself is not
maintained under gnus.  It just gets fed from there.

I'm thinking a script that works on message dates will be about
right.  The odd message that comes in 3mnths late in a thread won't
be important enough to try to allow for.  I considered using file
dates instead but there are too many ways a file date might get
changed over time given OS changes or complete revamps etc etc. 
Even mishaps where a section or all is destroyed and rebuilt from
online archives or the like. 

Thanks for the idea..

I guess I've just been too lazy to write some perl to do this.

I've begun to get a semi outline in mind so guess I'll get started on
it.  Maybe I should repost this stuff to gnu.emacs.gnus too:

Something like this should work:

1) date regex like this: look for  /^Date:.*(Jan|Feb|Mar)/
                        Take action if blank line is seen

Action to take
2) (There is perl stuff to do these things but I have refresh my self
    on it).
   using pwd, establish this files address them 
   mkdir -p that address while changing the first directoryname in
   path to correct quarter.  
   
   Then use perls rename to put the file at the end of the newly
   created (or existing) address
   
This process would have to be carried out on each file.  In the
current case that is 500,000+ but it would be much smaller next time
around.

The current archive wound not need to be concerned about year, but that
shouldn't be to tough to add to the regex either unless the dates I
have have lots of different weird date syntax.



  reply	other threads:[~2002-09-06 21:42 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-09-05  2:17 Harry Putnam
2002-09-06 15:50 ` Kai Großjohann
2002-09-06 21:42   ` Harry Putnam [this message]
2002-09-07 19:04     ` Kai Großjohann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=m3y9aelrrj.fsf@newsguy.com \
    --to=reader@newsguy.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).