From: Harry Putnam <reader@newsguy.com>
Subject: Re: Major splitting problem ... Advice please
Date: Fri, 12 Oct 2001 21:28:43 -0700 [thread overview]
Message-ID: <m14rp488pq.fsf@reader.newsguy.com> (raw)
In-Reply-To: <878zegyfip.fsf@raven.i.defaultvalue.org> (Rob Browning's message of "Fri, 12 Oct 2001 11:44:30 -0500")
Rob Browning <rlb@defaultvalue.org> writes:
>> However, there are enough differnet date styles to make that kind of
>> split pretty hard to program. Also the problem of some messages
>> that came late to a thread, landing in a different group arises.
>> Keeping all thead members in one group may not even be possible,
>> except by hand. I'm not sure.
>
> Hmm. I had just been planning to use gnus date functions. I hadn't
> considered that those might not be sufficient.
My comments may have been a little misleading. They were directed at
the idea of splitting messages by date with tools such as awk and
perl. What I was getting at was a certain amount of difficulty
getting regular expressions that match all possible date formulations
like these (Taken from a sample of headers on comp.unix.solaris):
Date: 24 Sep 2001 09:07:45 GMT
Date: Mon, 8 Oct 2001 15:30:18 +0100
Date: 8 Oct 2001 14:30:26 GMT
Date: 08 Oct 2001 16:42:08 +0200
Date: Sun, 7 Oct 2001 20:02:06 +0200
Date: Sun, 07 Oct 2001 17:45:17 GMT
There are some even odder formulations to be found. Probably not
impossible to set regexp that will work for them all, but just a pita.
If you plan to use the date functions that do limiting like these:
`/ t' and 'C-u / t' It may not be a problem. I wanted to do the
splitting outside gnus because it is such a large archive.
(app 250,000 messages, from about a dozen groups)
I haven't tried this but I suspect one could do this by first setting
up a nnmail split methods that splits by date to mnemonic named
groups. Then entering the monster groups and split them with `M P a
<RET> B r <RET>. Adjusting the spit method rules for each group, But
here again, I would expect some extensive experimentation getting the
date regexp right. And it would be very time intensive to do that
inside gnus I think, assuming the groups are above 25,000 or so.
prev parent reply other threads:[~2001-10-13 4:28 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-10-11 5:26 Harry Putnam
2001-10-11 7:40 ` Kai Großjohann
2001-10-12 4:01 ` Harry Putnam
2001-10-11 12:02 ` Karl Kleinpaste
2001-10-11 15:54 ` Paul Jarc
2001-10-11 16:25 ` Paul Jarc
2001-10-11 16:37 ` Kai Großjohann
2001-10-12 16:44 ` Rob Browning
2001-10-13 4:28 ` Harry Putnam [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m14rp488pq.fsf@reader.newsguy.com \
--to=reader@newsguy.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).