From: Michael Shields <shields@msrl.com>
Cc: Jake Colman <colman@ppllc.com>, ding@gnus.org
Subject: Re: Training for ham and training for spam
Date: Thu, 30 Oct 2003 14:39:10 +0000 [thread overview]
Message-ID: <87y8v2hjsh.fsf@mulligatwani.msrl.com> (raw)
In-Reply-To: <m38yn4t90x.fsf@biostaff03.nuigalway.ie> (Ian Dobbie's message of "Wed, 29 Oct 2003 14:26:22 +0000")
In message <m38yn4t90x.fsf@biostaff03.nuigalway.ie>,
Ian Dobbie <ian.dobbie@nuigalway.ie> wrote:
> I would recommend against having large volume mailing lists in the ham
> filter all the time. Maybe train on a few hundred messages and then
> dont bother. However I am using an old, slow machine so CPU is a major
> factor.
If learning is slow, then maybe you should do it asynchronously by
using the settings that copy ham and spam into folders for training,
and then use a cronjob to learn from those in the background. I use:
(setq gnus-spam-process-newsgroups
'(("^INBOX" (gnus-group-ham-exit-processor-copy))))
(setq gnus-spam-process-destinations
'(("^INBOX" "INBOX.SA-spam")))
(setq gnus-ham-process-destinations
'(("^INBOX" "INBOX.SA-ham")))
(setq spam-move-spam-nonspam-groups-only nil)
The major motivation for that feature was to have filtering done on a
different machine, the IMAP server. But it also means that I don't
have to wait for either learning or filtering, since they happen in
the background from cron and procmail respectively.
Another idea would be to have a knob that allowed you to train on only
every n-th message; it would make processing n times faster, and since
the Bayesian filters are statistical they would still work ok. You
would only set this knob after building up a database of a few hundred
messages.
--
Shields.
prev parent reply other threads:[~2003-10-30 14:39 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-10-29 19:39 Jake Colman
2003-10-29 19:50 ` Ted Zlatanov
2003-10-29 19:54 ` Michael Shields
2003-10-29 14:26 ` Ian Dobbie
2003-10-30 14:39 ` Michael Shields [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87y8v2hjsh.fsf@mulligatwani.msrl.com \
--to=shields@msrl.com \
--cc=colman@ppllc.com \
--cc=ding@gnus.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).