Gnus development mailing list
 help / color / mirror / Atom feed
From: Michael Shields <shields@msrl.com>
Cc: Jake Colman <colman@ppllc.com>, ding@gnus.org
Subject: Re: Training for ham and training for spam
Date: Thu, 30 Oct 2003 14:39:10 +0000	[thread overview]
Message-ID: <87y8v2hjsh.fsf@mulligatwani.msrl.com> (raw)
In-Reply-To: <m38yn4t90x.fsf@biostaff03.nuigalway.ie> (Ian Dobbie's message of "Wed, 29 Oct 2003 14:26:22 +0000")

In message <m38yn4t90x.fsf@biostaff03.nuigalway.ie>,
Ian Dobbie <ian.dobbie@nuigalway.ie> wrote:
> I would recommend against having large volume mailing lists in the ham
> filter all the time. Maybe train on a few hundred messages and then
> dont bother. However I am using an old, slow machine so CPU is a major
> factor.

If learning is slow, then maybe you should do it asynchronously by
using the settings that copy ham and spam into folders for training,
and then use a cronjob to learn from those in the background.  I use:

    (setq gnus-spam-process-newsgroups
          '(("^INBOX" (gnus-group-ham-exit-processor-copy))))
    (setq gnus-spam-process-destinations
          '(("^INBOX" "INBOX.SA-spam")))
    (setq gnus-ham-process-destinations
          '(("^INBOX" "INBOX.SA-ham")))
    (setq spam-move-spam-nonspam-groups-only nil)

The major motivation for that feature was to have filtering done on a
different machine, the IMAP server.  But it also means that I don't
have to wait for either learning or filtering, since they happen in
the background from cron and procmail respectively.

Another idea would be to have a knob that allowed you to train on only
every n-th message; it would make processing n times faster, and since
the Bayesian filters are statistical they would still work ok.  You
would only set this knob after building up a database of a few hundred
messages.
-- 
Shields.




      reply	other threads:[~2003-10-30 14:39 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-10-29 19:39 Jake Colman
2003-10-29 19:50 ` Ted Zlatanov
2003-10-29 19:54 ` Michael Shields
2003-10-29 14:26   ` Ian Dobbie
2003-10-30 14:39     ` Michael Shields [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y8v2hjsh.fsf@mulligatwani.msrl.com \
    --to=shields@msrl.com \
    --cc=colman@ppllc.com \
    --cc=ding@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).