Gnus development mailing list
 help / color / mirror / Atom feed
From: Hubert Chan <hubert@uhoreg.ca>
Subject: Re: spam.el: generic bayes interface?
Date: Tue, 20 Jan 2004 23:02:15 -0500	[thread overview]
Message-ID: <87d69e54qg.fsf@uhoreg.ca> (raw)
In-Reply-To: <4nptdei2oh.fsf@collins.bwh.harvard.edu> (Ted Zlatanov's message of "Tue, 20 Jan 2004 19:08:14 -0500")

>>>>> "Ted" == Ted Zlatanov <tzz@lifelogs.com> writes:

[...]

Ted> Yes, spam-use-regex-headers will do the right thing for splitting
Ted> incoming mail, but there's no SA specific backend.  Hubert Chan
Ted> wrote a SA backend, and I have been late in replying to his
Ted> questions.  It's coming, though.

I see it in CVS now. ;-)  I promised to write documentation too, but
that won't happen until at least next week some time.  In the mean
time, though, the variable documentation should probably suffice for
most people.

[...]

Ted> The problem is that then you force people into just one Bayesian
Ted> approach (how would SA and bogofilter work together?), and I'm not
Ted> sure it's a good idea.  Granted, most people use just one Bayesian
Ted> filter, so it's probably nice to switch filters with just one
Ted> thing.

Well, there are at least some good reasons that someone might want to
use multiple Bayesian filters.  For example, one might want to just try
out the effectiveness of one filter, while retaining their original
filter as a backup.  Also, if one wishes to switch Bayesian filters,
and does not have a corpus of spam/ham to train the filter, there would
have to be a transition time during which the new filter is trained,
while the old filter is still being used for splitting.  And, of
course, during this time, one would still want to keep training the old
filter at the same time.

This got me thinking, though, Ted, that the registration code for the
spam/ham processors is pretty similar.  They seem to mostly work in one
of two ways -- either register one at a time, or register multiple
articles at a time in a mbox-style format.  I think they all feed the
articles via standard input.  I would imagine that we would be able to
share a lot of common code.  Maybe write a function that feeds the
article(s) to the registration program, and pass the name of the
program and its arguments as arguments to that function.  Then the
registration functions just have to call that function with the
appropriate arguments.  Hmm.  I'll have to look at the code to see if
that would actually work...

-- 
Hubert Chan <hubert@uhoreg.ca> - http://www.uhoreg.ca/
PGP/GnuPG key: 1024D/124B61FA
Fingerprint: 96C5 012F 5F74 A5F7 1FF7  5291 AF29 C719 124B 61FA
Key available at wwwkeys.pgp.net.   Encrypted e-mail preferred.




  reply	other threads:[~2004-01-21  4:02 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-01-20 21:17 Reiner Steib
2004-01-21  0:08 ` Ted Zlatanov
2004-01-21  4:02   ` Hubert Chan [this message]
2004-01-21 18:47     ` Ted Zlatanov
2004-01-21 20:24       ` Hubert Chan
2004-01-22 18:23         ` Ted Zlatanov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87d69e54qg.fsf@uhoreg.ca \
    --to=hubert@uhoreg.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).