From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/56135 Path: main.gmane.org!not-for-mail From: Ted Zlatanov Newsgroups: gmane.emacs.gnus.general Subject: Re: spam.el: generic bayes interface? Date: Tue, 20 Jan 2004 19:08:14 -0500 Organization: =?koi8-r?q?=F4=C5=CF=C4=CF=D2=20=FA=CC=C1=D4=C1=CE=CF=D7?= @ Cienfuegos Sender: ding-owner@lists.math.uh.edu Message-ID: <4nptdei2oh.fsf@collins.bwh.harvard.edu> References: NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1074643803 6581 80.91.224.253 (21 Jan 2004 00:10:03 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 21 Jan 2004 00:10:03 +0000 (UTC) Cc: Hubert Chan Original-X-From: ding-owner+M4675@lists.math.uh.edu Wed Jan 21 01:09:50 2004 Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1Aj5wM-0000NL-00 for ; Wed, 21 Jan 2004 01:09:50 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by malifon.math.uh.edu with smtp (Exim 3.20 #1) id 1Aj5wA-0005WX-00; Tue, 20 Jan 2004 18:09:38 -0600 Original-Received: from justine.libertine.org ([66.139.78.221] ident=postfix) by malifon.math.uh.edu with esmtp (Exim 3.20 #1) id 1Aj5w5-0005WS-00 for ding@lists.math.uh.edu; Tue, 20 Jan 2004 18:09:33 -0600 Original-Received: from clifford.bwh.harvard.edu (clifford.bwh.harvard.edu [134.174.9.41]) by justine.libertine.org (Postfix) with ESMTP id 3CDD13A0083 for ; Tue, 20 Jan 2004 18:09:33 -0600 (CST) Original-Received: from collins.bwh.harvard.edu (collins [134.174.9.80]) by clifford.bwh.harvard.edu (8.10.2+Sun/8.11.0) with ESMTP id i0L08LU17655; Tue, 20 Jan 2004 19:08:21 -0500 (EST) Original-Received: from collins.bwh.harvard.edu (localhost [127.0.0.1]) by collins.bwh.harvard.edu (8.12.9+Sun/8.11.0) with ESMTP id i0L08EuB012299; Tue, 20 Jan 2004 19:08:14 -0500 (EST) Original-Received: (from tzz@localhost) by collins.bwh.harvard.edu (8.12.9+Sun/8.12.9/Submit) id i0L08E5d012296; Tue, 20 Jan 2004 19:08:14 -0500 (EST) Original-To: ding@gnus.org X-Face: bd.DQ~'29fIs`T_%O%C\g%6jW)yi[zuz6;d4V0`@y-~$#3P_Ng{@m+e4o<4P'#(_GJQ%TT= D}[Ep*b!\e,fBZ'j_+#"Ps?s2!4H2-Y"sx" Mail-Followup-To: ding@gnus.org, Hubert Chan In-Reply-To: (Reiner Steib's message of "Tue, 20 Jan 2004 22:17:06 +0100") User-Agent: Gnus/5.110002 (No Gnus v0.2) Emacs/21.3.50 (usg-unix-v) Precedence: bulk Xref: main.gmane.org gmane.emacs.gnus.general:56135 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:56135 On Tue, 20 Jan 2004, 4.uce.03.r.s@nurfuerspam.de wrote: > in the German Gnus group someone asked how to use the > SpamAssassin/Bayes (see sa-learn(1)) thingie with Gnus. I happily > pointed him to `spam.el' and the fine manual. But it turned out > that there is no interface for SpamAssassin/Bayes in `spam.el' (or > at least I couldn't locate it). Yes, spam-use-regex-headers will do the right thing for splitting incoming mail, but there's no SA specific backend. Hubert Chan wrote a SA backend, and I have been late in replying to his questions. It's coming, though. > I assume that SpamAssassin/Bayes works very similar to bogofilter > [1], so it probably works by abusing the `spam-bogofilter-*' [2] > variables. But this is a quite dubious approach, IMHO. Wouldn't it > make sense to add a generic bayes interface with say > `spam-bayes-...' variables (similar to the `browse-url-generic*' > variables) instead of adding a set of variables for each (new) > Bayesian filter? The problem is that then you force people into just one Bayesian approach (how would SA and bogofilter work together?), and I'm not sure it's a good idea. Granted, most people use just one Bayesian filter, so it's probably nice to switch filters with just one thing. But consider that the registry must track which Bayesian backend has registered which message. Let's say the registry knows that spam-use-bayesian has registered message A, and that was Bogofilter at the time, but the user switches to SA later. Now the registry doesn't know that SA has not registered message A, and spam.el will not re-register message A. It's just an example, but things will be slightly harder to track in general. Also, I can't drop the current Bayesian spam-use-* backends that users are using. So now we will have the general case of spam-use-bayesian plus the specific backends. Seems pretty confusing. I would prefer to make adding new Bayesian backends easy, but give them separate spam-use-BACKEND symbols. Hubert's work will be helpful here, because I've been too lazy/busy to write a good example :) Ted