From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/56146 Path: main.gmane.org!not-for-mail From: Ted Zlatanov Newsgroups: gmane.emacs.gnus.general Subject: Re: spam.el: generic bayes interface? Date: Wed, 21 Jan 2004 13:47:00 -0500 Organization: =?koi8-r?q?=F4=C5=CF=C4=CF=D2=20=FA=CC=C1=D4=C1=CE=CF=D7?= @ Cienfuegos Sender: ding-owner@lists.math.uh.edu Message-ID: <4n7jzldtqz.fsf@collins.bwh.harvard.edu> References: <4nptdei2oh.fsf@collins.bwh.harvard.edu> <87d69e54qg.fsf@uhoreg.ca> NNTP-Posting-Host: deer.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: sea.gmane.org 1074711003 30494 80.91.224.253 (21 Jan 2004 18:50:03 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Wed, 21 Jan 2004 18:50:03 +0000 (UTC) Cc: Hubert Chan Original-X-From: ding-owner+M4686@lists.math.uh.edu Wed Jan 21 19:49:53 2004 Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by deer.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 1AjNQH-0006t7-00 for ; Wed, 21 Jan 2004 19:49:53 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by malifon.math.uh.edu with smtp (Exim 3.20 #1) id 1AjNOs-0007sV-00; Wed, 21 Jan 2004 12:48:26 -0600 Original-Received: from justine.libertine.org ([66.139.78.221] ident=postfix) by malifon.math.uh.edu with esmtp (Exim 3.20 #1) id 1AjNOn-0007sQ-00 for ding@lists.math.uh.edu; Wed, 21 Jan 2004 12:48:21 -0600 Original-Received: from clifford.bwh.harvard.edu (clifford.bwh.harvard.edu [134.174.9.41]) by justine.libertine.org (Postfix) with ESMTP id BCC673A007A for ; Wed, 21 Jan 2004 12:48:19 -0600 (CST) Original-Received: from collins.bwh.harvard.edu (collins [134.174.9.80]) by clifford.bwh.harvard.edu (8.10.2+Sun/8.11.0) with ESMTP id i0LIl6U25874; Wed, 21 Jan 2004 13:47:06 -0500 (EST) Original-Received: from collins.bwh.harvard.edu (localhost [127.0.0.1]) by collins.bwh.harvard.edu (8.12.9+Sun/8.11.0) with ESMTP id i0LIl0uB016776; Wed, 21 Jan 2004 13:47:00 -0500 (EST) Original-Received: (from tzz@localhost) by collins.bwh.harvard.edu (8.12.9+Sun/8.12.9/Submit) id i0LIl02D016773; Wed, 21 Jan 2004 13:47:00 -0500 (EST) Original-To: ding@gnus.org X-Face: bd.DQ~'29fIs`T_%O%C\g%6jW)yi[zuz6;d4V0`@y-~$#3P_Ng{@m+e4o<4P'#(_GJQ%TT= D}[Ep*b!\e,fBZ'j_+#"Ps?s2!4H2-Y"sx" Mail-Followup-To: ding@gnus.org, Hubert Chan In-Reply-To: <87d69e54qg.fsf@uhoreg.ca> (Hubert Chan's message of "Tue, 20 Jan 2004 23:02:15 -0500") User-Agent: Gnus/5.110002 (No Gnus v0.2) Emacs/21.3.50 (usg-unix-v) Precedence: bulk Xref: main.gmane.org gmane.emacs.gnus.general:56146 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:56146 On Tue, 20 Jan 2004, hubert@uhoreg.ca wrote: > Well, there are at least some good reasons that someone might want > to use multiple Bayesian filters. For example, one might want to > just try out the effectiveness of one filter, while retaining their > original filter as a backup. Also, if one wishes to switch Bayesian > filters, and does not have a corpus of spam/ham to train the filter, > there would have to be a transition time during which the new filter > is trained, while the old filter is still being used for splitting. > And, of course, during this time, one would still want to keep > training the old filter at the same time. I'm OK with that, we can add a spam-use-generic-bayesian if it's necessary. I just think customization, registry tracking, and other things won't work so well when we generalize the interface too much. If you or someone else produces that generic-bayesian backend, I don't see a problem with putting it in. We can't anticipate the new bayesian filters people might want to use, after all. > This got me thinking, though, Ted, that the registration code for > the spam/ham processors is pretty similar. They seem to mostly work > in one of two ways -- either register one at a time, or register > multiple articles at a time in a mbox-style format. Yes, I've noticed that too after the 3rd time I wrote that code :) > I think they all feed the articles via standard input. I would > imagine that we would be able to share a lot of common code. Maybe > write a function that feeds the article(s) to the registration > program, and pass the name of the program and its arguments as > arguments to that function. Then the registration functions just > have to call that function with the appropriate arguments. Hmm. > I'll have to look at the code to see if that would actually work... It could work. I've been trying to make the functions generic on the API side, now it's time to make them generic on the backend side as well. I'm afraid it will make the code more complex, but adding new backends should be significantly easier. I'll work on gnus-encrypt.el first though, so feel free to start on this if you have the interest :) Thanks Ted