From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id NAA10709; Mon, 21 Oct 2002 13:58:38 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id NAA10574 for ; Mon, 21 Oct 2002 13:58:37 +0200 (MET DST) Received: from lri.lri.fr (lri.lri.fr [129.175.15.1]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id g9LBwa508151 for ; Mon, 21 Oct 2002 13:58:36 +0200 (MET DST) Received: from serveur-mail.lri.fr (serveur-mail [129.175.8.90]) by lri.lri.fr (8.11.6/jtpda-5.3.2) with ESMTP id g9LBpaR07024 ; Mon, 21 Oct 2002 13:51:36 +0200 (MEST) Received: from serveur-demons (mail@serveur-demons [129.175.8.130]) by serveur-mail.lri.fr (8.11.6/jtpda-5.3.2) with ESMTP id g9LBpZx29415 ; Mon, 21 Oct 2002 13:51:36 +0200 (MEST) Received: from marche by serveur-demons with local (Exim 3.35 #1 (Debian)) id 183b5r-0003ot-00; Mon, 21 Oct 2002 13:51:35 +0200 From: Claude Marche MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Message-ID: <15795.59975.509171.829475@mailhost.lri.fr> Date: Mon, 21 Oct 2002 13:51:35 +0200 To: jmarant@nerim.net (=?iso-8859-1?q?J=E9r=F4me?= Marant) Cc: caml-list@inria.fr Subject: Re: [Caml-list] Announcement: SpamOracle In-Reply-To: <87vg3wvn8b.fsf@marant.org> References: <20020826151138.A32572@pauillac.inria.fr> <20021020104354.GA11059@iliana> <20021020204946.GP31578@cs.unibo.it> <87vg3wvn8b.fsf@marant.org> X-Mailer: VM 6.93 under Emacs 20.7.2 X-MailScanner: Found to be clean Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk >>>>> "Jérôme" == Jérôme Marant writes: Jérôme> Stefano Zacchiroli writes: >> On Sun, Oct 20, 2002 at 12:43:54PM +0200, Sven Luther wrote: >>> That said, what i really wanted to know, is if you have some idea of how >>> spamoracle would scale in case of heavy load, if you use it to filter >>> mailing lists input for example ? For example, do you use it to filter >>> the ocaml mailing lists or something such ? Or do you think it would be >>> possible to filter the debian mailing lists and not have the mailserver >>> overload or something such ? >> >> BTW, have you performed any comparison with spamassassin? Jérôme> Hi, Jérôme> I've already tried spamoracle: I fed it with about 2000 spams and Jérôme> 3000 good mails and it too often considered good mail as spam. Hi, I use Spamoracle almost since it has been announced. Before, I was using SpamAssassin. Currently, my Spamoracle database contains roughly 20000 good mails and 1000 spams (not including asiatic language spams which are filtered differently). Now, I usually get 0 or 1 spam per day not filtered, usually because there are written in french and my database is not large enough for those. I check my spamoracle folder some time to time, I had almost no good mail classified as spam, and if I get one, I immediately move the mail in a `good' folder and rebuild the database. I suggest you should check to way you built your database, may be you made some mistakes. With respect to SpamAssassin, SpamOracle runs much faster, this would not surprise anyone here since SpamAssassin is a perl script. Moreover, I had problems with SpamAssassin because I receive my mails on several machines, not running the very same version of perl, that sometime leads to runtime error in execution of SpamAssassin. Finally, one should be aware that the filtering methods of SpamAssassin and SpamOracle are very different, and I like very much the idea, in SpamOracle, that the filter should be tuned by the user personal idea of what is a spam. I recommend reading Paul Graham's paper (http://www.paulgraham.com/spam.html) on which SpamOracle filter method is based. I wish you a happy spam filtering ! - Claude -- | Claude Marché | mailto:Claude.Marche@lri.fr | | LRI - Bât. 490 | http://www.lri.fr/~marche/ | | Université de Paris-Sud | phoneto: +33 1 69 15 64 85 | | F-91405 ORSAY Cedex | faxto: +33 1 69 15 65 86 | ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners