From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id SAA22954; Fri, 15 Nov 2002 18:23:17 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id SAA22971 for ; Fri, 15 Nov 2002 18:23:16 +0100 (MET) Received: from saul.cis.upenn.edu (SAUL.CIS.UPENN.EDU [158.130.12.4]) by nez-perce.inria.fr (8.11.1/8.11.1) with ESMTP id gAFHNE108241; Fri, 15 Nov 2002 18:23:14 +0100 (MET) Received: from localhost (localhost [127.0.0.1]) by saul.cis.upenn.edu (8.12.5/8.12.5) with SMTP id gAFHNDsw009247; Fri, 15 Nov 2002 12:23:13 -0500 (EST) To: xavier.leroy@inria.fr Subject: [Caml-list] Quick spamoracle questions Cc: caml-list@inria.fr Reply-to: bcpierce@cis.upenn.edu Date: Fri, 15 Nov 2002 12:23:13 EST Message-ID: <9246.1037380993@saul.cis.upenn.edu> From: "Benjamin C. Pierce" Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk Quick question about spamoracle: Is it important to try to keep the number of sample spams and sample non-spams roughly the same when constructing the database? (E.g., if I present it with 100 times as many non-spam examples as spam examples, will that tend to make it very conservative, or even too conservative?) And another (or maybe it's the same one): When it correctly judges a message to be spam, is it a good idea to add this message to the database as a new spam example, to reinforce what happened? Thanks for a great tool, Xavier! Benjamin ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners