From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/46175 Path: main.gmane.org!not-for-mail From: Ted Zlatanov Newsgroups: gmane.emacs.gnus.general Subject: Re: Paul Graham on fighting SPAM Date: Mon, 19 Aug 2002 07:29:44 -0400 Organization: =?koi8-r?q?=F4=C5=CF=C4=CF=D2=20=FA=CC=C1=D4=C1=CE=CF=D7?= @ Cienfuegos Sender: owner-ding@hpc.uh.edu Message-ID: References: <87d6sf42ys.fsf@emacswiki.org> NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1029756431 17017 127.0.0.1 (19 Aug 2002 11:27:11 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 19 Aug 2002 11:27:11 +0000 (UTC) Cc: ding@gnus.org Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 17gkge-0004Q5-00 for ; Mon, 19 Aug 2002 13:27:08 +0200 Original-Received: from sina.hpc.uh.edu ([129.7.128.10] ident=lists) by malifon.math.uh.edu with esmtp (Exim 3.20 #1) id 17gkh2-0008GM-00; Mon, 19 Aug 2002 06:27:33 -0500 Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Mon, 19 Aug 2002 06:28:03 -0500 (CDT) Original-Received: from sclp3.sclp.com (qmailr@sclp3.sclp.com [209.196.61.66]) by sina.hpc.uh.edu (8.9.3/8.9.3) with SMTP id GAA22732 for ; Mon, 19 Aug 2002 06:27:52 -0500 (CDT) Original-Received: (qmail 9061 invoked by alias); 19 Aug 2002 11:27:16 -0000 Original-Received: (qmail 9056 invoked from network); 19 Aug 2002 11:27:16 -0000 Original-Received: from ns1.beld.net (208.229.215.81) by gnus.org with SMTP; 19 Aug 2002 11:27:16 -0000 Original-Received: from heechee.beld.net (dhcp-0-50-8b-df-51-5e.cpe.beld.net [65.202.179.253]) by ns1.beld.net (Postfix) with ESMTP id 3C7F83B949; Mon, 19 Aug 2002 07:27:14 -0400 (EDT) Original-To: Alex Schroeder X-Face: bd.DQ~'29fIs`T_%O%C\g%6jW)yi[zuz6;d4V0`@y-~$#3P_Ng{@m+e4o<4P'#(_GJQ%TT= D}[Ep*b!\e,fBZ'j_+#"Ps?s2!4H2-Y"sx" Mail-Followup-To: Alex Schroeder , ding@gnus.org In-Reply-To: <87d6sf42ys.fsf@emacswiki.org> (Alex Schroeder's message of "Mon, 19 Aug 2002 11:23:07 +0200") Original-Lines: 27 User-Agent: Gnus/5.090008 (Oort Gnus v0.08) Emacs/21.2 (i386-redhat-linux-gnu) Precedence: list X-Majordomo: 1.94.jlt7 Xref: main.gmane.org gmane.emacs.gnus.general:46175 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:46175 On Mon, 19 Aug 2002, alex@emacswiki.org wrote: > I posted some code to g.e.sources to implement the basics. If > anybody feels like fooling around with it, I'd be happy to read > about it. There's also a comment by Kai on it in g.e.help. > > * http://www.emacswiki.org/cgi-bin/wiki.pl?SpamStat > > Things I'd like to see: Efficient storage and retrieval of the data > from disk. Based on 3351 mails, 298 of them being spam, I got a > dictionary of 650k; preparing it used an intermediary file of 7m. > Once saving is fast, I'd like to update the stats as we go along to > avoid the long preparation times. Updating the stats requires the > original 7m of data, however. So before delving into all of this, > I'd prefer to see wether it works, see what other people think, > collect some ideas and patches... Do you want to integrate it with the current spam.el contents? You just need to add a function that uses spam-stats.el, that can be invoked on a message buffer to return t or a number if spam is detected, and nil otherwise. See the spam-split function, it already invokes the blackholes and whitelist/blacklist checks, and would invoke your function as well. I have to write the code to make those checks user-selectable via some symbols, but that's a separate thing. -- Teodor Zlatanov "Brevis oratio penetrat colos, longa potatio evacuat ciphos." -Rabelais