From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/53169 Path: main.gmane.org!not-for-mail From: Ted Zlatanov Newsgroups: gmane.emacs.gnus.general Subject: Re: spam-stat regeneration notes Date: Sun, 15 Jun 2003 14:38:48 -0400 Organization: =?koi8-r?q?=F4=C5=CF=C4=CF=D2=20=FA=CC=C1=D4=C1=CE=CF=D7?= @ Cienfuegos Sender: ding-owner@lists.math.uh.edu Message-ID: <4nvfv7jidj.fsf@lockgroove.bwh.harvard.edu> References: NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1055702254 30833 80.91.224.249 (15 Jun 2003 18:37:34 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sun, 15 Jun 2003 18:37:34 +0000 (UTC) Cc: ding Original-X-From: ding-owner+M1713@lists.math.uh.edu Sun Jun 15 20:37:30 2003 Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 19RcMg-0007wz-00 for ; Sun, 15 Jun 2003 20:36:31 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by malifon.math.uh.edu with smtp (Exim 3.20 #1) id 19RcP6-0003jA-00; Sun, 15 Jun 2003 13:39:00 -0500 Original-Received: from sclp3.sclp.com ([64.157.176.121]) by malifon.math.uh.edu with smtp (Exim 3.20 #1) id 19RcP1-0003j5-00 for ding@lists.math.uh.edu; Sun, 15 Jun 2003 13:38:55 -0500 Original-Received: (qmail 75562 invoked by alias); 15 Jun 2003 18:38:55 -0000 Original-Received: (qmail 75557 invoked from network); 15 Jun 2003 18:38:55 -0000 Original-Received: from clifford.bwh.harvard.edu (134.174.9.41) by sclp3.sclp.com with SMTP; 15 Jun 2003 18:38:55 -0000 Original-Received: from lockgroove.bwh.harvard.edu (lockgroove [134.174.9.133]) by clifford.bwh.harvard.edu (8.10.2+Sun/8.11.0) with ESMTP id h5FIcmI17061; Sun, 15 Jun 2003 14:38:49 -0400 (EDT) Original-Received: (from tzz@localhost) by lockgroove.bwh.harvard.edu (8.11.6+Sun/8.11.0) id h5FIcmY08977; Sun, 15 Jun 2003 14:38:48 -0400 (EDT) Original-To: Bill White X-Face: bd.DQ~'29fIs`T_%O%C\g%6jW)yi[zuz6;d4V0`@y-~$#3P_Ng{@m+e4o<4P'#(_GJQ%TT= D}[Ep*b!\e,fBZ'j_+#"Ps?s2!4H2-Y"sx" Mail-Followup-To: Bill White , ding In-Reply-To: (Bill White's message of "Fri, 13 Jun 2003 19:43:06 -0500") User-Agent: Gnus/5.1003 (Gnus v5.10.3) Emacs/21.3 (usg-unix-v) Precedence: bulk Xref: main.gmane.org gmane.emacs.gnus.general:53169 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:53169 On Fri, 13 Jun 2003, billw@wolfram.com wrote: > I've been using spam-stat for about 6 months now, and noticed lately > that spam processing was getting mightly slow - thanks (I suspect) > to the hashbusters spammers are putting in their messages. I even > do (spam-stat-reduce-size) when quitting gnus each day, so the thing > was as small as possible. Maybe we should add code that filters out terms seen only once from the spam/ham database. Doing that once a week should work fine. Also, spam-stat.el creates hashtables without any optimizations: (make-hash-table :test 'equal) maybe that could be improved. There may be many such optimization points. I don't use spam-stat personally, but maybe you can instrument it and see where the slowdowns are. > So today I did my first rebuild of the spam-stat database. That's > not bad in my book - 6 months for one constantly-growing database. > Here's the code, which I should probably put in a function > "spam-reset" or something. Please do! > - Is there an easy way to run a function over an entire directory > tree, while specifying which dirs to include or avoid? Maybe find-dired will help? I don't know about anything like that built-in... Ted