From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/46259 Path: main.gmane.org!not-for-mail From: Alex Schroeder Newsgroups: gmane.emacs.gnus.general Subject: Re: Paul Graham on fighting SPAM Date: Tue, 27 Aug 2002 01:19:15 +0200 Sender: owner-ding@hpc.uh.edu Message-ID: <87hehh5hu4.fsf@emacswiki.org> References: <87d6sf42ys.fsf@emacswiki.org> <871y8u7un8.fsf@emacswiki.org> <87fzxa7ala.fsf@emacswiki.org> <87d6se9dsy.fsf@emacswiki.org> <87wuqd5lp9.fsf@emacswiki.org> NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1030403937 31021 127.0.0.1 (26 Aug 2002 23:18:57 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 26 Aug 2002 23:18:57 +0000 (UTC) Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 17jT8J-000832-00 for ; Tue, 27 Aug 2002 01:18:55 +0200 Original-Received: from sina.hpc.uh.edu ([129.7.128.10] ident=lists) by malifon.math.uh.edu with esmtp (Exim 3.20 #1) id 17jT7G-00062M-00; Mon, 26 Aug 2002 18:17:50 -0500 Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Mon, 26 Aug 2002 18:18:22 -0500 (CDT) Original-Received: from sclp3.sclp.com (qmailr@sclp3.sclp.com [209.196.61.66]) by sina.hpc.uh.edu (8.9.3/8.9.3) with SMTP id SAA09318 for ; Mon, 26 Aug 2002 18:18:07 -0500 (CDT) Original-Received: (qmail 22787 invoked by alias); 26 Aug 2002 23:17:25 -0000 Original-Received: (qmail 22781 invoked from network); 26 Aug 2002 23:17:25 -0000 Original-Received: from quimby.gnus.org (80.91.224.244) by gnus.org with SMTP; 26 Aug 2002 23:17:25 -0000 Original-Received: from news by quimby.gnus.org with local (Exim 3.12 #1 (Debian)) id 17jTaX-000116-00 for ; Tue, 27 Aug 2002 01:48:05 +0200 Original-To: ding@gnus.org Original-Path: not-for-mail Original-Newsgroups: gnus.ding Original-Lines: 13 Original-NNTP-Posting-Host: dclient217-162-239-43.hispeed.ch Original-X-Trace: quimby.gnus.org 1030405685 3298 217.162.239.43 (26 Aug 2002 23:48:05 GMT) Original-X-Complaints-To: usenet@quimby.gnus.org Original-NNTP-Posting-Date: 26 Aug 2002 23:48:05 GMT X-Face: ^BC$`[IcggstLPyen&dqF+b2'zyK#r.mU*'Nms}@&4zw%SJ#5!/7SMVjBS7'lb;QK)|IPU5U'o1'522W4TyzB3Ab*IBo^iw]l4|kUbdZuUDO6=Um-.4IzhNiV'B"@K#jy_(wW|Zbk[34flKY^|PrQ?$u2\fKg^]AY>wOX#H32i User-Agent: Gnus/5.090008 (Oort Gnus v0.08) Emacs/21.2.90 (i686-pc-linux-gnu) Cancel-Lock: sha1:9N5cIQ/YexpigP6gVFMluNGSRM0= Precedence: list X-Majordomo: 1.94.jlt7 Xref: main.gmane.org gmane.emacs.gnus.general:46259 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:46259 Alex Schroeder writes: > An ifile user suggested I write code to reduce the dictionary size > again -- perhaps I should remove all the words occuring less than 5 > times, and all words whose spaminess is close to 0.5 (common words > occuring both in spam and non-spam) I was bored, so I implemented this. A new version is in gnu.emacs.sources. It reduced the dictionary file from over 500k to below 100k. Sounds good! :) Alex.