From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/69163 Path: news.gmane.org!not-for-mail From: Ted Zlatanov Newsgroups: gmane.emacs.gnus.general Subject: Re: filtering nntp messages Date: Sat, 24 Oct 2009 00:04:58 -0500 Organization: =?utf-8?B?0KLQtdC+0LTQvtGAINCX0LvQsNGC0LDQvdC+0LI=?= @ Cienfuegos Message-ID: <87y6n1crgl.fsf@lifelogs.com> References: <87fx9ayr7z.fsf@newsguy.com> <874oppivvc.fsf@topper.koldfront.dk> <87skd9en56.fsf@lifelogs.com> <87bpjxpnim.fsf@newsguy.com> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1256360812 22880 80.91.229.12 (24 Oct 2009 05:06:52 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Sat, 24 Oct 2009 05:06:52 +0000 (UTC) To: ding@gnus.org Original-X-From: ding-owner+M17576@lists.math.uh.edu Sat Oct 24 07:06:45 2009 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by lo.gmane.org with esmtp (Exim 4.50) id 1N1Ypw-0006Xs-QK for ding-account@gmane.org; Sat, 24 Oct 2009 07:06:45 +0200 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1N1YpA-0002Yr-S5; Sat, 24 Oct 2009 00:05:56 -0500 Original-Received: from mx2.math.uh.edu ([129.7.128.33]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1N1Yp8-0002Yb-0w for ding@lists.math.uh.edu; Sat, 24 Oct 2009 00:05:54 -0500 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx2.math.uh.edu with esmtp (Exim 4.69) (envelope-from ) id 1N1Yot-0001BF-TQ for ding@lists.math.uh.edu; Sat, 24 Oct 2009 00:05:53 -0500 Original-Received: from lo.gmane.org ([80.91.229.12]) by quimby.gnus.org with esmtp (Exim 3.36 #1 (Debian)) id 1N1Yot-0000ot-00 for ; Sat, 24 Oct 2009 07:05:39 +0200 Original-Received: from list by lo.gmane.org with local (Exim 4.50) id 1N1Yot-0006I0-3H for ding@gnus.org; Sat, 24 Oct 2009 07:05:39 +0200 Original-Received: from c-98-227-29-141.hsd1.il.comcast.net ([98.227.29.141]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sat, 24 Oct 2009 07:05:39 +0200 Original-Received: from tzz by c-98-227-29-141.hsd1.il.comcast.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sat, 24 Oct 2009 07:05:39 +0200 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 34 Original-X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: c-98-227-29-141.hsd1.il.comcast.net X-Face: bd.DQ~'29fIs`T_%O%C\g%6jW)yi[zuz6;d4V0`@y-~$#3P_Ng{@m+e4o<4P'#(_GJQ%TT= D}[Ep*b!\e,fBZ'j_+#"Ps?s2!4H2-Y"sx" User-Agent: Gnus/5.110011 (No Gnus v0.11) Emacs/23.1.50 (gnu/linux) Cancel-Lock: sha1:txoLZzxKsOZW+Et93Cso+Itdbmw= X-Spam-Score: -2.6 (--) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:69163 Archived-At: On Fri, 23 Oct 2009 20:51:45 -0500 Harry Putnam wrote: HP> Ted Zlatanov writes: >> You can filter nntp with spam.el, same as any other message source. You >> just can't move spam articles out of a group, but you can copy them to >> another backend or feed directly into the spam.el backends for spam >> training. >> >> Statistical spam backends will require fetching every message body, >> though, which could be painful. Unfortunately that's the best solution >> nowadays. You may want to look into integrating some anti-spam solution >> with leafnode on arrival or something like it (I don't know if it's >> possible!). Then you can just score on headers, with spam.el or not. HP> Sounds like scoring would be a better and easier solution eh Ted?. I mean HP> you can score on just the headers right... and mark things read by HP> scoring or the like, not having to download all bodies. Or am I HP> missing your point? Spam changes too fast for scoring rules. See "A Plan for Spam" which sort of started the statistical analysis of spam a while back. Nowadays companies use a mix of statistical, blackhole, and static filters, but individual users can't keep up with the latter two. If you find static (scoring) rules sufficient, that's wonderful. If the posting path for your NNTP spam always goes through a particular host, for example, use that in your scoring rules. Your NNTP host may already be doing some spam filtering, check the headers. If static scoring won't work, you can try setting up statistical filters just on the headers. It won't be as accurate as filtering on the whole body but it will be fast. Ted