From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/63962 Path: news.gmane.org!not-for-mail From: Florent Rougon Newsgroups: gmane.mail.spam.spambayes.devel,gmane.emacs.gnus.general Subject: An alternative to spambayes.el for those using Gnus Date: Sun, 12 Nov 2006 00:09:48 +0100 Message-ID: <87zmaxr35v.fsf@florent.maison> NNTP-Posting-Host: main.gmane.org Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-Trace: sea.gmane.org 1163286800 15315 80.91.229.2 (11 Nov 2006 23:13:20 GMT) X-Complaints-To: usenet@sea.gmane.org NNTP-Posting-Date: Sat, 11 Nov 2006 23:13:20 +0000 (UTC) Original-X-From: spambayes-dev-bounces-+ZN9ApsXKcEdnm+yROfE0A@public.gmane.org Sun Nov 12 00:13:19 2006 Return-path: Envelope-to: gmssd-spambayes-dev-Uylq5CNFT+jYtjvyW6yDsg@public.gmane.org Original-Received: from smtp-vbr14.xs4all.nl ([194.109.24.34]) by ciao.gmane.org with esmtp (Exim 4.43) id 1Gj229-0004a9-Jf for gmssd-spambayes-dev-Uylq5CNFT+jYtjvyW6yDsg@public.gmane.org; Sun, 12 Nov 2006 00:13:10 +0100 Original-Received: from bag.python.org (bag.python.org [194.109.207.14]) by smtp-vbr14.xs4all.nl (8.13.8/8.13.8) with ESMTP id kABNBTKx098408; Sun, 12 Nov 2006 00:11:29 +0100 (CET) (envelope-from spambayes-dev-bounces-+ZN9ApsXKcEdnm+yROfE0A@public.gmane.org) Original-Received: from bag.python.org (bag [127.0.0.1]) by bag.python.org (Postfix) with ESMTP id E77FD1E4007; Sun, 12 Nov 2006 00:11:28 +0100 (CET) Original-Received: from bag.python.org (bag [127.0.0.1]) by bag.python.org (Postfix) with ESMTP id 9569F1E400F for ; Sun, 12 Nov 2006 00:10:10 +0100 (CET) X-Spam-Status: OK 0.000 Original-Received: from bag (HELO bag.python.org) (127.0.0.1) by bag.python.org with SMTP; 12 Nov 2006 00:10:10 +0100 Original-Received: from smtp2-g19.free.fr (smtp2-g19.free.fr [212.27.42.28]) by bag.python.org (Postfix) with ESMTP for ; Sun, 12 Nov 2006 00:10:09 +0100 (CET) Original-Received: from frougon.dyndns.org (unknown [81.56.18.128]) by smtp2-g19.free.fr (Postfix) with ESMTP id 4069276EA; Sun, 12 Nov 2006 00:10:09 +0100 (CET) Original-Received: by frougon.dyndns.org (Postfix, from userid 1000) id B5F252F109; Sun, 12 Nov 2006 00:09:48 +0100 (CET) Original-To: ding-smP1P7uqpqc@public.gmane.org, spambayes-dev-+ZN9ApsXKcEdnm+yROfE0A@public.gmane.org Mail-Followup-To: ding-smP1P7uqpqc@public.gmane.org, spambayes-dev-+ZN9ApsXKcEdnm+yROfE0A@public.gmane.org User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/21.4 (gnu/linux) X-BeenThere: spambayes-dev-+ZN9ApsXKcEdnm+yROfE0A@public.gmane.org X-Mailman-Version: 2.1.9 Precedence: list List-Id: Development of the Pythonic Bayesian classifier List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Original-Sender: spambayes-dev-bounces-+ZN9ApsXKcEdnm+yROfE0A@public.gmane.org Errors-To: spambayes-dev-bounces-+ZN9ApsXKcEdnm+yROfE0A@public.gmane.org X-Virus-Scanned: by XS4ALL Virus Scanner Xref: news.gmane.org gmane.mail.spam.spambayes.devel:3808 gmane.emacs.gnus.general:63962 Archived-At: --=-=-= Hi, I've been running my own interface code between Gnus and Spambayes for a while, and improved it a bit today to the point that I think it should be ready for public consumption. It can do the same things as spambayes.el, but in a way that should be cleaner and slightly faster (using `call-process-region' instead of `shell-command-on-region', for instance). It also provides a few more things, most notably: - a command for (re-)running the classifier on an article (or process-marked articles). Useful when you've recently trained Spambayes and want to see how the newly-trained filter performs---and maybe even respool some articles with this new filter. - a command to examine what the Spambayes filter thinks of an article (read-only operation): whether it is classified as ham or spam, the overall spam score as well as the various spam clues with their respective scores (from the 'X-Spambayes-Evidence' header). This has been tested with GNU Emacs 21.4, Spambayes 1.0.3 and No Gnus v0.6 (also with Gnus v5.10.7). It works well for me, and I hope others will find it useful. --=-=-= Content-Type: application/emacs-lisp Content-Disposition: attachment; filename=flo-spambayes.el Content-Transfer-Encoding: quoted-printable Content-Description: Interface between Spambayes and Gnus ;;; flo-spambayes.el --- Integrate Spambayes filtering into Gnus ;; Copyright (C) 2005, 2006 Florent Rougon ;; ;; Author: Florent Rougon ;; Version: 1.0 ;; Keywords: spambayes, gnus, spam, filtering ;; This program is free software; you can redistribute it and/or modify ;; it under the terms of the GNU General Public License as published by ;; the Free Software Foundation; version 2 dated June, 1991. ;; ;; This program is distributed in the hope that it will be useful, but ;; WITHOUT ANY WARRANTY; without even the implied warranty of ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ;; General Public License for more details. ;;=20 ;; You should have received a copy of the GNU General Public License ;; along with this program; see the file COPYING. If not, write to the ;; Free Software Foundation, Inc., ;; 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. ;;; Commentary: ;; This file provides a means of integrating Spambayes filtering into the G= nus ;; message reader. It was initially inspired from the spambayes.el file by ;; Neale Pickett as shipped in the Spambayes distribution. ;; ;; To use this file, put it in a directory that is part of your `load-path' ;; and add the code indicated in this Commentary to your Gnus initialization ;; file (~/.gnus.el). Of course, you also need to have Spambayes installed = and ;; correctly configured. ;; ;; The first thing you should add to your ~/.gnus.el is the following line, ;; which will load this file when Gnus is started (if not already done): ;; ;; (require 'flo-spambayes) ;; ;; Then, make sure the code in this file can locate the Spambayes filter ;; (which is called sb_filter.py as the time of this writing). To that effe= ct, ;; you should customize `flo-spambayes-filter-program', e.g. with: ;; ;; (setq flo-spambayes-filter-program "/usr/local/bin/sb_filter.py") ;; ;; You should also indicate a file that will be used to store the filter ;; output on stderr (usually empty, but might provide valuable information = in ;; case something goes wrong). For instance: ;; ;; (setq flo-spambayes-logfile-for-filter-stderr "~/tmp/spambayes-stderr") ;; ;; The following two lines tell Gnus to pipe incoming mail through the ;; Spambayes classifier: ;; ;; (add-hook 'nnmail-prepare-incoming-message-hook ;; 'flo-spambayes-filter-buffer) ;; ;; This will add the X-Spambayes-Classification header to your incoming mai= l, ;; which you can then use to do your filtering in the usual ways offered by ;; Gnus (e.g., with `nnmail-split-methods'). For instance, my ;; `nnmail-split-methods' starts like this: ;; ;; (("Spambayes-spam" "^X-Spambayes-Classification: spam; ") ;; ("Unsure" "^X-Spambayes-Classification: unsure; ") ;; ;; and is followed by my other filtering rules (those not relying on ;; X-Spambayes-Classification). As a consequence, every mail Spambayes ;; considers as spam ends up in the Spambayes-spam group, every mail Spamba= yes ;; considers as unsure goes to the Unsure group and the rest is split using= my ;; usual rules (mailing lists, friends, etc.). ;; ;; You can also tell Spambayes to retrain the filter on an article, telling= it ;; whether it is ham or spam. This can be done with ;; `flo-spambayes-gnus-refile-as-ham' and `flo-spambayes-gnus-refile-as-spa= m', ;; respectively. Such an action alters the Spambayes database, therefore an ;; article that was classified as, e.g., ham, could then be classified as s= pam. ;; As a result, you'll sometimes want to rerun an article through the ;; classifier. This can be done with `flo-spambayes-gnus-classify'. ;; ;; The following lines provide simple bindings for these three functions in ;; Gnus Summary mode. These bindings will respool the articles after the ;; action is performed. This is very useful: for instance, suppose you have= a ;; wrongly-classified spam message in your Inbox. Typing 'B s' will: ;; - retrain the filter telling it "Hey, dude, this message is spam!" ;; - update the article with the X-Spambayes-Classification header obtain= ed ;; after this retraining (as well as an X-Spambayes-Trained header, ;; indicating that you told the filter the article was spam) ;; - respool the article, which will most probably move it to the ;; Spambayes-spam group, if you are filtering based on the ;; X-Spambayes-Classification header as suggested above. ;; ;; (define-key gnus-summary-mode-map "Bs" ;; #'(lambda () (interactive) (flo-spambayes-gnus-refile-as-spam t))) ;; (define-key gnus-summary-mode-map "Bh" ;; #'(lambda () (interactive) (flo-spambayes-gnus-refile-as-ham t))) ;; (define-key gnus-summary-mode-map "Bf" ;; #'(lambda () (interactive) (flo-spambayes-gnus-classify t))) ;; ;; Sometimes, you might want to only pipe the article to Spambayes (updating ;; it with the new X-Spambayes-Classification, but not respooling it ;; afterwards). This can be done with the following bindings (which can also ;; be used to respool if you use a prefix argument). ;; ;; (define-key gnus-summary-mode-map "BS" 'flo-spambayes-gnus-refile-as-s= pam) ;; (define-key gnus-summary-mode-map "BH" 'flo-spambayes-gnus-refile-as-h= am) ;; (define-key gnus-summary-mode-map "BF" 'flo-spambayes-gnus-classify) ;; ;; Finally, the following binding is useful to see what Spambayes thinks of= an ;; article (classification, score and spam clues) without affecting it at a= ll ;; (this is a read-only operation, whereas flo-spambayes-gnus-classify will ;; update the article with the new X-Spambayes-Classification header). ;; ;; (define-key gnus-summary-mode-map [C-f1] 'flo-spambayes-examine-articl= e) ;;; Code: (defvar flo-spambayes-filter-program "/usr/bin/sb_filter.py" "Path to the sb_filter program.") (defvar flo-spambayes-logfile-for-filter-stderr "~/tmp/spambayes-stderr" "File used to store the stderr output of `flo-spambayes-filter-program'.") ;; This function is short and not flexible, on purpose: it's optimized for ;; speed (it will most likely be run for every message received). (defun flo-spambayes-filter-buffer () "Run `flo-spambayes-filter-program' on the current buffer. Run the program referenced by `flo-spambayes-filter-program' on the contents of the current buffer, with option \"-f\". The buffer contents is replaced with the output of the filter." (call-process-region (point-min) (point-max) flo-spambayes-filter-program t `(t ,flo-spambayes-logfile-for-filter-stderr) nil "-f")) (require 'gnus-art) (require 'gnus-int) (defun flo-spambayes-gnus-filter-article (num group action) "Pipe an article through `flo-spambayes-filter-program'. Pipe the article number NUM from group GROUP through the program referenced= by `flo-spambayes-filter-program'. ACTION must be a symbol from the following list: classify pipe the article through the classifier spam retrain the article as spam ham retrain the article as ham The contents of the article is replaced by the output of the filter program= ." (with-temp-buffer (gnus-request-article-this-buffer num group) (let (arglist action-string res) (setq arglist '("-f")) (cond ((eq action 'spam) (setq arglist (cons "-s" arglist)) (setq action-string "Retraining")) ((eq action 'ham) (setq arglist (cons "-g" arglist)) (setq action-string "Retraining")) ((eq action 'classify) (setq action-string "Classifying")) (t (error "Invalid action: '%s'" action))) (setq res (with-temp-message (concat action-string "...") (eval `(call-process-region (point-min) (point-max) flo-spambayes-filter-program t `(t ,flo-spambayes-logfile-for-filter-stderr) nil ,@arglist)))) (cond ((zerop res) (message "%s...done" action-string) (gnus-request-replace-article num group (current-buffer))) ((stringp res) (error "%s: %s" flo-spambayes-filter-program res)) ((integerp res) (error "%s returned exit status %d" flo-spambayes-filter-program re= s)) (t (error "call-process-region on '%s' returned '%s'" flo-spambayes-filter-program res))) ))) (require 'gnus-sum) (defun flo-spambayes-gnus-filter (action &optional respool) "Filter all processable articles, or the one under the cursor. ACTION must be a symbol from the following list: classify pipe the articles through the classifier spam retrain the articles as spam ham retrain the articles as ham If RESPOOL is non-nil, respool the articles afterwards." (let ((group gnus-newsgroup-name) (list (gnus-summary-work-articles nil))) (while list (flo-spambayes-gnus-filter-article (car list) group action) (setq list (cdr list))) (if respool (gnus-summary-respool-article nil (gnus-group-method gnus-newsgroup-name))) (gnus-summary-unmark-all-processable))) (defun flo-spambayes-gnus-classify (respool) "Reclassify all process-marked articles. If RESPOOL is non-nil, respool the articles afterwards." (interactive "P") (flo-spambayes-gnus-filter 'classify respool)) (defun flo-spambayes-gnus-refile-as-spam (respool) "Retrain and reclassify all process-marked articles as spam. If RESPOOL is non-nil, respool the articles afterwards." (interactive "P") (flo-spambayes-gnus-filter 'spam respool)) (defun flo-spambayes-gnus-refile-as-ham (respool) "Retrain and reclassify all process-marked articles as ham. If RESPOOL is non-nil, respool the articles afterwards." (interactive "P") (flo-spambayes-gnus-filter 'ham respool)) (defun flo-spambayes-examine-article () "Examine the information provided by Spambayes for the current article. The current article is piped through the classifier with an option causing all spam clues to be inserted as headers. This does NOT modify the article. If you want to replace the article with the classifier output (e.g., if you have modified the Spambayes database and want to reclassify the article with this new database), you should use `flo-spambayes-gnus-classify' instead." (interactive) (let ((group gnus-newsgroup-name) (num (gnus-summary-article-number)) (buf (get-buffer-create " *Spambayes output*")) res pos) (with-current-buffer buf=20 (let ((inhibit-read-only t)) (erase-buffer) (gnus-request-article-this-buffer num group) (if (zerop (setq res (call-process-region (point-min) (point-max) flo-spambayes-filter-program t `(t ,flo-spambayes-logfile-for-filter-stderr) nil "-f" "-o" "Headers:include_evidence:True"))) (progn (goto-char (point-min)) (setq pos (if (re-search-forward "^X-Spambayes-" nil t) (match-beginning 0) (point-min))) (goto-char pos)) (error "%s returned exit status %d" flo-spambayes-filter-program res)))) (set-window-start (display-buffer buf) pos))) (provide 'flo-spambayes) ;;; flo-spambayes.el ends here --=-=-= -- Florent --=-=-= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ spambayes-dev mailing list spambayes-dev-+ZN9ApsXKcEdnm+yROfE0A@public.gmane.org http://mail.python.org/mailman/listinfo/spambayes-dev --=-=-=--