Gnus development mailing list
 help / color / mirror / Atom feed
* gnus-summary-sort-by-spamicity (bogofilter)
@ 2003-09-24 13:08 Reiner Steib
  2003-09-24 14:45 ` Ted Zlatanov
  2003-10-08  5:13 ` Michael Shields
  0 siblings, 2 replies; 6+ messages in thread
From: Reiner Steib @ 2003-09-24 13:08 UTC (permalink / raw)


Hi,

bogofilter inserts lines like...

| X-Bogosity: Yes, tests=bogofilter, spamicity=1.000000, version=0.14.3
| X-Bogosity: Unsure, tests=bogofilter, spamicity=0.377453, version=0.14.3
| X-Bogosity: No, tests=bogofilter, spamicity=0.025015, version=0.14.3

... in the mail headers.  The "spamicity" is a float between 0.0 and
1.0 indicating the spam probability.  To check for false positives
(ham) in my spam groups or to check for spam in my ham groups it would
be nice to sort the summary by spamicity.

I couldn't gather from the existing sort function in `gnus-sum.el' how
to write a `gnus-summary-sort-by-spamicity' function.  Any hints?

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo--- PGP key available via WWW   http://rsteib.home.pages.de/




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: gnus-summary-sort-by-spamicity (bogofilter)
  2003-09-24 13:08 gnus-summary-sort-by-spamicity (bogofilter) Reiner Steib
@ 2003-09-24 14:45 ` Ted Zlatanov
  2003-10-08  5:13 ` Michael Shields
  1 sibling, 0 replies; 6+ messages in thread
From: Ted Zlatanov @ 2003-09-24 14:45 UTC (permalink / raw)


On Wed, 24 Sep 2003, 4.uce.03.r.s@nurfuerspam.de wrote:

>| X-Bogosity: Yes, tests=bogofilter, spamicity=1.000000, ...
> ... in the mail headers.  The "spamicity" is a float between 0.0 and
> 1.0 indicating the spam probability.  To check for false positives
> (ham) in my spam groups or to check for spam in my ham groups it
> would be nice to sort the summary by spamicity.
> 
> I couldn't gather from the existing sort function in `gnus-sum.el'
> how to write a `gnus-summary-sort-by-spamicity' function.  Any
> hints?

I'd love to see this too.  Keep in mind the spam.el "spam score" can
be used for more than just spamicity, I plan to keep it normalized
between 0 and 1 for any future spam filter that supports scoring.
SpamAssassin for instance would be normalized to 0 - 0.5 for values
under your threshold, and 0.5 - 1 for values over the threshold.  So
you may want to have gnus-summary-sort-by-spam-score instead of
-spamicity.

Now all I have to do is implement the SA scoring :)

Ted



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: gnus-summary-sort-by-spamicity (bogofilter)
  2003-09-24 13:08 gnus-summary-sort-by-spamicity (bogofilter) Reiner Steib
  2003-09-24 14:45 ` Ted Zlatanov
@ 2003-10-08  5:13 ` Michael Shields
  2003-11-30 15:27   ` Adam Sjøgren
  1 sibling, 1 reply; 6+ messages in thread
From: Michael Shields @ 2003-10-08  5:13 UTC (permalink / raw)


In message <v9fzim72ry.fsf@marauder.physik.uni-ulm.de>,
Reiner Steib <4.uce.03.r.s@nurfuerspam.de> wrote:
> I couldn't gather from the existing sort function in `gnus-sum.el' how
> to write a `gnus-summary-sort-by-spamicity' function.  Any hints?

Here is what I use for Spamassassin scores.  It should be easy to
adapt to bogofilter.

I find it is very helpful in sorting the spam-trap folder; this allows
me to see all the most likely false positives at the top.  Because of
this I've even been able to filter more aggressively.

(add-to-list 'nnmail-extra-headers 'X-Spam-Status)
(defun gnus-article-sort-by-spam-status (h1 h2)
  "Sort articles by score from the X-Spam-Status: header."
  (< (string-to-number (gnus-replace-in-string
			(gnus-extra-header 'X-Spam-Status h1)
			".*hits=" ""))
     (string-to-number (gnus-replace-in-string
			(gnus-extra-header 'X-Spam-Status h2)
			".*hits=" ""))))

-- 
Shields.




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: gnus-summary-sort-by-spamicity (bogofilter)
  2003-10-08  5:13 ` Michael Shields
@ 2003-11-30 15:27   ` Adam Sjøgren
  2003-12-02 11:30     ` Yair Friedman
  0 siblings, 1 reply; 6+ messages in thread
From: Adam Sjøgren @ 2003-11-30 15:27 UTC (permalink / raw)


On Wed, 08 Oct 2003 05:13:19 +0000, Michael wrote:

> Here is what I use for Spamassassin scores.  It should be easy to
> adapt to bogofilter.
[...]

> (add-to-list 'nnmail-extra-headers 'X-Spam-Status)
> (defun gnus-article-sort-by-spam-status (h1 h2)
>   "Sort articles by score from the X-Spam-Status: header."
>   (< (string-to-number (gnus-replace-in-string
> 			(gnus-extra-header 'X-Spam-Status h1)
> 			".*hits=" ""))
>      (string-to-number (gnus-replace-in-string
> 			(gnus-extra-header 'X-Spam-Status h2)
> 			".*hits=" ""))))

I'm trying to make this work on my nnml:spam-group, but for some
reason I can't figure out, X-Spam-Status isn't returned by
gnus-extra-header.

My group-parameters for the group are:

 ((gnus-show-threads nil)
  (gnus-extra-headers
   '(X-Spam-Status To Newsgroups))
  (gnus-article-sort-functions
   '(gnus-article-sort-by-spam-status)))

(I stumbled over the gnus-extra-headers variable and tried to set that
as well as nnmail-extra-headers, but it didn't make any difference it
seems).

I do a M-x edebug-defun on gnus-article-sort-by-spam-status (located
in my ~/.gnus) and when I step through the function (upon entering
nnml:spam), I get this in the *Message-log*:

 Fetching headers for nnml:spam...
 Fetching headers for nnml:spam...done
 Result: [23918 "hot adult stars and more" "\"Merlin Moody\" <m_moody_ro@artspas.uwaterloo.ca>" "Mon, 01 Dec 2003 11:36:12 +0000" "<OGEJDLKOBDBKJBHJFHOBPMNNIEAA.m_moody_ro@artspas.uwaterloo.ca>" "" 661 13 "virgil.koldfront.dk spam:23918" ((To . "asger@diku.dk, asjo@diku.dk, atlas@diku.dk"))]

Which I find odd, because the spam does have an X-Spam-Status line:

 $ grep X-Spam-Status Mail/spam/23918     
 X-Spam-Status: Yes, hits=13.3 required=5.0 tests=BAYES_99,BIZ_TLD,CLICK_BELOW,
 $

I'm trying to figure out where to look next, but I'm a little lost
right now.

Any pointers?


  Best regards,

-- 
 "Do not feed the oysters under the clouds"                   Adam Sjøgren
                                                         asjo@koldfront.dk




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: gnus-summary-sort-by-spamicity (bogofilter)
  2003-11-30 15:27   ` Adam Sjøgren
@ 2003-12-02 11:30     ` Yair Friedman
  2003-12-02 15:34       ` Adam Sjøgren
  0 siblings, 1 reply; 6+ messages in thread
From: Yair Friedman @ 2003-12-02 11:30 UTC (permalink / raw)


On Sun, 30 Nov 2003 16:27:13 +0100, 
spamtrap@koldfront.dk (Adam Sjøgren) writes:

> On Wed, 08 Oct 2003 05:13:19 +0000, Michael wrote:
>
>> (add-to-list 'nnmail-extra-headers 'X-Spam-Status)
...
> Any pointers?

Does calling nnml-generate-nov-databases help?





^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: gnus-summary-sort-by-spamicity (bogofilter)
  2003-12-02 11:30     ` Yair Friedman
@ 2003-12-02 15:34       ` Adam Sjøgren
  0 siblings, 0 replies; 6+ messages in thread
From: Adam Sjøgren @ 2003-12-02 15:34 UTC (permalink / raw)


On Tue, 02 Dec 2003 13:30:19 +0200, Yair wrote:

>> On Wed, 08 Oct 2003 05:13:19 +0000, Michael wrote:
>>> (add-to-list 'nnmail-extra-headers 'X-Spam-Status)
> ...
>> Any pointers?

> Does calling nnml-generate-nov-databases help?

Sorting works for new spam (I was about to announce that as
mysterious, but...), so you're probably right.

Let me just check.

Yes, that was it.

Thank you!


  Best regards,

-- 
 "Do not feed the oysters under the clouds"                   Adam Sjøgren
                                                         asjo@koldfront.dk




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-12-02 15:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-09-24 13:08 gnus-summary-sort-by-spamicity (bogofilter) Reiner Steib
2003-09-24 14:45 ` Ted Zlatanov
2003-10-08  5:13 ` Michael Shields
2003-11-30 15:27   ` Adam Sjøgren
2003-12-02 11:30     ` Yair Friedman
2003-12-02 15:34       ` Adam Sjøgren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).