Gnus development mailing list
 help / color / mirror / Atom feed
* `S t' always returns nil
@ 2003-01-26 15:35 Christopher Splinter
  2003-01-27 17:50 ` Ted Zlatanov
  0 siblings, 1 reply; 6+ messages in thread
From: Christopher Splinter @ 2003-01-26 15:35 UTC (permalink / raw)


Hi,

I think there is something wrong with `S t' as it always
returns a spamicity of 0, even if the concerned mail is clearly
spam and is recognized as such by bogofilter, too. 

I use the most recent version of bogofilter from CVS, which
returns lines like this:

| X-Bogosity: Spam, tests=bogofilter, spamicity=0.9056373309, version=0.10.1.1.cvs.20030125

The variables which are related to bogofilter are set properly.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: `S t' always returns nil
  2003-01-26 15:35 `S t' always returns nil Christopher Splinter
@ 2003-01-27 17:50 ` Ted Zlatanov
  2003-01-27 19:01   ` Christopher Splinter
  0 siblings, 1 reply; 6+ messages in thread
From: Ted Zlatanov @ 2003-01-27 17:50 UTC (permalink / raw)
  Cc: ding

On Sun, 26 Jan 2003, chris@splinter.inka.de wrote:
> Hi,
> 
> I think there is something wrong with `S t' as it always
> returns a spamicity of 0, even if the concerned mail is clearly
> spam and is recognized as such by bogofilter, too. 
> 
> I use the most recent version of bogofilter from CVS, which
> returns lines like this:
> 
>| X-Bogosity: Spam, tests=bogofilter, spamicity=0.9056373309,
>| version=0.10.1.1.cvs.20030125
> 
> The variables which are related to bogofilter are set properly.

Hmm, the version I used for testing had lines like this:

X-Bogosity: Yes, tests=bogofilter, spamicity=0.567040, version=0.9.1.2

so I wrote spam-check-bogofilter-headers to look for "X-Bogosity: Yes"
headers, see below.

(defun spam-check-bogofilter-headers (&optional score)
  (let ((header (message-fetch-field spam-bogofilter-header)))
      (when (and header
	       (string-match "^Yes" header))
	  (if score
	      (when (string-match "spamicity=\\([0-9.]+\\)" header)
		(match-string 1 header))
	    spam-split-group))))

Fixing this is trivial, but should I match on something else, or
should I allow both ^Yes and ^Spam as valid spam indicators in the
header?

Thanks
Ted



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: `S t' always returns nil
  2003-01-27 17:50 ` Ted Zlatanov
@ 2003-01-27 19:01   ` Christopher Splinter
  2003-01-27 20:17     ` Ted Zlatanov
  0 siblings, 1 reply; 6+ messages in thread
From: Christopher Splinter @ 2003-01-27 19:01 UTC (permalink / raw)


Ted Zlatanov <tzz@lifelogs.com> writes:

> On Sun, 26 Jan 2003, chris@splinter.inka.de wrote:
>> I think there is something wrong with `S t' as it always
>> returns a spamicity of 0, even if the concerned mail is clearly
>> spam and is recognized as such by bogofilter, too. 
>> 
>> | X-Bogosity: Spam, tests=bogofilter, spamicity=0.9056373309,
>> | version=0.10.1.1.cvs.20030125
>> 
>> The variables which are related to bogofilter are set properly.
>
> Hmm, the version I used for testing had lines like this:
>
> X-Bogosity: Yes, tests=bogofilter, spamicity=0.567040, version=0.9.1.2

You still get this output (at least by default) using the
Robinson or the Graham algorithm. The newer Robinson-Fisher
though, which has three possible return values, can return
'Spam', 'Ham' or 'Unsure'.

Currently Robinson is the default algorithm, but AFAIK it is
considered to change it to Robinson-Fisher.

> so I wrote spam-check-bogofilter-headers to look for "X-Bogosity: Yes"
> headers, see below.
>
> (defun spam-check-bogofilter-headers (&optional score)
>   (let ((header (message-fetch-field spam-bogofilter-header)))
>       (when (and header
> 	       (string-match "^Yes" header))

Gna. Should've seen this.

> Fixing this is trivial, but should I match on something else, or
> should I allow both ^Yes and ^Spam as valid spam indicators in the
> header?

Since the indicators are configurable, it might be advisable to
allow the user to tell Gnus what to match on.

There's one thing to be decided if Robinson-Fisher is integrated into
spam.el, though: What should be done with the 'unsure' messages?
Should they be moved to a separate group?



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: `S t' always returns nil
  2003-01-27 19:01   ` Christopher Splinter
@ 2003-01-27 20:17     ` Ted Zlatanov
  2003-01-28 15:13       ` Christopher Splinter
  0 siblings, 1 reply; 6+ messages in thread
From: Ted Zlatanov @ 2003-01-27 20:17 UTC (permalink / raw)
  Cc: ding

On Mon, 27 Jan 2003, chris@splinter.inka.de wrote:
> You still get this output (at least by default) using the
> Robinson or the Graham algorithm. The newer Robinson-Fisher
> though, which has three possible return values, can return
> 'Spam', 'Ham' or 'Unsure'.

OK, so "Spam" and "Yes" are both positive spam indicators.  I've added
that to spam.el.

>> Fixing this is trivial, but should I match on something else, or
>> should I allow both ^Yes and ^Spam as valid spam indicators in the
>> header?
> 
> Since the indicators are configurable, it might be advisable to
> allow the user to tell Gnus what to match on.

Done, another long variable comes into existence!

spam-bogofilter-bogosity-positive-spam-header is a regexp.  The
default allows for "Yes" and "Spam."

> There's one thing to be decided if Robinson-Fisher is integrated
> into spam.el, though: What should be done with the 'unsure'
> messages?  Should they be moved to a separate group?

I want spam.el to remain binary (spam vs. ham).  Introducing a third
category would complicate things enormously.  ifile is the only
multi-way classifier in a limited way, and I'm not sure that's too
useful.  We should have three ways of dealing with 'Unsure' messages:

- treat them as ham (the case now)

- let the user decide if they are to be ham or spam (why would the
  user ever want unsure messages as spam, though?)

- use the spamicity score for 'Unsure' messages, cutoff decided by the
  user

I think the third option is best, what do you think?

Ted



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: `S t' always returns nil
  2003-01-27 20:17     ` Ted Zlatanov
@ 2003-01-28 15:13       ` Christopher Splinter
  2003-01-28 16:46         ` Ted Zlatanov
  0 siblings, 1 reply; 6+ messages in thread
From: Christopher Splinter @ 2003-01-28 15:13 UTC (permalink / raw)


Ted Zlatanov <tzz@lifelogs.com> writes:

> I want spam.el to remain binary (spam vs. ham).  Introducing a third
> category would complicate things enormously.  ifile is the only
> multi-way classifier in a limited way, and I'm not sure that's too
> useful.  We should have three ways of dealing with 'Unsure' messages:
>
> - treat them as ham (the case now)
>
> - let the user decide if they are to be ham or spam (why would the
>   user ever want unsure messages as spam, though?)
>
> - use the spamicity score for 'Unsure' messages, cutoff decided by the
>   user
>
> I think the third option is best, what do you think?

As bogofilter allows the user to use RF (Robinson-Fisher) as a
binary filter too, that's somewhat redundant. Another reason why
this wouldn't be too useful is the RF algorithm itself, which
yields extreme results for messages which are easy to classify,
but rather undifferentiated ones for 'unsure' messages.

If it's arduous to support non-binary algorithms properly, I
wouldn't support them at all and go for the first option.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: `S t' always returns nil
  2003-01-28 15:13       ` Christopher Splinter
@ 2003-01-28 16:46         ` Ted Zlatanov
  0 siblings, 0 replies; 6+ messages in thread
From: Ted Zlatanov @ 2003-01-28 16:46 UTC (permalink / raw)
  Cc: ding

On Tue, 28 Jan 2003, chris@splinter.inka.de wrote:
> If it's arduous to support non-binary algorithms properly, I
> wouldn't support them at all and go for the first option.

Cool, status quo! (obSimpsons)

Ted



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-01-28 16:46 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-01-26 15:35 `S t' always returns nil Christopher Splinter
2003-01-27 17:50 ` Ted Zlatanov
2003-01-27 19:01   ` Christopher Splinter
2003-01-27 20:17     ` Ted Zlatanov
2003-01-28 15:13       ` Christopher Splinter
2003-01-28 16:46         ` Ted Zlatanov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).