Gnus development mailing list
 help / color / mirror / Atom feed
* displaying arbitrary headers in summary
@ 2003-11-06 12:30 Max Froumentin
  2003-11-06 18:19 ` Michael Shields
  0 siblings, 1 reply; 8+ messages in thread
From: Max Froumentin @ 2003-11-06 12:30 UTC (permalink / raw)


Hi,

Basically, I'd like to display the spamassassin score of each article
in each line of the summary buffer. It's probably not easy since the
content of the X-Spam-Status is not in the .now file by default,
but maybe there's a way to put it in there.

Max.

PS. And if LarsMI happens to be reading this: thank you Lars,
    gmane has changed my life...



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: displaying arbitrary headers in summary
  2003-11-06 12:30 displaying arbitrary headers in summary Max Froumentin
@ 2003-11-06 18:19 ` Michael Shields
  2003-11-06 19:58   ` Ted Zlatanov
  2003-11-13 17:32   ` making bogofilter write spam headers? Bill White
  0 siblings, 2 replies; 8+ messages in thread
From: Michael Shields @ 2003-11-06 18:19 UTC (permalink / raw)
  Cc: ding

In message <87znf9d6i8.fsf@w3.org>,
Max Froumentin <max@lapin-bleu.net> wrote:
> Basically, I'd like to display the spamassassin score of each article
> in each line of the summary buffer. It's probably not easy since the
> content of the X-Spam-Status is not in the .now file by default,
> but maybe there's a way to put it in there.

Here is one approach, along with a function that sorts by spam score:

(add-to-list 'nnmail-extra-headers 'X-Spam-Status)
(defun gnus-article-spamassassin-score (header)
  "Return the Spamassassin score of this article, as a string."
  (gnus-replace-in-string
   (gnus-replace-in-string
    (gnus-extra-header 'X-Spam-Status header)
    ".*hits=" "")
   " .*" ""))
(defun gnus-user-format-function-s (header)
  (gnus-article-spamassassin-score header))
(defun gnus-article-sort-by-spam-status (h1 h2)
  "Sort articles by Spamassassin score."
  (< (string-to-number (gnus-article-spamassassin-score h1))
     (string-to-number (gnus-article-spamassassin-score h2))))

Now you can use "%us" in your gnus-summary-line-format to display the
spam score.  You could add to your spamtrap folder's group parameters:

 (gnus-summary-line-format "%U%R%z%I%(%[%6us: %-23,23f%]%) %s\n")
 (gnus-article-sort-functions
  '(gnus-article-sort-by-spam-status))

This displays the score for each article, and also puts the most
likely false positives at the top.
-- 
Shields.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: displaying arbitrary headers in summary
  2003-11-06 18:19 ` Michael Shields
@ 2003-11-06 19:58   ` Ted Zlatanov
  2003-11-06 20:20     ` Michael Shields
  2003-11-13 17:32   ` making bogofilter write spam headers? Bill White
  1 sibling, 1 reply; 8+ messages in thread
From: Ted Zlatanov @ 2003-11-06 19:58 UTC (permalink / raw)
  Cc: Max Froumentin, ding

On Thu, 06 Nov 2003, shields@msrl.com wrote:

> In message <87znf9d6i8.fsf@w3.org>,
> Max Froumentin <max@lapin-bleu.net> wrote:
>> Basically, I'd like to display the spamassassin score of each
>> article in each line of the summary buffer. It's probably not easy
>> since the content of the X-Spam-Status is not in the .now file by
>> default, but maybe there's a way to put it in there.
> 
> Here is one approach, along with a function that sorts by spam
> score:
> 
> (add-to-list 'nnmail-extra-headers 'X-Spam-Status)
> (defun gnus-article-spamassassin-score (header)
>   "Return the Spamassassin score of this article, as a string."
>   (gnus-replace-in-string
>    (gnus-replace-in-string
>     (gnus-extra-header 'X-Spam-Status header)
>     ".*hits=" "")
>    " .*" ""))
> (defun gnus-user-format-function-s (header)
>   (gnus-article-spamassassin-score header))
> (defun gnus-article-sort-by-spam-status (h1 h2)
>   "Sort articles by Spamassassin score."
>   (< (string-to-number (gnus-article-spamassassin-score h1))
>      (string-to-number (gnus-article-spamassassin-score h2))))

Can this go into spam.el, to be turned on by default for Bogofilter
and spam-use-regex (SpamAssassin) users if the users request it?  It
seems reasonable that the X-Spam-Status and similar headers should be
handled for summary formats, etc. by spam.el.

Maybe it can interface with the existing scoring functions for
Bogofilter.  I've been thinking about uniform weighted scoring for
spam.el - something that will always score between -1 and 1.  For
instance, SA scores, which can be between -inf. and inf. could be
weighted like so:

;;; assuming a range from -1 to 1
(defun normalize (score &optional offset)
  (let* ((score (float score))
	 (offset (if offset (float offset) 0))
	 (offset-score (- score offset)))
    (if (zerop offset-score)
	0
      (let* ((absolute-weighted-score (/ 2 offset-score))
	     (adjustment (if (> offset-score 0) 1 -1)))
	(- adjustment absolute-weighted-score)))))

What do you think?.

> Now you can use "%us" in your gnus-summary-line-format to display
> the spam score.  You could add to your spamtrap folder's group
> parameters:
> 
>  (gnus-summary-line-format "%U%R%z%I%(%[%6us: %-23,23f%]%) %s\n")
>  (gnus-article-sort-functions
>   '(gnus-article-sort-by-spam-status))
> 
> This displays the score for each article, and also puts the most
> likely false positives at the top.

Looks very useful!  I don't think any of this belongs in spam.el of
course, but it should probably go in the manual if we add your
formatting and headers code above.

Ted



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: displaying arbitrary headers in summary
  2003-11-06 19:58   ` Ted Zlatanov
@ 2003-11-06 20:20     ` Michael Shields
  2003-11-06 21:00       ` Ted Zlatanov
  0 siblings, 1 reply; 8+ messages in thread
From: Michael Shields @ 2003-11-06 20:20 UTC (permalink / raw)
  Cc: ding

In message <4nsml12rs9.fsf@koz.bwh.harvard.edu>,
Ted Zlatanov <tzz@lifelogs.com> wrote:
> Can this go into spam.el, to be turned on by default for Bogofilter
> and spam-use-regex (SpamAssassin) users if the users request it?

If you like; I have copyright papers on file already.  We'd need to
assign another character for the spam score in summary lines, since %u
is from the reserved user space.

> Maybe it can interface with the existing scoring functions for
> Bogofilter.  I've been thinking about uniform weighted scoring for
> spam.el - something that will always score between -1 and 1.  For
> instance, SA scores, which can be between -inf. and inf. could be
> weighted like so:

It should be straightforward to create a function that will map from
one spam analyzer's scores to another, maintaining a constant
distribution; Spamassassin's distribution is well-known and I think
the pure Bayesian analyzers will have a natural bell curve.  But what
do you plan to do with this?

>> This displays the score for each article, and also puts the most
>> likely false positives at the top.
>
> Looks very useful!  I don't think any of this belongs in spam.el of
> course, but it should probably go in the manual if we add your
> formatting and headers code above.

Do you want me to write a documentation patch?
-- 
Shields.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: displaying arbitrary headers in summary
  2003-11-06 20:20     ` Michael Shields
@ 2003-11-06 21:00       ` Ted Zlatanov
  0 siblings, 0 replies; 8+ messages in thread
From: Ted Zlatanov @ 2003-11-06 21:00 UTC (permalink / raw)
  Cc: Max Froumentin, ding

On Thu, 06 Nov 2003, shields@msrl.com wrote:

> In message <4nsml12rs9.fsf@koz.bwh.harvard.edu>,
> Ted Zlatanov <tzz@lifelogs.com> wrote:
>> Can this go into spam.el, to be turned on by default for Bogofilter
>> and spam-use-regex (SpamAssassin) users if the users request it?
> 
> If you like; I have copyright papers on file already.  We'd need to
> assign another character for the spam score in summary lines, since
> %u is from the reserved user space.

How would I do that for all of Gnus, considering I don't want it to
be on by default?  It seems like all that would have to be inserted
in gnus-summary-line-format-alist is:

    (?$ gnus-tmp-spam-score ?d)

but it seems only integers are supported.

Then in gnus-summary-prepare-threads I will need to add

(setq gnus-tmp-spam-score (or (cdr (assq number gnus-newsgroup-spam-scored))
				gnus-summary-default-spam-score 0))

But from that point on, I'm not sure where to add the code that would
let spam.el generate gnus-newsgroup-spam-scored.  Can I put it in the
summary entry hook, or is that too late because the lines are already
generated?

>> Maybe it can interface with the existing scoring functions for
>> Bogofilter.  I've been thinking about uniform weighted scoring for
>> spam.el - something that will always score between -1 and 1.  For
>> instance, SA scores, which can be between -inf. and inf. could be
>> weighted like so:
> 
> It should be straightforward to create a function that will map from
> one spam analyzer's scores to another, maintaining a constant
> distribution; Spamassassin's distribution is well-known and I think
> the pure Bayesian analyzers will have a natural bell curve.  But
> what do you plan to do with this?

I like a consistent score.  If the SA score is 5 and Bogofilter says
2, what does that mean currently?  Well, it depends on the user's SA
and Bogofilter thresholds :)

Also, SA scores are not well-known.  The user can skew the tests up or
down, so they are not bound inside the -100 to +100 interval as it
would seem.

If I'm the only one who wants to normalize scores, then never mind,
I'll just use the raw SA or Bogofilter scores.  It's silly to
over-engineer something that needs speed more than anything (so the
summary buffer doesn't take a long time to be displayed).

>>> This displays the score for each article, and also puts the most
>>> likely false positives at the top.
>>
>> Looks very useful!  I don't think any of this belongs in spam.el of
>> course, but it should probably go in the manual if we add your
>> formatting and headers code above.
> 
> Do you want me to write a documentation patch?

Sure, if you want to write a tutorial on displaying and sorting by
the spam score.  But let's get the spam.el modifications done first,
so your tutorial doesn't have to include all that code.

Ted



^ permalink raw reply	[flat|nested] 8+ messages in thread

* making bogofilter write spam headers?
  2003-11-06 18:19 ` Michael Shields
  2003-11-06 19:58   ` Ted Zlatanov
@ 2003-11-13 17:32   ` Bill White
  2003-11-13 18:07     ` Ted Zlatanov
  1 sibling, 1 reply; 8+ messages in thread
From: Bill White @ 2003-11-13 17:32 UTC (permalink / raw)
  Cc: ding

On Thu Nov 06 2003 at 12:19, Michael Shields <shields@msrl.com> said:

> In message <87znf9d6i8.fsf@w3.org>,
> Max Froumentin <max@lapin-bleu.net> wrote:

>> Basically, I'd like to display the spamassassin score of each
>> article in each line of the summary buffer. It's probably not easy
>> since the content of the X-Spam-Status is not in the .now file by
>> default, but maybe there's a way to put it in there.
>
> Here is one approach, along with a function that sorts by spam score:

[...]

> Now you can use "%us" in your gnus-summary-line-format to display
> the spam score.  You could add to your spamtrap folder's group
> parameters:

[...]

> This displays the score for each article, and also puts the most
> likely false positives at the top.

I'd love to do this, but I don't have spam headers in my email (I'm
using bogofilter from inside gnus during mail splitting).  Is there a
way to get bogofilter to add its header during nnmail-split-fancy?

Cheers -

bw
-- 
Bill White . billw@wolfram.com . http://members.wri.com/billw
"No ma'am, we're musicians."




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: making bogofilter write spam headers?
  2003-11-13 17:32   ` making bogofilter write spam headers? Bill White
@ 2003-11-13 18:07     ` Ted Zlatanov
  2003-11-13 18:24       ` Bill White
  0 siblings, 1 reply; 8+ messages in thread
From: Ted Zlatanov @ 2003-11-13 18:07 UTC (permalink / raw)
  Cc: Michael Shields, ding

On Thu, 13 Nov 2003, billw@wolfram.com wrote:

> I'd love to do this, but I don't have spam headers in my email (I'm
> using bogofilter from inside gnus during mail splitting).  Is there
> a way to get bogofilter to add its header during nnmail-split-fancy?

I don't know of an *easy* way to modify incoming articles - maybe I
can just modify the spool buffer, but it seems like a bad idea because
of the many possible complications.

How about using the gnus-registry?  It can associate arbitrary data
with a message ID, and retrieval is very fast.  I could define the
extra data 'spam-score and keep track of it.  When the spam-score is
not found (for instance, you clear the registry or the message gets
removed from it), it can be regenerated.

The only small issue is what kind of score I am saving.  I'll
probably store cons cells like so: (bogofilter-score . 0.4) to
accomodate the various kinds of scoring systems (I decided to ditch
the universal spam score concept since no one but me was interested in
it).

What do you think?

Thanks
Ted



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: making bogofilter write spam headers?
  2003-11-13 18:07     ` Ted Zlatanov
@ 2003-11-13 18:24       ` Bill White
  0 siblings, 0 replies; 8+ messages in thread
From: Bill White @ 2003-11-13 18:24 UTC (permalink / raw)
  Cc: ding, Ted Zlatanov

On Thu Nov 13 2003 at 12:07, Ted Zlatanov <tzz@lifelogs.com> said:

> On Thu, 13 Nov 2003, billw@wolfram.com wrote:
>
>> I'd love to do this, but I don't have spam headers in my email (I'm
>> using bogofilter from inside gnus during mail splitting).  Is there
>> a way to get bogofilter to add its header during
>> nnmail-split-fancy?
>
> I don't know of an *easy* way to modify incoming articles - maybe I
> can just modify the spool buffer, but it seems like a bad idea
> because of the many possible complications.

OK. (i.e., I'm clueless and I trust you :-)

I know messages can be modified with procmail, but I'd like to stay
away from procmail since I don't have the motivation to learn it, do
useful things with it and maintain a procmailrc.

> How about using the gnus-registry?  It can associate arbitrary data
> with a message ID, and retrieval is very fast.  I could define the
> extra data 'spam-score and keep track of it.  When the spam-score is
> not found (for instance, you clear the registry or the message gets
> removed from it), it can be regenerated.
>
> The only small issue is what kind of score I am saving.  I'll
> probably store cons cells like so: (bogofilter-score . 0.4) to
> accomodate the various kinds of scoring systems (I decided to ditch
> the universal spam score concept since no one but me was interested
> in it).
>
> What do you think?

Seems like that would track only the last gnus-registry-max-entries
messages.  Ah - but you say the registry entry and its spam score
could be regenerated.  That would take time, I suppose, but for use in
my one spam group I wouldn't mind waiting.

Sounds good for my setup, at least.  Thanks for doing the thinking!

Cheers -

bw
-- 
Bill White . billw@wolfram.com . http://members.wri.com/billw
"No ma'am, we're musicians."




^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2003-11-13 18:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-11-06 12:30 displaying arbitrary headers in summary Max Froumentin
2003-11-06 18:19 ` Michael Shields
2003-11-06 19:58   ` Ted Zlatanov
2003-11-06 20:20     ` Michael Shields
2003-11-06 21:00       ` Ted Zlatanov
2003-11-13 17:32   ` making bogofilter write spam headers? Bill White
2003-11-13 18:07     ` Ted Zlatanov
2003-11-13 18:24       ` Bill White

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).