Spam splitting and multiple nnimap methods

Gnus development mailing list
 help / color / mirror / Atom feed

* Spam splitting and multiple nnimap methods
@ 2004-05-17 21:50 Timothy Brown
  2004-05-18  9:53 ` Jonas Steverud
  0 siblings, 1 reply; 19+ messages in thread
From: Timothy Brown @ 2004-05-17 21:50 UTC (permalink / raw)


Hi,

I'm a little clueless about Gnus, having just started using it and Emacs a few
months ago.  The manual isn't entirely clear about how best to proceed along
this path.  Basically:

	1) I have multiple IMAP select methods.
	2) I want to do spam filtering via bogofilter and blackholes, and potentially
	   SA later.
	3) I want mail to be "autodetected" and shifted into a spam group.
	4) I will also manually train bogofilter by marking messages as spam.

I'm a little confused about how the options in .gnus and the group customize
options end up interacting.  i've spent hours poring over this and I'm still
not getting anywhere.

I've included my .gnus; i'm wondering if people could give me some pointers
on how best to get this accomplished?  You will note the relevant sections
of my configuration are commented out in order to prevent myself from losing
any more mail. :-)

Thanks,
Tim

; nil select methods

(setq gnus-select-method '(nnnil ""))

; IMAP select methods

[snip, multiple named nnimap methods, with
 -address, -directory, and stream defined for each]

[snip, additional customizations]

;(setq nnimap-split-download-body)    ; download bodies - IMAP
;(setq spam-split-group "INBOX.spam") ; put all spam in this group
;(setq spam-use-blackholes t)         ; use blackholes
;(setq spam-use-bogofilter t)         ; use bogofilter

;(require 'spam)                      ; flip the switch marty
;(spam-initialize)                    ; flip the other switch too, doc

;(setq nnimap-split-inbox '("INBOX"))  ; split messages in INBOX

;(setq nnimap-split-rule 'nnimap-split-fancy) ; fancy splitting

; 
; fancy splitting rulesets
; 

;(setq nnimap-split-fancy '(| 
;                          (: spam-split)
;                          ;; default mailbox
;                          "INBOX"))




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spam splitting and multiple nnimap methods
  2004-05-17 21:50 Spam splitting and multiple nnimap methods Timothy Brown
@ 2004-05-18  9:53 ` Jonas Steverud
  2004-05-18 12:53   ` Timothy Brown
  0 siblings, 1 reply; 19+ messages in thread
From: Jonas Steverud @ 2004-05-18  9:53 UTC (permalink / raw)

Timothy Brown <tim@tux.org> writes:

> Hi,
>
> I'm a little clueless about Gnus, having just started using it and Emacs a few
> months ago.  The manual isn't entirely clear about how best to proceed along
> this path.  Basically:
>
> 	1) I have multiple IMAP select methods.

No problem, I have two POP sources. Spam.el does not care about how
many sources you have.

> 	2) I want to do spam filtering via bogofilter and blackholes, and potentially
> 	   SA later.

(setq spam-use-bogofilter t)
(setq spam-use-blackholes t)
(setq spam-use-spamassassin t)  ;; I have not used this myself, read
          ;;the docs in the infofile for spam.el!

> 	3) I want mail to be "autodetected" and shifted into a spam group.
> 	4) I will also manually train bogofilter by marking messages as spam.

What you do is set up .gnus to load spam.el and tell it which filters
to use, I use bogofilter and use BBDB as a whitelist and have the
following setup:

-----------
(setq gnus-registry-cache-file (concat gnus-dribble-directory
				       "gnus.registry.eld")
      spam-split-group "Spam"
      spam-use-bogofilter t
      spam-use-BBDB t ;; Whitelist
      spam-log-to-registry t
      spam-mark-ham-unread-before-move-from-spam-group t
      spam-move-spam-nonspam-groups-only nil ; No moving at all.
      spam-disable-spam-split-during-ham-respool t
      )

(spam-initialize) ;; Loads the spam.el package etc.
(gnus-registry-initialize)
-----------

To use the registry is a good idea, spam-log-to-registry and
gnus-registry-initialize above, since you then can have
(: gnus-registry-split-fancy-with-parent) in your split rules. That
way emails are split together with their parents, which is good if you
move a discussion to a specific group, e.g. "Grandma's
Birthdayparty". To bad news is that Microsoft Outlook does not add a
References header so those emails are not correctly splitted - but
Outlook Express does add it.

I've added (: spam-split) to my split rules.

Now comes the second part; tell spam.el which groups contains spam and
which don't. That is done with group parameters.

I use topic which makes things a little easier (if you don't, you
might be interested in gnus-parameters which is explained in the Group
Parameters node in the Gnus info file).

On the top Email topic, I have added this:
((comment
  (spam-contents gnus-group-spam-classification-ham))
 (spam-process
  ((spam spam-use-bogofilter)
   (ham spam-use-bogofilter)))
 (spam-process-destination)
 (comment
  (ham-marks
   (gnus-del-mark gnus-read-mark gnus-killed-mark gnus-kill-file-mark gnus-low-score-mark gnus-expirable-mark gnus-ancient-mark))))

It tells Gnus/spam.el to use bogofilter for both spam and ham, that
spam-process-destination shall be nil (i.e. don't move the spam
anywhere, it will be marked as expired and will be deleted with the
rest of the expired emails in that group). The first line tells that
everything is ham (commented out), I used it while training Bogofilter
on ham - together with the last line all read emails where processed
as ham.

On my Spam group (nnfolder:Spam) I have
((expiry-wait . immediate)
 (ham-process-destination respool)
 (spam-contents gnus-group-spam-classification-spam)
 (ham-marks
  (gnus-ticked-mark)))

Which deletes all emails at once, that all ticked articles (!) shall
be considered ham and respooled, i.e. sent through the split process
again, and that this group consists of spam.

All spam I find in ham groups (i.e. everywhere except Spam) I mark
with M-d and it is sent to Bogofilter on exit and then marked as
expired.

All spam found during splitting is sent to Spam and when I enter it,
all emails are marked with $ as spam and those I mark as ham (with !)
is sent to bogofilter as ham and then sent to respool. The remaining
emails are sent to bogofilter to be trained as spam, and then deleted
due to the "immediate" value of expire-wait.

The difference between this and your setup is that I do not have any
autodetection in any group since I don't have access to News and all
may emails is filtered through (: spam-split).

HTH.

/Jonas
-- 
(        http://hem.bredband.net/steverud/        !     Wei Wu Wei     )
(        Meaning of U2 Lyrics, Roleplaying        !  To Do Without Do  )

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spam splitting and multiple nnimap methods
  2004-05-18  9:53 ` Jonas Steverud
@ 2004-05-18 12:53   ` Timothy Brown
  2004-05-18 13:50     ` Jonas Steverud
                       ` (3 more replies)
  0 siblings, 4 replies; 19+ messages in thread
From: Timothy Brown @ 2004-05-18 12:53 UTC (permalink / raw)
  Cc: ding

On Tue, May 18, 2004 at 11:53:01AM +0200, Jonas Steverud wrote:

> No problem, I have two POP sources. Spam.el does not care about how
> many sources you have.

This I keep reading, but i'm a little confused.

Using nnimap, I guess, my real challenge is understanding three things:

Where can I make a server declaration?  If I have something like:

>       spam-split-group "Spam"

Does that split to the spam group locally (nnfolder), does it split to
the IMAP group on that particular server, or does it split to an IMAP
group on a different server?  And when is split-group looked at?  Many
times my messages will say they are being "IMAP split host:INBOX:xx to INBOX"
but Gnus never sees them as part of that mailbox.  They aren't being lost,
exactly, but they do exist.  Also, how does splitting on nnimap, and spam-split,
interact?  So far i've had terrible luck with imap splitting, even with
splitting on the bodies as the manual says.

This is one area where Gnus' flexibility is giving me a huge headache - the
manual just isn't clear enough.

> I've added (: spam-split) to my split rules.

My own split rules are pretty simple - essentially split from "INBOX"
(which nnimap box is it splitting from?  From all of them?), run
: spam-split, (see question above), and then return messages not split
to "INBOX" (again, is this on the server i'm currently checking)?

> Now comes the second part; tell spam.el which groups contains spam and
> which don't. That is done with group parameters.

This part i'm sort of kind of -- well, totally lost on.

My assumption is:

	- INBOX will always contain spam.
	- I don't care about any other groups at the moment.
	- If there is spam in INBOX, move it elsewhere;
	- when I leave INBOX, process what I have marked (with S x) as
	  spam for bogofilter to train with.

The questions:

	- Do I need to do anything with ham?
	- If so, what?
	- How do I achieve the right functionality with the rules above?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spam splitting and multiple nnimap methods
  2004-05-18 12:53   ` Timothy Brown
@ 2004-05-18 13:50     ` Jonas Steverud
  2004-05-18 14:02     ` IMAP Splitting with multiple mailboxes (was Re: Spam splitting and multiple nnimap methods) Timothy Brown
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 19+ messages in thread
From: Jonas Steverud @ 2004-05-18 13:50 UTC (permalink / raw)

Timothy Brown <tim@tux.org> writes:

> On Tue, May 18, 2004 at 11:53:01AM +0200, Jonas Steverud wrote:
>
>> No problem, I have two POP sources. Spam.el does not care about how
>> many sources you have.
>
> This I keep reading, but i'm a little confused.
>
> Using nnimap, I guess, my real challenge is understanding three things:
>
> Where can I make a server declaration?  If I have something like:
>
>>       spam-split-group "Spam"
>
> Does that split to the spam group locally (nnfolder), does it split to
> the IMAP group on that particular server, or does it split to an IMAP
> group on a different server?

On that particular server. spam-split-group shall never contain a
colon and any part before it since that will be automatically
added. If Gnus fetches emails form nnimap+my.isp.com, the spam will
end up in nnimap+my.isp.com:Spam. When it fetches by POP for nnfolder
it will end up in nnfolder+pop.my.isp.com:Spam. you will always be
able to see form which backend the spam comes from.

> And when is split-group looked at?  Many times my messages will say
> they are being "IMAP split host:INBOX:xx to INBOX" but Gnus never
> sees them as part of that mailbox.  They aren't being lost, exactly,
> but they do exist.  Also, how does splitting on nnimap, and
> spam-split, interact?  So far i've had terrible luck with imap
> splitting, even with splitting on the bodies as the manual says.
>
> This is one area where Gnus' flexibility is giving me a huge headache - the
> manual just isn't clear enough.

Sorry, I haven't used IMAP so I don't know.

>> I've added (: spam-split) to my split rules.
>
> My own split rules are pretty simple - essentially split from "INBOX"
> (which nnimap box is it splitting from?  From all of them?), run
> : spam-split, (see question above), and then return messages not split
> to "INBOX" (again, is this on the server i'm currently checking)?

It seems like you need to understand how splitting is done - esp. for
IMAP. I cannot help you since I have never used IMAP. You could try to
send a specific IMAP-question to the list to see if anyone responds.

>> Now comes the second part; tell spam.el which groups contains spam and
>> which don't. That is done with group parameters.
>
> This part i'm sort of kind of -- well, totally lost on.
>
> My assumption is:
>
> 	- INBOX will always contain spam.
> 	- I don't care about any other groups at the moment.
> 	- If there is spam in INBOX, move it elsewhere;

All emails that are marked as spam when you exit the group is moved to
the spam-process-destination, declared by that group's parameters.

> 	- when I leave INBOX, process what I have marked (with S x) as
> 	  spam for bogofilter to train with.

Done automatically.

> The questions:
>
> 	- Do I need to do anything with ham?

Unless they are in a spam only group (like my nnfolder:Spam) you might
want to move it, otherwise it sounds like it is a good idea to let it
be. You need to train bogofilter on some ham otherwise it will
consider everything as spam (since it won't have a list of legitimate words).

> 	- If so, what?
> 	- How do I achieve the right functionality with the rules above?

Try copy-and-past my setup and work from there. Add one feature at the time.

1. Decide on how spam should be treated when it arrives (i.e. Gnus
   fetches it) - should it end up in your normal boxes or should it
   end up in a specific spam box? I have the latter in my case but you
   can have the former if you want to - just don't add (: spam-split).

2. On your email groups, press G p and replace the nil (or add to the
   list if you already have set some parameters) and add the suggested
   parameters I have. Remove the comments around the ham-marks and
   contents-is-ham lines, that way all you ham will be processed as
   ham and bogofilter will be trained on it as ham. When you done so
   for a couple of hundred of emails, insert the comments again - or
   remove the lines.

It might be so that you should do a G c instead of G p if you are
unfamiliar with Lisp/editing group parameters by hand. G c invokes the
customization engine for group parameters.

3. Add the lisp code in my last mail to your .gnus.

4. Add the autodetecion feature according to the info file (you seems
   to want it, while I do not so it is not included in my example).

-- 
(        http://hem.bredband.net/steverud/        !     Wei Wu Wei     )
(        Meaning of U2 Lyrics, Roleplaying        !  To Do Without Do  )

^ permalink raw reply	[flat|nested] 19+ messages in thread

* IMAP Splitting with multiple mailboxes (was Re: Spam splitting and multiple nnimap methods)
  2004-05-18 12:53   ` Timothy Brown
  2004-05-18 13:50     ` Jonas Steverud
@ 2004-05-18 14:02     ` Timothy Brown
  2004-05-18 14:13       ` IMAP Splitting with multiple mailboxes Kai Grossjohann
  2004-05-18 14:13     ` Spam splitting and multiple nnimap methods Jonas Steverud
  2004-05-18 19:11     ` Ted Zlatanov
  3 siblings, 1 reply; 19+ messages in thread
From: Timothy Brown @ 2004-05-18 14:02 UTC (permalink / raw)


Can someone provide a better explanation of how IMAP spam-splitting and
standard splitting, particularly with multiple servers, interact?  i've read
the manual and the sections on virtual server spam splitting but i'm still a
little lost, particularly in combinations using :spam-split.

In short:

 1) I am using fancy splitting;
 2) I have spam-split as part of my splitting rules;
 3) I am not specifying any nnimap+host:folder definitions. 

Do I need to be using the virtual server stuff, as 6.5.1 in the Manual
suggests?  If so, how does this work with fancy splitting?

Thanks,
Tim


> Does that split to the spam group locally (nnfolder), does it split to
> the IMAP group on that particular server, or does it split to an IMAP
> group on a different server?  And when is split-group looked at?  Many
> times my messages will say they are being "IMAP split host:INBOX:xx to INBOX"
> but Gnus never sees them as part of that mailbox.  They aren't being lost,
> exactly, but they do exist.  Also, how does splitting on nnimap, and
> spam-split, interact?  So far i've had terrible luck with imap splitting,
> even with splitting on the bodies as the manual says.
> 
> This is one area where Gnus' flexibility is giving me a huge headache - the
> manual just isn't clear enough.




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: IMAP Splitting with multiple mailboxes
  2004-05-18 14:02     ` IMAP Splitting with multiple mailboxes (was Re: Spam splitting and multiple nnimap methods) Timothy Brown
@ 2004-05-18 14:13       ` Kai Grossjohann
  2004-05-18 14:15         ` Timothy Brown
  0 siblings, 1 reply; 19+ messages in thread
From: Kai Grossjohann @ 2004-05-18 14:13 UTC (permalink / raw)


Timothy Brown <tim@tux.org> writes:

>  3) I am not specifying any nnimap+host:folder definitions. 

Do you have an nnimap server at all?  If you don't, you can forget
about nnimap-split-rule: it is not used.

Kai



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spam splitting and multiple nnimap methods
  2004-05-18 12:53   ` Timothy Brown
  2004-05-18 13:50     ` Jonas Steverud
  2004-05-18 14:02     ` IMAP Splitting with multiple mailboxes (was Re: Spam splitting and multiple nnimap methods) Timothy Brown
@ 2004-05-18 14:13     ` Jonas Steverud
  2004-05-18 19:11     ` Ted Zlatanov
  3 siblings, 0 replies; 19+ messages in thread
From: Jonas Steverud @ 2004-05-18 14:13 UTC (permalink / raw)


Timothy Brown <tim@tux.org> writes:

> On Tue, May 18, 2004 at 11:53:01AM +0200, Jonas Steverud wrote:
>
>> No problem, I have two POP sources. Spam.el does not care about how
>> many sources you have.
>
> This I keep reading, but i'm a little confused.

BTW, also search the archive for my discussion with Ted Zlatanov
during around March. Might be some good answers for you there as well.

-- 
(        http://hem.bredband.net/steverud/        !     Wei Wu Wei     )
(        Meaning of U2 Lyrics, Roleplaying        !  To Do Without Do  )




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: IMAP Splitting with multiple mailboxes
  2004-05-18 14:13       ` IMAP Splitting with multiple mailboxes Kai Grossjohann
@ 2004-05-18 14:15         ` Timothy Brown
  2004-05-18 15:53           ` Kai Grossjohann
  0 siblings, 1 reply; 19+ messages in thread
From: Timothy Brown @ 2004-05-18 14:15 UTC (permalink / raw)
  Cc: ding

On Tue, May 18, 2004 at 04:13:21PM +0200, Kai Grossjohann wrote:
> Timothy Brown <tim@tux.org> writes:
> 
> >  3) I am not specifying any nnimap+host:folder definitions. 
> 
> Do you have an nnimap server at all?  If you don't, you can forget
> about nnimap-split-rule: it is not used.
> 
> Kai

Yes, that was the reason for my original mail. :-)

I actually have three nnimap select methods, each pointing to a different
server.  

Tim



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: IMAP Splitting with multiple mailboxes
  2004-05-18 14:15         ` Timothy Brown
@ 2004-05-18 15:53           ` Kai Grossjohann
  2004-05-18 15:58             ` Timothy Brown
  0 siblings, 1 reply; 19+ messages in thread
From: Kai Grossjohann @ 2004-05-18 15:53 UTC (permalink / raw)


Timothy Brown <tim@tux.org> writes:

> I actually have three nnimap select methods, each pointing to a different
> server.  

Now I'm pretty confused as to what you want.  And I can't find your
first message which describes this :-|

Sorry to be of so little help.

Kai




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: IMAP Splitting with multiple mailboxes
  2004-05-18 15:53           ` Kai Grossjohann
@ 2004-05-18 15:58             ` Timothy Brown
  2004-05-18 16:14               ` Kai Grossjohann
  0 siblings, 1 reply; 19+ messages in thread
From: Timothy Brown @ 2004-05-18 15:58 UTC (permalink / raw)
  Cc: ding

On Tue, May 18, 2004 at 05:53:37PM +0200, Kai Grossjohann wrote:
> Timothy Brown <tim@tux.org> writes:
> 
> > I actually have three nnimap select methods, each pointing to a different
> > server.  
> 
> Now I'm pretty confused as to what you want.  And I can't find your
> first message which describes this :-|
> 
> Sorry to be of so little help.
> 
> Kai

I have three nnimap select methods selecting from different servers.
When using nnimap splitting - fancy splitting - where do the messages
end up if there is no declaration of nnimap+host:folder, and just a folder
declaration?  Do they end up local to the nnimap server that is being split
from at the time?  Can splitting occur between servers?  Can I see messages
on one server and have them moved via nnimap to a different server using
splitting?  How does : spam-split interact with nnimap fancy splitting,
particularly in the concept of having multiple virtual servers?

Tim

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: IMAP Splitting with multiple mailboxes
  2004-05-18 15:58             ` Timothy Brown
@ 2004-05-18 16:14               ` Kai Grossjohann
  0 siblings, 0 replies; 19+ messages in thread
From: Kai Grossjohann @ 2004-05-18 16:14 UTC (permalink / raw)

Timothy Brown <tim@tux.org> writes:

> I have three nnimap select methods selecting from different servers.
> When using nnimap splitting - fancy splitting - where do the messages
> end up if there is no declaration of nnimap+host:folder,

I think that splitting cannot cross server boundaries.  That is, if
you specify "INBOX.foo" as the target group for a server, then it
will be on the server currently being split.

> and just a folder declaration?  Do they end up local to the nnimap
> server that is being split from at the time?

I think so.

> Can splitting occur between servers?

I think that's not possible.

> Can I see messages on one server and have them moved via nnimap to a
> different server using splitting?

Well, splitting happens before you see the messages.  So you can't
see them before splitting...  Maybe you mean something else?

> How does : spam-split interact with nnimap fancy splitting,
> particularly in the concept of having multiple virtual servers?

Huh.  Dunno.  I guess that if it says that INBOX.spam is the target,
then this will be on the server being split...

A question you didn't ask is how to have different split rules for
different servers.  The variable nnimap-split-rule appears to allow
this, but I don't see how it would work for fancy splitting.

[time passes]

Oh!  Now I see.  Here's what you do:

(setq nnimap-split-rule '(("server1" ("INBOX" tim-split-fancy-1))
                          ("server2" ("INBOX" tim-split-fancy-2))))

(setq tim-split-fancy-rule-1
      -value-that-looks-like-nnimap-split-fancy-)
(setq tim-split-fancy-rule-2
      -another-value-looking-like-nnimap-split-fancy-)

(defun tim-split-fancy-1 ()
  (let ((nnimap-split-fancy tim-split-fancy-rule-1))
    (nnimap-split-fancy)))

(defun tim-split-fancy-2 ()
  (let ((nnimap-split-fancy tim-split-fancy-rule-2))
    (nnimap-split-fancy)))

The idea is that you define two different functions to perform the
splitting, and both functions essentially do like the function
nnimap-split-fancy, but temporarily change the value of the
nnimap-split-fancy variable.

That's a common Lisp trick.

All of the above is untested.  I never tried anything that fancy with
splitting ;-)

Kai

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spam splitting and multiple nnimap methods
  2004-05-18 12:53   ` Timothy Brown
                       ` (2 preceding siblings ...)
  2004-05-18 14:13     ` Spam splitting and multiple nnimap methods Jonas Steverud
@ 2004-05-18 19:11     ` Ted Zlatanov
  2004-05-18 22:19       ` Timothy Brown
  2004-05-20 10:27       ` Yair Friedman
  3 siblings, 2 replies; 19+ messages in thread
From: Ted Zlatanov @ 2004-05-18 19:11 UTC (permalink / raw)
  Cc: Jonas Steverud, ding

On Tue, 18 May 2004, tim@tux.org wrote:

> Where can I make a server declaration?  If I have something like:
> 
>>       spam-split-group "Spam"
> 
> Does that split to the spam group locally (nnfolder), does it split to
> the IMAP group on that particular server, or does it split to an IMAP
> group on a different server?  

spam-split-group is returned by spam-split, that's all.  So if a
valid group name can be a split target, it's valid for
spam-split-group.  This means that it really depends on the server,
but generally the group will be created as an IMAP folder on an
nnimap server, or as a directory on a nnml server for example.

I actually meant to write cross-server splitting, which would allow
"nnimap+server.com:externalgroup" but I keep forgetting about it.
No one seems to be clamoring for it, so I guess it's not that
important.

> And when is split-group looked at?  

When you invoke spam-split.

> Many times my messages will say they are being "IMAP split
> host:INBOX:xx to INBOX" but Gnus never sees them as part of that
> mailbox.  They aren't being lost, exactly, but they do exist.

I have no idea what you mean here, sorry.

> Also, how does splitting on nnimap, and spam-split, interact?  So
> far i've had terrible luck with imap splitting, even with
> splitting on the bodies as the manual says.

spam-split is just a function that can return nil (meaning "skip
this rule") or a string (meaning a group name).  It should not be
last because it could return nil and then you make Gnus unhappy.

> This is one area where Gnus' flexibility is giving me a huge
> headache - the manual just isn't clear enough.

This is my fault, since I wrote most of the spam support
documentation.  I've had help from several volunteers with the
manual; if you would like to help as well that would be great.

>> I've added (: spam-split) to my split rules.
> 
> My own split rules are pretty simple - essentially split from "INBOX"
> (which nnimap box is it splitting from?  From all of them?), run
> : spam-split, (see question above), and then return messages not split
> to "INBOX" (again, is this on the server i'm currently checking)?

You should not split back into INBOX.  It's been done, but it's
unnecessary.  Make your last split group "mail" for example and
you'll be happier.

Each IMAP server with a nnimap server entry in your Gnus setup can
have its own split rules.  This is my setup, for instance:

(setq nnimap-split-rule '(
		     ("lifelogs" ("INBOX" nnimap-split-fancy))
		     ("imap" ("INBOX" nnimap-courier-split-fancy))))

as opposed to the simpler but less useful:

(setq nnimap-split-rule 'nnimap-split-fancy)

I use nnimap-courier-split-fancy as a wrapper around
nnimap-split-fancy to prepend "INBOX." to the group names, because
of Courier IMAP's particular group name prefix, but that's not
important.  What's important is that you can specify split rules
for each server, instead of one rule for them all.

>> Now comes the second part; tell spam.el which groups contains spam and
>> which don't. That is done with group parameters.
> 
> This part i'm sort of kind of -- well, totally lost on.

I'll write out some text here that may go in the manual, so forgive
me if I get a little too wordy.

You can see some information at http://lifelogs.com/spam, where I
try to explain the top-level ideas about spam.el.

There's only three important things in the group parameters.  1) the
group spam/ham classification, 2) the exit spam/ham processors, and
3) the exit spam/ham destination.  All of them, and the rest, can
be accessed with `G c'.

The exit spam/ham processors are applied to spam or ham mail when
you exit the group.

The exit spam/ham destination is where spam or ham is moved when
you exit the group (nil, the default, means expire; 'respool means
to respool it back the splitting process).  You can even specify
multiple group names here, and they can be on other servers!

The group classification is important when you enter AND exit a
group.

When you enter a SPAM group, any UNREAD messages will be marked as
spam.

When you enter any other type of group (ham or unclassified), you
have to mark spam manually.

Ham is mail that matches the ham-marks parameters.  Things get
complicated here, but basically you can control that parameter to
only consider ticked (!) articles ham, for instance.  The defaults
should be OK for most users.

Spam is also identified through a parameter (spam-marks) but you
should really leave that to be just the spam article mark.  Some
people consider low-score articles spam, but I don't recommend it.

Finally, the group classification (ham/spam) matters when you exit
the group.  Ham, for instance, will get moved out of spam groups
but not out of ham groups.

> My assumption is:
> 
> 	- INBOX will always contain spam.
> 	- I don't care about any other groups at the moment.
> 	- If there is spam in INBOX, move it elsewhere;
> 	- when I leave INBOX, process what I have marked (with S x) as
> 	  spam for bogofilter to train with.

You don't say where spam that bogofilter detects should go.  I'll
assume it will go to "spam" - set the spam-split-group variable to
that.

I suggest you make "mail" you main mailbox, and leave INBOX to be
just the splitting source.  Set spam-use-bogofilter to t globally.

Use spam-split in your IMAP splitting methods, and it will send
what bogofilter thinks is spam to the spam-split-group ("spam").
Make "mail" the last entry in your splitting method, so all mail
will go there.

Make "mail" a spam group, and when you enter it all unread mail will
be marked as spam.  Set the "mail" group spam exit processor to
bogofilter.

Do the same as above for the "spam" group.

Now all spam mail will be processed and marked as expired (since
there is no spam destination for the "mail" and "spam" groups).  You
can delete it when you want, or let automatic expiry do it for you.

If it seems like "spam" and "mail" are similar, you're right.  Most
people make "mail" a ham group, and "spam" a spam group.  But you
can do it your way if you like.

> The questions:
> 
>	- Do I need to do anything with ham?
>	- If so, what?

Well, if you want to train bogofilter with the ham, for every spam
group you have, set the ham exit processor to bogofilter, and the
ham destination to a group you want to hold your ham messages.

>	- How do I achieve the right functionality with the rules above?

Let me know if the information above is what you wanted, first,
before we write the code.

Ted

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spam splitting and multiple nnimap methods
  2004-05-18 19:11     ` Ted Zlatanov
@ 2004-05-18 22:19       ` Timothy Brown
  2004-05-19 11:36         ` Jonas Steverud
  2004-05-19 14:48         ` Ted Zlatanov
  2004-05-20 10:27       ` Yair Friedman
  1 sibling, 2 replies; 19+ messages in thread
From: Timothy Brown @ 2004-05-18 22:19 UTC (permalink / raw)

On Tue, May 18, 2004 at 03:11:49PM -0400, Ted Zlatanov wrote:

[snip]

> I actually meant to write cross-server splitting, which would allow
> "nnimap+server.com:externalgroup" but I keep forgetting about it.
> No one seems to be clamoring for it, so I guess it's not that
> important.

Clamor! Clamor! That would be good stuff.  You know, the reason I moved
to Gnus is it really handled multiple mailboxes and IMAP servers "well",
and it was the only client to do so and have everything displayed in a
way that made sense, other than Thunderbird which doesn't work for me
due to my reliance on text-based terminals, etc.  Cross-server splitting
would, for instance, allow me to treat all messages universally as part
of a single server, thus creating a kind of "IMAP proxy" setup.  But perhaps
kibozed groups offer me the same functionality(?) - haven't looked into
this.

> > Many times my messages will say they are being "IMAP split
> > host:INBOX:xx to INBOX" but Gnus never sees them as part of that
> > mailbox.  They aren't being lost, exactly, but they do exist.
> 
> I have no idea what you mean here, sorry.

What I meant to say was, Gnus is processing messages through the split,
leaving them where they are because INBOX is the default mailbox in
nnmail-split-fancy, but they are not showing up in Gnus yet are showing
up in, for instance, mutt when I point it at the mail server.  You talk
more about this below (and I reply...)

> > This is one area where Gnus' flexibility is giving me a huge
> > headache - the manual just isn't clear enough.
> 
> This is my fault, since I wrote most of the spam support
> documentation.  I've had help from several volunteers with the
> manual; if you would like to help as well that would be great.

I didn't mean to point fingers, but the information you've provided here
has really helped to clarify the process.  (But see below...)

> >> I've added (: spam-split) to my split rules.
> > 
> > My own split rules are pretty simple - essentially split from "INBOX"
> > (which nnimap box is it splitting from?  From all of them?), run
> > : spam-split, (see question above), and then return messages not split
> > to "INBOX" (again, is this on the server i'm currently checking)?
> 
> You should not split back into INBOX.  It's been done, but it's
> unnecessary.  Make your last split group "mail" for example and
> you'll be happier.

The behavior I really want is:

  1) Go through INBOX, detect whether mail is bogo-spammed or in a blacklist.
    a) Move this mail to the spam folder.

  2) Read the INBOX, and manually mark what bogofilter didn't clue in on as
     spam.

  3) Leave INBOX, and have the remaining mail that's there trained as ham.

You mention that I shouldn't split back into INBOX; can you explain why this
is unnecessary and/or bad?  I'm trying to figure out why it makes sense
to have to have a different folder as my INBOX (although i'm not against
the idea, i'd like to leave INBOX to its intended purpose and expire mail
into INBOX.mail later)

> Each IMAP server with a nnimap server entry in your Gnus setup can
> have its own split rules.  This is my setup, for instance:
> 
> (setq nnimap-split-rule '(
> 		     ("lifelogs" ("INBOX" nnimap-split-fancy))
> 		     ("imap" ("INBOX" nnimap-courier-split-fancy))))
> 
> as opposed to the simpler but less useful:
> 
> (setq nnimap-split-rule 'nnimap-split-fancy)

This sheds a ton of light on how nnimap can fancy split using individual
servers, thanks.  In these rulesets, you're only specifying folder names
and not fully qualified nn<backend>+etc. stuff, right?  This really needs
to go into the fancy splitting section or the IMAP section of the manual.

> I use nnimap-courier-split-fancy as a wrapper around
> nnimap-split-fancy to prepend "INBOX." to the group names, because
> of Courier IMAP's particular group name prefix, but that's not
> important.  What's important is that you can specify split rules
> for each server, instead of one rule for them all.
> 
> >> Now comes the second part; tell spam.el which groups contains spam and
> >> which don't. That is done with group parameters.
> > 
> > This part i'm sort of kind of -- well, totally lost on.
> 

[snip]

I follow all this.  What isn't clear is what the 'Spam Autodetection'
feature is used for, and/or if it needs to be enabled (in G c), etc.

This has all been really helpful.  To summarize and make this as clear
as possible:

  - I want to scan for spam in every IMAP mailbox I have.

  - If mail appears as spam based on what bogofilter and/or the blackholes
    rule knows, then dump it in a spam folder that is individual to that
    IMAP server.  Later SpamAssassin (via spamc) will be added to the
    mix.

  - I'll also scan through this folder after it's done, read the mail I want
    to read, mark certain things as spam, and treat everything else as ham.

  - Spam and ham will be processed on group exit.

  - It would be great if that folder was 'INBOX', but I understand if it has
    to be 'mail'.

  - I don't care about other folders at the moment.

My only real concerns at this point about the above are the weirdness i've seen
with splitting back to INBOX and never seeing the messages in Gnus, but I'll
bet that is a small problem.

Thanks again for all your help.  I think i'm almost there.

Tim

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spam splitting and multiple nnimap methods
  2004-05-18 22:19       ` Timothy Brown
@ 2004-05-19 11:36         ` Jonas Steverud
  2004-05-19 14:50           ` Ted Zlatanov
  2004-05-19 14:48         ` Ted Zlatanov
  1 sibling, 1 reply; 19+ messages in thread
From: Jonas Steverud @ 2004-05-19 11:36 UTC (permalink / raw)

Timothy Brown <tim@tux.org> writes:

[...]
> I follow all this.  What isn't clear is what the 'Spam Autodetection'
> feature is used for, and/or if it needs to be enabled (in G c), etc.

You can let spam.el detect spam in two ways, during split with
(: spam-split) or when you enter a group. The latter is called
autodetection.

The reason for this is that some groups might not be fed through your
splitrules, USENET news is an example of this. :-)

In case of a news group you can let spam.el detect spam for you in the
group and mark it with the $ sign. To do so you need to turn on the
autodetection for that group(s). To do so you set the group parameters
spam-autodetect and spam-autodetect-methods. I would recommend to use
G c to set those variables instead of G p since I don't know which
values are valid.

From what you write above (and which I have snipped), I think that you
want autodetection turned on for your INBOX.

> This has all been really helpful.  To summarize and make this as clear
> as possible:
>
>   - I want to scan for spam in every IMAP mailbox I have.

Turn on autodetection for all your IMAP mailboxes, i.e. set the
aforementioned group parameters to sensible values by using
gnus-group-customize (G c) instead of G p.

>   - If mail appears as spam based on what bogofilter and/or the blackholes
>     rule knows, then dump it in a spam folder that is individual to that
>     IMAP server.  Later SpamAssassin (via spamc) will be added to the
>     mix.

Set for each group, or group of group, the spam-process-destination
group parameter to the group you want it to end up in. WARNING! DO NOT
include "nnimap+servername:" in spam-process-destination! That will be
added by spam.el.

>   - I'll also scan through this folder after it's done, read the mail I want
>     to read, mark certain things as spam, and treat everything else as ham.

If "this folder" is the spam folder, set spam-process-destination to
nil for the group (i.e. (spam-process-destination)) and mark all your
spam with M-d or S x while reading.

I would recommend you to set
 (spam-process
  ((spam spam-use-bogofilter)
   (ham spam-use-bogofilter)))
in a top level topic so you don't have to set it for each group.

>   - Spam and ham will be processed on group exit.

That's the way spam.el is designed.

>   - It would be great if that folder was 'INBOX', but I understand if it has
>     to be 'mail'.

You're not afraid of loops?

> My only real concerns at this point about the above are the
> weirdness i've seen with splitting back to INBOX and never seeing
> the messages in Gnus, but I'll bet that is a small problem.

The reason for the problem might be that you split it back to the
source. Sounds strange to me, but I haven't looked into the IMAP stuff.

-- 
(        http://hem.bredband.net/steverud/        !     Wei Wu Wei     )
(        Meaning of U2 Lyrics, Roleplaying        !  To Do Without Do  )

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spam splitting and multiple nnimap methods
  2004-05-18 22:19       ` Timothy Brown
  2004-05-19 11:36         ` Jonas Steverud
@ 2004-05-19 14:48         ` Ted Zlatanov
  1 sibling, 0 replies; 19+ messages in thread
From: Ted Zlatanov @ 2004-05-19 14:48 UTC (permalink / raw)
  Cc: ding

On Tue, 18 May 2004, tim@tux.org wrote:

On Tue, May 18, 2004 at 03:11:49PM -0400, Ted Zlatanov wrote:

>> I actually meant to write cross-server splitting, which would allow
>> "nnimap+server.com:externalgroup" but I keep forgetting about it.
>> No one seems to be clamoring for it, so I guess it's not that
>> important.
> 
> Clamor! Clamor! That would be good stuff.  You know, the reason I moved
> to Gnus is it really handled multiple mailboxes and IMAP servers "well",
> and it was the only client to do so and have everything displayed in a
> way that made sense, other than Thunderbird which doesn't work for me
> due to my reliance on text-based terminals, etc.  Cross-server splitting
> would, for instance, allow me to treat all messages universally as part
> of a single server, thus creating a kind of "IMAP proxy" setup.  But perhaps
> kibozed groups offer me the same functionality(?) - haven't looked into
> this.

They don't.  If anyone else would like cross-server splitting, speak
up.

>> > This is one area where Gnus' flexibility is giving me a huge
>> > headache - the manual just isn't clear enough.
>> 
>> This is my fault, since I wrote most of the spam support
>> documentation.  I've had help from several volunteers with the
>> manual; if you would like to help as well that would be great.
> 
> I didn't mean to point fingers, but the information you've provided here
> has really helped to clarify the process.  (But see below...)

I'm saying that I would appreciate any volunteer help with the
manuals, not that you are being too critical.  I often feel like I'm
too close to spam.el, so I don't realize how strange and complex its
features are to most people.

>> You should not split back into INBOX.  It's been done, but it's
>> unnecessary.  Make your last split group "mail" for example and
>> you'll be happier.
> 
> The behavior I really want is:
> 
>   1) Go through INBOX, detect whether mail is bogo-spammed or in a blacklist.
>   	a) Move this mail to the spam folder.
> 
>   2) Read the INBOX, and manually mark what bogofilter didn't clue in on as
>   	spam.
> 
>   3) Leave INBOX, and have the remaining mail that's there trained
>   as ham.

I do this, but instead of INBOX I use "mail," and I only train on ham
that's misidentified as spam.  Your approach to training and spam
detection is just as valid.

> You mention that I shouldn't split back into INBOX; can you explain why this
> is unnecessary and/or bad?  I'm trying to figure out why it makes sense
> to have to have a different folder as my INBOX (although i'm not against
> the idea, i'd like to leave INBOX to its intended purpose and expire mail
> into INBOX.mail later)

Splitting back into INBOX is, in fact, possible - Uwe Brauer reported
back in October 2003 that it works:

http://groups.google.com/groups?selm=m3ekxv2ufx.fsf%40maport01.mat.ucm.es&output=gplain

but I think you'll do yourself a disservice if you respool back into
INBOX.  At the very least, you will be respooling the same messages
again and again if you don't clear them out of INBOX.  Also, all Gnus
splitting is oriented towards splitting things out of the INBOX, not
filtering in place.

But if you want to use INBOX, you can.  Just make sure to report any
bugs that you observe in the process.

>> Each IMAP server with a nnimap server entry in your Gnus setup can
>> have its own split rules.  This is my setup, for instance:
>> 
>> (setq nnimap-split-rule '(
>> 		     ("lifelogs" ("INBOX" nnimap-split-fancy))
>> 		     ("imap" ("INBOX" nnimap-courier-split-fancy))))
>> 
>> as opposed to the simpler but less useful:
>> 
>> (setq nnimap-split-rule 'nnimap-split-fancy)
> 
> This sheds a ton of light on how nnimap can fancy split using individual
> servers, thanks.  In these rulesets, you're only specifying folder names
> and not fully qualified nn<backend>+etc. stuff, right?  This really needs
> to go into the fancy splitting section or the IMAP section of the
> manual.

I got the information from C-h v nnimap-split-rule, but it's also in
the manual.  Maybe it should be more prominent, but I don't know that
multiple IMAP servers are a very common configuration.

> I follow all this.  What isn't clear is what the 'Spam Autodetection'
> feature is used for, and/or if it needs to be enabled (in G c), etc.

This was answered by Jonas Steverud.  It's also in the manual.

> This has all been really helpful.  To summarize and make this as clear
> as possible:
> 
>   - I want to scan for spam in every IMAP mailbox I have.
> 
>   - If mail appears as spam based on what bogofilter and/or the blackholes rule
>   	knows, then dump it in a spam folder that is individual to that IMAP
>   	server.  Later SpamAssassin (via spamc) will be added to the mix.
> 
>   - I'll also scan through this folder after it's done, read the mail I want to
>   	read, mark certain things as spam, and treat everything else as ham.
> 
>   - Spam and ham will be processed on group exit.
> 
>   - It would be great if that folder was 'INBOX', but I understand if it has to
>   	be 'mail'.
> 
>   - I don't care about other folders at the moment.
> 
> My only real concerns at this point about the above are the
> weirdness i've seen with splitting back to INBOX and never seeing
> the messages in Gnus, but I'll bet that is a small problem.

OK, I hope you'll come through unscathed :)

Remember you can set topic parameters that work just like group
parameters.  So you can say, for a whole topic, "the spam and ham
exit processor is bogofilter" instead of specifying it for each
group.  That simplifies things.

Make sure to keep backups of your INBOX!  You can use something like
my ifrom tool at http://lifelogs.com/source/ifrom.txt or whatever is
appropriate on the server side.  In theory no mail should be lost but
there's only you and Uwe splitting back into INBOX that I know of.

Ted



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spam splitting and multiple nnimap methods
  2004-05-19 11:36         ` Jonas Steverud
@ 2004-05-19 14:50           ` Ted Zlatanov
  0 siblings, 0 replies; 19+ messages in thread
From: Ted Zlatanov @ 2004-05-19 14:50 UTC (permalink / raw)


On Wed, 19 May 2004, tvrud@bredband.net wrote:

>From what you write above (and which I have snipped), I think that you
> want autodetection turned on for your INBOX.

That's an excellent point, Timothy doesn't even have to do nnimap
splitting.  He can just autodetect spam when he enters the group.
Neat!

Ted



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spam splitting and multiple nnimap methods
  2004-05-18 19:11     ` Ted Zlatanov
  2004-05-18 22:19       ` Timothy Brown
@ 2004-05-20 10:27       ` Yair Friedman
  2004-05-20 18:49         ` Ted Zlatanov
  1 sibling, 1 reply; 19+ messages in thread
From: Yair Friedman @ 2004-05-20 10:27 UTC (permalink / raw)


On 18 May 2004 15:11:49 -0400, 
"Ted Zlatanov" <tzz@lifelogs.com> writes:

> I actually meant to write cross-server splitting, which would allow
> "nnimap+server.com:externalgroup" but I keep forgetting about it.
> No one seems to be clamoring for it, so I guess it's not that
> important.
>


Please, generic cross-server splitting method is *very* useful even when
not filtering spam.

Having quota on IMAP servers where you want most important email to stay
and less important directly split to nnmail groups is one example.




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spam splitting and multiple nnimap methods
  2004-05-20 10:27       ` Yair Friedman
@ 2004-05-20 18:49         ` Ted Zlatanov
  2004-05-22 23:45           ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 19+ messages in thread
From: Ted Zlatanov @ 2004-05-20 18:49 UTC (permalink / raw)
  Cc: ding

On Thu, 20 May 2004, yairfr@icts-tech.com wrote:

> Please, generic cross-server splitting method is *very* useful even
> when not filtering spam.
> 
> Having quota on IMAP servers where you want most important email to
> stay and less important directly split to nnmail groups is one
> example.

OK.  Is there a hook I can use for right after all the mail splitting
has been done?  I plan to spool all the external-destination mail
in a "gnusexternalqueue" group and then move it out, and that group
name will be customizable.  I can't think of a cleaner approach that
wouldn't require too many hours of work :)

Ted

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: Spam splitting and multiple nnimap methods
  2004-05-20 18:49         ` Ted Zlatanov
@ 2004-05-22 23:45           ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 19+ messages in thread
From: Lars Magne Ingebrigtsen @ 2004-05-22 23:45 UTC (permalink / raw)


"Ted Zlatanov" <tzz@lifelogs.com> writes:

> OK.  Is there a hook I can use for right after all the mail splitting
> has been done? 

I think `nnmail-post-get-new-mail-hook' is the most likely
candidate... 

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2004-05-22 23:45 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-05-17 21:50 Spam splitting and multiple nnimap methods Timothy Brown
2004-05-18  9:53 ` Jonas Steverud
2004-05-18 12:53   ` Timothy Brown
2004-05-18 13:50     ` Jonas Steverud
2004-05-18 14:02     ` IMAP Splitting with multiple mailboxes (was Re: Spam splitting and multiple nnimap methods) Timothy Brown
2004-05-18 14:13       ` IMAP Splitting with multiple mailboxes Kai Grossjohann
2004-05-18 14:15         ` Timothy Brown
2004-05-18 15:53           ` Kai Grossjohann
2004-05-18 15:58             ` Timothy Brown
2004-05-18 16:14               ` Kai Grossjohann
2004-05-18 14:13     ` Spam splitting and multiple nnimap methods Jonas Steverud
2004-05-18 19:11     ` Ted Zlatanov
2004-05-18 22:19       ` Timothy Brown
2004-05-19 11:36         ` Jonas Steverud
2004-05-19 14:50           ` Ted Zlatanov
2004-05-19 14:48         ` Ted Zlatanov
2004-05-20 10:27       ` Yair Friedman
2004-05-20 18:49         ` Ted Zlatanov
2004-05-22 23:45           ` Lars Magne Ingebrigtsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).