* Re: Add hooks to Gnus on move/edit/delete?
2002-12-29 22:06 ` Kai Großjohann
@ 2002-12-29 22:53 ` Lars Magne Ingebrigtsen
2002-12-30 3:35 ` Ted Zlatanov
2002-12-30 18:49 ` Kai Großjohann
2002-12-30 3:33 ` Ted Zlatanov
2003-01-02 17:29 ` Simon Josefsson
2 siblings, 2 replies; 25+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-12-29 22:53 UTC (permalink / raw)
kai.grossjohann@uni-duisburg.de (Kai Großjohann) writes:
> I'm not sure. How do I tell spam.el which group a message is in? If
> ifile has chosen the group wrongly, how do I tell it about the error?
Quoth the manual:
-----
Gnus can learn from the spam you get. All you have to do is collect
your spam in one or more spam groups, and set the variable
`spam-junk-mailgroups' as appropriate. In these groups, all messages
are considered to be spam by default: they get the `H' mark. You must
review these messages from time to time and remove the `H' mark for
every message that is not spam after all. When you leave a spam group,
all messages that continue with the `H' mark, are passed on to the
spam-detection engine (bogofilter, ifile, and others). To remove the
`H' mark, you can use `M-u' to "unread" the article, or `d' for
declaring it read the non-spam way. When you leave a group, all `H'
marked articles, saved or unsaved, are sent to Bogofilter or ifile
(depending on `spam-use-bogofilter' and `spam-use-ifile'), which will
study them as spam samples.
-----
So I guess spam.el tells ifile these things upon group exit...
> But then, there is Gmane, which makes a lot of splitting unnecessary :-)
What's that, then? :-)
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Add hooks to Gnus on move/edit/delete?
2002-12-29 22:53 ` Lars Magne Ingebrigtsen
@ 2002-12-30 3:35 ` Ted Zlatanov
2002-12-30 18:49 ` Kai Großjohann
1 sibling, 0 replies; 25+ messages in thread
From: Ted Zlatanov @ 2002-12-30 3:35 UTC (permalink / raw)
On Sun, 29 Dec 2002, larsi@gnus.org wrote:
> Quoth the manual:
>
> -----
> Gnus can learn from the spam you get. All you have to do is
> collect your spam in one or more spam groups, and set the
> variable `spam-junk-mailgroups' as appropriate. In these groups,
> all messages are considered to be spam by default: they get the
> `H' mark. You must review these messages from time to time and
> remove the `H' mark for every message that is not spam after all.
> When you leave a spam group, all messages that continue with the
> `H' mark, are passed on to the spam-detection engine (bogofilter,
> ifile, and others). To remove the `H' mark, you can use `M-u' to
> "unread" the article, or `d' for declaring it read the non-spam
> way. When you leave a group, all `H' marked articles, saved or
> unsaved, are sent to Bogofilter or ifile (depending on
> `spam-use-bogofilter' and `spam-use-ifile'), which will study
> them as spam samples. -----
>
> So I guess spam.el tells ifile these things upon group exit...
I'm afraid I gave incorrect info here, ifile does not yet get the
spam-marked articles. But I'd rather fix the code than the manual :)
See my other message to Kai.
Ted
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Add hooks to Gnus on move/edit/delete?
2002-12-29 22:53 ` Lars Magne Ingebrigtsen
2002-12-30 3:35 ` Ted Zlatanov
@ 2002-12-30 18:49 ` Kai Großjohann
1 sibling, 0 replies; 25+ messages in thread
From: Kai Großjohann @ 2002-12-30 18:49 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> kai.grossjohann@uni-duisburg.de (Kai Großjohann) writes:
>
>> I'm not sure. How do I tell spam.el which group a message is in? If
>> ifile has chosen the group wrongly, how do I tell it about the error?
>
> Quoth the manual:
>[...]
> So I guess spam.el tells ifile these things upon group exit...
Hm. I'll read it again, but I thought spam.el only distinguished
between spam and ham, not between N different groups.
spam.el can use ifile as a binary classifier, but ifile-gnus.el
allows for using ifile as a multiway classifier. (Whee. Does that
make sense?)
--
Ambibibentists unite!
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Add hooks to Gnus on move/edit/delete?
2002-12-29 22:06 ` Kai Großjohann
2002-12-29 22:53 ` Lars Magne Ingebrigtsen
@ 2002-12-30 3:33 ` Ted Zlatanov
2002-12-30 18:53 ` Kai Großjohann
2003-01-02 17:29 ` Simon Josefsson
2 siblings, 1 reply; 25+ messages in thread
From: Ted Zlatanov @ 2002-12-30 3:33 UTC (permalink / raw)
Cc: ding
On Sun, 29 Dec 2002, kai.grossjohann@uni-duisburg.de wrote:
> I think it might work if you use nnimap splitting. If you use Sieve
> or procmail, then it could be difficult, since the ifile database
> would be on your local host in the home dir, whereas the
> Sieve/procmail rules are on the IMAP server.
We can explicitly require that in the manual.
>> So if ifile-spam-filter could and should be replaced with another
>> ifile-gnus.el function that returns regular group names, spam.el
>> can do all the ifile classification automatically if the user just
>> sets spam-use-ifile to t.
>>
>> Does that sound reasonable?
>
> I'm not sure. How do I tell spam.el which group a message is in?
> If ifile has chosen the group wrongly, how do I tell it about the
> error?
I think it's supposed to be automatic, but currently only bogofilter
processing is done:
- when entering a spam group ( a member of spam-junk-mailgroups),
all unread messages are marked as spam
- when exiting any group:
- spam-marked articles are processed as spam, then marked expired so
they are not processed twice and so we don't keep them around.
- ham-marked articles are processed as ham (non-spam); these are the
gnus-del-mark, gnus-read-mark, gnus-killed-mark,
gnus-kill-file-mark, and gnus-low-score-mark articles. There
should be, but isn't, a mechanism to prevent them from being
processed more than once. Any suggestions? Yet another cache or
a supplemental ham-processed-mark?
When I say "processed," I mean "passed to bogofilter on the command line
with the appropriate flag (-n for ham, -s for spam)."
The equivalent of spam-bogofilter-register-routine for ifile needs to
be written for ifile processing on group summary exit. I haven't done
it yet, because I didn't know that ifile-gnus.el supported nnimap
already. So let me know if you think it's ready to be incorporated.
> ifile-gnus intercepts move and copy operations to see that. So
> designating a message as spam works by moving it to the spam group.
With spam.el, you designate a message as spam by marking it with the
spam mark. We could automatically move all spam articles in a group
to the spam group on summary exit when ifile is enabled, to accomodate
its operation mode. But would that be acceptable to users?
> Maybe a two-stage process is better: first decide whether it's spam
> or not, then distribute the non-spam to the right groups.
I'm not sure how that would work.
Ted
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Add hooks to Gnus on move/edit/delete?
2002-12-30 3:33 ` Ted Zlatanov
@ 2002-12-30 18:53 ` Kai Großjohann
2002-12-30 22:53 ` Ted Zlatanov
0 siblings, 1 reply; 25+ messages in thread
From: Kai Großjohann @ 2002-12-30 18:53 UTC (permalink / raw)
Ted Zlatanov <tzz@lifelogs.com> writes:
> On Sun, 29 Dec 2002, kai.grossjohann@uni-duisburg.de wrote:
>
>> Maybe a two-stage process is better: first decide whether it's spam
>> or not, then distribute the non-spam to the right groups.
>
> I'm not sure how that would work.
Well, nnmail-split-methods (or nnmail-split-fancy) could use spam.el
to put spam messages into a special group. spam.el could invoke
ifile, telling it to use ~/.idata.spam. After this, `normal'
ifile-gnus processing could be done on the remaining ham.
The above is meant to be a description of the desired behavior, not a
suggestion for the implementation.
--
Ambibibentists unite!
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Add hooks to Gnus on move/edit/delete?
2002-12-30 18:53 ` Kai Großjohann
@ 2002-12-30 22:53 ` Ted Zlatanov
2002-12-31 12:08 ` Kai Großjohann
0 siblings, 1 reply; 25+ messages in thread
From: Ted Zlatanov @ 2002-12-30 22:53 UTC (permalink / raw)
Cc: ding, jhbrown
On Mon, 30 Dec 2002, kai.grossjohann@uni-duisburg.de wrote:
> Well, nnmail-split-methods (or nnmail-split-fancy) could use spam.el
> to put spam messages into a special group. spam.el could invoke
> ifile, telling it to use ~/.idata.spam. After this, `normal'
> ifile-gnus processing could be done on the remaining ham.
On second thought, why not have something like
(: spam-split)
(: ifile-split)
in the split rules? It seems awkward to try to make spam.el sort mail
in a way that it wasn't intended to do. I think I'd like to keep
functions invoked by spam-split standardized, returning either
spam-split-group or nil. Currently, spam-check-ifile does that:
(let ((ifile-primary-spam-group spam-split-group))
(ifile-spam-filter nil))))
I think that standardization would make life easier in the long run.
Do you or anyone else think we should have spam.el do general mail
sorting?
Also, the following is at Jeremy Brown's ifile-gnus.el page at
http://www.ai.mit.edu/people/jhbrown/ifile-gnus.html:
"THIS IS ALPHA-QUALITY SOFTWARE. IT MAY EAT YOUR EMAIL. IT MAY CORRUPT
YOUR GNUS CONFIGURATION. IT MAY DO OTHER UNPLEASANT THINGS TO YOUR
COMPUTER. I'M NOT KIDDING."
Considering that and the lack of support for nnimap, I'm not sure how
much work I should put into ifile-gnus.el support now instead of
later, when the code is more beta- than alpha-quality. I'm not
implying that ifile-gnus.el is buggy, only that it may change
significantly with the next release. Maybe Jeremy could give us an
update :)
Ted
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Add hooks to Gnus on move/edit/delete?
2002-12-30 22:53 ` Ted Zlatanov
@ 2002-12-31 12:08 ` Kai Großjohann
2002-12-31 14:32 ` ifile-gnus: " Ted Zlatanov
0 siblings, 1 reply; 25+ messages in thread
From: Kai Großjohann @ 2002-12-31 12:08 UTC (permalink / raw)
Ted Zlatanov <tzz@lifelogs.com> writes:
> On Mon, 30 Dec 2002, kai.grossjohann@uni-duisburg.de wrote:
>> Well, nnmail-split-methods (or nnmail-split-fancy) could use spam.el
>> to put spam messages into a special group. spam.el could invoke
>> ifile, telling it to use ~/.idata.spam. After this, `normal'
>> ifile-gnus processing could be done on the remaining ham.
>
> On second thought, why not have something like
>
> (: spam-split)
> (: ifile-split)
>
> in the split rules?
That might implement my suggestion. I haven't tried it :-) If it
works, it's a really good idea.
(I wonder if spam.el needs to tell ifile to use another index file.)
--
Ambibibentists unite!
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: ifile-gnus: Add hooks to Gnus on move/edit/delete?
2002-12-31 12:08 ` Kai Großjohann
@ 2002-12-31 14:32 ` Ted Zlatanov
2002-12-31 19:44 ` Nathan J. Williams
0 siblings, 1 reply; 25+ messages in thread
From: Ted Zlatanov @ 2002-12-31 14:32 UTC (permalink / raw)
Cc: ding, jhbrown
On Tue, 31 Dec 2002, kai.grossjohann@uni-duisburg.de wrote:
> Ted Zlatanov <tzz@lifelogs.com> writes:
>
>> On Mon, 30 Dec 2002, kai.grossjohann@uni-duisburg.de wrote:
>>> Well, nnmail-split-methods (or nnmail-split-fancy) could use
>>> spam.el to put spam messages into a special group. spam.el could
>>> invoke ifile, telling it to use ~/.idata.spam. After this,
>>> `normal' ifile-gnus processing could be done on the remaining ham.
>>
>> On second thought, why not have something like
>>
>> (: spam-split)
>> (: ifile-split)
>>
>> in the split rules?
>
> That might implement my suggestion. I haven't tried it :-) If it
> works, it's a really good idea.
I think it should, no reason why not. The ifile invocation inside
spam-split is binary. We invoke ifile-spam-filter with 'nil' as the
other-split parameter, and we set ifile-primary-spam-group to
spam-split-group; this means that only spam-split-group or nil will be
returned.
(defun ifile-spam-filter (other-split)
(if (and ifile-active (equal (ifile-recommend) "spam"))
ifile-primary-spam-group
other-split))
So just add this (sorry, no ifile-split exists)
(: ifile-recommend)
after (: spam-split) in your nnmail split rules and turn ifile on.
The penalty is that you invoke ifile twice for each article.
As I said, I want to wait until the ifile interface is stable before I
have spam.el support it for summary exit processing of spam. I hope
that's OK with everyone. Right now, you can manually move spam-marked
articles to your spam-split-group, and ifile-gnus.el will understand
that you want them to be considered spam.
> (I wonder if spam.el needs to tell ifile to use another index file.)
I don't think so, this should work right away.
Ted
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: ifile-gnus: Add hooks to Gnus on move/edit/delete?
2002-12-31 14:32 ` ifile-gnus: " Ted Zlatanov
@ 2002-12-31 19:44 ` Nathan J. Williams
2002-12-31 20:00 ` Ted Zlatanov
0 siblings, 1 reply; 25+ messages in thread
From: Nathan J. Williams @ 2002-12-31 19:44 UTC (permalink / raw)
Cc: Kai Großjohann, ding, jhbrown
Ted Zlatanov <tzz@lifelogs.com> writes:
> > (I wonder if spam.el needs to tell ifile to use another index file.)
>
> I don't think so, this should work right away.
You might want to do this for efficency reasons; ifile gets slower as
the number of split categories increases. Having a 2-way split (spam
vs. non-spam) and then an N-way split for non-spam is quite a bit
faster than two N+1-way splits.
(and one of us ifile users should hack in something to cache the DB in
memory, so it doesn't have to read/parse/write the entire DB for every
message).
- Nathan
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: ifile-gnus: Add hooks to Gnus on move/edit/delete?
2002-12-31 19:44 ` Nathan J. Williams
@ 2002-12-31 20:00 ` Ted Zlatanov
0 siblings, 0 replies; 25+ messages in thread
From: Ted Zlatanov @ 2002-12-31 20:00 UTC (permalink / raw)
Cc: Kai Großjohann, ding, jhbrown
On 31 Dec 2002, nathanw@MIT.EDU wrote:
> Ted Zlatanov <tzz@lifelogs.com> writes:
>
>> > (I wonder if spam.el needs to tell ifile to use another index
>> > file.)
>>
>> I don't think so, this should work right away.
>
> You might want to do this for efficency reasons; ifile gets slower
> as the number of split categories increases. Having a 2-way split
> (spam vs. non-spam) and then an N-way split for non-spam is quite a
> bit faster than two N+1-way splits.
I have no problem with the idea, but the optimized implementation
should really be on the ifile-gnus.el side, in the function
ifile-spam-filter. I don't think spam.el should reinvent
ifile-spam-filter, considering that function was specifically written
for spam detection.
Ted
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Add hooks to Gnus on move/edit/delete?
2002-12-29 22:06 ` Kai Großjohann
2002-12-29 22:53 ` Lars Magne Ingebrigtsen
2002-12-30 3:33 ` Ted Zlatanov
@ 2003-01-02 17:29 ` Simon Josefsson
2003-01-02 21:31 ` Kai Großjohann
2 siblings, 1 reply; 25+ messages in thread
From: Simon Josefsson @ 2003-01-02 17:29 UTC (permalink / raw)
Cc: ding
kai.grossjohann@uni-duisburg.de (Kai Großjohann) writes:
> Ted Zlatanov <tzz@lifelogs.com> writes:
>
>> On Sun, 29 Dec 2002, kai.grossjohann@uni-duisburg.de wrote:
>>> First of all, ifile-gnus.el now uses a different mechanism for
>>> communicating with the backend, and the new mechanism works for more
>>> than one backend.
>>
>> Is this released, and does it work with nnimap? I haven't followed
>> ifile-gnus.el, unfortunately, due to lack of time.
>
> I think it might work if you use nnimap splitting. If you use Sieve
> or procmail, then it could be difficult, since the ifile database
> would be on your local host in the home dir, whereas the
> Sieve/procmail rules are on the IMAP server.
It might be possible for ifile-gnus to generate Sieve rules similar to
what the Gnus/Sieve stuff does now. You would need to upload the
sieve script separately though.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Add hooks to Gnus on move/edit/delete?
2003-01-02 17:29 ` Simon Josefsson
@ 2003-01-02 21:31 ` Kai Großjohann
2003-01-02 22:07 ` Simon Josefsson
0 siblings, 1 reply; 25+ messages in thread
From: Kai Großjohann @ 2003-01-02 21:31 UTC (permalink / raw)
Simon Josefsson <jas@extundo.com> writes:
> It might be possible for ifile-gnus to generate Sieve rules similar to
> what the Gnus/Sieve stuff does now. You would need to upload the
> sieve script separately though.
I think Sieve is not Turing complete and therefore a naive Bayesian
classifier cannot be implemented within the confines of Sieve.
--
Ambibibentists unite!
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Add hooks to Gnus on move/edit/delete?
2003-01-02 21:31 ` Kai Großjohann
@ 2003-01-02 22:07 ` Simon Josefsson
2003-01-03 13:31 ` Kai Großjohann
0 siblings, 1 reply; 25+ messages in thread
From: Simon Josefsson @ 2003-01-02 22:07 UTC (permalink / raw)
Cc: ding
kai.grossjohann@uni-duisburg.de (Kai Großjohann) writes:
> Simon Josefsson <jas@extundo.com> writes:
>
>> It might be possible for ifile-gnus to generate Sieve rules similar to
>> what the Gnus/Sieve stuff does now. You would need to upload the
>> sieve script separately though.
>
> I think Sieve is not Turing complete and therefore a naive Bayesian
> classifier cannot be implemented within the confines of Sieve.
Right. I don't really know what ifile-gnus is. Does it generate Gnus
splitting rules? Or is it a fancy mail splitter function that is
invoked by Gnus? If the latter, forget my ramblings.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: Add hooks to Gnus on move/edit/delete?
2003-01-02 22:07 ` Simon Josefsson
@ 2003-01-03 13:31 ` Kai Großjohann
0 siblings, 0 replies; 25+ messages in thread
From: Kai Großjohann @ 2003-01-03 13:31 UTC (permalink / raw)
Simon Josefsson <jas@extundo.com> writes:
> Right. I don't really know what ifile-gnus is. Does it generate Gnus
> splitting rules? Or is it a fancy mail splitter function that is
> invoked by Gnus? If the latter, forget my ramblings.
ifile-gnus invokes the external program ifile on the message. ifile
is a Naive Bayesian classifier. That means you show it some example
messages with their categories (1 group = 1 category), and from these
it learns. Afterwards you can invoke it on a new message and it will
print a category for this message, using the previously learned data.
Naive Bayes is not the best text classifier you can get. Experiments
show that good classifiers can achieve 80% accuracy (or so).
The spam identification suggestion by Paul Graham implements Naive
Bayes in Lisp.
--
Ambibibentists unite!
^ permalink raw reply [flat|nested] 25+ messages in thread