? build.info ? build.sh ? contrib/auto-autoloads.el cvs server: Diffing . cvs server: Diffing contrib cvs server: Diffing etc cvs server: Diffing etc/gnus cvs server: Diffing etc/smilies cvs server: Diffing lisp cvs server: Diffing texi Index: texi/gnus.texi =================================================================== RCS file: /usr/local/cvsroot/gnus/texi/gnus.texi,v retrieving revision 7.47 diff -u -r7.47 gnus.texi --- texi/gnus.texi 28 May 2004 12:41:47 -0000 7.47 +++ texi/gnus.texi 31 May 2004 13:26:17 -0000 @@ -867,9 +867,13 @@ Filtering Spam Using The Spam ELisp Package -* Spam ELisp Package Sequence of Events:: +* Introducing the Spam ELisp Package:: * Spam ELisp Package Filtering of Incoming Mail:: +* Spam ELisp Package Scanning on Group Entry:: +* Spam Elisp Package Spam Processing on Group Exit:: * Spam ELisp Package Global Variables:: +* Spam ELisp Package Group Parameters:: +* Spam ELisp Package Spam and Ham Processors:: * Spam ELisp Package Configuration Examples:: * Blacklists and Whitelists:: * BBDB Whitelists:: @@ -22570,7 +22574,7 @@ non-spam messages. First of all, you @strong{must} run the function -@code{spam-initialize} to autoload @code{spam.el} and to install the +@code{spam-initialize} to autoload @file{spam.el} and to install the @code{spam.el} hooks. There is one exception: if you use the @code{spam-use-stat} (@pxref{spam-stat spam filtering}) setting, you should turn it on before @code{spam-initialize}: @@ -22582,11 +22586,11 @@ So, what happens when you load @file{spam.el}? -First, some hooks will get installed by @code{spam-initialize}. There +First, some hooks will get installed by @code{spam-initialize}. There are some hooks for @code{spam-stat} so it can save its databases, and there are hooks so interesting things will happen when you enter and -leave a group. More on the sequence of events later (@pxref{Spam -ELisp Package Sequence of Events}). +leave a group. More on the sequence of events later (@pxref{Introducing +the Spam ELisp Package}). You get the following keyboard commands: @@ -22601,10 +22605,10 @@ @findex gnus-summary-mark-as-spam @code{gnus-summary-mark-as-spam}. -Mark current article as spam, showing it with the @samp{$} mark. -Whenever you see a spam article, make sure to mark its summary line -with @kbd{M-d} before leaving the group. This is done automatically -for unread articles in @emph{spam} groups. +Mark current article as spam, showing it with the @samp{$} mark. +Whenever you see a spam article, make sure to mark its summary line with +@kbd{M-d} before leaving the group. @file{spam.el} can do this +automatically for unread articles in @emph{spam} groups. @item M s t @itemx S t @@ -22624,9 +22628,13 @@ group. @menu -* Spam ELisp Package Sequence of Events:: +* Introducing the Spam ELisp Package:: * Spam ELisp Package Filtering of Incoming Mail:: +* Spam ELisp Package Scanning on Group Entry:: +* Spam Elisp Package Spam Processing on Group Exit:: * Spam ELisp Package Global Variables:: +* Spam ELisp Package Group Parameters:: +* Spam ELisp Package Spam and Ham Processors:: * Spam ELisp Package Configuration Examples:: * Blacklists and Whitelists:: * BBDB Whitelists:: @@ -22642,106 +22650,107 @@ * Extending the Spam ELisp package:: @end menu -@node Spam ELisp Package Sequence of Events -@subsubsection Spam ELisp Package Sequence of Events +@node Introducing the Spam ELisp Package +@subsubsection Introducing the Spam ELisp Package +@cindex spam introduction +@cindex spam filtering introduction @cindex spam filtering @cindex spam filtering sequence of events @cindex spam -You must read this section to understand how @code{spam.el} works. -Do not skip, speed-read, or glance through this section. - -There are two @emph{contact points}, if you will, between -@code{spam.el} and the rest of Gnus: checking new mail for spam, and -leaving a group. - -Getting new mail is done in one of two ways. You can either split -your incoming mail or you can classify new articles as ham or spam -when you enter the group. - -Splitting incoming mail is better suited to mail backends such as -@code{nnml} or @code{nnimap} where new mail appears in a single file -called a @dfn{Spool File}. See @xref{Spam ELisp Package Filtering of -Incoming Mail}. - -@vindex gnus-spam-autodetect -@vindex gnus-spam-autodetect-methods -For backends such as @code{nntp} there is no incoming mail spool, so -an alternate mechanism must be used. This may also happen for -backends where the server is in charge of splitting incoming mail, and -Gnus does not do further splitting. The @code{spam-autodetect} and -@code{spam-autodetect-methods} group parameters (accessible with -@kbd{G c} and @kbd{G p} as usual), and the corresponding variables -@code{gnus-spam-autodetect} and @code{gnus-spam-autodetect-methods} -(accessible with @kbd{M-x customize-variable} as usual) can help. - -When @code{spam-autodetect} is used (you can turn it on for a -group/topic or wholesale by regex, as needed), it hooks into the -process of entering a group. Thus, entering a group with unseen or -unread articles becomes the substitute for checking incoming mail. -Whether only unseen articles or all unread articles will be processed -is determined by the @code{spam-autodetect-recheck-messages}. When -set to @code{t}, unread messages will be rechecked. - -@code{spam-autodetect} grants the user at once more and less control -of spam filtering. The user will have more control over each group's -spam methods, so for instance the @samp{ding} group may have -@code{spam-use-BBDB} as the autodetection method, while the -@samp{suspect} group may have the @code{spam-use-blacklist} and -@code{spam-use-bogofilter} methods enabled. Every article detected to -be spam will be marked with the spam mark @samp{$} and processed on -exit from the group as normal spam. The user has less control over -the @emph{sequence} of checks, as he might with @code{spam-split}. - -When the newly split mail goes into groups, or messages are -autodetected to be ham or spam, those groups must be exited (after -entering, if needed) for further spam processing to happen. It -matters whether the group is considered a ham group, a spam group, or -is unclassified, based on its @code{spam-content} parameter -(@pxref{Spam ELisp Package Global Variables}). Spam groups have the -additional characteristic that, when entered, any unseen or unread -articles (depending on the @code{spam-mark-only-unseen-as-spam} -variable) will be marked as spam. Thus, mail split into a spam group -gets automatically marked as spam when you enter the group. - -So, when you exit a group, the @code{spam-processors} are applied, if -any are set, and the processed mail is moved to the -@code{ham-process-destination} or the @code{spam-process-destination} -depending on the article's classification. If the -@code{ham-process-destination} or the @code{spam-process-destination}, -whichever is appropriate, are @code{nil}, the article is left in the -current group. - -If a spam is found in any group (this can be changed to only non-spam -groups with @code{spam-move-spam-nonspam-groups-only}), it is -processed by the active @code{spam-processors} (@pxref{Spam ELisp -Package Global Variables}) when the group is exited. Furthermore, the -spam is moved to the @code{spam-process-destination} (@pxref{Spam -ELisp Package Global Variables}) for further training or deletion. -You have to load the @code{gnus-registry.el} package and enable the -@code{spam-log-to-registry} variable if you want spam to be processed -no more than once. Thus, spam is detected and processed everywhere, -which is what most people want. If the -@code{spam-process-destination} is @code{nil}, the spam is marked as -expired, which is usually the right thing to do. - -If spam can not be moved---because of a read-only backend such as -@acronym{NNTP}, for example, it will be copied. - -If a ham mail is found in a ham group, as determined by the -@code{ham-marks} parameter, it is processed as ham by the active ham -@code{spam-processor} when the group is exited. With the variables -@code{spam-process-ham-in-spam-groups} and -@code{spam-process-ham-in-nonham-groups} the behavior can be further -altered so ham found anywhere can be processed. You have to load the -@code{gnus-registry.el} package and enable the -@code{spam-log-to-registry} variable if you want ham to be processed -no more than once. Thus, ham is detected and processed only when -necessary, which is what most people want. More on this in -@xref{Spam ELisp Package Configuration Examples}. - -If ham can not be moved---because of a read-only backend such as -@acronym{NNTP}, for example, it will be copied. +You must read this section to understand how @file{spam.el} works. +We recommend that you do not skip, speed-read, or glance through this +section, as it will be very difficult to configure @file{spam.el} +appropriately without understanding this background. + +There are three primary contact points, if you will, between +@file{spam.el} and the rest of Gnus: checking new mail for spam, +entering a group and leaving a group. + +There are two ways the checking of new mail for spam can be done, and +either or both can be used to meet your needs and mail sources. You can +split incoming spam mail into a special group, or you can scan new +articles for spam when you enter a group. + +@cindex Spam Scanning Incoming Mail +@cindex Spam Detection for NNML +Splitting incoming mail is better suited to mail sources such as +@code{nnml} or @code{nnimap} where new mail shows up and is filtered and +split to the destination groups by Gnus as part of the mail checking +process. See @xref{Spam ELisp Package Filtering of Incoming Mail}. + +@cindex Spam Scanning on Group Entry +@cindex Spam Detection for NNTP +Scanning new articles for spam when you enter a group is better suited +to mail or news sources such as @code{nntp}, or @code{nnimap} where the +server performs mail filtering and splitting, and Gnus simply views the +results of that process. See @xref{Spam ELisp Package Scanning on Group +Entry}. + +Spam scanning can be performed on all your groups, or on any subset of +them, allowing you to mix the two methods if you have @code{nnml} mail +groups and @code{nntp} news groups, and don't wish to see spam in +either. + +Spam scanning on group entry grants the user at once more and less +control of spam filtering. When scanning on entry, each group can have +its own set of spam detection methods, allowing you to use SpamAssassin +in one group and an exclusive whitelist in another group. + +The trade-off is that you will spend longer waiting for the spam +scanning process to take effect when you enter the group, and you have +less control over the order that the checks take place in and the +disposition of the spam messages -- they can be marked, but not moved +away automatically. + +@cindex Group Classification +@cindex Spam Group Classification +@cindex Spam Processing on Group Exit +@cindex Ham Processing on Group Exit +The second stage of @file{spam.el} activity, invoked when you exit a +group, depends on the classification of that group. The content of each +group is assigned to one of three categories: the default @emph{unknown} +classification, as containing only @emph{spam} or as @emph{ham}. + +When a group is classified as @emph{spam}, any unseen, or unseen and +unread, messages it contains can be marked with the spam mark @samp{$}, +tagging them for automatic processing when you have verified that they +were correctly detected as spam. For details on classifying your groups, +see @ref{Spam Elisp Package Spam Processing on Group Exit}. + +When you exit any group containing articles you have reclassified, from +spam to ham or vice-versa, those articles are "unregistered" by passing +them to the configured spam and ham processors for that group. This +ensures that learning spam classifiers become aware of their mistakes. + +Articles that have been marked as spam are also passed to the groups +spam processors, allowing them to learn from these. After that the spam +messages are marked as expired and, for @emph{ham} and @emph{unknown} +groups, moved to the spam destination group if one is set. This is not +normally done for @emph{spam} groups, but you can configure it to happen +there as well; For details, see @ref{Spam Elisp Package Spam Processing +on Group Exit}. + +After that articles marked as ham are passed through the ham processors +for the group, at least for groups you classified as @emph{ham}. By +default this is not done for @emph{spam} or @emph{unknown} groups, but +you can enable ham processing in either of these classes of groups. + +@c Is the registry stuff really not documented anywhere? I couldn't find +@c anything. --daniel@rimspace.net +The Gnus registry can be used to record each ham article as it is passed +through the ham processor for the group, ensuring that it will only be +processed once. Without this, all ham messages will be processed every +time you leave the group, which is normally undesirable. For more +details, see @ref{Spam Elisp Package Spam Processing on Group Exit}. + +Finally, if a ham destination group is configured, ham articles are +moved out of any @emph{spam} classified group and into that destination +group. Alternatively, they can be passed through the Gnus splitting code +with the @code{spam-split} routine disable, allowing them to reach their +normal destination rather than the spam group. If spam can not be +moved---because of a read-only backend such as @acronym{NNTP}, for +example, it will be copied. If all this seems confusing, don't worry. Soon it will be as natural as typing Lisp one-liners on a neural interface@dots{} err, sorry, that's @@ -22765,23 +22774,25 @@ @code{nnimap-split-fancy}, depending on whether you use the nnmail or nnimap back ends to retrieve your mail. -Also, @code{spam-split} will not modify incoming mail in any way. +Also, @code{spam-split} will not modify incoming mail in any way. So, +filtering through a spam processor like SpamAssassin will not add any +headers or make any subject line changes to your messages when invoked +through @code{spam-split}. The @code{spam-split} function will process incoming mail and send the -mail considered to be spam into the group name given by the variable -@code{spam-split-group}. By default that group name is @samp{spam}, -but you can customize @code{spam-split-group}. Make sure the contents -of @code{spam-split-group} are an @emph{unqualified} group name, for -instance in an @code{nnimap} server @samp{your-server} the value -@samp{spam} will turn out to be @samp{nnimap+your-server:spam}. The -value @samp{nnimap+server:spam}, therefore, is wrong and will -actually give you the group -@samp{nnimap+your-server:nnimap+server:spam} which may or may not -work depending on your server's tolerance for strange group names. - -You can also give @code{spam-split} a parameter, -e.g. @code{spam-use-regex-headers} or @code{"maybe-spam"}. Why is -this useful? +mail considered to be spam into the group on the current server given by +the variable @code{spam-split-group}; by default that is the @samp{spam} +group. + +@c REVIST: Do we actually need to go into this here? --daniel@rimspace.net +This group name must be @emph{unqualified} name; fancy splitting cannot +send mail to another back-end during the split process, so setting +@code{spam-split-group} to a qualified name will result in that full +string being used as the destination group. + +You can also give @code{spam-split} a parameter, e.g. +@code{spam-use-regex-headers} or @code{"maybe-spam"}. Why is this +useful? Take these split rules (with @code{spam-use-regex-headers} and @code{spam-use-blackholes} set): @@ -22825,12 +22836,15 @@ blackhole checks performed on them. You could also specify different spam checks for your nnmail split vs. your nnimap split. Go crazy. +@c REVISIT: this really should be more detailed about this. +@c can we either make these just work(tm), or document it without the +@c "but maybe it will work anyway" bit? You should still have specific checks such as -@code{spam-use-regex-headers} set to @code{t}, even if you -specifically invoke @code{spam-split} with the check. The reason is -that when loading @file{spam.el}, some conditional loading is done -depending on what @code{spam-use-xyz} variables you have set. This -is usually not critical, though. +@code{spam-use-regex-headers} set to @code{t}, even if you specifically +invoke @code{spam-split} with the check. The reason is that when loading +@file{spam.el}, some conditional loading is done depending on what +@code{spam-use-xyz} variables you have set. This is usually not +critical, though. @emph{Note for IMAP users} @@ -22845,9 +22859,161 @@ @xref{Splitting in IMAP}. -@emph{TODO: spam.el needs to provide a uniform way of training all the -statistical databases. Some have that functionality built-in, others -don't.} +@c TODO: spam.el needs to provide a uniform way of training all the +@c statistical databases. Some have that functionality built-in, others +@c don't. + +@node Spam ELisp Package Scanning on Group Entry +@subsubsection Spam ELisp Package Scanning on Group Entry +@cindex spam detection +@cindex spam autodetection +@cindex detect spam on group entry + +For backends such as @code{nntp}, or @code{nnimap} where server-side +splitting is used, spam checks cannot be performed by Gnus during the +delivery of new articles--it has no involvement in that process. + +@vindex gnus-spam-autodetect +@vindex gnus-spam-autodetect-methods +Unlike most other spam management tools, @file{spam.el} allows you to +perform spam detection in these groups as well. This is enabled with the +@code{spam-autodetect} group parameter, and the specific set of checks +controlled with the @code{spam-autodetect-methods} group parameter. + +To enable spam detection on group entry you need to set +@code{spam-autodetect} to @code{t}, and @code{spam-autodetect-methods} +to a list of methods to use. For example: + +@example +;; @r{in the group parameters for a gmane NNTP group} +(spam-autodetect t) +(spam-autodetect-methods (spam-use-blacklist spam-use-gmane)) +@end example + +You can also use the global variables @code{gnus-spam-autodetect} and +@code{gnus-spam-autodetect-methods} as an alternative method for setting +these values. Both of these map a series of regular expressions matching +the group name to appropriate values, such as: + +@example +;; @r{in your .gnus file} +(setq gnus-spam-autodetect '((":gmane\\..*" t))) +(setq gnus-spam-autodetect-methods '((":gmane\\..*" (spam-use-blacklist + spam-use-gmane)))) +@end example + +When spam scanning on group entry is enabled, any previously unseen +articles are scanned for spam. This is usually the right thing to do as +anything marked as spam is processed and moved away when you exit the +group. + +If you do want to rescan unread articles as well as scanning the new +articles, you can set @code{spam-autodetect-recheck-messages} to +@code{t}. + +@node Spam Elisp Package Spam Processing on Group Exit +@subsubsection Spam Elisp Package Spam Processing on Group Exit + +When you exit a group, @file{spam.el} can perform processing on messages +that are marked as spam or ham, to move them to another location or to +teach a learning tool such as Bogofilter or SpamAssassin to identify +them in future. + +The exact processing done depends on the classification of the group; +groups classified as containing only spam are treated differently from +groups containing only ham or unclassified groups. + +@cindex spam group classification +@cindex spam group +@cindex ham group +To classify a group, you need to set the @code{spam-contents} group +parameter, to one of the values: + +@table @code +@item gnus-group-spam-classification-spam +This group contains @emph{spam} messages only. + +@item gnus-group-spam-classification-ham +This group contains @emph{ham} messages only. + +@item nil or unset +This group is not classified, and may contain either ham or spam +articles. + +@end table + +The @code{spam-process} group parameter controls which spam and ham +processors are used in an individual group. This contains a list article +classification and processor pairs, indicating which tools to apply for +which types of articles. For example: + +@example +(spam-process ((spam spam-use-gmane) + (spam spam-use-bogofilter) + (ham spam-use-bogofilter))) +@end example + +This group parameter instructs @file{spam.el} to report spam articles to +the gmane spam reporting service, and to pass both spam and ham articles +to the bogofilter learning engine. + +For a full list of spam and ham processors, see @ref{Spam ELisp Package +Spam and Ham Processors}. + +Once the configured ham or spam processors have been applied, +@file{spam.el} can move the articles to a different destination for +further processing or disposal. + +For messages with the spam mark (@samp{$}), the destination is set with +the @code{spam-process-destination} group parameter. By default, only +articles in non-spam groups (ham and unknown groups) are moved. To move +spam in spam groups as well, @code{spam-move-spam-nonspam-groups-only} +should be set to @code{nil}. + +For ham articles, @code{ham-process-destination} controls the final +destination. You can also have those articles marked as unread, so that +you can read them properly in their normal destination group, by setting +@code{spam-mark-ham-unread-before-move-from-spam-group} to @code{t}. +By default this is not done. + +The possible values for @code{spam-process-destination} and +@code{ham-process-destination} are: + +@table @code +@item nil +Leave the article in the current group. + +@item a string +(eg. "mail.spam") + +Move the article to the named group on the current back-end. + +@item respool +(only for @emph{ham}) + +Respool the article using the current back-end. + +See also @code{spam-disable-spam-split-during-ham-respool} (in +@pxref{Spam ELisp Package Global Variables}), which ensures that the +article is not re-identified as spam by the @code{spam-split} routine. + +@end table + +If you leave either or both of these at the default @code{nil} value, +articles are simply left in the current group. + +Finally, a caution: by default @file{spam.el} does not keep track of +which articles have been processed and which have not, so all articles +will be processed every time you exit a group by default. + +If you use the Gnus registry (see @cite{gnus-registry.el} for +documentation and details), you can use it to keep track of this, +allowing @file{spam.el} to process each article only once. + +After you have loaded the Gnus registry, you need to set +@code{spam-log-to-registry} to @code{t}. When an article is processed as +spam or ham, the registry entry for it will then be annotated, ensuring +that it will never again be processed through the learning tools. @node Spam ELisp Package Global Variables @subsubsection Spam ELisp Package Global Variables @@ -22856,171 +23022,375 @@ @cindex spam variables @cindex spam -@vindex gnus-spam-process-newsgroups -The concepts of ham processors and spam processors are very important. -Ham processors and spam processors for a group can be set with the -@code{spam-process} group parameter, or the -@code{gnus-spam-process-newsgroups} variable. Ham processors take -mail known to be non-spam (@emph{ham}) and process it in some way so -that later similar mail will also be considered non-spam. Spam -processors take mail known to be spam and process it so similar spam -will be detected later. - -The format of the spam or ham processor entry used to be a symbol, -but now it is a @sc{cons} cell. See the individual spam processor entries -for more information. +There are a number of global variables that control the operation of +@file{spam.el}, allowing you to customize its behavior to your +requirements. -@vindex gnus-spam-newsgroup-contents -Gnus learns from the spam you get. You have to collect your spam in -one or more spam groups, and set or customize the variable -@code{spam-junk-mailgroups} as appropriate. You can also declare -groups to contain spam by setting their group parameter -@code{spam-contents} to @code{gnus-group-spam-classification-spam}, or -by customizing the corresponding variable -@code{gnus-spam-newsgroup-contents}. The @code{spam-contents} group -parameter and the @code{gnus-spam-newsgroup-contents} variable can -also be used to declare groups as @emph{ham} groups if you set their -classification to @code{gnus-group-spam-classification-ham}. If -groups are not classified by means of @code{spam-junk-mailgroups}, -@code{spam-contents}, or @code{gnus-spam-newsgroup-contents}, they are -considered @emph{unclassified}. All groups are unclassified by -default. +@table @code -@vindex gnus-spam-mark +@vindex spam-disable-spam-split-during-ham-respool +@item spam-disable-spam-split-during-ham-respool +If this is true, the @code{spam-split} routine will become a null +operation when @file{spam.el} is respooling ham messages after +processing. For details, see @ref{Spam Elisp Package Spam Processing on +Group Exit}. + +@cindex spam process only once +@cindex spam registry +@vindex spam-log-to-registry +@item spam-log-to-registry +If true, and the Gnus registry is enabled, articles will be annotated in +the registry when they are processed as ham or spam. This annotation is +then used to prevent the same article being processed multiple times. + + Keep in mind that if you limit the number of registry entries, this +won't work as well as it does without a limit. + +@cindex spam mark @cindex $ -In spam groups, all messages are considered to be spam by default: -they get the @samp{$} mark (@code{gnus-spam-mark}) when you enter the -group. If you have seen a message, had it marked as spam, then -unmarked it, it won't be marked as spam when you enter the group -thereafter. You can disable that behavior, so all unread messages -will get the @samp{$} mark, if you set the -@code{spam-mark-only-unseen-as-spam} parameter to @code{nil}. You -should remove the @samp{$} mark when you are in the group summary -buffer for every message that is not spam after all. To remove the -@samp{$} mark, you can use @kbd{M-u} to ``unread'' the article, or -@kbd{d} for declaring it read the non-spam way. When you leave a -group, all spam-marked (@samp{$}) articles are sent to a spam -processor which will study them as spam samples. - -Messages may also be deleted in various other ways, and unless -@code{ham-marks} group parameter gets overridden below, marks @samp{R} -and @samp{r} for default read or explicit delete, marks @samp{X} and -@samp{K} for automatic or explicit kills, as well as mark @samp{Y} for -low scores, are all considered to be associated with articles which -are not spam. This assumption might be false, in particular if you -use kill files or score files as means for detecting genuine spam, you -should then adjust the @code{ham-marks} group parameter. +@vindex gnus-spam-mark +@item gnus-spam-mark +The mark applied to spam messages, by default @samp{$}. This is set in +two ways, automatically by Gnus when it believes a message is spam, and +manually to indicate a spam message that was not correctly detected. + +If this is applied to a non-spam message, you @emph{must} remove the +mark, or risk confusing your spam detection tools and causing further +false positives in future. -@defvar ham-marks -You can customize this group or topic parameter to be the list of -marks you want to consider ham. By default, the list contains the -deleted, read, killed, kill-filed, and low-score marks (the idea is -that these articles have been read, but are not spam). It can be -useful to also include the tick mark in the ham marks. It is not -recommended to make the unread mark a ham mark, because it normally -indicates a lack of classification. But you can do it, and we'll be -happy for you. -@end defvar - -@defvar spam-marks -You can customize this group or topic parameter to be the list of -marks you want to consider spam. By default, the list contains only -the spam mark. It is not recommended to change that, but you can if -you really want to. -@end defvar +To remove the spam mark you can use @kbd{M-u} to mark it ``unread'', or +@kbd{d} to mark is as read and not spam. -When you leave @emph{any} group, regardless of its -@code{spam-contents} classification, all spam-marked articles are sent -to a spam processor, which will study these as spam samples. If you -explicit kill a lot, you might sometimes end up with articles marked -@samp{K} which you never saw, and which might accidentally contain -spam. Best is to make sure that real spam is marked with @samp{$}, -and nothing else. +When you exit a group, all articles marked with the +@code{gnus-spam-mark} are processed by the exit processors for that +group. -@vindex gnus-ham-process-destinations -When you leave a @emph{spam} group, all spam-marked articles are -marked as expired after processing with the spam processor. This is -not done for @emph{unclassified} or @emph{ham} groups. Also, any -@strong{ham} articles in a spam group will be moved to a location -determined by either the @code{ham-process-destination} group -parameter or a match in the @code{gnus-ham-process-destinations} -variable, which is a list of regular expressions matched with group -names (it's easiest to customize this variable with @kbd{M-x -customize-variable @key{RET} gnus-ham-process-destinations}). Each -group name list is a standard Lisp list, if you prefer to customize -the variable manually. If the @code{ham-process-destination} -parameter is not set, ham articles are left in place. If the -@code{spam-mark-ham-unread-before-move-from-spam-group} parameter is -set, the ham articles are marked as unread before being moved. - -If ham can not be moved---because of a read-only backend such as -@acronym{NNTP}, for example, it will be copied. - -Note that you can use multiples destinations per group or regular -expression! This enables you to send your ham to a regular mail -group and to a @emph{ham training} group. +Normally, read and deleted messages are not considered as spam; If you +use kill files or score files as means for detecting genuine spam, you +should then adjust the @code{ham-marks} group parameter. + +See also @code{ham-marks} and @code{spam-marks} in @ref{Spam ELisp +Package Group Parameters}. -When you leave a @emph{ham} group, all ham-marked articles are sent to -a ham processor, which will study these as non-spam samples. @vindex spam-process-ham-in-spam-groups -By default the variable @code{spam-process-ham-in-spam-groups} is -@code{nil}. Set it to @code{t} if you want ham found in spam groups -to be processed. Normally this is not done, you are expected instead +@item spam-process-ham-in-spam-groups +Enable processing of ham in groups classified as @emph{spam}. + +Set this to @code{t} if you want to use ham processors in a group +classified as @emph{spam}. By default this is not done--you are expected to send your ham to a ham group and process it there. @vindex spam-process-ham-in-nonham-groups -By default the variable @code{spam-process-ham-in-nonham-groups} is -@code{nil}. Set it to @code{t} if you want ham found in non-ham (spam -or unclassified) groups to be processed. Normally this is not done, -you are expected instead to send your ham to a ham group and process -it there. - -@vindex gnus-spam-process-destinations -When you leave a @emph{ham} or @emph{unclassified} group, all -@strong{spam} articles are moved to a location determined by either -the @code{spam-process-destination} group parameter or a match in the -@code{gnus-spam-process-destinations} variable, which is a list of -regular expressions matched with group names (it's easiest to -customize this variable with @kbd{M-x customize-variable @key{RET} -gnus-spam-process-destinations}). Each group name list is a standard -Lisp list, if you prefer to customize the variable manually. If the -@code{spam-process-destination} parameter is not set, the spam -articles are only expired. The group name is fully qualified, meaning -that if you see @samp{nntp:servername} before the group name in the -group buffer then you need it here as well. - -If spam can not be moved---because of a read-only backend such as -@acronym{NNTP}, for example, it will be copied. - -Note that you can use multiples destinations per group or regular -expression! This enables you to send your spam to multiple @emph{spam -training} groups. - -@vindex spam-log-to-registry -The problem with processing ham and spam is that Gnus doesn't track -this processing by default. Enable the @code{spam-log-to-registry} -variable so @code{spam.el} will use @code{gnus-registry.el} to track -what articles have been processed, and avoid processing articles -multiple times. Keep in mind that if you limit the number of registry -entries, this won't work as well as it does without a limit. +@item spam-process-ham-in-nonham-groups +Enable processing of ham in groups that are classified as @emph{spam} or +not classified. + +Set it to @code{t} if you want ham found in non-ham (spam or +unclassified) groups to be processed. Normally this is not done, you are +expected instead to send your ham to a ham group and process it there. +@cindex spam autodetectios +@cindex spam detection on group entry @vindex spam-mark-only-unseen-as-spam -Set this variable if you want only unseen articles in spam groups to -be marked as spam. By default, it is set. If you set it to -@code{nil}, unread articles will also be marked as spam. +@item spam-mark-only-unseen-as-spam +If true, which is the default, only unseen articles in spam are marked +as spam, which is usually the right thing to do. If set to @code{nil}, +unread articles will also be marked as spam. @vindex spam-mark-ham-unread-before-move-from-spam-group -Set this variable if you want ham to be unmarked before it is moved -out of the spam group. This is very useful when you use something -like the tick mark @samp{!} to mark ham---the article will be placed -in your @code{ham-process-destination}, unmarked as if it came fresh -from the mail server. +@item spam-mark-ham-unread-before-move-from-spam-group +If true, remove all marks from ham articles moved out of spam groups. + +By default any marks are preserved when moving ham articles from spam +groups. This is very useful when you use something like the tick mark +(@samp{!}) to mark ham---the article will be placed in your +@code{ham-process-destination}, unmarked as if it came fresh from the +mail server. @vindex spam-autodetect-recheck-messages -When autodetecting spam, this variable tells @code{spam.el} whether -only unseen articles or all unread articles should be checked for -spam. It is recommended that you leave it off. +@item spam-autodetect-recheck-messages +If true, unread articles are also checked when you enter a spam +autodetection group. + +By default only unseen articles and checked, and this is usually the +best method for handling spam detection. + +For more details, see @ref{Spam ELisp Package Scanning on Group Entry}. + +@end table + + +@node Spam ELisp Package Group Parameters +@subsubsection Spam ELisp Package Group Parameters + +The @file{spam.el} package uses a number of group parameters to control +the management of spam within Gnus. + +You can also use topic parameters (see @pxref{Topic Parameters}) and +@code{gnus-parameters} (see @pxref{Group Parameters}) to set these for +more than one group at a time, based on your Gnus configuration. + +Finally, each group parameter has an associated global variable that +allows you to map a regular expression matching group names to a set of +values. The variable is a list of regexp, value pairs, and the first +match is used. + +For example, the @code{spam-process} parameter could be set by +specifying: + +@example +(setq gnus-spam-process-newsgroups + '((":gmane\\." (spam-use-gmane)) + (".*" (spam-use-blacklist)))) +@end example + +The group parameters are, along with their associated global variables, +are: + +@table @code + +@cindex spam processors +@vindex gnus-spam-process-newsgroups +@item spam-process +@itemx gnus-spam-process-newsgroups +Configure the spam and ham processors applicable to this group. + +For details on configuring the spam and ham processors for a group, see +@ref{Spam Elisp Package Spam Processing on Group Exit}. + +@cindex spam group classification +@cindex group spam classification +@cindex group ham classification +@vindex gnus-spam-newsgroup-contents +@item spam-contents +@itemx gnus-spam-newsgroup-contents +Optionally classify the group as a spam or ham group. + +Appropriate values are: +@table @code +@item gnus-group-spam-classification-spam +This group contains spam messages only. + +@item gnus-group-spam-classification-ham +This group contains ham messages only. + +@end table + +It is perfectly acceptable to leave this value unset for all your +groups; @file{spam.el} will work perfectly well in such a configuration. +For more details, see @ref{Spam Elisp Package Spam Processing on Group +Exit}. + + +@cindex ham marks +@cindex spam ham marks +@item ham-marks +The list of marks that indicate an article is not spam. + +By default, the list contains the deleted, read, killed, kill-filed, and +low-score marks (the idea is that these articles have been read, but are +not spam). It can be useful to also include the tick mark in the ham +marks. + +It is not recommended to make the unread mark a ham mark, because it +normally indicates a lack of classification. But you can do it, and +we'll be happy for you. + +If you customize this value, you should use the symbolic values for the +marks (such as @code{gnus-read-mark}) rather than the characters +themselves, since that will adapt if you change the character used to +indicate a particular mark. + + +@cindex spam marks +@item spam-marks +The list of marks that indicate an article is spam, normally only the +spam mark (@samp{$}). These articles are processed as spam when you +leave the group. + +It is not recommended to change that, but you can if you really want to. +If so, be absolutely sure that you never accidentally give one of these +marks to a non-spam article. This is generally much harder than it +appears on the surface. + + +@cindex spam ham destination +@cindex ham moved after processing +@vindex gnus-ham-process-destinations +@item ham-process-destination +@itemx gnus-ham-process-destinations +The destination for ham articles to be moved to after processing. + +This can be a string, specifying a group on in the current back-end, or +the symbol @code{respool} indicating that the article should be +respooled. + +This can also be a list of destinations, in which case each is evaluated +in turn, allowing you to send articles to a particular group for further +training as well as respool to the normal destination. + +For a detailed explanation of this option, see @ref{Spam Elisp Package +Spam Processing on Group Exit}. + +@cindex spam spam destination +@cindex spam moved after processing +@vindex gnus-spam-process-destinations +@item spam-process-destination +@itemx gnus-spam-process-destinations +The destination for spam articles to be moved to after processing. + +This is a string, indicating a group in the current back-end to move the +spam article to after passing it through the group spam processors, if +any. + +It can also be a list of destinations, in which case each element should +be a valid destination. + +For a detailed explanation of this option, see @ref{Spam Elisp Package +Spam Processing on Group Exit}. + + +@cindex spam resend +@cindex resend spam to a mailbox +@item spam-resend-to +The email address to resend spam messages to when using the +@code{spam-use-resend} reporting interface. + +@end table + +@node Spam ELisp Package Spam and Ham Processors +@subsubsection Spam ELisp Package Spam and Ham Processors + +Not all of the @file{spam.el} processors support reporting both ham and +spam; this table summarizes the supported features for group exit +processors. For details on configuring these for a group, see @ref{Spam +Elisp Package Spam Processing on Group Exit}. + +The @samp{deprecated} names are how these spam and ham processors were +traditionally referred to. These should no longer be used, and are +included only for reference. Very recent processors do not have these +long-form names. + +@multitable @columnfractions .8 .1 .1 +@item processor +@tab spam +@tab ham + +@item @code{spam-use-BBDB} + +Update whitelist information in the BBDB database. @xref{BBDB +Whitelists}. + +(deprecated: @code{gnus-group-ham-exit-processor-BBDB}) + +@tab N +@tab Y + +@item @code{spam-use-blacklist} + +Update the simple @file{spam.el} blacklist. @xref{Blacklists and +Whitelists}. + +(deprecated: @code{gnus-group-spam-exit-processor-blacklist}) + +@tab Y +@tab N + +@item @code{spam-use-bogofilter} + +Train the bogofilter Bayesian engine on spam or ham. @xref{Bogofilter}. + +(deprecated: @code{gnus-group-spam-exit-processor-bogofilter}) + +@tab Y +@tab Y + +@item @code{spam-use-bsfilter} + +Train the bsfilter Bayesian engine on ham or spam. + +(deprecated: @code{gnus-group-spam-exit-processor-bsfilter}) + +@tab Y +@tab Y + +@item @code{spam-use-crm114} +@tab Y +@tab N + +@item @code{spam-use-gmane} + +Report spam through the Gmane web interface. @xref{Gmane Spam +Reporting}. + +@tab Y +@tab N + +@item @code{spam-use-ham-copy} + +Copy ham messages to another group. + +(deprecated: @code{gnus-group-ham-exit-processor-copy}) +@tab N +@tab Y + +@item @code{spam-use-ifile} + +Train the iFile text classifier on ham and spam. @xref{ifile spam +filtering}. + +(deprecated: @code{gnus-group-spam-exit-processor-ifile}) +@tab Y +@tab Y + +@item @code{spam-use-resend} + +Resend the spam article, unmodified, to an email address. Typically the +address feeds into a learning tool of some sort. + +The effect is the same as using the @code{gnus-summary-resend-message} +command manually. + +@tab Y +@tab N + +@item @code{spam-use-spamoracle} + +Train the SpamOracle Bayesian engine on ham and spam. @xref{SpamOracle}. + +(deprecated: @code{gnus-group-spam-exit-processor-spamoracle}) +@tab Y +@tab Y + +@item @code{spam-use-spamassassin} + +Train a local SpamAssassin Bayesian filter on ham and spam. +@xref{SpamAssassin backend}. + +(deprecated: @code{gnus-group-spam-exit-processor-spamassassin}) +@tab Y +@tab Y + +@item @code{spam-use-stat} + +Train the elisp spam-stat Bayesian filter on ham and spam. +@xref{spam-stat spam filtering}. + +(deprecated: @code{gnus-group-spam-exit-processor-stat}) +@tab Y +@tab Y + +@item @code{spam-use-whitelist} + +Update the simple @file{spam.el} whitelist. @xref{Blacklists and +Whitelists}. + +(deprecated: @code{gnus-group-ham-exit-processor-whitelist}) +@tab N +@tab Y + +@end multitable @node Spam ELisp Package Configuration Examples @subsubsection Spam ELisp Package Configuration Examples @@ -23125,7 +23495,7 @@ options and deletes them from the @samp{training.ham} and @samp{training.spam} folders. -With the following entries in @code{gnus-parameters}, @code{spam.el} +With the following entries in @code{gnus-parameters}, @file{spam.el} does most of the job for me: @lisp cvs server: Diffing texi/etc cvs server: Diffing texi/herds cvs server: Diffing texi/misc cvs server: Diffing texi/picons cvs server: Diffing texi/ps cvs server: Diffing texi/screen cvs server: Diffing texi/smilies cvs server: Diffing texi/xface