Announcements and discussions for Gnus, the GNU Emacs Usenet newsreader
 help / color / mirror / Atom feed
* Re: [OT] Bayesian filters and pseudo-clever spammers
       [not found] <87brv2pwta.fsf@phiwumbda.localnet>
@ 2003-08-06 17:47 ` Ted Zlatanov
  0 siblings, 0 replies; only message in thread
From: Ted Zlatanov @ 2003-08-06 17:47 UTC (permalink / raw)
  Cc: Jesse F. Hughes

On Wed, 06 Aug 2003, jesse@phiwumbda.org wrote:
> You see that this spammer has introduced comments willy-nilly to
> obfuscate the words that occur in the spam.  I guess it's easy to
> counteract this strategy: One simply writes the filter so that it
> ignores comments when parsing the mail.
> 
> My question is this: Is this what bogofilter, et al, do?  This spam
> didn't fool bogofilter at all.  So, either bogofilter has been
> explicitly coded to avoid these spam defenses, or the defenses just
> don't work because pseudo-words like "ev", "ene", "iag" and so on
> occur *only* in spam and so drive the score up naturally.
> 
> Sorry, I know this question isn't quite on-topic, but I thought
> someone here could answer it easily enough without having to dig
> through the source code.  Thanks for satisfying my curiosity.

The answer for Bogofilter is on the bogofilter mailing lists as well:

http://news.gmane.org/thread.php?group=gmane.mail.bogofilter.general

(search for "HTML comments" for instance)

Bogofilter strips away HTML comments.  It also has many improvements
to make it a not-so-naive not-so-Bayesian filter :)

For any other spam-detection tool, the answer is to look at the tool's
documentation.  Generally speaking, anti-spam filters are in an arms
race with spammers, so any simple tricks such as HTML comments are
detected and neutralized quickly because otherwise the anti-spam
filter wouldn't be very useful.

For Gnus in particular, spam-stat.el does not do this sort of HTML
parsing AFAIK.

Ted


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2003-08-06 17:47 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <87brv2pwta.fsf@phiwumbda.localnet>
2003-08-06 17:47 ` [OT] Bayesian filters and pseudo-clever spammers Ted Zlatanov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).