9fans - fans of the OS Plan 9 from Bell Labs
 help / color / mirror / Atom feed
* [9fans] /mail/lib/patterns
@ 2003-03-17 19:10 Steve Simon
  2003-03-17 19:32 ` Russ Cox
  0 siblings, 1 reply; 6+ messages in thread
From: Steve Simon @ 2003-03-17 19:10 UTC (permalink / raw)
  To: 9fans

Hi,

I seem to be getting more Spam per day...

Anyone have a through /mail/lib/patterns they would like to share?

-Steve


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] /mail/lib/patterns
  2003-03-17 19:10 [9fans] /mail/lib/patterns Steve Simon
@ 2003-03-17 19:32 ` Russ Cox
  2003-03-17 19:56   ` Mark Powers
  0 siblings, 1 reply; 6+ messages in thread
From: Russ Cox @ 2003-03-17 19:32 UTC (permalink / raw)
  To: 9fans

A few of us have experimented with bayesian
spam filters a la http://www.paulgraham.com/spam.html
and they seem to work well.

I don't think we have any code worth sharing 
at the moment, though.

Russ



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] /mail/lib/patterns
  2003-03-17 19:32 ` Russ Cox
@ 2003-03-17 19:56   ` Mark Powers
  2003-03-26  9:42     ` ozan s yigit
  0 siblings, 1 reply; 6+ messages in thread
From: Mark Powers @ 2003-03-17 19:56 UTC (permalink / raw)
  To: 9fans

people at panix seem happy with spamassassin. I haven't tried it, but
it's perl and shouldn't be too hard to plug in to upas.

---mp


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] /mail/lib/patterns
  2003-03-17 19:56   ` Mark Powers
@ 2003-03-26  9:42     ` ozan s yigit
  2003-03-26 10:23       ` Axel Belinfante
  0 siblings, 1 reply; 6+ messages in thread
From: ozan s yigit @ 2003-03-26  9:42 UTC (permalink / raw)
  To: 9fans

markp@panix.com (Mark Powers) writes:

> people at panix seem happy with spamassassin. I haven't tried it, but
> it's perl and shouldn't be too hard to plug in to upas.

we are using it. it is remarkable. it will catch just about every single
spam this very post will generate. the catch rate is astounding; it caught
2421 spam messages since i enabled it at feb25, but failed on two or three
messages (if memory serves) which could not be filtered from the headers,
came from appearently legitimate paths, and said something like "what do
you think about this one" (some strange, probably porn, web reference)
or "would you like to date me" or some such.

it is also great fun to read its content analysis. here is one from
last night (lightly reformatted)

Content analysis details:   (37.50 points, 5 required)
NO_REAL_NAME       (0.7 points)  From: does not include a real name
GAPPY_SUBJECT      (0.4 points)  Subject: contains G.a.p.p.y-T.e.x.t
SUBJ_HAS_SPACES    (2.0 points)  Subject contains lots of white space
REVERSE_AGING      (3.3 points)  BODY: Reverses Aging
AS_SEEN_ON         (2.1 points)  BODY: As seen on national TV!
HAIR_LOSS          (2.8 points)  BODY: Cures Baldness
BANG_OPRAH         (4.3 points)  BODY: Talks about Oprah with an Exclamation!
CLICK_BELOW_CAPS   (0.5 points)  BODY: Asks you to click below
                                 in capital letters)
BANG_EXERCISE      (2.2 points)  BODY: Talks about exercise with
                                 an exclamation!
HTML_20_30         (1.1 points)  BODY: Message is 20% to 30% HTML
HTML_MESSAGE       (0.1 points)  BODY: HTML included in message
HTML_LINK_CLICK_CAPS (1.1 points)  BODY: HTML link text says "CLICK"
HTML_LINK_CLICK_HERE (0.1 points)  BODY: HTML link text says "click here"
HTTP_USERNAME_USED (1.3 points)  URI: Uses a username in a URL
USERPASS           (1.0 points)  URI: URL contains username and
                                 (optional) password
MSGID_OUTLOOK_TIME (4.4 points)  Message-Id is fake
                                 (in Outlook Express format)
SUBJ_HAS_UNIQ_ID   (0.8 points)  Subject contains a unique ID
RCVD_IN_NJABL      (1.0 points)  RBL: Received via a relay in dnsbl.njabl.org
RCVD_IN_OSIRUSOFT_COM (0.6 points)  RBL: Received via a relay in
                                 relays.osirusoft.com
RCVD_IN_OPM        (4.3 points)  RBL: Received via a relay in opm.blitzed.org
X_OSIRU_SPAMWARE_SITE (1.1 points)  RBL: DNSBL: sender is a Spamware site
                                 or vendor
MIME_HTML_ONLY     (0.1 points)  Message only has text/html MIME parts
FORGED_MUA_EUDORA  (2.2 points)  Forged mail pretending to be from Eudora
---

hmm, paulgraham's bayesian spam filter two or three years from now, maybe
on top of his "lisp replacement" as a god's gift to all programmers?
wow, bestill my melting mailbox, i can hardly wait. :-P

[ Moderator's note:

  A bayesian mail filter was introduced as part of version 2.50
  of SpamAssassin.  If you are interested in looking at such mail
  filters, I've seen good reports of the following:

  bogofilter:
  http://bogofilter.sourceforge.net/

  bmf:
  http://sourceforge.net/projects/bmf

  SpamProbe:
  http://sourceforge.net/projects/spamprobe/ ]

oz
---
the great unheralded battle in the world is the battle between
those who have a sense of humour and those who don't. -- salman rushdie


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] /mail/lib/patterns
  2003-03-26  9:42     ` ozan s yigit
@ 2003-03-26 10:23       ` Axel Belinfante
  2003-03-26 14:40         ` ozan s yigit
  0 siblings, 1 reply; 6+ messages in thread
From: Axel Belinfante @ 2003-03-26 10:23 UTC (permalink / raw)
  To: 9fans

The funny thing is that, probably because you included the spam
assasin report, your message got classified as spam here.
here is the (slightly reformatted) report for your own message.

Axel.

SPAM: -------------------- Start SpamAssassin results ----------------------
SPAM: This mail is probably spam.  The original message has been altered
SPAM: so you can recognise or block similar unwanted mail in future.
SPAM: See http://spamassassin.org/tag/ for more details.
SPAM: 
SPAM: Content analysis details:   (5.8 hits, 5 required)
SPAM: PORN_10            (0.6 points)  BODY: Uses words and phrases which
                                             indicate porn (10)
SPAM: AS_SEEN_ON         (2.2 points)  BODY: As seen on national TV!
SPAM: CLICK_BELOW        (1.5 points)  BODY: Asks you to click below
SPAM: DOUBLE_CAPSWORD    (1.1 points)  BODY: A word in all caps repeated
                                             on the line
SPAM: GAPPY_TEXT         (0.4 points)  BODY: Contains 'G.a.p.p.y-T.e.x.t'
SPAM: 
SPAM: -------------------- End of SpamAssassin results ---------------------

> it is also great fun to read its content analysis. here is one from
> last night (lightly reformatted)



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [9fans] /mail/lib/patterns
  2003-03-26 10:23       ` Axel Belinfante
@ 2003-03-26 14:40         ` ozan s yigit
  0 siblings, 0 replies; 6+ messages in thread
From: ozan s yigit @ 2003-03-26 14:40 UTC (permalink / raw)
  To: 9fans

Axel.Belinfante@cs.utwente.nl (Axel Belinfante) writes:

> The funny thing is that, probably because you included the spam
> assasin report, your message got classified as spam here.

it goes to show what kind of improvement the mailing list would get
as soon as it is spamassassin-assisted... 8-)

oz
-- 
music is the space between the notes. | www.cs.yorku.ca/~oz
                   -- claude debussy  | york u. dept of computer science 


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2003-03-26 14:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-17 19:10 [9fans] /mail/lib/patterns Steve Simon
2003-03-17 19:32 ` Russ Cox
2003-03-17 19:56   ` Mark Powers
2003-03-26  9:42     ` ozan s yigit
2003-03-26 10:23       ` Axel Belinfante
2003-03-26 14:40         ` ozan s yigit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).