From mboxrd@z Thu Jan 1 00:00:00 1970 To: 9fans@cse.psu.edu From: ozan s yigit Message-ID: Content-Type: text/plain; charset=us-ascii References: , <20030317195622.GA1578@panix.com> Subject: Re: [9fans] /mail/lib/patterns Date: Wed, 26 Mar 2003 09:42:27 +0000 Topicbox-Message-UUID: 84d8a2c2-eacb-11e9-9e20-41e7f4b1d025 markp@panix.com (Mark Powers) writes: > people at panix seem happy with spamassassin. I haven't tried it, but > it's perl and shouldn't be too hard to plug in to upas. we are using it. it is remarkable. it will catch just about every single spam this very post will generate. the catch rate is astounding; it caught 2421 spam messages since i enabled it at feb25, but failed on two or three messages (if memory serves) which could not be filtered from the headers, came from appearently legitimate paths, and said something like "what do you think about this one" (some strange, probably porn, web reference) or "would you like to date me" or some such. it is also great fun to read its content analysis. here is one from last night (lightly reformatted) Content analysis details: (37.50 points, 5 required) NO_REAL_NAME (0.7 points) From: does not include a real name GAPPY_SUBJECT (0.4 points) Subject: contains G.a.p.p.y-T.e.x.t SUBJ_HAS_SPACES (2.0 points) Subject contains lots of white space REVERSE_AGING (3.3 points) BODY: Reverses Aging AS_SEEN_ON (2.1 points) BODY: As seen on national TV! HAIR_LOSS (2.8 points) BODY: Cures Baldness BANG_OPRAH (4.3 points) BODY: Talks about Oprah with an Exclamation! CLICK_BELOW_CAPS (0.5 points) BODY: Asks you to click below in capital letters) BANG_EXERCISE (2.2 points) BODY: Talks about exercise with an exclamation! HTML_20_30 (1.1 points) BODY: Message is 20% to 30% HTML HTML_MESSAGE (0.1 points) BODY: HTML included in message HTML_LINK_CLICK_CAPS (1.1 points) BODY: HTML link text says "CLICK" HTML_LINK_CLICK_HERE (0.1 points) BODY: HTML link text says "click here" HTTP_USERNAME_USED (1.3 points) URI: Uses a username in a URL USERPASS (1.0 points) URI: URL contains username and (optional) password MSGID_OUTLOOK_TIME (4.4 points) Message-Id is fake (in Outlook Express format) SUBJ_HAS_UNIQ_ID (0.8 points) Subject contains a unique ID RCVD_IN_NJABL (1.0 points) RBL: Received via a relay in dnsbl.njabl.org RCVD_IN_OSIRUSOFT_COM (0.6 points) RBL: Received via a relay in relays.osirusoft.com RCVD_IN_OPM (4.3 points) RBL: Received via a relay in opm.blitzed.org X_OSIRU_SPAMWARE_SITE (1.1 points) RBL: DNSBL: sender is a Spamware site or vendor MIME_HTML_ONLY (0.1 points) Message only has text/html MIME parts FORGED_MUA_EUDORA (2.2 points) Forged mail pretending to be from Eudora --- hmm, paulgraham's bayesian spam filter two or three years from now, maybe on top of his "lisp replacement" as a god's gift to all programmers? wow, bestill my melting mailbox, i can hardly wait. :-P [ Moderator's note: A bayesian mail filter was introduced as part of version 2.50 of SpamAssassin. If you are interested in looking at such mail filters, I've seen good reports of the following: bogofilter: http://bogofilter.sourceforge.net/ bmf: http://sourceforge.net/projects/bmf SpamProbe: http://sourceforge.net/projects/spamprobe/ ] oz --- the great unheralded battle in the world is the battle between those who have a sense of humour and those who don't. -- salman rushdie