From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/46171 Path: main.gmane.org!not-for-mail From: Oliver Scholz Newsgroups: gmane.emacs.gnus.general Subject: Re: Paul Graham on fighting SPAM Date: Mon, 19 Aug 2002 12:50:24 +0200 Organization: Olymp Sender: owner-ding@hpc.uh.edu Message-ID: References: NNTP-Posting-Host: localhost.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: 8bit X-Trace: main.gmane.org 1029751313 10945 127.0.0.1 (19 Aug 2002 10:01:53 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 19 Aug 2002 10:01:53 +0000 (UTC) Return-path: Original-Received: from malifon.math.uh.edu ([129.7.128.13]) by main.gmane.org with esmtp (Exim 3.35 #1 (Debian)) id 17gjM5-0002pp-00 for ; Mon, 19 Aug 2002 12:01:49 +0200 Original-Received: from sina.hpc.uh.edu ([129.7.128.10] ident=lists) by malifon.math.uh.edu with esmtp (Exim 3.20 #1) id 17gjM0-0007lE-00; Mon, 19 Aug 2002 05:01:44 -0500 Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Mon, 19 Aug 2002 05:02:14 -0500 (CDT) Original-Received: from sclp3.sclp.com (qmailr@sclp3.sclp.com [209.196.61.66]) by sina.hpc.uh.edu (8.9.3/8.9.3) with SMTP id FAA22606 for ; Mon, 19 Aug 2002 05:01:59 -0500 (CDT) Original-Received: (qmail 7627 invoked by alias); 19 Aug 2002 10:01:18 -0000 Original-Received: (qmail 7622 invoked from network); 19 Aug 2002 10:01:18 -0000 Original-Received: from main.gmane.org (80.91.224.249) by gnus.org with SMTP; 19 Aug 2002 10:01:18 -0000 Original-Received: from root by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 17gjKR-0002Mo-00 for ; Mon, 19 Aug 2002 12:00:07 +0200 Original-To: ding@gnus.org X-Injected-Via-Gmane: http://gmane.org/ Original-Received: from news by main.gmane.org with local (Exim 3.35 #1 (Debian)) id 17giZX-0003EP-00 for ; Mon, 19 Aug 2002 11:11:39 +0200 Original-Path: hermes!nobody Original-Newsgroups: gmane.emacs.ding Original-Lines: 25 Original-NNTP-Posting-Host: dialin-145-254-204-087.arcor-ip.net Original-X-Trace: main.gmane.org 1029748299 12372 145.254.204.87 (19 Aug 2002 09:11:39 GMT) Original-X-Complaints-To: usenet@main.gmane.org Original-NNTP-Posting-Date: Mon, 19 Aug 2002 09:11:39 +0000 (UTC) X-Operating-System: Linux from Scratch X-Attribution: os X-Face: "HgH2sgK|bfH$;PiOJI6|qUCf.ve<51_Od(%ynHr?=>znn#~#oS>",F%B8&\vus),2AsPYb -n>PgddtGEn}s7kH?7kH{P_~vu?]OvVN^qD(L)>G^gDCl(U9n{:d>'DkilN!_K"eNzjrtI4Ya6;Td% IZGMbJ{lawG+'J>QXPZD&TwWU@^~A}f^zAb[Ru;CT(UA]c& User-Agent: Gnus/5.090008 (Oort Gnus v0.08) Emacs/21.2 (i686-pc-linux-gnu) Cancel-Lock: sha1:+ojM4fraQx7IOXkm6eBxCdk1/HQ= Precedence: list X-Majordomo: 1.94.jlt7 Xref: main.gmane.org gmane.emacs.gnus.general:46171 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:46171 prj@po.cwru.edu (Paul Jarc) writes: > Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) wrote: >> There is a research field known as "information filtering" or >> "(automatic) text classification" or "text categorization". I don't >> know the details of the theory, but folks in that community are >> speaking of "naive Bayes classifiers" as one of the ways to do it -- >> maybe that's similar to his approach. > > Sounds like it. Anyone know if this (or another) method generalizes > to more than two categories (spam/nonspam)? If so, it could be used > for all mail splitting. We wouldn't have to manually craft split > rules; we'd just seed a new group with the mails we have so far that > belong there, and their contents would let the computer guess which > new mails belong with them. [...] Cool! I wonder if this technique could be abused to get a more sophisticated adaptive scoring, too ... -- Oliver -- 2 Fructidor an 210 de la Révolution Liberté, Egalité, Fraternité!