From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/9118 Path: main.gmane.org!not-for-mail From: Lars Magne Ingebrigtsen Newsgroups: gmane.emacs.gnus.general Subject: Re: adaptive word scoring Date: 06 Dec 1996 11:39:51 +0100 Sender: larsi@proletcult.slip.ifi.uio.no Message-ID: References: <199611290525.VAA00464@kim.teleport.com> <9612021508.AA23722@stud2.tuwien.ac.at> NNTP-Posting-Host: coloc-standby.netfonds.no Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Trace: main.gmane.org 1035149191 16572 80.91.224.250 (20 Oct 2002 21:26:31 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Sun, 20 Oct 2002 21:26:31 +0000 (UTC) Return-Path: Original-Received: from ifi.uio.no (0@ifi.uio.no [129.240.64.2]) by deanna.miranova.com (8.8.4/8.8.4) with SMTP id DAA27885 for ; Fri, 6 Dec 1996 03:24:15 -0800 Original-Received: from proletcult.slip.ifi.uio.no (root@ppp15.larris.ifi.uio.no [129.240.68.115]) by ifi.uio.no with ESMTP (8.6.11/ifi2.4) id for ; Fri, 6 Dec 1996 12:10:11 +0100 Original-Received: (from larsi@localhost) by proletcult.slip.ifi.uio.no (8.8.2/8.8.2) id LAA18685; Fri, 6 Dec 1996 11:39:54 +0100 Original-To: ding@ifi.uio.no In-Reply-To: Sean Lynch's message of 05 Dec 1996 13:21:45 -0800 Original-Lines: 19 X-Mailer: Red Gnus v0.74/Emacs 19.34 X-Face: &w!^oO~dS|}-P0~ge{$c!h\ writes: > I remember reading earlier in this thread about the possibility of > rating words based on interestingness, and I think this is probably > the way to go. The fundamental theorem of information theory tells us > that the value of any piece of information is inversely proportional > to its probability of occurrence. Therefore, we should keep some sort > of history of the number of occurrences of each word in the adaptive > scoring criteria (i.e. the subject lines) and estimate the probability > of each word's occurrence, weighting the affect of each word on the > final score by the inverse of the probability. Would it suffice to calculate this on the fly (from the articles currently in the summary buffer), or does this have to be stored in a database? -- (domestic pets only, the antidote for overdose, milk.) larsi@ifi.uio.no * Lars Ingebrigtsen