Gnus development mailing list
 help / color / mirror / Atom feed
From: Janne Sinkkonen <janne@avocado.pc.helsinki.fi>
Cc: Lars Magne Ingebrigtsen <larsi@ifi.uio.no>, ding@ifi.uio.no
Subject: Re: adaptive word scoring
Date: 06 Dec 1996 23:02:09 +0200	[thread overview]
Message-ID: <oak9qvgze6.fsf@avocado.pc.helsinki.fi> (raw)
In-Reply-To: Sean Lynch's message of 05 Dec 1996 13:21:45 -0800

Sean Lynch <seanl@Internex.NET> writes:

> the way to go.  The fundamental theorem of information theory tells us
> that the value of any piece of information is inversely proportional
> to its probability of occurrence.

To the logarithm of the probability, actually. This holds as long as
the events are independent. The occurrence of words depend on the
context, but we get an approximation anyway.

> The score of the word in the database would be adjusted by adding
> (old score - new score)/c to it, where c is the speed of light.

This makes sense (given c is in appropriate units).

> C could decrease over time so that scores would stabilize, though
> this would cause scores to stop adapting eventually.

I vote against decreasing C. Instead, it should be a constant small
value, say something between 0.05 and 0.001. Reading pattern changes
etc. - the scores should adapt all the time. 

> Obviously, there would be some sort of thresholding function to drop
> words with a large probability of occurrence.

And words with small probabilities should not be in the calculations
because the probability estimates are unstable.

-- 
Janne Sinkkonen      <janne@iki.fi>      <URL: http://www.iki.fi/~janne/ >


  parent reply	other threads:[~1996-12-06 21:02 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1996-11-29  5:25 Felix Lee
1996-11-29  8:09 ` Kai Grossjohann
1996-11-29 22:48   ` Felix Lee
1996-11-30 13:18     ` Lars Magne Ingebrigtsen
1996-12-01  8:39       ` Felix Lee
1996-11-29 15:45 ` Jan Vroonhof
1996-11-30  2:28   ` Felix Lee
1996-12-02  9:37   ` Steinar Bang
1996-12-02  9:40 ` Wesley.Hardaker
1996-12-05 18:49   ` Lars Magne Ingebrigtsen
1996-12-06  8:18     ` Wesley.Hardaker
1996-12-02 11:46 ` Hans de Graaff
1996-12-02 15:08   ` Robert Bihlmeyer
1996-12-05 18:50     ` Lars Magne Ingebrigtsen
1996-12-05 21:21       ` Sean Lynch
1996-12-06 10:39         ` Lars Magne Ingebrigtsen
1996-12-08 22:19           ` Sean Lynch
1996-12-11  0:44             ` Lars Magne Ingebrigtsen
1996-12-06 21:02         ` Janne Sinkkonen [this message]
1996-12-08 22:48           ` Sean Lynch
1996-12-10 22:25             ` nnspool virtual server shows funny numbers of articles C. R. Oldham
1996-12-11  0:42               ` Lars Magne Ingebrigtsen
     [not found]   ` <vcn2vvixpz.fsf@totally-fudged-out-message-id>
1996-12-03 13:51     ` adaptive word scoring Holger Franz
  -- strict thread matches above, loose matches on Subject: below --
1996-10-31  1:34 Adaptive " Sten Drescher
1996-11-05 15:51 ` Robert Bihlmeyer
1996-11-05 17:16   ` Per Abrahamsen
1996-11-05 21:24   ` Lars Magne Ingebrigtsen
1996-11-05 21:25 ` Lars Magne Ingebrigtsen
1996-08-04  2:57 Lars Magne Ingebrigtsen
1996-08-04 17:19 ` François Pinard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=oak9qvgze6.fsf@avocado.pc.helsinki.fi \
    --to=janne@avocado.pc.helsinki.fi \
    --cc=ding@ifi.uio.no \
    --cc=larsi@ifi.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).