Gnus development mailing list
 help / color / mirror / Atom feed
* adaptive word scoring
@ 1996-11-29  5:25 Felix Lee
  1996-11-29  8:09 ` Kai Grossjohann
                   ` (3 more replies)
  0 siblings, 4 replies; 30+ messages in thread
From: Felix Lee @ 1996-11-29  5:25 UTC (permalink / raw)


so after using adaptive word scoring for a while, I've
decided that it's mostly useless.

say you're an avid fan of alt.sex.pictures.emacs.  the word
"gif" is fairly common and mostly neutral: you can't tell if
an article is interesting based on the word "gif".

however, adaptive scoring treats "gif" as significant in an
odd way.  if you kill a massive series of "vi pinup gif"s,
then adaptive scoring is going to reduce the score of "gif"
by an amount proportional to the number of articles you've
killed.  this significantly affects the score of those
really sexy emacs gifs.

ok, you could add "gif" to the ignored-word list, but this
is just one instance of a more general problem.

my current thoughts are:

- adaptive scoring should try to discover _useful_
  discriminants by comparing interesting v. uninteresting
  articles.  the ignored-word list should be unnecessary.

- rather than adjusting score by N for every article marked,
  marked articles should be assigned a score target, and
  adaptive-scoring elements should be adjusted to try to hit
  the target.

comments?  I'm not sure how to implement this, yet.
--


^ permalink raw reply	[flat|nested] 30+ messages in thread
* Adaptive word scoring
@ 1996-10-31  1:34 Sten Drescher
  1996-11-05 15:51 ` Robert Bihlmeyer
  1996-11-05 21:25 ` Lars Magne Ingebrigtsen
  0 siblings, 2 replies; 30+ messages in thread
From: Sten Drescher @ 1996-10-31  1:34 UTC (permalink / raw)



	I've been using adaptive word scoring, and I've noticed two
problems:

	1) You can't do a setq or defvar of the
gnus-default-adaptive-word-score-alist in .gnus, using the mark names
as shown on the info page, because they haven't been defined yet.  You
can use the numeric values resulting from setting the variable after
starting Gnus, but I'm not really comfortable doing that.

	2) All other adaptive scoring stops when you do word scoring.
Yes, I want the words adapted, but I still want the authors and
followups adapted as well.  Is there any way this can be done?


-- 
+----------------------  Tivoli Customer Support  ----------------------+
|   Sten Drescher                     Tivoli Systems, Inc               |
|   email: sten.drescher@tivoli.com   9442 Capital of Texas Hwy North   |
|   phone: (512) 794-9070             Arboretum Plaza One, Suite 500    |
|   fax  : (512) 345-2784             Austin, Texas 78759               |
+-----------------------------------------------------------------------+


^ permalink raw reply	[flat|nested] 30+ messages in thread
* Adaptive word scoring
@ 1996-08-04  2:57 Lars Magne Ingebrigtsen
  1996-08-04 17:19 ` François Pinard
  0 siblings, 1 reply; 30+ messages in thread
From: Lars Magne Ingebrigtsen @ 1996-08-04  2:57 UTC (permalink / raw)



I want to implement adaptive scoring on words.  Since this will
generate a *lot* of score rules, I think I have to write a new match
method -- `w'.  One could use string search (to inaccurate), regexp
search with "\bword\b" (too slow), split all subjects into words and
put them in a list and use `member' (too slow), intern them and use
`memq' (better, but too slow), split into words and put in a buffer
and use search for "\nword\n" (slowish).  So I think I'll split into
words and use a hash table, which seems to be the fastest way.  This
means that the words'll have to be downcased, so one can't then have
case-sensitive word matches.

Yes.  Does anybody have a list of common English "small words" --
"and", "the", etc., that should be excluded from adaptiation?  This
should be configurable, of course.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@ifi.uio.no * Lars Ingebrigtsen


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~1996-12-11  0:44 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1996-11-29  5:25 adaptive word scoring Felix Lee
1996-11-29  8:09 ` Kai Grossjohann
1996-11-29 22:48   ` Felix Lee
1996-11-30 13:18     ` Lars Magne Ingebrigtsen
1996-12-01  8:39       ` Felix Lee
1996-11-29 15:45 ` Jan Vroonhof
1996-11-30  2:28   ` Felix Lee
1996-12-02  9:37   ` Steinar Bang
1996-12-02  9:40 ` Wesley.Hardaker
1996-12-05 18:49   ` Lars Magne Ingebrigtsen
1996-12-06  8:18     ` Wesley.Hardaker
1996-12-02 11:46 ` Hans de Graaff
1996-12-02 15:08   ` Robert Bihlmeyer
1996-12-05 18:50     ` Lars Magne Ingebrigtsen
1996-12-05 21:21       ` Sean Lynch
1996-12-06 10:39         ` Lars Magne Ingebrigtsen
1996-12-08 22:19           ` Sean Lynch
1996-12-11  0:44             ` Lars Magne Ingebrigtsen
1996-12-06 21:02         ` Janne Sinkkonen
1996-12-08 22:48           ` Sean Lynch
1996-12-10 22:25             ` nnspool virtual server shows funny numbers of articles C. R. Oldham
1996-12-11  0:42               ` Lars Magne Ingebrigtsen
     [not found]   ` <vcn2vvixpz.fsf@totally-fudged-out-message-id>
1996-12-03 13:51     ` adaptive word scoring Holger Franz
  -- strict thread matches above, loose matches on Subject: below --
1996-10-31  1:34 Adaptive " Sten Drescher
1996-11-05 15:51 ` Robert Bihlmeyer
1996-11-05 17:16   ` Per Abrahamsen
1996-11-05 21:24   ` Lars Magne Ingebrigtsen
1996-11-05 21:25 ` Lars Magne Ingebrigtsen
1996-08-04  2:57 Lars Magne Ingebrigtsen
1996-08-04 17:19 ` François Pinard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).