Gnus development mailing list
 help / color / mirror / Atom feed
* adaptive word score, and numbers
@ 1996-11-18 12:00 Steinar Bang
  1996-11-18 15:04 ` Lars Balker Rasmussen
  1996-11-18 17:56 ` Lars Magne Ingebrigtsen
  0 siblings, 2 replies; 14+ messages in thread
From: Steinar Bang @ 1996-11-18 12:00 UTC (permalink / raw)


One high volume group I've just tried using adaptive word score (or
whatever it's called) in, is rec.aviation.military.  Other groups I
would eventually try this out in, are motorcycle groups.

A common factor for these groups, are subjects, with letter
combinations, followed by numbers.  And these letter combinations are
relevant for what I want the score to adapt to.  Eg. in rec.av.mil, I
would like it to increase score for F-104 (actually also for CF-104,
F104, and other interesting combinations), while I would like to sink
score for anything with "TWA 800" in subject.  In rec.moto I would
like to increase score on "VFR750F" (Yeah!  Go little Viffer! Go! Go!
GO! :-), and sink in on everything with HD in it (though that's not
really relevant...)

Is this possible, without diving into the source code?  As I'm never
sure of which versions of the Gnus info file I'm reading, when I do
C-h i, I haven't even looked there right now (I'm doing stuff like
this when compiling and linking, so it's a bit broken up, and it's
hard to concentrate enough to understand the source code).


- Steinar


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: adaptive word score, and numbers
  1996-11-18 12:00 adaptive word score, and numbers Steinar Bang
@ 1996-11-18 15:04 ` Lars Balker Rasmussen
  1996-11-18 16:06   ` Sudish Joseph
  1996-11-18 17:56 ` Lars Magne Ingebrigtsen
  1 sibling, 1 reply; 14+ messages in thread
From: Lars Balker Rasmussen @ 1996-11-18 15:04 UTC (permalink / raw)


Steinar Bang <sb@metis.no> writes:
> [...] As I'm never sure of which versions of the Gnus info file I'm
> reading, when I do C-h i [...]

I have a couple of handy lines in my .gnus
;; (info "~gnort/source/gnus/red-gnus/texi/gnus")
;; (info "~gnort/source/gnus/red-gnus/texi/message")

Just C-x C-e at the end of the right line, and you'll get the right
file!  Amazing, innit?

(red-gnus is a link which I have to update far too often, but hey... ;) )
-- 
Lars Balker Rasmussen                                              - Duck!
<URL:http://www.daimi.aau.dk/~gnort/>                              - Where!?!


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: adaptive word score, and numbers
  1996-11-18 15:04 ` Lars Balker Rasmussen
@ 1996-11-18 16:06   ` Sudish Joseph
  1996-11-18 18:07     ` David Moore
  0 siblings, 1 reply; 14+ messages in thread
From: Sudish Joseph @ 1996-11-18 16:06 UTC (permalink / raw)


In article <0fenhr5tsi.fsf@fraxinus.daimi.aau.dk>,
Lars Balker Rasmussen <gnort@daimi.aau.dk> writes:
> Steinar Bang <sb@metis.no> writes:
> I have a couple of handy lines in my .gnus
> ;; (info "~gnort/source/gnus/red-gnus/texi/gnus")
> ;; (info "~gnort/source/gnus/red-gnus/texi/message")

> Just C-x C-e at the end of the right line, and you'll get the right
> file!  Amazing, innit?

> (red-gnus is a link which I have to update far too often, but hey... ;) )

I maintain a link for loading Gnus, too.  However, I've never had a
problem with getting either of (X)Emacs to respect
Info-default-directory-list.  From an init file:

(setq Info-default-directory-list
      `(,(expand-file-name "~/xemacs/site-lisp/ding/texi/")
	,(expand-file-name "~/xemacs/site-lisp/info/")
	,@Info-default-directory-list
	"/usr/info/"
	"/usr/local/info/"))

There're three symlinks on the way to .../texi. :-)

-Sudish


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: adaptive word score, and numbers
  1996-11-18 12:00 adaptive word score, and numbers Steinar Bang
  1996-11-18 15:04 ` Lars Balker Rasmussen
@ 1996-11-18 17:56 ` Lars Magne Ingebrigtsen
  1996-11-19  7:02   ` Steinar Bang
  1 sibling, 1 reply; 14+ messages in thread
From: Lars Magne Ingebrigtsen @ 1996-11-18 17:56 UTC (permalink / raw)


Steinar Bang <sb@metis.no> writes:

> A common factor for these groups, are subjects, with letter
> combinations, followed by numbers.  And these letter combinations are
> relevant for what I want the score to adapt to.  Eg. in rec.av.mil, I
> would like it to increase score for F-104 (actually also for CF-104,
> F104, and other interesting combinations), while I would like to sink
> score for anything with "TWA 800" in subject.  In rec.moto I would
> like to increase score on "VFR750F" (Yeah!  Go little Viffer! Go! Go!
> GO! :-), and sink in on everything with HD in it (though that's not
> really relevant...)

Well, I think this is just what adaptive word scoring will get you.
If you don't read articles with "TWA 800" in the subjects, the words
"TWA" and "800" will get their scores lowered.  

If the words you want to score on contain "-", you should modify the
syntax entry for that character to be word-consituent:

(setq gnus-adaptive-word-syntax-table 
      (copy-syntax-table (standard-syntax-table)))
(modify-syntax-entry ?- "w" gnus-adaptive-word-syntax-table)

(By the way, by default, numbers are considered to be white space when
doing word scoring.)

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@ifi.uio.no * Lars Ingebrigtsen


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: adaptive word score, and numbers
  1996-11-18 16:06   ` Sudish Joseph
@ 1996-11-18 18:07     ` David Moore
  0 siblings, 0 replies; 14+ messages in thread
From: David Moore @ 1996-11-18 18:07 UTC (permalink / raw)


Sudish Joseph <sudish@mindspring.com> writes:

> In article <0fenhr5tsi.fsf@fraxinus.daimi.aau.dk>,
> Lars Balker Rasmussen <gnort@daimi.aau.dk> writes:
> > Steinar Bang <sb@metis.no> writes:
> > I have a couple of handy lines in my .gnus
> > ;; (info "~gnort/source/gnus/red-gnus/texi/gnus")
> > ;; (info "~gnort/source/gnus/red-gnus/texi/message")
> 
> > Just C-x C-e at the end of the right line, and you'll get the right
> > file!  Amazing, innit?
> 
> > (red-gnus is a link which I have to update far too often, but hey... ;) )
> 
> I maintain a link for loading Gnus, too.  However, I've never had a
> problem with getting either of (X)Emacs to respect
> Info-default-directory-list.  From an init file:

	I've never had problems with Info-default-directory-list
either.  Currently, I don't go with the symlink approach, but have a
variable in my emacs init file for finding the version I want:

(defvar dmoore::rgnus-version "/export/tmp/rgnus-0.63/")

;;; for some suitable defintion of add-to-load-path. ;-)
(mapcar 'add-to-load-path
	(list "~/emacs/bbdb-1.51" "~/emacs" "~/emacs/dmoore"
	      (concat dmoore::rgnus-version "lisp")))

(setq Info-default-directory-list
      (cons (concat dmoore::rgnus-version "texi")
	    Info-default-directory-list))

-- 
David Moore <dmoore@ucsd.edu>       | Computer Systems Lab      __o
UCSD Dept. Computer Science - 0114  | Work: (619) 534-8604    _ \<,_
La Jolla, CA 92093-0114             | Fax:  (619) 534-1445   (_)/ (_)
<URL:http://oj.egbt.org/dmoore/>    | Solo Furnace Creek 508 -- 1996!


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: adaptive word score, and numbers
  1996-11-18 17:56 ` Lars Magne Ingebrigtsen
@ 1996-11-19  7:02   ` Steinar Bang
  1996-11-20 14:50     ` Steinar Bang
  0 siblings, 1 reply; 14+ messages in thread
From: Steinar Bang @ 1996-11-19  7:02 UTC (permalink / raw)


>>>>> Lars Magne Ingebrigtsen <larsi@ifi.uio.no>:

> Well, I think this is just what adaptive word scoring will get you.
> If you don't read articles with "TWA 800" in the subjects, the words
> "TWA" and "800" will get their scores lowered.  

Not "800", no (at least, no such numbers is in rec.aviation.military.ADAPT)

> If the words you want to score on contain "-", you should modify the
> syntax entry for that character to be word-consituent:

> (setq gnus-adaptive-word-syntax-table 
>       (copy-syntax-table (standard-syntax-table)))
> (modify-syntax-entry ?- "w" gnus-adaptive-word-syntax-table)

> (By the way, by default, numbers are considered to be white space when
> doing word scoring.)

Yup!  I'll look into changing it, and what effects it would have.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: adaptive word score, and numbers
  1996-11-19  7:02   ` Steinar Bang
@ 1996-11-20 14:50     ` Steinar Bang
  1996-11-20 15:26       ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 14+ messages in thread
From: Steinar Bang @ 1996-11-20 14:50 UTC (permalink / raw)


>>>>> Steinar Bang <sb@metis.no>:

>> (setq gnus-adaptive-word-syntax-table 
>>       (copy-syntax-table (standard-syntax-table)))
>> (modify-syntax-entry ?- "w" gnus-adaptive-word-syntax-table)

>> (By the way, by default, numbers are considered to be white space when
>> doing word scoring.)

> Yup!  I'll look into changing it, and what effects it would have.

OK.  So now I've checked with gnus-score.el, and if I've understood
this correctly, the default is standard-syntax-table, explicitly minus
numbers...?

Hmm... what's the reason for not counting numbers as words?  Since
you've explicitly removed them, I assume this is deliberate, but I
can't immediately see the reason for it...?


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: adaptive word score, and numbers
  1996-11-20 14:50     ` Steinar Bang
@ 1996-11-20 15:26       ` Lars Magne Ingebrigtsen
  1996-11-22  8:46         ` Wesley.Hardaker
  0 siblings, 1 reply; 14+ messages in thread
From: Lars Magne Ingebrigtsen @ 1996-11-20 15:26 UTC (permalink / raw)


Steinar Bang <sb@metis.no> writes:

> OK.  So now I've checked with gnus-score.el, and if I've understood
> this correctly, the default is standard-syntax-table, explicitly minus
> numbers...?
> 
> Hmm... what's the reason for not counting numbers as words?  Since
> you've explicitly removed them, I assume this is deliberate, but I
> can't immediately see the reason for it...?

Uhm...  I can't remember either why I did it.  Hm, well, I also
exclude all "common" words, so I guess it was just because most
numbers are rather common?  There are so many numbers that scoring on
them is mostly pointless?  Or something?

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@ifi.uio.no * Lars Ingebrigtsen


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: adaptive word score, and numbers
  1996-11-20 15:26       ` Lars Magne Ingebrigtsen
@ 1996-11-22  8:46         ` Wesley.Hardaker
  1996-11-23  3:54           ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 14+ messages in thread
From: Wesley.Hardaker @ 1996-11-22  8:46 UTC (permalink / raw)


Lars Magne Ingebrigtsen <larsi@ifi.uio.no> writes:

> Uhm...  I can't remember either why I did it.  Hm, well, I also
> exclude all "common" words, so I guess it was just because most
> numbers are rather common?  There are so many numbers that scoring on
> them is mostly pointless?  Or something?

Well, I've always thought they should be scored myself...  However, I
don't use word scoring all that much so I'm not an expert.  I have it
turned on in a few high volume groups, but I've found that because I
read so few articles in high volume groups that sooner or later my
high volume groups reduce themselves to very low volume groups as the
word scoring makes most articles marked below my expunge marker.  I
have a feeling that word scoring would work best in groups where you
read 30% of the articles or so, and not something like .1% to .5% like I
do.

Anyway, the problem with numbers is that typically when you want to
score them its in a version number string:

Red Gnus 0.68

for instance.  Technically you want to keep the 0. with the 68 as its
really one word in that context (unlike words, where red.gnus would be
two most likely).  That way it would work in groups where you ended up
with discussions like "68 ways to clean a monkeys mouth with topic
mode".

...
more random coments, because, well...  thats what I do!
Wes


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: adaptive word score, and numbers
  1996-11-22  8:46         ` Wesley.Hardaker
@ 1996-11-23  3:54           ` Lars Magne Ingebrigtsen
  1996-11-23 14:18             ` Per Abrahamsen
  0 siblings, 1 reply; 14+ messages in thread
From: Lars Magne Ingebrigtsen @ 1996-11-23  3:54 UTC (permalink / raw)


Wesley.Hardaker@sphys.unil.ch writes:

> Anyway, the problem with numbers is that typically when you want to
> score them its in a version number string:
> 
> Red Gnus 0.68
> 
> for instance.  Technically you want to keep the 0. with the 68 as its
> really one word in that context (unlike words, where red.gnus would be
> two most likely).  That way it would work in groups where you ended up
> with discussions like "68 ways to clean a monkeys mouth with topic
> mode".

Yup.  Which is a good argument for ignoring numbers by default.
Numbers are too, uhm, non-distinctive.  While words like "quisquose"
aren't likely to appear in subject headers that don't deal with
quisquosian things, the number 42 is as likely to appear in something
that talks about the cosmic constant as it is in articles about speed
limits.  Or something.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@ifi.uio.no * Lars Ingebrigtsen


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: adaptive word score, and numbers
  1996-11-23  3:54           ` Lars Magne Ingebrigtsen
@ 1996-11-23 14:18             ` Per Abrahamsen
  1996-11-25  3:39               ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 14+ messages in thread
From: Per Abrahamsen @ 1996-11-23 14:18 UTC (permalink / raw)



Lars Magne Ingebrigtsen <larsi@ifi.uio.no> writes:

> quisquosian things, the number 42 is as likely to appear in something
> that talks about the cosmic constant as it is in articles about speed
> limits.  Or something.

If you read rec.cats.pets and see 42 in the subject of a thread,
*someone* will make the Douglas Adams reference.  This can be a reason
to avoid or enter the thread, depending on your preferences.

I would also guess that the number 666 increase the chance of an
article being entertaining, while 900 decrease it. 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: adaptive word score, and numbers
  1996-11-23 14:18             ` Per Abrahamsen
@ 1996-11-25  3:39               ` Lars Magne Ingebrigtsen
  1996-11-25  8:39                 ` Wesley.Hardaker
  0 siblings, 1 reply; 14+ messages in thread
From: Lars Magne Ingebrigtsen @ 1996-11-25  3:39 UTC (permalink / raw)


Per Abrahamsen <abraham@dina.kvl.dk> writes:

> Lars Magne Ingebrigtsen <larsi@ifi.uio.no> writes:
> 
> > quisquosian things, the number 42 is as likely to appear in something
> > that talks about the cosmic constant as it is in articles about speed
> > limits.  Or something.
> 
> If you read rec.cats.pets and see 42 in the subject of a thread,
> *someone* will make the Douglas Adams reference.  This can be a reason
> to avoid or enter the thread, depending on your preferences.
> 
> I would also guess that the number 666 increase the chance of an
> article being entertaining, while 900 decrease it. 

:-)  Well, should I start treating numbers as word-constituent
characters when doing word scores, then?  Will this yield better or
worse results?  I'm mostly afraid that all those different numbers
will just clutter up the ADAPT files without doing much good...

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@ifi.uio.no * Lars Ingebrigtsen


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: adaptive word score, and numbers
  1996-11-25  3:39               ` Lars Magne Ingebrigtsen
@ 1996-11-25  8:39                 ` Wesley.Hardaker
  1996-11-25 11:13                   ` Steinar Bang
  0 siblings, 1 reply; 14+ messages in thread
From: Wesley.Hardaker @ 1996-11-25  8:39 UTC (permalink / raw)


Lars Magne Ingebrigtsen <larsi@ifi.uio.no> writes:

> :-)  Well, should I start treating numbers as word-constituent
> characters when doing word scores, then?  Will this yield better or
> worse results?  I'm mostly afraid that all those different numbers
> will just clutter up the ADAPT files without doing much good...

*sniff* *sniff*...  Hmmm..   Smells like...  _A new variable!_

gnus-adaptive-include-numerics

???
Wes


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: adaptive word score, and numbers
  1996-11-25  8:39                 ` Wesley.Hardaker
@ 1996-11-25 11:13                   ` Steinar Bang
  0 siblings, 0 replies; 14+ messages in thread
From: Steinar Bang @ 1996-11-25 11:13 UTC (permalink / raw)


>>>>> Wesley.Hardaker@sphys.unil.ch:

> Lars Magne Ingebrigtsen <larsi@ifi.uio.no> writes:
>> :-)  Well, should I start treating numbers as word-constituent
>> characters when doing word scores, then?  Will this yield better or
>> worse results?  I'm mostly afraid that all those different numbers
>> will just clutter up the ADAPT files without doing much good...

> *sniff* *sniff*...  Hmmm..   Smells like...  _A new variable!_

> gnus-adaptive-include-numerics

I didn't want to be the one to suggest it...


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~1996-11-25 11:13 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1996-11-18 12:00 adaptive word score, and numbers Steinar Bang
1996-11-18 15:04 ` Lars Balker Rasmussen
1996-11-18 16:06   ` Sudish Joseph
1996-11-18 18:07     ` David Moore
1996-11-18 17:56 ` Lars Magne Ingebrigtsen
1996-11-19  7:02   ` Steinar Bang
1996-11-20 14:50     ` Steinar Bang
1996-11-20 15:26       ` Lars Magne Ingebrigtsen
1996-11-22  8:46         ` Wesley.Hardaker
1996-11-23  3:54           ` Lars Magne Ingebrigtsen
1996-11-23 14:18             ` Per Abrahamsen
1996-11-25  3:39               ` Lars Magne Ingebrigtsen
1996-11-25  8:39                 ` Wesley.Hardaker
1996-11-25 11:13                   ` Steinar Bang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).