* Wish: "rate of new info" for scoring
@ 2004-04-29 12:03 Pavel Janík
2004-04-29 14:07 ` Wes Hardaker
2004-04-29 14:27 ` Jesper Harder
0 siblings, 2 replies; 6+ messages in thread
From: Pavel Janík @ 2004-04-29 12:03 UTC (permalink / raw)
Hi,
I'd like to be able to automatically score down articles that are written
using the well known style of Outlook users, ie:
--- cut here ---
You're right. <<< This is new info
> ... <<< This is the original e-mail, quoted
> ... many lines ...
> ...
--- cut here ---
To cope with this, we could:
- count lines/characters that are new (ie. not quoted). (/n/)
- count lines/characters that are quoted (/q/)
So the "rate of new info" is q/n. We can then score using this value and
mark articles as useless to read etc.
What do you think about it?
--
Pavel Janík
I'm glad that Emacs is bigger and more open than some of the people who use
it.
-- Tony Reed in gnu.emacs.help
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Wish: "rate of new info" for scoring
2004-04-29 12:03 Wish: "rate of new info" for scoring Pavel Janík
@ 2004-04-29 14:07 ` Wes Hardaker
2004-04-29 19:53 ` Pavel Janík
2004-04-29 14:27 ` Jesper Harder
1 sibling, 1 reply; 6+ messages in thread
From: Wes Hardaker @ 2004-04-29 14:07 UTC (permalink / raw)
Cc: ding
>>>>> On Thu, 29 Apr 2004 14:03:46 +0200, Pavel@Janik.cz (Pavel Janík) said:
Pavel> To cope with this, we could:
Pavel> - count lines/characters that are new (ie. not quoted). (/n/)
Pavel> - count lines/characters that are quoted (/q/)
Pavel> So the "rate of new info" is q/n. We can then score using this
Pavel> value and mark articles as useless to read etc.
Hmm... thought I saw that done somewhere. If it's not in the gnus
manual now, it must be in the procmailex manual page...
--
"In the bathtub of history the truth is harder to hold than the soap,
and much more difficult to find." -- Terry Pratchett
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Wish: "rate of new info" for scoring
2004-04-29 12:03 Wish: "rate of new info" for scoring Pavel Janík
2004-04-29 14:07 ` Wes Hardaker
@ 2004-04-29 14:27 ` Jesper Harder
2004-04-29 19:52 ` Pavel Janík
1 sibling, 1 reply; 6+ messages in thread
From: Jesper Harder @ 2004-04-29 14:27 UTC (permalink / raw)
Pavel@Janik.cz (Pavel Janík) writes:
> I'd like to be able to automatically score down articles that are
> written using the well known style of Outlook users, ie:
I use a scoring rule like this for scoring down top posters:
("body"
("\\`\\(^[^>].*\n\\)\\{4,\\}\\(\\(^>.*\n\\)\\{4,\\}\\)[ \n]*\\'" -1000 nil r))
--
Jesper Harder <http://purl.org/harder/>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Wish: "rate of new info" for scoring
2004-04-29 14:27 ` Jesper Harder
@ 2004-04-29 19:52 ` Pavel Janík
2004-05-16 11:56 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 6+ messages in thread
From: Pavel Janík @ 2004-04-29 19:52 UTC (permalink / raw)
From: Jesper Harder <harder@ifa.au.dk>
Date: Thu, 29 Apr 2004 16:27:32 +0200
> > I'd like to be able to automatically score down articles that are
> > written using the well known style of Outlook users, ie:
>
> I use a scoring rule like this for scoring down top posters:
Maybe I was not too exact. I do not want to score down top posters. I want
to score down on the amount of quote lines from original mail compared to
new (ie not quoted) lines. Both top and bottom posters can cite in *bad*
way:
Bad top poster will send this:
--- cut here ---
Yes
>
> ... here is the complete original mail ...
>
--- cut here ---
Bad bottom poster will send this:
--- cut here ---
>
> ... here is the complete original mail ...
>
Yes
--- cut here ---
Both are BAD and should be scored down ;-)
--
Pavel Janík
The crawl of today is a rather rapid one by 6502 standards ;)
-- Linus Torvalds in LKML
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Wish: "rate of new info" for scoring
2004-04-29 14:07 ` Wes Hardaker
@ 2004-04-29 19:53 ` Pavel Janík
0 siblings, 0 replies; 6+ messages in thread
From: Pavel Janík @ 2004-04-29 19:53 UTC (permalink / raw)
From: Wes Hardaker <wes@hardakers.net>
Date: Thu, 29 Apr 2004 07:07:40 -0700
> Hmm... thought I saw that done somewhere. If it's not in the gnus
> manual now, it must be in the procmailex manual page...
Could you be more specific? I'm looking for a solution for about two days
now and have read all procmail pages and (maybe old) Gnus manual.
--
Pavel Janík
So anybody who depends on "dump" getting backups right is already playing
russian rulette with their backups.
-- Linus Torvalds in linux-kernel
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Wish: "rate of new info" for scoring
2004-04-29 19:52 ` Pavel Janík
@ 2004-05-16 11:56 ` Lars Magne Ingebrigtsen
0 siblings, 0 replies; 6+ messages in thread
From: Lars Magne Ingebrigtsen @ 2004-05-16 11:56 UTC (permalink / raw)
Pavel@Janik.cz (Pavel Janík) writes:
> Maybe I was not too exact. I do not want to score down top posters. I want
> to score down on the amount of quote lines from original mail compared to
> new (ie not quoted) lines.
Well, it's easy enough to write something like that, but it'll be
*s*l*o*w*. I mean, it'll have to look at the body of each message.
But if you think that it'll be used, I can hack it up...
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2004-05-16 11:56 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-04-29 12:03 Wish: "rate of new info" for scoring Pavel Janík
2004-04-29 14:07 ` Wes Hardaker
2004-04-29 19:53 ` Pavel Janík
2004-04-29 14:27 ` Jesper Harder
2004-04-29 19:52 ` Pavel Janík
2004-05-16 11:56 ` Lars Magne Ingebrigtsen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).