Gnus development mailing list
 help / color / mirror / Atom feed
* Wish: "rate of new info" for scoring
@ 2004-04-29 12:03 Pavel Janík
  2004-04-29 14:07 ` Wes Hardaker
  2004-04-29 14:27 ` Jesper Harder
  0 siblings, 2 replies; 6+ messages in thread
From: Pavel Janík @ 2004-04-29 12:03 UTC (permalink / raw)


Hi,

I'd like to be able to automatically score down articles that are written
using the well known style of Outlook users, ie:

--- cut here ---
You're right.        <<< This is new info

> ...                <<< This is the original e-mail, quoted
> ... many lines ...
> ...
--- cut here ---

To cope with this, we could:

- count lines/characters that are new (ie. not quoted). (/n/)
- count lines/characters that are quoted (/q/)

So the "rate of new info" is q/n. We can then score using this value and
mark articles as useless to read etc.

What do you think about it?
-- 
Pavel Janík

I'm glad that Emacs is bigger and more open than some of the people who use
it.
                  -- Tony Reed in gnu.emacs.help



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Wish: "rate of new info" for scoring
  2004-04-29 12:03 Wish: "rate of new info" for scoring Pavel Janík
@ 2004-04-29 14:07 ` Wes Hardaker
  2004-04-29 19:53   ` Pavel Janík
  2004-04-29 14:27 ` Jesper Harder
  1 sibling, 1 reply; 6+ messages in thread
From: Wes Hardaker @ 2004-04-29 14:07 UTC (permalink / raw)
  Cc: ding

>>>>> On Thu, 29 Apr 2004 14:03:46 +0200, Pavel@Janik.cz (Pavel Janík) said:

Pavel> To cope with this, we could:

Pavel> - count lines/characters that are new (ie. not quoted). (/n/)
Pavel> - count lines/characters that are quoted (/q/)

Pavel> So the "rate of new info" is q/n. We can then score using this
Pavel> value and mark articles as useless to read etc.

Hmm...  thought I saw that done somewhere.  If it's not in the gnus
manual now, it must be in the procmailex manual page...

-- 
"In the bathtub of history the truth is harder to hold than the soap,
 and much more difficult to find."  -- Terry Pratchett



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Wish: "rate of new info" for scoring
  2004-04-29 12:03 Wish: "rate of new info" for scoring Pavel Janík
  2004-04-29 14:07 ` Wes Hardaker
@ 2004-04-29 14:27 ` Jesper Harder
  2004-04-29 19:52   ` Pavel Janík
  1 sibling, 1 reply; 6+ messages in thread
From: Jesper Harder @ 2004-04-29 14:27 UTC (permalink / raw)


Pavel@Janik.cz (Pavel Janík) writes:

> I'd like to be able to automatically score down articles that are
> written using the well known style of Outlook users, ie:

I use a scoring rule like this for scoring down top posters:


 ("body"
  ("\\`\\(^[^>].*\n\\)\\{4,\\}\\(\\(^>.*\n\\)\\{4,\\}\\)[ \n]*\\'" -1000 nil r))

-- 
Jesper Harder                                <http://purl.org/harder/>



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Wish: "rate of new info" for scoring
  2004-04-29 14:27 ` Jesper Harder
@ 2004-04-29 19:52   ` Pavel Janík
  2004-05-16 11:56     ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 6+ messages in thread
From: Pavel Janík @ 2004-04-29 19:52 UTC (permalink / raw)


   From: Jesper Harder <harder@ifa.au.dk>
   Date: Thu, 29 Apr 2004 16:27:32 +0200

   > > I'd like to be able to automatically score down articles that are
   > > written using the well known style of Outlook users, ie:
   > 
   > I use a scoring rule like this for scoring down top posters:

Maybe I was not too exact. I do not want to score down top posters. I want
to score down on the amount of quote lines from original mail compared to
new (ie not quoted) lines. Both top and bottom posters can cite in *bad*
way:

Bad top poster will send this:

--- cut here ---
Yes

>
> ... here is the complete original mail ...
>

--- cut here ---

Bad bottom poster will send this:

--- cut here ---
>
> ... here is the complete original mail ...
>

Yes
--- cut here ---

Both are BAD and should be scored down ;-)
-- 
Pavel Janík

The crawl of today is a rather rapid one by 6502 standards ;)
                  -- Linus Torvalds in LKML



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Wish: "rate of new info" for scoring
  2004-04-29 14:07 ` Wes Hardaker
@ 2004-04-29 19:53   ` Pavel Janík
  0 siblings, 0 replies; 6+ messages in thread
From: Pavel Janík @ 2004-04-29 19:53 UTC (permalink / raw)


   From: Wes Hardaker <wes@hardakers.net>
   Date: Thu, 29 Apr 2004 07:07:40 -0700

   > Hmm...  thought I saw that done somewhere.  If it's not in the gnus
   > manual now, it must be in the procmailex manual page...

Could you be more specific? I'm looking for a solution for about two days
now and have read all procmail pages and (maybe old) Gnus manual.
-- 
Pavel Janík

So anybody who depends on "dump" getting backups right is already playing
russian rulette with their backups.
                  -- Linus Torvalds in linux-kernel



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Wish: "rate of new info" for scoring
  2004-04-29 19:52   ` Pavel Janík
@ 2004-05-16 11:56     ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 6+ messages in thread
From: Lars Magne Ingebrigtsen @ 2004-05-16 11:56 UTC (permalink / raw)


Pavel@Janik.cz (Pavel Janík) writes:

> Maybe I was not too exact. I do not want to score down top posters. I want
> to score down on the amount of quote lines from original mail compared to
> new (ie not quoted) lines. 

Well, it's easy enough to write something like that, but it'll be
*s*l*o*w*.  I mean, it'll have to look at the body of each message.

But if you think that it'll be used, I can hack it up... 

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-05-16 11:56 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-04-29 12:03 Wish: "rate of new info" for scoring Pavel Janík
2004-04-29 14:07 ` Wes Hardaker
2004-04-29 19:53   ` Pavel Janík
2004-04-29 14:27 ` Jesper Harder
2004-04-29 19:52   ` Pavel Janík
2004-05-16 11:56     ` Lars Magne Ingebrigtsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).