Gnus development mailing list
 help / color / mirror / Atom feed
* Scoring problem for short messages
@ 1999-02-23  2:35 François Pinard
  1999-02-26  8:17 ` Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 2+ messages in thread
From: François Pinard @ 1999-02-23  2:35 UTC (permalink / raw)


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=us-ascii, Size: 1540 bytes --]

Hi, people.  Maybe someone will be kind enough to help me on this.

I receive many short messages which are created automatically on many
different systems, and filter them through a score file roughly looking like:


(
 ((and ("lines" 2 =)
       (or ;; [...]
	   ("body" "rapquot" s)
	   ;; [...]
	   ))
  -1000)
 )


The problem is that, depending on the distribution of cosmic rays along
Earth orbit, and probably other factors as well, I get a variable amount
of spurious white lines at end of messages, and even, sometimes, before
the first line.

The ideal would be that I go to each machine, recreate the exact conditions
of the invoice, try to reproduce the problem, understand it, and repair
all occurrences.  But I really do not have the time to do that now, and
I would like some other solution in the meantime.

I could use some bigger number of lines and use an inequality, but then,
the score might be decreased if a message happens to contain `rapquot'
together with something else which then interests me.  I would like the
score to be decreased /only/ if the message contains `rapquot' and nothing
else than white lines, say.  Do you have an idea how I could manage this?

I also do not know when and how lines are counted.  Is there a way to
arrange so that count excludes prior and subsequent white lines?  I guess
this approach might also solve my little problem.

-- 
François Pinard                            mailto:pinard@iro.umontreal.ca
Join the free Translation Project!    http://www.iro.umontreal.ca/~pinard



^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Scoring problem for short messages
  1999-02-23  2:35 Scoring problem for short messages François Pinard
@ 1999-02-26  8:17 ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 2+ messages in thread
From: Lars Magne Ingebrigtsen @ 1999-02-26  8:17 UTC (permalink / raw)


François Pinard <pinard@iro.umontreal.ca> writes:

> I could use some bigger number of lines and use an inequality, but then,
> the score might be decreased if a message happens to contain `rapquot'
> together with something else which then interests me.  I would like the
> score to be decreased /only/ if the message contains `rapquot' and nothing
> else than white lines, say.  Do you have an idea how I could manage this?

You could score on "body" and, er, blankness, but that would take a
while. 

> I also do not know when and how lines are counted.  Is there a way to
> arrange so that count excludes prior and subsequent white lines?

Gnus uses the contents of the Lines header.  So for mail you could
have a function run off of `nnmail-prepare-incoming-message-hook'
remove all heading and trailing blank lines, and then the Lines header 
should be correct, I think.

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~1999-02-26  8:17 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-02-23  2:35 Scoring problem for short messages François Pinard
1999-02-26  8:17 ` Lars Magne Ingebrigtsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).