Gnus development mailing list
 help / color / mirror / Atom feed
* Profiling
@ 2003-12-30  3:58 Lars Magne Ingebrigtsen
  2003-12-30  5:06 ` Profiling Kevin Greiner
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Lars Magne Ingebrigtsen @ 2003-12-30  3:58 UTC (permalink / raw)


Gnus was feeling a bit more sluggish than usual, so I've spent a
couple of hours profiling and fixing obvious things.  (I've cached
some of the server-to-method calls and stuff like that.)

The main remaining suspicious function now is
`gnus-agent-possibly-alter-active', which is called for every
(covered) group.  Is that really necessary?  It takes about one
quarter of the total time spent when pressing `g'...
 
-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Profiling
  2003-12-30  3:58 Profiling Lars Magne Ingebrigtsen
@ 2003-12-30  5:06 ` Kevin Greiner
  2003-12-31  1:39   ` Profiling Lars Magne Ingebrigtsen
  2003-12-31  1:49 ` Profiling Michael Cook
  2004-01-01 13:51 ` Spam processing slows down group exit (was: Profiling) Reiner Steib
  2 siblings, 1 reply; 10+ messages in thread
From: Kevin Greiner @ 2003-12-30  5:06 UTC (permalink / raw)


Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Gnus was feeling a bit more sluggish than usual, so I've spent a
> couple of hours profiling and fixing obvious things.  (I've cached
> some of the server-to-method calls and stuff like that.)
>
> The main remaining suspicious function now is
> `gnus-agent-possibly-alter-active', which is called for every
> (covered) group.  Is that really necessary?  It takes about one
> quarter of the total time spent when pressing `g'...

Necessary yes.  The server for the gnus.ding group recently changed
its active range to start at 51065.  When this happened, every
agentized article below that cut-off (for me, about 30K articles)
disappeared.

The gnus-agent-possibly-alter-active has to be called where it is to
correctly set the number of unread articles in the Group buffer.  What
can be improved is the implementation.  Right now, it opens each
group's alist file.  In my original implementation, I wanted to use
the agent's private copy of the server's active file.  That way agent
files are opened once per server rather than once per group.  I
abandoned that implementation as it required that I add code to
maintain the agent's active file as articles are added/removed from
the alist.  While it may be more complex to design, its implementation
should not be too difficult, and the performance should be much
better.  I'll look into it tomorrow.

On a related topic (scalability rather than performance), my primary
nntp server was recently replaced.  As part of the change over, the
implementor set the article numbers about 6M above each group's active
range (I suspect so that they didn't have to worry about the number of
articles being posted while they performed the change over).  The
gnus-agent-possibly-alter-active function overrode the new active
range to produce a range spanning the old and new article ranges
(about 14M).  This killed gnus with memory exhausted in
gnus-list-of-unread-articles.  I could not even run catchup as it also
calls gnus-list-of-unread-articles.  I've written two fixes so far
(but have not checked them in due to a lack of testing).

1) Added gnus-sequence-of-unread-articles to return a compressed
   sequence rather than an uncompressed list.  I then modified
   gnus-group-catchup to use it.  This means that you can catchup any
   group irregardless of size (it's also a good deal faster).

2) I modified gnus-agent-possibly-alter-active to assume that all
   articles between the end of the alist and the start of the new real
   active range are read.  That's not technically true.  However,
   these articles are no longer available on the server and they are
   not stored in the agent so they are lost (hence read).  This change
   partly solved my memory exhausted problem as the number of unread
   articles is a more manageable 8M (Foolish server admins, I've never
   seen more than 100K articles due to retention limits).

Any suggestions as to what else may be done?  On a long-term
perspective, should we consider making gnus-newsgroup-unreads a
sequence rather than a list?

Kevin





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Profiling
  2003-12-30  5:06 ` Profiling Kevin Greiner
@ 2003-12-31  1:39   ` Lars Magne Ingebrigtsen
  2004-01-04 15:37     ` Profiling Per Abrahamsen
  0 siblings, 1 reply; 10+ messages in thread
From: Lars Magne Ingebrigtsen @ 2003-12-31  1:39 UTC (permalink / raw)


Kevin Greiner <kgreiner@xpediantsolutions.com> writes:

> While it may be more complex to design, its implementation should
> not be too difficult, and the performance should be much better.
> I'll look into it tomorrow.

Cool.

> Any suggestions as to what else may be done?  On a long-term
> perspective, should we consider making gnus-newsgroup-unreads a
> sequence rather than a list?

I think this would probably be the way to go in the longer term;
yes.  (The same goes for all the article lists -- none of them should
be de/compressed.)  However, fixing this is probably a big task.
Or perhaps not -- the places where articles are added to the lists
are actually somewhat containable...

I'd also be worried about performance, but even that is something
that I haven't looked into.

What's faster -- `(memq article gnus-newsgroup-unreads)' or (the
imaginary) `(gnus-member-of-range article gnus-newsgroup-unread-range)'?
It isn't immediately obvious which one of these would win...

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Profiling
  2003-12-30  3:58 Profiling Lars Magne Ingebrigtsen
  2003-12-30  5:06 ` Profiling Kevin Greiner
@ 2003-12-31  1:49 ` Michael Cook
  2004-01-03 16:56   ` Profiling Robert Marshall
  2004-01-01 13:51 ` Spam processing slows down group exit (was: Profiling) Reiner Steib
  2 siblings, 1 reply; 10+ messages in thread
From: Michael Cook @ 2003-12-31  1:49 UTC (permalink / raw)


Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> Gnus was feeling a bit more sluggish than usual, so I've spent a
> couple of hours profiling and fixing obvious things.

i find it significantly faster.  feels like a factor of 10 faster.
super!



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Spam processing slows down group exit (was: Profiling)
  2003-12-30  3:58 Profiling Lars Magne Ingebrigtsen
  2003-12-30  5:06 ` Profiling Kevin Greiner
  2003-12-31  1:49 ` Profiling Michael Cook
@ 2004-01-01 13:51 ` Reiner Steib
  2004-01-02 16:08   ` Spam processing slows down group exit Ted Zlatanov
  2 siblings, 1 reply; 10+ messages in thread
From: Reiner Steib @ 2004-01-01 13:51 UTC (permalink / raw)


On Tue, Dec 30 2003, Lars Magne Ingebrigtsen wrote:

> Gnus was feeling a bit more sluggish than usual, so I've spent a
> couple of hours profiling and fixing obvious things.

I noticed that leaving large nntp or nnml groups became kinda slow (I
didn't profile it, though).

I found that the group and topic parameters (spam/ham) are
calculated[1] for *every article* rather than once for each group
exit.  I have set up spam processing for some IMAP groups and all
Gmane groups[2], but not for nnml (and other nntp groups).

I added the following debug statements in `spam-group-ham-mark-p'...

  (gnus-message 9 "DEBUG: spam-group-ham-mark-p %s %s %s" group mark spam)

... and got the following output after leaving a group with 10
articles (entered with `10 RET'):

,----
| Exiting summary buffer and applying spam rules
| DEBUG: spam-group-ham-mark-p nnml+foo:bar 69 t [9 times]
| DEBUG: spam-group-ham-mark-p nnml+foo:bar 33 t
| DEBUG: spam-group-ham-mark-p nnml+foo:bar 69 nil [9 times]
| DEBUG: spam-group-ham-mark-p nnml+foo:bar 33 nil
| Marking spam as expired without moving it
`----

The function seems to be called from `spam-ham-copy-or-move-routine':

    (dolist (article articles)
      (when (spam-group-ham-mark-p gnus-newsgroup-name
				   (gnus-summary-article-mark article))
	(push article todo)))

Couldn't each possible mark be checked _once_ per group exit instead
once for every article?  (==> O(1) instead of O(n), AFAIKS.)

Why is `spam-ham-copy-or-move-routine' called in the first place?  I
have not requested any spam processing in this group.

Bye, Reiner.

[1] Using C-g after toggle-debug-on-quit, I saw that Gnus was doing
    this most of the time.

[2] See also <news:v94qw3y46d.fsf@marauder.physik.uni-ulm.de>:

   ("nnimap:spam\\.detected"
    (gnus-article-sort-functions '(gnus-article-sort-by-chars))
    (ham-process-destination "nnimap:INBOX" "nnimap:training.ham")
    (spam-contents gnus-group-spam-classification-spam))
   ("nnimap:\\(INBOX\\|other-folders\\)"
    (spam-process-destination . "nnimap:training.spam")
    (spam-contents gnus-group-spam-classification-ham))

   ("^gmane\\."
    (spam-process (gnus-group-spam-exit-processor-report-gmane)))
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo--- PGP key available via WWW   http://rsteib.home.pages.de/




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Spam processing slows down group exit
  2004-01-01 13:51 ` Spam processing slows down group exit (was: Profiling) Reiner Steib
@ 2004-01-02 16:08   ` Ted Zlatanov
  2004-01-02 18:40     ` Reiner Steib
  0 siblings, 1 reply; 10+ messages in thread
From: Ted Zlatanov @ 2004-01-02 16:08 UTC (permalink / raw)


On Thu, 01 Jan 2004, 4.uce.03.r.s@nurfuerspam.de wrote:

> I found that the group and topic parameters (spam/ham) are
> calculated[1] for *every article* rather than once for each group
> exit.  I have set up spam processing for some IMAP groups and all
> Gmane groups[2], but not for nnml (and other nntp groups).
[...]
> Couldn't each possible mark be checked _once_ per group exit instead
> once for every article?  (==> O(1) instead of O(n), AFAIKS.)

There were two problems here, one that I had forgotten to use the new
spam-list-articles function which encapsulated that check, and the
other that I had forgotten to optimize spam-list-articles as I should
have.

OK, that's three problems, if you count my forgetfulness :)

Fixed in CVS, take a look.  I didn't bother with a global cache,
because it would be much optimizing for a small gain, plus the
customization functions would need triggers to set the caches to nil,
plus if a user modified the ham-marks or spam-marks manually we're in
trouble anyhow.  Instead the caches of "yes" and "no" matches
(spam-list-articles can take a classification, so "yes" the mark
matches the 'spam or 'ham classification, or "no" it doesn't) are
rebuilt every time the function spam-list-articles is run.

> Why is `spam-ham-copy-or-move-routine' called in the first place?  I
> have not requested any spam processing in this group.

It's called from spam-summary-prepare-exit, can you check why it's
being invoked in your case in particular?  It should only be called
here (through the spam-ham-{copy,move}-routine proxies):

    (when (spam-group-ham-processor-copy-p gnus-newsgroup-name)
      (gnus-message 5 "Copying ham")
      (spam-ham-copy-routine
       (gnus-parameter-ham-process-destination gnus-newsgroup-name)))

    ;; now move all ham articles out of spam groups
    (when (spam-group-spam-contents-p gnus-newsgroup-name)
      (gnus-message 5 "Moving ham messages from spam group")
      (spam-ham-move-routine
       (gnus-parameter-ham-process-destination gnus-newsgroup-name))))

Thanks
Ted



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Spam processing slows down group exit
  2004-01-02 16:08   ` Spam processing slows down group exit Ted Zlatanov
@ 2004-01-02 18:40     ` Reiner Steib
  0 siblings, 0 replies; 10+ messages in thread
From: Reiner Steib @ 2004-01-02 18:40 UTC (permalink / raw)


On Fri, Jan 02 2004, Ted Zlatanov wrote:

> On Thu, 01 Jan 2004, 4.uce.03.r.s@nurfuerspam.de wrote:
>
>> I found that the group and topic parameters (spam/ham) are
>> calculated[1] for *every article* rather than once for each group
>> exit.
[...]
> Fixed in CVS, take a look.

Thanks.  It's fine now.

> I didn't bother with a global cache, because it would be much
> optimizing for a small gain, 

Yes, I agree.

>> Why is `spam-ham-copy-or-move-routine' called in the first place?  I
>> have not requested any spam processing in this group.
>
> It's called from spam-summary-prepare-exit, can you check why it's
> being invoked in your case in particular?  It should only be called
> here (through the spam-ham-{copy,move}-routine proxies):

It seems it isn't called anymore with your change.

Bye, Reiner.
-- 
       ,,,
      (o o)
---ooO-(_)-Ooo--- PGP key available via WWW   http://rsteib.home.pages.de/




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Profiling
  2003-12-31  1:49 ` Profiling Michael Cook
@ 2004-01-03 16:56   ` Robert Marshall
  0 siblings, 0 replies; 10+ messages in thread
From: Robert Marshall @ 2004-01-03 16:56 UTC (permalink / raw)


On Tue, 30 Dec 2003, Michael Cook wrote:

> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> 
>> Gnus was feeling a bit more sluggish than usual, so I've spent a
>> couple of hours profiling and fixing obvious things.
> 
> i find it significantly faster.  feels like a factor of 10 faster.
> super!

I'd agree with that - I hardly see any delay when doing a get-new-news

It also fixes a problem that I thought was something to do with my
setup - gnus waited uptil ppp was up before starting (I use dial on
demand) in spite of my only using localhost as a newsserver - many thanks!

Robert
-- 
La grenouille songe..dans son château d'eau



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Profiling
  2003-12-31  1:39   ` Profiling Lars Magne Ingebrigtsen
@ 2004-01-04 15:37     ` Per Abrahamsen
  2004-01-04 20:41       ` Profiling Lars Magne Ingebrigtsen
  0 siblings, 1 reply; 10+ messages in thread
From: Per Abrahamsen @ 2004-01-04 15:37 UTC (permalink / raw)


Lars Magne Ingebrigtsen <larsi@gnus.org> writes:

> What's faster -- `(memq article gnus-newsgroup-unreads)' or (the
> imaginary) `(gnus-member-of-range article gnus-newsgroup-unread-range)'?
> It isn't immediately obvious which one of these would win...

I'm not sure it is relevant here, but rewriting some of the simplest
and most commonly used functions in C may be an option.  Customize
first became usable fast when Hrvoje reimplemented widget-get in C.  

Basically, calling build-in functions is way faster than calling Emacs
Lisp functions.

I'm pretty sure RMS would accept including a C implementation of
gnus-member-of-range if it gave a significant performance boost.

Of course, you need to be sure the interface is not going to change
more, before doing that.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Profiling
  2004-01-04 15:37     ` Profiling Per Abrahamsen
@ 2004-01-04 20:41       ` Lars Magne Ingebrigtsen
  0 siblings, 0 replies; 10+ messages in thread
From: Lars Magne Ingebrigtsen @ 2004-01-04 20:41 UTC (permalink / raw)


Per Abrahamsen <abraham@dina.kvl.dk> writes:

> I'm pretty sure RMS would accept including a C implementation of
> gnus-member-of-range if it gave a significant performance boost.

Yes, that's true.  So going from lists to ranges will probably be a
performance gain.  (A C version of `member-of-range' should be much
faster than using `memq' on the (much bigger) lists we use now.)
In addition, we wouldn't have to convert between lists and ranges all
the time, so there's another speed and space saving...

-- 
(domestic pets only, the antidote for overdose, milk.)
  larsi@gnus.org * Lars Magne Ingebrigtsen




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2004-01-04 20:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-30  3:58 Profiling Lars Magne Ingebrigtsen
2003-12-30  5:06 ` Profiling Kevin Greiner
2003-12-31  1:39   ` Profiling Lars Magne Ingebrigtsen
2004-01-04 15:37     ` Profiling Per Abrahamsen
2004-01-04 20:41       ` Profiling Lars Magne Ingebrigtsen
2003-12-31  1:49 ` Profiling Michael Cook
2004-01-03 16:56   ` Profiling Robert Marshall
2004-01-01 13:51 ` Spam processing slows down group exit (was: Profiling) Reiner Steib
2004-01-02 16:08   ` Spam processing slows down group exit Ted Zlatanov
2004-01-02 18:40     ` Reiner Steib

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).