* Re: db-backed mail back end
2002-01-23 0:36 ` Lars Magne Ingebrigtsen
@ 2002-01-23 0:49 ` Henrik Enberg
2002-01-23 1:36 ` Jorge Godoy
2002-01-23 1:34 ` Jorge Godoy
` (7 subsequent siblings)
8 siblings, 1 reply; 102+ messages in thread
From: Henrik Enberg @ 2002-01-23 0:49 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> If we were to design a mail back end that's supposed to scale to
> several hundred thousands of messages in thousands of groups -- how
> would we do that? Perhaps somebody has pondered this question
> before. :-)
Virtual subgroups. Suppose I have foo with 10000 messages in it then I
could create the virtual groups foo1 with messages 0 - 1000, foo2 with
1001 - 2000 and so on (with ^ being able to find parents in other
subgroups).
Henrik
--
There's not going to be enough people in the system to take advantage
of people like me.
-- George W. Bush
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-23 0:36 ` Lars Magne Ingebrigtsen
2002-01-23 0:49 ` Henrik Enberg
@ 2002-01-23 1:34 ` Jorge Godoy
2002-01-23 2:39 ` John S. J. Anderson
` (6 subsequent siblings)
8 siblings, 0 replies; 102+ messages in thread
From: Jorge Godoy @ 2002-01-23 1:34 UTC (permalink / raw)
[-- Attachment #1: Type: text/plain, Size: 1083 bytes --]
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> If we were to design a mail back end that's supposed to scale to
> several hundred thousands of messages in thousands of groups -- how
> would we do that? Perhaps somebody has pondered this question
> before. :-)
The biggest problem to me seems to be disk access and memory usage
(this is in fact an Emacs problem and it's not restricted to
Gnus...). I'd go for some optimization in the way informatio is stored
and retrieved, trying to minimize I/O in big groups (I feel the
performance problem with groups in the 40K+ messages range...).
Storing the marks on a separate file and using indices for messages
was indeed a good idea. We should try separating all the information
of a group into it's subdirectory (I use nnml).
Implementing Maildir support would also be interesting, since it would
prevent Gnus from copying all the messages to a temporary file before
splitting them.
--
Godoy. <godoy@conectiva.com>
Escritório de Projetos -- Conectiva S.A.
Projects Office -- Conectiva Inc.
[-- Attachment #2: Type: application/pgp-signature, Size: 268 bytes --]
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-23 0:36 ` Lars Magne Ingebrigtsen
2002-01-23 0:49 ` Henrik Enberg
2002-01-23 1:34 ` Jorge Godoy
@ 2002-01-23 2:39 ` John S. J. Anderson
2002-01-24 1:18 ` Lars Magne Ingebrigtsen
2002-01-23 3:46 ` Daniel Pittman
` (5 subsequent siblings)
8 siblings, 1 reply; 102+ messages in thread
From: John S. J. Anderson @ 2002-01-23 2:39 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> If we were to design a mail back end that's supposed to scale to
> several hundred thousands of messages in thousands of groups -- how
> would we do that? Perhaps somebody has pondered this question
> before. :-)
I offer a semi-serious answer to your joking question, not because I
think it needs one, but because I'd really like to see this happen and
I don't have the skills to do it myself[1]. 8^)=
There are a couple of emailers out there that exist basically as thin
layers over a DB backend, so it might be informative to pull them
apart and see how they work. Pronto is one, I think.
I would contend (after thinking about it for, oh, 30 seconds) that the
Right Way(tm) would be to have one large table for the mails (with
obvious things normalized out to lookup tables, of course), with a
field or fields allowing arbitrary groupings ("Mailing List:", etc.)
Then "groups" as they currently exist in Gnus become views, searches
become much easier to do, etc.
Apologies if this is all so obvious as to not be worth mentioning...
Oh, and because this seems like a good time to mention it: something
like <http://www.mozilla.org/blue-sky/misc/199805/intertwingle.html>
on a Gnus skeleton would be _*awesome*_.
john.
Footnotes:
[1] And I could acquire them, true, but I'd like it sooner than
that. So...
--
It takes an uncommon mind to think of these things.
--- Calvin
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-23 2:39 ` John S. J. Anderson
@ 2002-01-24 1:18 ` Lars Magne Ingebrigtsen
2002-01-25 14:37 ` Randal L. Schwartz
0 siblings, 1 reply; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-24 1:18 UTC (permalink / raw)
jacobs@genehack.org (John S. J. Anderson) writes:
>> If we were to design a mail back end that's supposed to scale to
>> several hundred thousands of messages in thousands of groups -- how
>> would we do that? Perhaps somebody has pondered this question
>> before. :-)
>
> I offer a semi-serious answer to your joking question, not because I
> think it needs one, but because I'd really like to see this happen and
> I don't have the skills to do it myself[1]. 8^)=
I'm serious as mud. :-)
> Oh, and because this seems like a good time to mention it: something
> like <http://www.mozilla.org/blue-sky/misc/199805/intertwingle.html>
> on a Gnus skeleton would be _*awesome*_.
That's quite interesting...
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 1:18 ` Lars Magne Ingebrigtsen
@ 2002-01-25 14:37 ` Randal L. Schwartz
0 siblings, 0 replies; 102+ messages in thread
From: Randal L. Schwartz @ 2002-01-25 14:37 UTC (permalink / raw)
>>>>> "Lars" == Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
Lars> jacobs@genehack.org (John S. J. Anderson) writes:
>>> If we were to design a mail back end that's supposed to scale to
>>> several hundred thousands of messages in thousands of groups -- how
>>> would we do that? Perhaps somebody has pondered this question
>>> before. :-)
>>
>> I offer a semi-serious answer to your joking question, not because I
>> think it needs one, but because I'd really like to see this happen and
>> I don't have the skills to do it myself[1]. 8^)=
Lars> I'm serious as mud. :-)
Heh. I was just about to unmark my !-marked article from this list
from nov 1999 about this very subject. Irony of that. Maybe it'll
finally happen.
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-23 0:36 ` Lars Magne Ingebrigtsen
` (2 preceding siblings ...)
2002-01-23 2:39 ` John S. J. Anderson
@ 2002-01-23 3:46 ` Daniel Pittman
2002-01-24 0:51 ` Russ Allbery
2002-01-23 8:52 ` Simon Josefsson
` (4 subsequent siblings)
8 siblings, 1 reply; 102+ messages in thread
From: Daniel Pittman @ 2002-01-23 3:46 UTC (permalink / raw)
On Wed, 23 Jan 2002, Lars Magne Ingebrigtsen wrote:
> Daniel Pittman <daniel@rimspace.net> writes:
>
>> Gain? For me, the biggest pain with Gnus right now is that there are
>> some scalability issues. I got sick of losing information that had
>> been on mailing lists, so I stopped expiring mail a couple of years
>> ago.
>
> A db doesn't necessarily give you any performance gain.
No, it doesn't.
> If you're not careful with how you set things up, you can get
> arbitrarily awful db performance (and I should know, since I've been
> Oracling a lot these past years. :-))
*grin* We recently moved from storing our database information, for the
product I work on, from badly designed XML files (ick) to badly designed
SQL databases, resulting in a net performance loss... Heh.
Ahem. Anyway, adding a database is no assurance of better performance,
no. I suspect that you might gain more from using GDBM to hash some of
the overview and marks stuff than using SQL for data storage.
Also, it might be worth hashing the actual article storage in an NNML
group, from one directory to one-plus-n-by-m directories. :)
...but I can't say that I know any of this will work because I am still
too lazy to build the prototype, performance test it, then build the
real thing.
>> Disk space is sufficiently cheap, even for a laptop, that I don't
>> have space issues with this. Gnus starts to feel a little pain,
>> sometimes, though when dealing with groups in the > 75,000 messages
>> range (with NNML.)
>
> Perhaps the question should be restated as such:
>
> If we were to design a mail back end that's supposed to scale to
> several hundred thousands of messages in thousands of groups -- how
> would we do that? Perhaps somebody has pondered this question
> before. :-)
That's a better question, I think. I never resolved it to my
satisfaction.
Daniel
--
I am, as I said, inspired by the biological phenomena in which chemical
forces are used in repetitious fashion to produce all kinds of weird
effects (one of which is the author).
-- Richard Feynman, _There's Plenty of Room at the Bottom_
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-23 0:36 ` Lars Magne Ingebrigtsen
` (3 preceding siblings ...)
2002-01-23 3:46 ` Daniel Pittman
@ 2002-01-23 8:52 ` Simon Josefsson
2002-01-26 20:55 ` Steinar Bang
2002-01-23 11:45 ` Per Abrahamsen
` (3 subsequent siblings)
8 siblings, 1 reply; 102+ messages in thread
From: Simon Josefsson @ 2002-01-23 8:52 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Daniel Pittman <daniel@rimspace.net> writes:
>
>> Gain? For me, the biggest pain with Gnus right now is that there are
>> some scalability issues. I got sick of losing information that had been
>> on mailing lists, so I stopped expiring mail a couple of years ago.
>
> A db doesn't necessarily give you any performance gain. If you're not
> careful with how you set things up, you can get arbitrarily awful db
> performance (and I should know, since I've been Oracling a lot these
> past years. :-))
I agree -- a db backend will most likely be slower than nnml. NNML is
a datbase specialized for its purpose, a generic database will not be
faster.
...unless the backend interface design changes in the process, which
would make some speed improvements possible.
> If we were to design a mail back end that's supposed to scale to
> several hundred thousands of messages in thousands of groups -- how
> would we do that? Perhaps somebody has pondered this question
> before. :-)
Don't compute things that you don't need. Entering a group should be
a call to `(switch-to-buffer (generate-new-buffer (format "*Summary
%s*" group))'. Populating the view should be incremental and
asynchronous. Threading should optionally be pushed down into the
backend (IMAP supports server-side threading, a database backend could
as well), with utility threading functionality in Gnus. Etc.
Same goes for the group buffer, btw. It should be displayed
immediately, and updating status on groups should happen
asynchronously.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-23 8:52 ` Simon Josefsson
@ 2002-01-26 20:55 ` Steinar Bang
0 siblings, 0 replies; 102+ messages in thread
From: Steinar Bang @ 2002-01-26 20:55 UTC (permalink / raw)
>>>>> Simon Josefsson <jas@extundo.com>:
> Don't compute things that you don't need. Entering a group should
> be a call to `(switch-to-buffer (generate-new-buffer (format
> "*Summary %s*" group))'. Populating the view should be incremental
> and asynchronous. Threading should optionally be pushed down into
> the backend (IMAP supports server-side threading, a database backend
> could as well), with utility threading functionality in Gnus. Etc.
You don't really need threading for this in Gnus/Emacs. You just need
something with the conceptual model (_not_ the actual code), of W3C
libwww.
<http://www.w3.org/Library/>
A short explanation of this model:
- you create request objects for the URLs you wish to access
and give them to the libwww. These request objects provide a
context for the entire transaction
- in the libwww the requests are put in a queue
- the queue is handled by a timer: requests are picked up, and turned
into actual network requests
- at some point in time a request causes a response, ie. data start
arriving on the wire. To handle this, a "stream" object for
handling the response is created, and libwww stuffs the arriving
data down this stream
- when the request completes, the request object is created
I've used libwww in a GUI-application, and the GUI event loop was the
"engine" for the timer, and was handling input on open network
connection.
It all worked like a multihreaded application, but everything was
happening inside a single thread.
> Same goes for the group buffer, btw. It should be displayed
> immediately, and updating status on groups should happen
> asynchronously.
Yup.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-23 0:36 ` Lars Magne Ingebrigtsen
` (4 preceding siblings ...)
2002-01-23 8:52 ` Simon Josefsson
@ 2002-01-23 11:45 ` Per Abrahamsen
2002-01-23 14:11 ` Kai Großjohann
2002-01-23 14:26 ` Mark Milhollan
` (2 subsequent siblings)
8 siblings, 1 reply; 102+ messages in thread
From: Per Abrahamsen @ 2002-01-23 11:45 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> If we were to design a mail back end that's supposed to scale to
> several hundred thousands of messages in thousands of groups -- how
> would we do that? Perhaps somebody has pondered this question
> before. :-)
I'm pretty sure both Google and DejaNews before them has pondered the
question carefully.
Seriously, I suspect the core functionality would be the plain text
searches, with the more structured queries being second.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-23 0:36 ` Lars Magne Ingebrigtsen
` (5 preceding siblings ...)
2002-01-23 11:45 ` Per Abrahamsen
@ 2002-01-23 14:26 ` Mark Milhollan
2002-01-24 1:19 ` Lars Magne Ingebrigtsen
2002-01-23 17:50 ` Paul Jarc
2002-01-24 0:50 ` Russ Allbery
8 siblings, 1 reply; 102+ messages in thread
From: Mark Milhollan @ 2002-01-23 14:26 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
>Perhaps the question should be restated as such:
>
>If we were to design a mail back end that's supposed to scale to
>several hundred thousands of messages in thousands of groups -- how
>would we do that? Perhaps somebody has pondered this question
>before. :-)
(This smells like bait, but I'll rise anyway.)
Something already exists -- a news server. Tens of thousands of groups:
check. Millions (or even billions) of messages: check. Existing Gnus
back-end: check. Searching and indexing facilities: limited in most
servers, though generally easy to extend externally.
/mark
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-23 14:26 ` Mark Milhollan
@ 2002-01-24 1:19 ` Lars Magne Ingebrigtsen
0 siblings, 0 replies; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-24 1:19 UTC (permalink / raw)
Mark Milhollan <mlm@mlm.ath.cx> writes:
> Something already exists -- a news server. Tens of thousands of groups:
> check. Millions (or even billions) of messages: check. Existing Gnus
> back-end: check. Searching and indexing facilities: limited in most
> servers, though generally easy to extend externally.
A news server is a wonderful thing, but putting your mail into it is
painful.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-23 0:36 ` Lars Magne Ingebrigtsen
` (6 preceding siblings ...)
2002-01-23 14:26 ` Mark Milhollan
@ 2002-01-23 17:50 ` Paul Jarc
2002-01-24 0:50 ` Russ Allbery
8 siblings, 0 replies; 102+ messages in thread
From: Paul Jarc @ 2002-01-23 17:50 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
> Perhaps the question should be restated as such:
>
> If we were to design a mail back end that's supposed to scale to
> several hundred thousands of messages in thousands of groups -- how
> would we do that? Perhaps somebody has pondered this question
> before. :-)
Unless the mail backend will directly access a raw disk device, I
think that's still not quite the right question. The backend will use
some particular software interface for storage - maybe the filesystem,
maybe SQL, whatever. Making mail fast means making the implementation
behind that storage interface fast. This is not a Gnus problem, or at
least not entirely a Gnus problem. Just because Gnus seems slow, it
doesn't mean it's Gnus's fault or that Gnus is the right place to try
to fix it. If *your* filesystem is slow, *you* should switch to a
different filesystem. (E.g., ReiserFS is designed to deal with large
directories. Maybe XFS or JFS would also do well; I dunno.)
Gnus has a choice of interfaces, but not of implementations behind
them. The best we can do is to choose an interface that *allows* an
implementation meeting certain performance criteria, or possibly to
choose an interface where there are known to be widely-available
implementations meeting certain criteria.
paul
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-23 0:36 ` Lars Magne Ingebrigtsen
` (7 preceding siblings ...)
2002-01-23 17:50 ` Paul Jarc
@ 2002-01-24 0:50 ` Russ Allbery
2002-01-24 1:24 ` Lars Magne Ingebrigtsen
2002-01-24 9:14 ` db-backed mail back end Sean Neakums
8 siblings, 2 replies; 102+ messages in thread
From: Russ Allbery @ 2002-01-24 0:50 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> If we were to design a mail back end that's supposed to scale to several
> hundred thousands of messages in thousands of groups -- how would we do
> that? Perhaps somebody has pondered this question before. :-)
Don't store all the articles in one directory; the file system
implementation will kill you.
Preindex the stuff that you usually want to search for. .overview is sort
of a basic step in that direction, but for truly vast archives you need
something like Glimpse or htdig or whatnot. You probably also don't want
just one flat index; you want one index on sender, maybe an index on
recipient, a text index of the message, maybe an index on subject
depending on how you use subject, etc. And you want to limit searches by
a date range.
--
Russ Allbery (rra@stanford.edu) <http://www.eyrie.org/~eagle/>
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 0:50 ` Russ Allbery
@ 2002-01-24 1:24 ` Lars Magne Ingebrigtsen
2002-01-24 1:49 ` Daniel Pittman
2002-01-24 17:11 ` Paul Jarc
2002-01-24 9:14 ` db-backed mail back end Sean Neakums
1 sibling, 2 replies; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-24 1:24 UTC (permalink / raw)
Russ Allbery <rra@stanford.edu> writes:
> Don't store all the articles in one directory; the file system
> implementation will kill you.
Yup. The obvious solution to that would be using some sort of
hierarchy -- using (say) pairs of digits of the article number as
directory names. Or something.
"d45/d34/d43/4" for the message "4534434".
Or perhaps the other way around:
"45/34/43/a4" for the message "4534434".
> Preindex the stuff that you usually want to search for. .overview
> is sort of a basic step in that direction, but for truly vast
> archives you need something like Glimpse or htdig or whatnot.
Speaking over .overview files -- putting a 20kline .overview file into
a buffer and then looking through that buffer for the 2 lines we want
is s-l-o-w. I see two obvious solutions.
1) Split the .overview file into an .overview directory with several
1kline .overview files in it
2) Implement the basic .overview "find-the-right-line" thing in C.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 1:24 ` Lars Magne Ingebrigtsen
@ 2002-01-24 1:49 ` Daniel Pittman
2002-01-24 8:36 ` Lars Magne Ingebrigtsen
2002-01-24 17:11 ` Paul Jarc
1 sibling, 1 reply; 102+ messages in thread
From: Daniel Pittman @ 2002-01-24 1:49 UTC (permalink / raw)
On Thu, 24 Jan 2002, Lars Magne Ingebrigtsen wrote:
> Russ Allbery <rra@stanford.edu> writes:
[...]
>> Preindex the stuff that you usually want to search for. .overview
>> is sort of a basic step in that direction, but for truly vast
>> archives you need something like Glimpse or htdig or whatnot.
>
> Speaking over .overview files -- putting a 20kline .overview file into
> a buffer and then looking through that buffer for the 2 lines we want
> is s-l-o-w. I see two obvious solutions.
>
> 1) Split the .overview file into an .overview directory with several
> 1kline .overview files in it
> 2) Implement the basic .overview "find-the-right-line" thing in C.
3) Implement .overview.db, which allows you to find the overview line,
in Lisp `read' format, given the article number. Or whatever.
Daniel
--
One day the white men arrived in ships with wings, which shone in the sun like
knives. They fought hard battles with the Ngola and spat fire at him. They
conquered his salt-pans and the Ngola fled inland to the Lukala river..
-- Pende oral tradition
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 1:49 ` Daniel Pittman
@ 2002-01-24 8:36 ` Lars Magne Ingebrigtsen
2002-01-24 8:58 ` Kai Großjohann
0 siblings, 1 reply; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-24 8:36 UTC (permalink / raw)
Daniel Pittman <daniel@rimspace.net> writes:
> 3) Implement .overview.db, which allows you to find the overview line,
> in Lisp `read' format, given the article number. Or whatever.
Or perhaps stick the overview data into a Berkeley DB? I haven't
actually played with bdb (or any of its brethren) at all -- would
asking a bdb thing for overview files (indexed on article numbers) be
very fast? And does Emacs have native bdb support?
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 8:36 ` Lars Magne Ingebrigtsen
@ 2002-01-24 8:58 ` Kai Großjohann
2002-01-24 9:28 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 102+ messages in thread
From: Kai Großjohann @ 2002-01-24 8:58 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Daniel Pittman <daniel@rimspace.net> writes:
>
>> 3) Implement .overview.db, which allows you to find the overview line,
>> in Lisp `read' format, given the article number. Or whatever.
>
> Or perhaps stick the overview data into a Berkeley DB?
I think that NNDB and NNML (not nnml.el) do this. These are Perl
packages. Hm. Ah, googling finds NNDB at least.
kai
--
Simplification good! Oversimplification bad! (Larry Wall)
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 8:58 ` Kai Großjohann
@ 2002-01-24 9:28 ` Lars Magne Ingebrigtsen
2002-01-24 10:37 ` Kai Großjohann
0 siblings, 1 reply; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-24 9:28 UTC (permalink / raw)
Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes:
> I think that NNDB and NNML (not nnml.el) do this. These are Perl
> packages. Hm. Ah, googling finds NNDB at least.
There is a backend called nndb.el... Does that have anything to do
with anything? :-)
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 9:28 ` Lars Magne Ingebrigtsen
@ 2002-01-24 10:37 ` Kai Großjohann
0 siblings, 0 replies; 102+ messages in thread
From: Kai Großjohann @ 2002-01-24 10:37 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes:
>
>> I think that NNDB and NNML (not nnml.el) do this. These are Perl
>> packages. Hm. Ah, googling finds NNDB at least.
>
> There is a backend called nndb.el... Does that have anything to do
> with anything? :-)
It's for the NNDB Perl program.
kai
--
Simplification good! Oversimplification bad! (Larry Wall)
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 1:24 ` Lars Magne Ingebrigtsen
2002-01-24 1:49 ` Daniel Pittman
@ 2002-01-24 17:11 ` Paul Jarc
2002-01-24 17:58 ` nnmaildir (was: db-backed mail back end) Josh Huber
1 sibling, 1 reply; 102+ messages in thread
From: Paul Jarc @ 2002-01-24 17:11 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> wrote:
> Speaking over .overview files -- putting a 20kline .overview file into
> a buffer and then looking through that buffer for the 2 lines we want
> is s-l-o-w.
nnmaildir stores NOV information in one file per message. Finding an
article's NOV information is as fast as finding the article itself.
paul
^ permalink raw reply [flat|nested] 102+ messages in thread
* nnmaildir (was: db-backed mail back end)
2002-01-24 17:11 ` Paul Jarc
@ 2002-01-24 17:58 ` Josh Huber
2002-01-24 18:14 ` Harry Putnam
2002-01-24 18:39 ` nnmaildir Paul Jarc
0 siblings, 2 replies; 102+ messages in thread
From: Josh Huber @ 2002-01-24 17:58 UTC (permalink / raw)
prj@po.cwru.edu (Paul Jarc) writes:
> nnmaildir stores NOV information in one file per message. Finding
> an article's NOV information is as fast as finding the article
> itself.
Indeed, I've been meaning to try nnmaildir out. Would it be very
painful to migrate from nnml to maildir? Perhaps I'll try it for just
a couple groups to start with...
ttyl,
--
Josh Huber
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: nnmaildir (was: db-backed mail back end)
2002-01-24 17:58 ` nnmaildir (was: db-backed mail back end) Josh Huber
@ 2002-01-24 18:14 ` Harry Putnam
2002-01-24 18:43 ` Paul Jarc
2002-01-24 18:39 ` nnmaildir Paul Jarc
1 sibling, 1 reply; 102+ messages in thread
From: Harry Putnam @ 2002-01-24 18:14 UTC (permalink / raw)
Josh Huber <huber@alum.wpi.edu> writes:
> prj@po.cwru.edu (Paul Jarc) writes:
>
>> nnmaildir stores NOV information in one file per message. Finding
>> an article's NOV information is as fast as finding the article
>> itself.
>
> Indeed, I've been meaning to try nnmaildir out. Would it be very
> painful to migrate from nnml to maildir? Perhaps I'll try it for just
> a couple groups to start with...
I'd be interested in hearing your results... I had lots of trouble
trying to run both nnml and nnmaildir, but since then Paul J. has
fixed lots of stuff. I haven't tried running both since before the
fixes though. Still using nnmaildir alone on a secondary machine, but
my main mail/news, on the main machine, is still non-nnmaildir and
mostly nnml.
I think, if you have some semi-large to large groups you may notice
some significant slowness in initial startup as nnmaildir processes
everything, but after starting, you may find group entry to be
noticably faster. I don't hnow how much of that applies since Pauls
new release and other fixes.
One thing attractive about nnmaildir to me is that it doesn't care
about adding and subtracting from outside gnus. It just finds and
processes what is there (in `new') on entry. Also plays ball nicely
with procmail (like nnml does) since procmail has a builtin
recognition of maildir setup, and knows how to write to nnmaildir
groups.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: nnmaildir (was: db-backed mail back end)
2002-01-24 18:14 ` Harry Putnam
@ 2002-01-24 18:43 ` Paul Jarc
2002-01-24 22:05 ` Harry Putnam
0 siblings, 1 reply; 102+ messages in thread
From: Paul Jarc @ 2002-01-24 18:43 UTC (permalink / raw)
Harry Putnam <reader@newsguy.com> wrote:
> I had lots of trouble trying to run both nnml and nnmaildir, but
> since then Paul J. has fixed lots of stuff.
I think all your troubles were strictly nnmaildir-related, or at least
not related to running nnml as well.
> I think, if you have some semi-large to large groups you may notice
> some significant slowness in initial startup as nnmaildir processes
> everything, but after starting, you may find group entry to be
> noticably faster. I don't hnow how much of that applies since Pauls
> new release and other fixes.
I haven't made any changes in that area.
paul
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: nnmaildir (was: db-backed mail back end)
2002-01-24 18:43 ` Paul Jarc
@ 2002-01-24 22:05 ` Harry Putnam
2002-01-24 22:51 ` nnmaildir Paul Jarc
0 siblings, 1 reply; 102+ messages in thread
From: Harry Putnam @ 2002-01-24 22:05 UTC (permalink / raw)
prj@po.cwru.edu (Paul Jarc) writes:
> Harry Putnam <reader@newsguy.com> wrote:
>> I had lots of trouble trying to run both nnml and nnmaildir, but
>> since then Paul J. has fixed lots of stuff.
>
> I think all your troubles were strictly nnmaildir-related, or at least
> not related to running nnml as well.
>
>> I think, if you have some semi-large to large groups you may notice
>> some significant slowness in initial startup as nnmaildir processes
>> everything, but after starting, you may find group entry to be
>> noticably faster. I don't hnow how much of that applies since Pauls
>> new release and other fixes.
>
> I haven't made any changes in that area.
I wasn't sure because I'm using the most recent version, but not
really using it hard like before. I gave up trying to convert news to
mail that way. So have no large groups to test it on. My giving up
was not nnmailder related. Just seemed like too much complication
using suck, procmail and then gnus to get the results I wanted. Some
where in that chain of events I was generating lots of duplicates.
Not related to the apps I don't think, but my processing scripts.
There is one small thing that I can think of that might be improved
some how. Those filenames are pretty ridiculous if you do anything
with those files outside gnus. Seems like a simple incremental
numbering scheme once the messags are brought from `new' would work.
These are a tad unweildy:
1008622921.3916_0.sol.local.lan:2,
1011293422.3800_0.sol.local.lan:2,
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: nnmaildir
2002-01-24 22:05 ` Harry Putnam
@ 2002-01-24 22:51 ` Paul Jarc
0 siblings, 0 replies; 102+ messages in thread
From: Paul Jarc @ 2002-01-24 22:51 UTC (permalink / raw)
Harry Putnam <reader@newsguy.com> wrote:
> There is one small thing that I can think of that might be improved
> some how. Those filenames are pretty ridiculous if you do anything
> with those files outside gnus. Seems like a simple incremental
> numbering scheme once the messags are brought from `new' would work.
It works as long as you never have two processes moving mail from new/
to cur/ at the same time. Otherwise, you still need an uncoordinated
way to ensure filenames don't collide.
paul
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: nnmaildir
2002-01-24 17:58 ` nnmaildir (was: db-backed mail back end) Josh Huber
2002-01-24 18:14 ` Harry Putnam
@ 2002-01-24 18:39 ` Paul Jarc
1 sibling, 0 replies; 102+ messages in thread
From: Paul Jarc @ 2002-01-24 18:39 UTC (permalink / raw)
Josh Huber <huber@alum.wpi.edu> wrote:
> prj@po.cwru.edu (Paul Jarc) writes:
>> nnmaildir stores NOV information in one file per message. Finding
>> an article's NOV information is as fast as finding the article
>> itself.
>
> Indeed, I've been meaning to try nnmaildir out.
Note that finding the NOV information is also just as *slow* as
finding the article. :) It's up to you to either keep your groups
small or use a filesystem that can deal with large directories.
> Would it be very painful to migrate from nnml to maildir?
Setting up the server should be pretty straightforward. Copying the
articles is as easy as B c. The group parameters for expiry will be
different from what you're used to, but hopefully not too troublesome.
paul
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 0:50 ` Russ Allbery
2002-01-24 1:24 ` Lars Magne Ingebrigtsen
@ 2002-01-24 9:14 ` Sean Neakums
2002-01-24 9:59 ` Per Abrahamsen
1 sibling, 1 reply; 102+ messages in thread
From: Sean Neakums @ 2002-01-24 9:14 UTC (permalink / raw)
begin Russ Allbery quotation:
> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>> If we were to design a mail back end that's supposed to scale to
>> several hundred thousands of messages in thousands of groups -- how
>> would we do that? Perhaps somebody has pondered this question
>> before. :-)
>
> Don't store all the articles in one directory; the file system
> implementation will kill you.
Not all filesystems handle large directories poorly. Both XFS and
ReiserFS on Linux systems handle them well, for example. If you do
add code to make these directories into some kind of hierarchical
hashed layout, please make it an option.
--
///////////////// | | The spark of a pin
<sneakums@zork.net> | (require 'gnu) | dropping, falling feather-like.
\\\\\\\\\\\\\\\\\ | | There is too much noise.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 9:14 ` db-backed mail back end Sean Neakums
@ 2002-01-24 9:59 ` Per Abrahamsen
2002-01-24 10:03 ` Lars Magne Ingebrigtsen
` (3 more replies)
0 siblings, 4 replies; 102+ messages in thread
From: Per Abrahamsen @ 2002-01-24 9:59 UTC (permalink / raw)
Sean Neakums <sneakums@zork.net> writes:
> If you do add code to make these directories into some kind of
> hierarchical hashed layout, please make it an option.
Why? Does ReiserFS and XFS perform poorly on small directories?
Gnus will be less robust if it has to support mutiple filesystem
layouts, so there has to be some real advantage to make up for that.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 9:59 ` Per Abrahamsen
@ 2002-01-24 10:03 ` Lars Magne Ingebrigtsen
2002-01-24 10:24 ` Sean Neakums
` (2 subsequent siblings)
3 siblings, 0 replies; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-24 10:03 UTC (permalink / raw)
Per Abrahamsen <abraham@dina.kvl.dk> writes:
>> If you do add code to make these directories into some kind of
>> hierarchical hashed layout, please make it an option.
>
> Why? Does ReiserFS and XFS perform poorly on small directories?
>
> Gnus will be less robust if it has to support mutiple filesystem
> layouts, so there has to be some real advantage to make up for that.
On the other hand, this is Gnus. Options'R'Us. :-)
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 9:59 ` Per Abrahamsen
2002-01-24 10:03 ` Lars Magne Ingebrigtsen
@ 2002-01-24 10:24 ` Sean Neakums
2002-01-24 11:49 ` Jorge Godoy
2002-01-24 11:32 ` Simon Josefsson
2002-01-24 17:14 ` Paul Jarc
3 siblings, 1 reply; 102+ messages in thread
From: Sean Neakums @ 2002-01-24 10:24 UTC (permalink / raw)
begin Per Abrahamsen quotation:
> Sean Neakums <sneakums@zork.net> writes:
>> If you do add code to make these directories into some kind of
>> hierarchical hashed layout, please make it an option.
>
> Why? Does ReiserFS and XFS perform poorly on small directories?
No. But since they handle large ones well, I don't my want my Gnus
jumping through unnecessary hoops. And having mail groups be a tree
of directories whose names are hex numbers offends me to some degree.
--
///////////////// | | The spark of a pin
<sneakums@zork.net> | (require 'gnu) | dropping, falling feather-like.
\\\\\\\\\\\\\\\\\ | | There is too much noise.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 9:59 ` Per Abrahamsen
2002-01-24 10:03 ` Lars Magne Ingebrigtsen
2002-01-24 10:24 ` Sean Neakums
@ 2002-01-24 11:32 ` Simon Josefsson
2002-01-24 11:58 ` Karl Kleinpaste
2002-01-24 17:14 ` Paul Jarc
3 siblings, 1 reply; 102+ messages in thread
From: Simon Josefsson @ 2002-01-24 11:32 UTC (permalink / raw)
Cc: ding
On Thu, 24 Jan 2002, Per Abrahamsen wrote:
> Sean Neakums <sneakums@zork.net> writes:
>
> > If you do add code to make these directories into some kind of
> > hierarchical hashed layout, please make it an option.
>
> Why? Does ReiserFS and XFS perform poorly on small directories?
>
> Gnus will be less robust if it has to support mutiple filesystem
> layouts, so there has to be some real advantage to make up for that.
Are the problems in designing large backends really in the backend? Most
problems with large backends are due to the design of Gnus, not the
backends. NNML scales linearly with the number of messages in the group,
does it not? Gnus definitely doesn't scale linearly, so redesigning
something to get support for large mail/news backends should go into Gnus,
I think.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 11:32 ` Simon Josefsson
@ 2002-01-24 11:58 ` Karl Kleinpaste
2002-01-24 12:11 ` Lars Magne Ingebrigtsen
` (3 more replies)
0 siblings, 4 replies; 102+ messages in thread
From: Karl Kleinpaste @ 2002-01-24 11:58 UTC (permalink / raw)
Simon Josefsson <jas@extundo.com> writes:
> NNML scales linearly with the number of messages in the group, does
> it not? Gnus definitely doesn't scale linearly, so redesigning
> something to get support for large mail/news backends should go into
> Gnus, I think.
I have been wondering about this since this discussion started. The
slowness of entering a large group of 10K messages, or 100K, has very
little to do with getting at either the overview data or the message
files. It has to do with threading, scoring, and sorting. I don't
know what Gnus' threading algorithm is, but I suspect its performance
is at least as poor as O(n log n) and may even be as bad as O(n^2).
It is not clear to me that it is possible to do better than O(n log n)
in the first place, though I haven't contemplated the matter very long.
These are not problems that are addressed at all by the manner in
which articles and indices (overviews or otherwise) are stored, but
rather in how threading is computed, and how scoring is managed.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 11:58 ` Karl Kleinpaste
@ 2002-01-24 12:11 ` Lars Magne Ingebrigtsen
2002-01-24 12:15 ` Lars Magne Ingebrigtsen
2002-01-24 12:29 ` Simon Josefsson
` (2 subsequent siblings)
3 siblings, 1 reply; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-24 12:11 UTC (permalink / raw)
Karl Kleinpaste <karl@charcoal.com> writes:
> These are not problems that are addressed at all by the manner in
> which articles and indices (overviews or otherwise) are stored, but
> rather in how threading is computed, and how scoring is managed.
Try entering a group with an .overview file containing 100000 lines
compared to one that contains 100 line. The difference is
noticeable.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 12:11 ` Lars Magne Ingebrigtsen
@ 2002-01-24 12:15 ` Lars Magne Ingebrigtsen
2002-01-24 12:54 ` Karl Kleinpaste
0 siblings, 1 reply; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-24 12:15 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Try entering a group with an .overview file containing 100000 lines
> compared to one that contains 100 line. The difference is
> noticeable.
I just compared a 20K group with a 100 group, and the 20K group took
three times as long to enter as the 100 group. (And I selected the
four most recent articles in each.)
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 12:15 ` Lars Magne Ingebrigtsen
@ 2002-01-24 12:54 ` Karl Kleinpaste
2002-01-24 15:05 ` Lars Magne Ingebrigtsen
2002-01-24 16:13 ` Frank Schmitt
0 siblings, 2 replies; 102+ messages in thread
From: Karl Kleinpaste @ 2002-01-24 12:54 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> I just compared a 20K group with a 100 group, and the 20K group took
> three times as long to enter as the 100 group. (And I selected the
> four most recent articles in each.)
You and Simon have both made this observation. I just don't see the
same performance difference. Perhaps my blobs of articles aren't big
enough. That's by design, at least in part, as a choice for mental
discipline: I don't want groups larger than I can reasonably think
about. (I happen to generate per-year archive groups every Jan 1.)
The largest nnml group I have has 4900 articles in it, with a 620K
.overview. I'm operating on a none-too-recent PII-450 under Linux
with RAID1 (mirrored) IDEs.
When I select only the last 4 articles from it, the time between
hitting RET and seeing a displayed summary is too short to measure by
eyeball against a clock -- less than a second. How do you measure
4-article entry to a 100-article group, that you can say that the time
varies by a factor of 3? What are the timings?
When I select the whole group, Gnus says "scoring" within 3 or 4
seconds of beginning entry, which then takes 28 seconds. Then Gnus
says "Generating summary", which takes well over a minute before the
summary is displayed. Total time is about 1:43. That's the sort of
time occupation that worries me.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 12:54 ` Karl Kleinpaste
@ 2002-01-24 15:05 ` Lars Magne Ingebrigtsen
2002-01-24 20:40 ` Karl Kleinpaste
2002-01-24 16:13 ` Frank Schmitt
1 sibling, 1 reply; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-24 15:05 UTC (permalink / raw)
Karl Kleinpaste <karl@charcoal.com> writes:
> When I select only the last 4 articles from it, the time between
> hitting RET and seeing a displayed summary is too short to measure by
> eyeball against a clock -- less than a second. How do you measure
> 4-article entry to a 100-article group, that you can say that the time
> varies by a factor of 3? What are the timings?
On my new machine it's 0.25 seconds vs. 0.75 seconds. :-)
But I suspect it's somewhat slower on less spiffy machines.
> When I select the whole group, Gnus says "scoring" within 3 or 4
> seconds of beginning entry, which then takes 28 seconds. Then Gnus
> says "Generating summary", which takes well over a minute before the
> summary is displayed. Total time is about 1:43. That's the sort of
> time occupation that worries me.
Yes, entering a biiig group takes a long, long time, but that's harder
to do something about. :-)
But it isn't just the group entry time that might make it worthwhile
to do something about the storage -- it's also the mail splitting.
(And since the Agent uses nnml for its storage, it also affects
article fetching.) Doing a fetch in an Agent that hasn't been expired
in a few months is slow -- by and order of two or three times slower
than doing it in a freshly expired Agent.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 15:05 ` Lars Magne Ingebrigtsen
@ 2002-01-24 20:40 ` Karl Kleinpaste
2002-01-25 1:28 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 102+ messages in thread
From: Karl Kleinpaste @ 2002-01-24 20:40 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> On my new machine it's 0.25 seconds vs. 0.75 seconds. :-)
Let me get this straight -- you're worried about the "slowness" of
something that, even in your own worst case, takes noticeably less
than a second?
> But I suspect it's somewhat slower on less spiffy machines.
Yeah, on my older machine, it might take all of 3 seconds. A
horrifying and crushing waste of time, to be sure. I might have long
enough in there to blink 2 or 3 times for no good reason.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 20:40 ` Karl Kleinpaste
@ 2002-01-25 1:28 ` Lars Magne Ingebrigtsen
2002-01-25 2:17 ` Karl Kleinpaste
0 siblings, 1 reply; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-25 1:28 UTC (permalink / raw)
Karl Kleinpaste <karl@charcoal.com> writes:
> Let me get this straight -- you're worried about the "slowness" of
> something that, even in your own worst case, takes noticeably less
> than a second?
That was as 20K group. I'm looking for 200K group solutions.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 1:28 ` Lars Magne Ingebrigtsen
@ 2002-01-25 2:17 ` Karl Kleinpaste
2002-01-25 2:42 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 102+ messages in thread
From: Karl Kleinpaste @ 2002-01-25 2:17 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> That was as 20K group. I'm looking for 200K group solutions.
Do you have reason to believe that scaling 20K->200K (10x) will be
substantially worse than scaling 100->20K (200x)?
You only had a 3x time factor in the case already seen, for a 200x
content increase. Why is there reason to believe that scaling up
again by a mere 10x will be substantially worse? If by some fluke you
should find a full 10x time cost for the 10x size increase of 20K->200K,
you will encounter a 4-article entry to a 200K group costing a
stunning...7.5 seconds (based on your 0.25 and 0.75 timings).
That's about long enough to take a leisurely sip of <insert beverage>.
Maybe 10x is a reasonable scaling expectation in 20K->200K, maybe not.
Maybe filesystem overhead, or the system's control of Emacs' virtual
memory behavior, will make it much worse. I don't know. Maybe it'll
only cost 3x time to scale size by 10x. It only cost 3x time for a
200x size before.
I'm not trying to be difficult or argumentative. I'm just trying to
figure out what problem you're really trying to solve. So far, I see
some nebulous idea that "it's slow," without being able to qualify,
much less quantify, where the problem lies.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 2:17 ` Karl Kleinpaste
@ 2002-01-25 2:42 ` Lars Magne Ingebrigtsen
2002-01-25 3:23 ` Karl Kleinpaste
0 siblings, 1 reply; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-25 2:42 UTC (permalink / raw)
Karl Kleinpaste <karl@charcoal.com> writes:
> You only had a 3x time factor in the case already seen, for a 200x
> content increase. Why is there reason to believe that scaling up
> again by a mere 10x will be substantially worse? If by some fluke you
> should find a full 10x time cost for the 10x size increase of 20K->200K,
> you will encounter a 4-article entry to a 200K group costing a
> stunning...7.5 seconds (based on your 0.25 and 0.75 timings).
Yes. And that's not acceptable. If it takes 7.5 seconds to enter a
group to look at the one new article, then that's too slow. Much,
much, much too slow.
> I'm not trying to be difficult or argumentative. I'm just trying to
> figure out what problem you're really trying to solve. So far, I see
> some nebulous idea that "it's slow," without being able to qualify,
> much less quantify, where the problem lies.
Fact is that it's too slow. And finding out possible solutions to the
problem is the reason for this thread.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 2:42 ` Lars Magne Ingebrigtsen
@ 2002-01-25 3:23 ` Karl Kleinpaste
2002-01-25 3:34 ` Lars Magne Ingebrigtsen
` (3 more replies)
0 siblings, 4 replies; 102+ messages in thread
From: Karl Kleinpaste @ 2002-01-25 3:23 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Yes. And that's not acceptable. If it takes 7.5 seconds to enter a
> group to look at the one new article, then that's too slow. Much,
> much, much too slow.
Sorry, no, I don't buy it.
Every day, people use Gnus to enter NNTP groups with 100, 500, 1000,
or 2000 new articles to scan, sort, and maybe even read a few. Every
time they do this, they spend tens of seconds while threading,
scoring, sorting, and summary generation occur. This sort of use is
of the all-day-every-day kind.
Yet you want to optimize the case that's so far out to the edge, I
doubt there's anyone that even actually _has_ a single group with 200K
messages in it, against which to test the planned optimization. (Does
anyone? Really, right now? Does anyone have a group, or does anyone
plan to create a group, that goes beyond, say, 30K messages?) You're
optimizing this bizarre, unusual, never-seen case, while the case that
I've run into at least a dozen times today alone haunts us, where
getting into a big, busy, NNTP group costs me minutes of computation
time while *Summary* is generated.
There's certainly no harm to making Gnus able to handle the storage
needs of gargantuan archives. Sure, knock yourself out. But that's
not a problem currently being faced by more than maybe 5 people on the
planet, whereas every single Gnus user has to worry over the entry
time cost of a busy group with 900 new messages.
Ohwell, attitudes vary, that's all.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 3:23 ` Karl Kleinpaste
@ 2002-01-25 3:34 ` Lars Magne Ingebrigtsen
2002-01-25 3:37 ` Daniel Pittman
` (2 subsequent siblings)
3 siblings, 0 replies; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-25 3:34 UTC (permalink / raw)
Karl Kleinpaste <karl@charcoal.com> writes:
>> Yes. And that's not acceptable. If it takes 7.5 seconds to enter a
>> group to look at the one new article, then that's too slow. Much,
>> much, much too slow.
>
> Sorry, no, I don't buy it.
>
> Every day, people use Gnus to enter NNTP groups with 100, 500, 1000,
> or 2000 new articles to scan, sort, and maybe even read a few. Every
> time they do this, they spend tens of seconds while threading,
> scoring, sorting, and summary generation occur. This sort of use is
> of the all-day-every-day kind.
>
> Yet you want to optimize the case that's so far out to the edge, I
> doubt there's anyone that even actually _has_ a single group with 200K
> messages in it, against which to test the planned optimization.
I get the feeling that you think that if we make it possible to have
big groups, that'll make the rest of Gnus slower. Or that if we spend
time making one part of Gnus faster, that we can't spend time making
other parts of Gnus faster. That's a kinda odd thing to focus on.
I've just re-implemented `gnus-agent-get-undownloaded-list', so that
Gnus now spends 0.04 seconds in that function instead of 2.3 seconds
when entering a big group.
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 3:23 ` Karl Kleinpaste
2002-01-25 3:34 ` Lars Magne Ingebrigtsen
@ 2002-01-25 3:37 ` Daniel Pittman
2002-01-25 4:19 ` Karl Kleinpaste
2002-01-25 4:29 ` Lars Magne Ingebrigtsen
2002-01-25 7:05 ` Justin Sheehy
2002-01-28 12:56 ` Jorge Godoy
3 siblings, 2 replies; 102+ messages in thread
From: Daniel Pittman @ 2002-01-25 3:37 UTC (permalink / raw)
On Thu, 24 Jan 2002, Karl Kleinpaste wrote:
> Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
>> Yes. And that's not acceptable. If it takes 7.5 seconds to enter a
>> group to look at the one new article, then that's too slow. Much,
>> much, much too slow.
>
> Sorry, no, I don't buy it.
Lucky you. :)
> Every day, people use Gnus to enter NNTP groups with 100, 500, 1000,
> or 2000 new articles to scan, sort, and maybe even read a few.
Sure. I also enter some groups with, say, two messages to read, like the
ding list just now... that take seven to ten seconds to enter.
> Every time they do this, they spend tens of seconds while threading,
> scoring, sorting, and summary generation occur. This sort of use is of
> the all-day-every-day kind.
It would be /great/ for it to be faster, especially in groups where
there are ticked, visible messages from a long time ago.
> Yet you want to optimize the case that's so far out to the edge, I
> doubt there's anyone that even actually _has_ a single group with 200K
> messages in it, against which to test the planned optimization. (Does
> anyone? Really, right now?
Nope. I peak at > 90,000 messages at the moment, with the ten most
active groups running down from there to ~ 12,000 messages.
> Does anyone have a group, or does anyone plan to create a group, that
> goes beyond, say, 30K messages?)
Yes.
[...]
> There's certainly no harm to making Gnus able to handle the storage
> needs of gargantuan archives. Sure, knock yourself out. But that's
> not a problem currently being faced by more than maybe 5 people on the
> planet, whereas every single Gnus user has to worry over the entry
> time cost of a busy group with 900 new messages.
That would be something that would be good to address first. Don't
mistake it for the only problem, though.
Daniel
--
Sticks and stones are hard on bones.
Aimed with angry art, Words can sting like anything.
But silence breaks the heart.
-- Phyllis McGinley
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 3:37 ` Daniel Pittman
@ 2002-01-25 4:19 ` Karl Kleinpaste
2002-01-25 4:47 ` Lars Magne Ingebrigtsen
2002-01-25 5:32 ` Daniel Pittman
2002-01-25 4:29 ` Lars Magne Ingebrigtsen
1 sibling, 2 replies; 102+ messages in thread
From: Karl Kleinpaste @ 2002-01-25 4:19 UTC (permalink / raw)
Daniel Pittman <daniel@rimspace.net> writes:
> I peak at > 90,000 messages at the moment, with the ten most
> active groups running down from there to ~ 12,000 messages.
When you have a few spare moments, it would be instructive to see your
timings for a 4-article entry to a group with 100, 20K, and 90K messages.
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> I get the feeling that you think that if we make it possible to have
> big groups, that'll make the rest of Gnus slower. Or that if we spend
> time making one part of Gnus faster, that we can't spend time making
> other parts of Gnus faster.
No, that's not even remotely what I'm thinking, but if I'm being that
badly interpreted, I think I had best find something else to do.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 4:19 ` Karl Kleinpaste
@ 2002-01-25 4:47 ` Lars Magne Ingebrigtsen
2002-01-25 9:23 ` Kai Großjohann
2002-01-25 5:32 ` Daniel Pittman
1 sibling, 1 reply; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-25 4:47 UTC (permalink / raw)
Karl Kleinpaste <karl@charcoal.com> writes:
>> I get the feeling that you think that if we make it possible to have
>> big groups, that'll make the rest of Gnus slower. Or that if we spend
>> time making one part of Gnus faster, that we can't spend time making
>> other parts of Gnus faster.
>
> No, that's not even remotely what I'm thinking, but if I'm being that
> badly interpreted, I think I had best find something else to do.
I was kinda hoping you'd instead explain to me what I'm
misunderstanding. :-/
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 4:47 ` Lars Magne Ingebrigtsen
@ 2002-01-25 9:23 ` Kai Großjohann
2002-01-25 9:30 ` Kai Großjohann
0 siblings, 1 reply; 102+ messages in thread
From: Kai Großjohann @ 2002-01-25 9:23 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Karl Kleinpaste <karl@charcoal.com> writes:
>
>>> I get the feeling that you think that if we make it possible to have
>>> big groups, that'll make the rest of Gnus slower. Or that if we spend
>>> time making one part of Gnus faster, that we can't spend time making
>>> other parts of Gnus faster.
>>
>> No, that's not even remotely what I'm thinking, but if I'm being that
>> badly interpreted, I think I had best find something else to do.
>
> I was kinda hoping you'd instead explain to me what I'm
> misunderstanding. :-/
I think he wants you to speed up "Generating summary buffer". In my
normal Gnus usage, this is also the most time-consuming part. But it
has already improved as opposed to 3 or 4 years ago.
kai
--
Simplification good! Oversimplification bad! (Larry Wall)
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 9:23 ` Kai Großjohann
@ 2002-01-25 9:30 ` Kai Großjohann
2002-01-25 9:35 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 102+ messages in thread
From: Kai Großjohann @ 2002-01-25 9:30 UTC (permalink / raw)
Kai.Grossjohann@cs.uni-dortmund.de (Kai Großjohann) writes:
> I think he wants you to speed up "Generating summary buffer". In my
> normal Gnus usage, this is also the most time-consuming part.
I got to take that back. It seems I'm not yet used to the spiffy 1.4
GHz Athlon thingy that I've got under my desk now. "Generating
summary buffer" is not faster than "Fetching headers" at the moment.
(I tried entering the nnimap:INBOX.auto.gnus and gnu.emacs.help
groups. With C-u RET RET. Both ended up with approx 1800 lines.)
I can imagine that people with slower machines will really notice
that part. I guess that these people will see that scoring and
threading are quick, but generating the summary buffer is pretty
slow. Is there anything that can be done about this?
> But it has already improved as opposed to 3 or 4 years ago.
Could be this is because of the new machine...
kai
--
Simplification good! Oversimplification bad! (Larry Wall)
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 9:30 ` Kai Großjohann
@ 2002-01-25 9:35 ` Lars Magne Ingebrigtsen
2002-01-25 9:58 ` Kai Großjohann
0 siblings, 1 reply; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-25 9:35 UTC (permalink / raw)
Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes:
> I can imagine that people with slower machines will really notice
> that part. I guess that these people will see that scoring and
> threading are quick, but generating the summary buffer is pretty
> slow. Is there anything that can be done about this?
Are you sure that it's the summary buffer generation that's slow? Try
ELPing and post the results...
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 9:35 ` Lars Magne Ingebrigtsen
@ 2002-01-25 9:58 ` Kai Großjohann
2002-01-25 10:04 ` Kai Großjohann
0 siblings, 1 reply; 102+ messages in thread
From: Kai Großjohann @ 2002-01-25 9:58 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Are you sure that it's the summary buffer generation that's slow? Try
> ELPing and post the results...
I've now entered nnimap:INBOX.auto.gnus and here are the results.
Hm. What do they tell us? Maybe that the summary buffer generation
isn't so slow after all? Hm.
/----
| Function Name Call Count Elapsed Time Average Time
| ====================================================== ========== ============ ============
| gnus-retrieve-headers 2 64.930867 32.4654335
| gnus-topic-select-group 1 49.303124 49.303124
| gnus-group-select-group 1 49.303064 49.303064
| gnus-group-read-group 1 49.303052 49.303052
| gnus-summary-read-group 1 49.303004 49.303004
| gnus-summary-read-group-1 1 49.30299 49.30299
| gnus-select-newsgroup 1 35.786196 35.786196
| gnus-fetch-headers 1 33.866834 33.866834
| gnus-cache-retrieve-headers 1 32.465541 32.465541
| nnimap-retrieve-headers 1 32.465148 32.465148
| gnus-summary-prepare 1 12.402957 12.402957
| gnus-sort-threads-1 1052 12.283426 0.0116762604
| gnus-summary-prepare-threads 1 11.147104 11.147104
| gnus-simplify-subject-fuzzy 4061 3.3280859999 0.0008195237
| gnus-dd-mmm 1785 2.2941050000 0.0012852128
| gnus-get-newsgroup-headers-xover 1 1.40016 1.40016
| gnus-summary-from-or-to-or-newsgroups 1785 1.1426499999 0.0006401400
| gnus-sort-threads 1 1.037662 1.037662
| gnus-articles-to-read 1 1.024266 1.024266
| gnus-thread-sort-by-most-recent-number 3015 1.0074100000 0.0003341326
| gnus-thread-highest-number 6030 0.8809909999 0.0001461013
| gnus-build-sparse-threads 1 0.741911 0.741911
| gnus-set-work-buffer 4063 0.6677200000 0.0001643416
| gnus-run-hooks 1792 0.3971829999 0.0002216422
| gnus-summary-highlight-line 1783 0.3743940000 0.0002099798
| gnus-build-old-threads 1 0.3319879999 0.3319879999
| gnus-build-get-header 237 0.3241400000 0.0013676793
| gnus-possibly-score-headers 1 0.288972 0.288972
| gnus-score-headers 1 0.28717 0.28717
| gnus-request-group 1 0.27203 0.27203
| nnimap-request-group 1 0.271948 0.271948
| nnimap-request-update-info-internal 1 0.270044 0.270044
| gnus-score-string 1 0.258248 0.258248
| nnimap-retrieve-which-headers 1 0.19694 0.19694
| gnus-correct-substring 716 0.1953960000 0.0002728994
| gnus-summary-limit-children 2282 0.1902700000 8.337...e-05
| gnus-gather-threads-by-subject 1 0.121053 0.121053
| gnus-extract-address-components 1785 0.1111819999 6.228...e-05
| gnus-put-text-property 3570 0.1019420000 2.855...e-05
| gnus-summary-initial-limit 1 0.072339 0.072339
| gnus-set-difference 4 0.070964 0.017741
| gnus-make-threads 1 0.056748 0.056748
| gnus-thread-loop-p 1814 0.0432050000 2.381...e-05
| gnus-update-missing-marks 1 0.042941 0.042941
| gnus-cut-threads 1 0.039511 0.039511
| gnus-member-of-range 1620 0.0352010000 2.172...e-05
| gnus-put-text-property-excluding-characters-with-faces 1783 0.0317080000 1.778...e-05
| gnus-char-width 11456 0.0247400000 2.159...e-06
| nnimap-retrieve-headers-from-file 1 0.023779 0.023779
| gnus-score-string< 8729 0.0174639999 2.000...e-06
| gnus-compress-sequence 8 0.012047 0.001505875
| gnus-sorted-complement 2 0.009498 0.004749
| gnus-uncompress-range 8 0.00879 0.00109875
| gnus-point-at-eol 1814 0.0069329999 3.821...e-06
| nnimap-possibly-change-group 3 0.005339 0.0017796666
| gnus-summary-setup-buffer 1 0.005138 0.005138
| gnus-killed-articles 1 0.004538 0.004538
| gnus-sorted-intersection 1 0.003625 0.003625
| gnus-group-find-parameter 8 0.003045 0.000380625
| gnus-message 13 0.0030139999 0.0002318461
| gnus-group-topic-parameters 8 0.002541 0.000317625
| gnus-summary-mode 1 0.002367 0.002367
| gnus-adjust-marked-articles 1 0.002138 0.002138
| gnus-group-fast-parameter 6 0.00207 0.000345
| gnus-summary-auto-select-subject 1 0.00199 0.00199
| gnus-group-decoded-name 6 0.001987 0.0003311666
| gnus-summary-first-unread-subject 1 0.00198 0.00198
| gnus-update-summary-mark-positions 2 0.001857 0.0009285
| gnus-summary-first-subject 2 0.001839 0.0009195
| gnus-topic-hierarchical-parameters 8 0.001836 0.0002295
| gnus-all-score-files 1 0.001781 0.001781
| gnus-summary-insert-line 2 0.001654 0.000827
| gnus-uncompress-sequence 1 0.00163 0.00163
| gnus-group-name-decode 6 0.0014379999 0.0002396666
| gnus-topic-parent-topic 152 0.0013550000 8.914...e-06
| nnimap-group-overview-filename 3 0.001226 0.0004086666
| gnus-current-topics 8 0.001199 0.000149875
| gnus-remassoc 41 0.00101 2.463...e-05
| nnimap-dont-use-nov-p 1 0.000932 0.000932
| gnus-set-sorted-intersection 1 0.000907 0.000907
| gnus-summary-setup-default-charset 1 0.000849 0.000849
| gnus-score-load-files 1 0.000774 0.000774
| gnus-score-load-file 8 0.0007219999 9.024...e-05
| gnus-update-format-specifications 2 0.000662 0.000331
| gnus-point-at-bol 190 0.0006359999 3.347...e-06
| gnus-score-find-hierarchical 1 0.000608 0.000608
| gnus-find-method-for-group 13 0.000502 3.861...e-05
| gnus-last-element 1 0.000496 0.000496
| gnus-set-mode-line 1 0.00049 0.00049
| gnus-topic-parameters 16 0.0004779999 2.987...e-05
| gnus-continuum-version 4 0.000464 0.000116
| gnus-parameter-charset 1 0.000462 0.000462
| gnus-group-name-charset 6 0.000447 7.450...e-05
| gnus-group-auto-expirable-p 1 0.000446 0.000446
| gnus-topic-find-topology 56 0.0004439999 7.928...e-06
| gnus-summary-set-local-parameters 1 0.000432 0.000432
| gnus-set-global-variables 4 0.000402 0.0001005
| gnus-parameter-ignored-charsets 1 0.000364 0.000364
| gnus-score-file-name 8 0.0003459999 4.324...e-05
| gnus-sort-gathered-threads 1 0.000343 0.000343
| gnus-configure-windows 1 0.000338 0.000338
| nnimap-mark-permanent-p 14 0.000336 2.399...e-05
| gnus-update-alist-soft 6 0.0003279999 5.466...e-05
| gnus-summary-buffer-name 1 0.000314 0.000314
| nnimap-mark-to-flag 27 0.0002650000 9.814...e-06
| gnus-update-read-articles 1 0.000265 0.000265
| gnus-apply-kill-file 1 0.000258 0.000258
| gnus-cache-articles-in-group 1 0.000251 0.000251
| gnus-make-sort-function 1 0.000249 0.000249
| gnus-current-topic 8 0.000239 2.9875e-05
| gnus-goto-colon 6 0.000233 3.883...e-05
| gnus-byte-compile 1 0.000226 0.000226
| gnus-summary-position-point 5 0.000221 4.420...e-05
| gnus-group-prefixed-name 4 0.00022 5.5e-05
| gnus-score-load-score-alist 4 0.000188 4.7e-05
| gnus-configure-frame 2 0.000178 8.9e-05
| nnmail-message-id 4 0.000164 4.1e-05
| gnus-group-goto-group 8 0.0001609999 2.012...e-05
| gnus-cache-file-name 1 0.000159 0.000159
| gnus-group-prev-group 1 0.000148 0.000148
| nnheader-nov-delete-outside-range 1 0.000136 0.000136
| gnus-server-to-method 9 0.0001309999 1.455...e-05
| gnus-newsgroup-kill-file 2 0.0001309999 6.549...e-05
| gnus-group-next-unread-group 1 0.000127 0.000127
| gnus-group-search-forward 1 0.00011 0.00011
| nnheader-find-nov-line 2 0.0001099999 5.499...e-05
| gnus-make-hashtable 2 0.000109 5.45e-05
| gnus-check-server 1 0.000105 0.000105
| gnus-group-get-parameter 12 0.0001030000 8.583...e-06
| gnus-get-buffer-create 3 0.0001029999 3.433...e-05
| gnus-summary-make-local-variables 2 0.0001009999 5.049...e-05
| gnus-server-opened 1 9.6e-05 9.6e-05
| gnus-parameters-get-parameter 2 9.5e-05 4.75e-05
| gnus-remove-from-range 2 8.999...e-05 4.499...e-05
| gnus-summary-set-display-table 1 8.8e-05 8.8e-05
| gnus-make-directory 1 8.8e-05 8.8e-05
| gnus-agent-mode 1 8.7e-05 8.7e-05
| gnus-agent-get-undownloaded-list 1 8.3e-05 8.3e-05
| gnus-buffer-live-p 8 8.1e-05 1.0125e-05
| gnus-all-windows-visible-p 1 7.9e-05 7.9e-05
| gnus-summary-show-thread 1 7.3e-05 7.3e-05
| gnus-copy-sequence 9 7.000...e-05 7.777...e-06
| gnus-group-parameter-value 15 6.900...e-05 4.600...e-06
| gnus-methods-using 4 6.6e-05 1.65e-05
| gnus-virtual-group-p 1 6.1e-05 6.1e-05
| gnus-group-position-point 1 6e-05 6e-05
| gnus-mailing-list-mode 1 6e-05 6e-05
| gnus-newsgroup-savable-name 7 5.9e-05 8.428...e-06
| nnimap-mark-to-flag-1 27 5.800...e-05 2.148...e-06
| gnus-short-group-name 1 5.5e-05 5.5e-05
| nnimap-server-opened 1 5e-05 5e-05
| gnus-get-buffer-window 1 4.8e-05 4.8e-05
| gnus-server-equal 4 4.5e-05 1.125e-05
| gnus-thread-sort-by-number 15 4.399...e-05 2.933...e-06
| nnheader-set-temp-buffer 1 4e-05 4e-05
| gnus-group-topic-p 1 3.9e-05 3.9e-05
| gnus-article-mark-to-type 21 3.800...e-05 1.809...e-06
| nnheader-translate-file-chars 23 3.6e-05 1.565...e-06
| gnus-summary-make-menu-bar 1 3.2e-05 3.2e-05
| gnus-group-topic-name 1 3e-05 3e-05
| gnus-agent-get-function 3 2.9e-05 9.666...e-06
| gnus-add-minor-mode 1 2.6e-05 2.6e-05
| gnus-get-function 1 2.5e-05 2.5e-05
| gnus-undo-register 1 2.5e-05 2.5e-05
| gnus-turn-off-edit-menu 1 2.3e-05 2.3e-05
| gnus-group-quit-config 1 2.3e-05 2.3e-05
| gnus-list-of-unread-articles 1 2e-05 2e-05
| gnus-visual-p 6 1.999...e-05 3.333...e-06
| gnus-info-set-entry 6 1.999...e-05 3.333...e-06
| gnus-get-unread-articles-in-group 1 1.8e-05 1.8e-05
| gnus-undo-boundary 5 1.7e-05 3.4e-06
| gnus-group-group-name 1 1.7e-05 1.7e-05
| gnus-group-real-prefix 1 1.7e-05 1.7e-05
| gnus-use-long-file-name 8 1.600...e-05 2.000...e-06
| gnus-undo-register-1 1 1.6e-05 1.6e-05
| gnus-mode-line-buffer-identification 1 1.6e-05 1.6e-05
| nnimap-mark-to-predicate 6 1.499...e-05 2.499...e-06
| gnus-create-hash-size 2 1.4e-05 7e-06
| gnus-online 5 1.399...e-05 2.799...e-06
| nnoo-server-opened 1 1.3e-05 1.3e-05
| gnus-frames-on-display-list 1 1.2e-05 1.2e-05
| nnimap-possibly-change-server 3 1.1e-05 3.666...e-06
| gnus-simplify-mode-line 1 8e-06 8e-06
| gnus-expand-group-parameters 2 8e-06 4e-06
| gnus-score-find-alist 1 6e-06 6e-06
| gnus-home-score-file 2 6e-06 3e-06
| gnus-agent-method-p 1 5e-06 5e-06
| gnus-window-to-buffer-helper 2 4.999...e-06 2.499...e-06
| gnus-windows-old-to-new 1 4e-06 4e-06
| gnus-make-sort-function-1 1 4e-06 4e-06
| gnus-delete-if 1 4e-06 4e-06
| gnus-summary-make-tool-bar 1 3e-06 3e-06
| gnus-server-status 1 3e-06 3e-06
| gnus-summary-maybe-hide-threads 1 3e-06 3e-06
| gnus-make-thread-indent-array 1 3e-06 3e-06
| nnoo-current-server 1 3e-06 3e-06
| gnus-agent-summary-make-menu-bar 1 2e-06 2e-06
| gnus-mailing-list-make-menu-bar 1 2e-06 2e-06
| gnus-set-default-directory 1 2e-06 2e-06
| nnimap-before-find-minmax-bugworkaround 1 2e-06 2e-06
\----
kai
--
Simplification good! Oversimplification bad! (Larry Wall)
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 9:58 ` Kai Großjohann
@ 2002-01-25 10:04 ` Kai Großjohann
2002-01-25 10:15 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 102+ messages in thread
From: Kai Großjohann @ 2002-01-25 10:04 UTC (permalink / raw)
Kai.Grossjohann@cs.uni-dortmund.de (Kai Großjohann) writes:
> I've now entered nnimap:INBOX.auto.gnus and here are the results.
Here are the results for gnu.emacs.help (with C-u RET RET):
/----
| Function Name Call Count Elapsed Time Average Time
| ====================================================== ========== ============ ============
| gnus-retrieve-headers 3 364.20543699 121.40181233
| gnus-topic-select-group 1 206.129288 206.129288
| gnus-group-select-group 1 206.129225 206.129225
| gnus-group-read-group 1 206.129214 206.129214
| gnus-summary-read-group 1 206.129169 206.129169
| gnus-summary-read-group-1 1 206.129154 206.129154
| gnus-select-newsgroup 1 192.549928 192.549928
| gnus-fetch-headers 1 186.317282 186.317282
| gnus-cache-retrieve-headers 1 182.575129 182.575129
| gnus-agent-retrieve-headers 1 180.577227 180.577227
| gnus-agent-save-alist 1 157.608388 157.608388
| gnus-set-difference 3 20.2174 6.7391333333
| gnus-summary-prepare 1 11.530258 11.530258
| gnus-summary-prepare-threads 1 9.125797 9.125797
| gnus-get-newsgroup-headers-xover 1 3.741023 3.741023
| gnus-build-old-threads 1 3.2381729999 3.2381729999
| gnus-build-get-header 151 3.2306279999 0.0213948874
| gnus-group-jump-to-group 1 2.617927 2.617927
| gnus-dd-mmm 1768 2.5379900000 0.0014355147
| gnus-cache-braid-nov 1 1.892906 1.892906
| gnus-simplify-subject-fuzzy 4181 1.8558889999 0.0004438863
| gnus-sort-threads-1 902 1.7841550000 0.0019779988
| gnus-sort-threads 1 1.72618 1.72618
| gnus-thread-sort-by-most-recent-number 4627 1.6714319999 0.0003612344
| nntp-accept-process-output 130 1.309013 0.0100693307
| gnus-possibly-score-headers 1 1.3010519999 1.3010519999
| gnus-articles-to-read 1 1.300275 1.300275
| gnus-score-headers 1 1.2994379999 1.2994379999
| gnus-score-string 1 1.148072 1.148072
| gnus-thread-highest-number 9254 1.0903829999 0.0001178282
| nntp-retrieve-headers 1 1.052888 1.052888
| nntp-retrieve-headers-with-xover 1 1.052777 1.052777
| nntp-send-xover-command 1 1.040614 1.040614
| nntp-send-command-nodelete 1 1.040569 1.040569
| gnus-agent-braid-nov 1 0.970894 0.970894
| gnus-summary-from-or-to-or-newsgroups 1768 0.7955190000 0.0004499541
| gnus-run-hooks 1775 0.7471350000 0.0004209211
| gnus-summary-highlight-line 1766 0.7037719999 0.0003985118
| gnus-build-sparse-threads 1 0.6824209999 0.6824209999
| gnus-extract-address-components 1768 0.6443449999 0.0003644485
| nnheader-insert-file-contents 4 0.621835 0.15545875
| gnus-member-of-range 1688 0.5000059999 0.0002962120
| gnus-make-threads 1 0.498406 0.498406
| gnus-make-hashtable 2 0.4010209999 0.2005104999
| gnus-check-server 1 0.269944 0.269944
| gnus-open-server 1 0.268662 0.268662
| nntp-open-server 1 0.268606 0.268606
| nntp-open-connection 1 0.266418 0.266418
| gnus-update-missing-marks 1 0.218805 0.218805
| gnus-gather-threads-by-subject 1 0.15905 0.15905
| gnus-set-work-buffer 4183 0.1484209999 3.548...e-05
| gnus-agent-load-alist 1 0.122237 0.122237
| gnus-agent-read-file 1 0.122044 0.122044
| gnus-summary-limit-children 1986 0.1206469999 6.074...e-05
| gnus-sorted-complement 3 0.1122239999 0.037408
| nnheader-find-nov-line 1 0.08668 0.08668
| gnus-sorted-intersection 2 0.064246 0.032123
| gnus-correct-substring 336 0.0623740000 0.0001856369
| gnus-summary-initial-limit 1 0.056863 0.056863
| gnus-score-string< 14155 0.0487400000 3.443...e-06
| gnus-request-group 1 0.030841 0.030841
| nntp-request-group 1 0.030745 0.030745
| gnus-killed-articles 1 0.02524 0.02524
| gnus-agent-save-history 1 0.023906 0.023906
| gnus-put-text-property-excluding-characters-with-faces 1766 0.0202220000 1.145...e-05
| gnus-cut-threads 1 0.019795 0.019795
| gnus-put-text-property 3536 0.0175920000 4.975...e-06
| gnus-point-at-eol 1864 0.0174220000 9.346...e-06
| gnus-agent-open-history 1 0.015814 0.015814
| gnus-uncompress-range 6 0.014541 0.0024235
| gnus-thread-loop-p 1387 0.0143829999 1.036...e-05
| nntp-decode-text 4 0.009171 0.00229275
| nntp-open-network-stream 1 0.006867 0.006867
| gnus-char-width 5376 0.0059320000 1.103...e-06
| gnus-message 13 0.0054610000 0.0004200769
| gnus-adjust-marked-articles 1 0.003488 0.003488
| nnheader-message 121 0.0034460000 2.847...e-05
| gnus-group-find-parameter 8 0.003428 0.0004285
| gnus-summary-setup-buffer 1 0.003197 0.003197
| nntp-send-mode-reader 1 0.003043 0.003043
| gnus-group-topic-parameters 8 0.0029519999 0.0003689999
| nntp-send-authinfo 1 0.002534 0.002534
| gnus-parse-netrc 1 0.002478 0.002478
| gnus-group-fast-parameter 6 0.002298 0.0003830000
| gnus-summary-mode 1 0.002208 0.002208
| gnus-summary-auto-select-subject 1 0.001927 0.001927
| gnus-group-decoded-name 6 0.001923 0.0003205
| gnus-summary-first-unread-subject 1 0.001917 0.001917
| nntp-kill-buffer 1 0.001916 0.001916
| gnus-topic-hierarchical-parameters 8 0.001882 0.00023525
| gnus-summary-first-subject 2 0.001782 0.000891
| gnus-update-summary-mark-positions 2 0.001683 0.0008415
| gnus-cache-articles-in-group 1 0.001584 0.001584
| gnus-all-score-files 1 0.001579 0.001579
| gnus-group-name-decode 6 0.001533 0.0002555000
| gnus-summary-insert-line 2 0.001459 0.0007295
| gnus-topic-parent-topic 152 0.0013930000 9.164...e-06
| gnus-current-topics 8 0.0012380000 0.0001547500
| gnus-set-sorted-intersection 1 0.00094 0.00094
| nntp-find-connection-buffer 131 0.0008610000 6.572...e-06
| gnus-summary-setup-default-charset 1 0.000818 0.000818
| gnus-score-load-files 1 0.000773 0.000773
| gnus-score-load-file 8 0.000706 8.825e-05
| gnus-update-format-specifications 2 0.000694 0.000347
| gnus-point-at-bol 167 0.0006179999 3.700...e-06
| gnus-current-topic 8 0.000608 7.6e-05
| gnus-group-auto-expirable-p 1 0.000576 0.000576
| gnus-agent-article-name 3 0.0005729999 0.0001909999
| gnus-sort-gathered-threads 1 0.000498 0.000498
| gnus-continuum-version 4 0.0004930000 0.0001232500
| gnus-set-mode-line 1 0.000471 0.000471
| gnus-last-element 1 0.00047 0.00047
| gnus-topic-parameters 16 0.0004689999 2.931...e-05
| gnus-score-find-hierarchical 1 0.000462 0.000462
| gnus-topic-find-topology 56 0.0004489999 8.017...e-06
| gnus-summary-set-local-parameters 1 0.000447 0.000447
| gnus-parameter-charset 1 0.000439 0.000439
| gnus-group-goto-group 9 0.000395 4.388...e-05
| gnus-score-file-name 8 0.0003760000 4.700...e-05
| gnus-parameter-ignored-charsets 1 0.000354 0.000354
| gnus-configure-windows 1 0.000346 0.000346
| gnus-agent-group-path 3 0.0003050000 0.0001016666
| gnus-goto-colon 6 0.000296 4.933...e-05
| gnus-group-name-charset 6 0.000287 4.783...e-05
| gnus-make-directory 2 0.0002839999 0.0001419999
| gnus-cache-file-name 3 0.000281 9.366...e-05
| gnus-apply-kill-file 1 0.000277 0.000277
| gnus-summary-buffer-name 1 0.000274 0.000274
| gnus-set-global-variables 3 0.000249 8.3e-05
| gnus-make-sort-function 1 0.000248 0.000248
| gnus-byte-compile 1 0.000225 0.000225
| gnus-summary-position-point 5 0.000221 4.420...e-05
| gnus-get-buffer-create 5 0.0002199999 4.4e-05
| gnus-configure-frame 2 0.000187 9.35e-05
| gnus-score-load-score-alist 4 0.00018 4.5e-05
| gnus-newsgroup-kill-file 2 0.000147 7.35e-05
| gnus-agent-enter-history 1 0.000146 0.000146
| gnus-update-read-articles 1 0.00014 0.00014
| nntp-read-server-type 1 0.000131 0.000131
| nnoo-change-server 1 0.00013 0.00013
| nnheader-init-server-buffer 2 0.000116 5.8e-05
| gnus-find-method-for-group 14 0.000115 8.214...e-06
| gnus-summary-make-local-variables 2 0.000115 5.75e-05
| gnus-group-position-point 1 0.000112 0.000112
| nntp-possibly-change-group 2 0.0001099999 5.499...e-05
| gnus-agent-mode 1 0.000109 0.000109
| gnus-agent-lib-file 1 9.3e-05 9.3e-05
| gnus-summary-set-display-table 1 9e-05 9e-05
| nntp-server-opened 4 8.8e-05 2.2e-05
| gnus-buffer-live-p 12 8.7e-05 7.25e-06
| gnus-group-get-parameter 10 8.000...e-05 8.000...e-06
| gnus-all-windows-visible-p 1 8e-05 8e-05
| gnus-thread-sort-by-number 28 7.899...e-05 2.821...e-06
| nnheader-translate-file-chars 19 7.699...e-05 4.052...e-06
| gnus-summary-show-thread 1 7.3e-05 7.3e-05
| nntp-make-process-buffer 1 7.1e-05 7.1e-05
| gnus-parameters-get-parameter 2 7e-05 3.5e-05
| gnus-get-function 2 6.8e-05 3.4e-05
| gnus-agent-get-function 4 6.500...e-05 1.625...e-05
| nnheader-concat 4 6.3e-05 1.575e-05
| nnoo-push-server 1 6.3e-05 6.3e-05
| nnheader-set-temp-buffer 1 6.2e-05 6.2e-05
| gnus-article-mark-to-type 21 5.999...e-05 2.857...e-06
| gnus-server-opened 1 5.8e-05 5.8e-05
| gnus-newsgroup-savable-name 7 5.6e-05 8e-06
| gnus-short-group-name 1 5.2e-05 5.2e-05
| gnus-online 6 4.999...e-05 8.333...e-06
| gnus-get-buffer-window 1 4.9e-05 4.9e-05
| nnheader-replace-duplicate-chars-in-string 3 4.700...e-05 1.566...e-05
| gnus-group-topic-p 1 4.1e-05 4.1e-05
| gnus-agent-get-undownloaded-list 1 4.1e-05 4.1e-05
| gnus-ephemeral-group-p 1 3.9e-05 3.9e-05
| gnus-group-parameter-value 8 3.4e-05 4.25e-06
| gnus-summary-make-menu-bar 1 3.2e-05 3.2e-05
| gnus-group-topic-name 1 3.1e-05 3.1e-05
| gnus-undo-register 1 2.6e-05 2.6e-05
| gnus-use-long-file-name 10 2.5e-05 2.5e-06
| gnus-turn-off-edit-menu 1 2.4e-05 2.4e-05
| gnus-group-quit-config 1 2.1e-05 2.1e-05
| gnus-list-of-unread-articles 1 2e-05 2e-05
| gnus-mode-line-buffer-identification 1 1.9e-05 1.9e-05
| gnus-visual-p 5 1.899...e-05 3.799...e-06
| gnus-create-hash-size 2 1.8e-05 9e-06
| gnus-group-real-prefix 1 1.8e-05 1.8e-05
| gnus-group-group-name 1 1.7e-05 1.7e-05
| gnus-virtual-group-p 1 1.7e-05 1.7e-05
| gnus-undo-register-1 1 1.7e-05 1.7e-05
| gnus-agent-create-buffer 1 1.6e-05 1.6e-05
| gnus-agent-history-buffer 1 1.4e-05 1.4e-05
| gnus-get-unread-articles-in-group 1 1.4e-05 1.4e-05
| gnus-undo-boundary 4 1.399...e-05 3.499...e-06
| nnoo-current-server 4 1.299...e-05 3.249...e-06
| gnus-netrc-machine 1 1.2e-05 1.2e-05
| gnus-frames-on-display-list 1 1.1e-05 1.1e-05
| gnus-score-find-alist 1 1.1e-05 1.1e-05
| gnus-simplify-mode-line 1 9e-06 9e-06
| gnus-cache-update-active 2 9e-06 4.5e-06
| nntp-find-connection 2 8e-06 4e-06
| gnus-home-score-file 2 7e-06 3.5e-06
| gnus-netrc-get 3 7e-06 2.333...e-06
| gnus-server-status 1 7e-06 7e-06
| nnoo-variables 2 6e-06 3e-06
| gnus-read-active-file-p 1 5e-06 5e-06
| gnus-agent-method-p 1 5e-06 5e-06
| gnus-make-sort-function-1 1 5e-06 5e-06
| gnus-window-to-buffer-helper 2 4.999...e-06 2.499...e-06
| gnus-windows-old-to-new 1 4e-06 4e-06
| gnus-cache-save-buffers 2 4e-06 2e-06
| gnus-summary-make-tool-bar 1 3e-06 3e-06
| gnus-copy-sequence 1 3e-06 3e-06
| gnus-delete-if 1 3e-06 3e-06
| gnus-make-thread-indent-array 1 3e-06 3e-06
| nnoo-parents 1 3e-06 3e-06
| gnus-agent-summary-make-menu-bar 1 2e-06 2e-06
| gnus-set-default-directory 1 2e-06 2e-06
| gnus-summary-maybe-hide-threads 1 2e-06 2e-06
\----
kai
--
Simplification good! Oversimplification bad! (Larry Wall)
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 10:04 ` Kai Großjohann
@ 2002-01-25 10:15 ` Lars Magne Ingebrigtsen
2002-01-25 12:12 ` Kai Großjohann
0 siblings, 1 reply; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-25 10:15 UTC (permalink / raw)
Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes:
> | gnus-retrieve-headers 3 364.20543699 121.40181233
This is misleading -- ELP doesn't get recursive calls right, so this
probably uses one third of the time shown here.
> | gnus-cache-retrieve-headers 1 182.575129 182.575129
Right. This is the actual time spent fetching. (The name is
misleading -- it will (most likely) just call `gnus-retrieve-headers'
and let it do its thing.
> | nntp-retrieve-headers 1 1.052888 1.052888
But this is incredible. Or do you really have all the articles in the
cache?
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 10:15 ` Lars Magne Ingebrigtsen
@ 2002-01-25 12:12 ` Kai Großjohann
2002-01-25 12:19 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 102+ messages in thread
From: Kai Großjohann @ 2002-01-25 12:12 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes:
>
>> | nntp-retrieve-headers 1 1.052888 1.052888
>
> But this is incredible. Or do you really have all the articles in the
> cache?
I have a lot of ticked and dormant messages in there which are in the
cache (because of gnus-use-cache being t). Lesse, how many articles
does the group actually have?
Ah:
/----
| grossjoh@lucy> telnet fbi-news 119
| Trying 129.217.4.45...
| Connected to fbi-news.
| Escape character is '^]'.
| 200 Informatik.Uni-Dortmund.DE InterNetNews NNRP server INN 1.7.2 08-Dec-1997 ready (posting ok).
| group gnu.emacs.help
| 211 1007 97846 98858 gnu.emacs.help
| quit
| 205 .
| Connection closed by foreign host.
\----
Hm. 1,000 messages. These should have been fetched from the
server. Do you think it's reasonable that these can be fetched in
one second? The connection between here and the server is a LAN
(same building).
kai
--
Simplification good! Oversimplification bad! (Larry Wall)
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 12:12 ` Kai Großjohann
@ 2002-01-25 12:19 ` Lars Magne Ingebrigtsen
2002-01-25 13:15 ` Kai Großjohann
0 siblings, 1 reply; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-25 12:19 UTC (permalink / raw)
Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes:
> Hm. 1,000 messages. These should have been fetched from the
> server. Do you think it's reasonable that these can be fetched in
> one second? The connection between here and the server is a LAN
> (same building).
That's reasonable, but it took 364 seconds to enter that group. How
many articles are there in the cache?
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 12:19 ` Lars Magne Ingebrigtsen
@ 2002-01-25 13:15 ` Kai Großjohann
2002-01-25 13:59 ` Lars Magne Ingebrigtsen
0 siblings, 1 reply; 102+ messages in thread
From: Kai Großjohann @ 2002-01-25 13:15 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> That's reasonable, but it took 364 seconds to enter that group. How
> many articles are there in the cache?
/----
| grossjoh@lucy> ls ~/News/cache/gnu.emacs.help/ | wc -l
| 184
\----
So... is there anything else I can do to find out where Gnus spends
the time? The articles are ticked. (In case that makes a
difference.)
Maybe I should try to mark them as dormant and then see what happens.
kai
--
Simplification good! Oversimplification bad! (Larry Wall)
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 13:15 ` Kai Großjohann
@ 2002-01-25 13:59 ` Lars Magne Ingebrigtsen
2002-01-25 15:43 ` Kai Großjohann
0 siblings, 1 reply; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-25 13:59 UTC (permalink / raw)
Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes:
> So... is there anything else I can do to find out where Gnus spends
> the time? The articles are ticked. (In case that makes a
> difference.)
>
> Maybe I should try to mark them as dormant and then see what happens.
Dormant and ticked articles are very similar, so that probably won't
make much difference.
But I see that I missed the fact that the group is agentized. How
many articles were displayed when you entered the group?
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 13:59 ` Lars Magne Ingebrigtsen
@ 2002-01-25 15:43 ` Kai Großjohann
[not found] ` <m3y9immn0r.fsf@quimbies.gnus.org>
0 siblings, 1 reply; 102+ messages in thread
From: Kai Großjohann @ 2002-01-25 15:43 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai Großjohann) writes:
>
>> So... is there anything else I can do to find out where Gnus spends
>> the time? The articles are ticked. (In case that makes a
>> difference.)
>>
>> Maybe I should try to mark them as dormant and then see what happens.
>
> Dormant and ticked articles are very similar, so that probably won't
> make much difference.
I see.
> But I see that I missed the fact that the group is agentized. How
> many articles were displayed when you entered the group?
About 1,800 (1,772 to be exact, but in the meantime some articles
might have arrived or expired).
While the group is agentized, I haven't done an agent fetch in a
really long time, and I was in plugged mode.
Hm. Ah. But I have 2210 articles in the agent cache directory. Let
me delete those to see what happens (they are really old anyway).
Okay, now I did a cvs update, started a fresh Gnus, did M-x
gnus-agent-expire RET, entered the group, elp-instrumented the
packages "gnus" and "nn", entered the group again, and here are the
results:
/----
| Function Name Call Count Elapsed Time Average Time
| ====================================================== ========== ============ ============
| gnus-retrieve-headers 3 193.845265 64.615088333
| gnus-topic-select-group 1 108.034995 108.034995
| gnus-group-select-group 1 108.034931 108.034931
| gnus-group-read-group 1 108.034919 108.034919
| gnus-summary-read-group 1 108.034872 108.034872
| gnus-summary-read-group-1 1 108.034857 108.034857
| gnus-select-newsgroup 1 99.801361 99.801361
| gnus-fetch-headers 1 97.991543 97.991543
| gnus-cache-retrieve-headers 1 97.007807 97.007807
| gnus-agent-retrieve-headers 1 96.319221 96.319221
| gnus-agent-save-alist 1 90.824356 90.824356
| gnus-summary-prepare 2 7.5673759999 3.7836879999
| gnus-summary-prepare-threads 2 6.356983 3.1784915
| gnus-run-hooks 2598 4.4140439999 0.0016990161
| gnus-set-difference 3 4.2837419999 1.4279139999
| gnus-summary-limit 1 3.703272 3.703272
| gnus-dd-mmm 2590 2.1757540000 0.0008400594
| gnus-simplify-subject-fuzzy 6054 1.3627560000 0.0002251000
| gnus-sort-threads-1 1360 1.1647499999 0.0008564338
| gnus-sort-threads 2 1.079927 0.5399635
| gnus-articles-to-read 1 1.021403 1.021403
| gnus-get-newsgroup-headers-xover 1 0.982616 0.982616
| gnus-thread-sort-by-most-recent-number 5882 0.6088250000 0.0001035064
| nntp-retrieve-headers 1 0.518039 0.518039
| nntp-accept-process-output 187 0.5179440000 0.0027697540
| nntp-retrieve-headers-with-xover 1 0.517936 0.517936
| nntp-send-xover-command 1 0.50554 0.50554
| nntp-send-command-nodelete 1 0.505498 0.505498
| gnus-build-old-threads 1 0.4732420000 0.4732420000
| gnus-build-get-header 40 0.4697150000 0.0117428750
| gnus-cache-braid-nov 1 0.450159 0.450159
| gnus-summary-from-or-to-or-newsgroups 2590 0.3935580000 0.0001519528
| gnus-thread-highest-number 11764 0.3495069999 2.970...e-05
| gnus-possibly-score-headers 1 0.343014 0.343014
| gnus-set-work-buffer 6056 0.3429149999 5.662...e-05
| gnus-score-headers 1 0.341492 0.341492
| gnus-agent-braid-nov 1 0.323557 0.323557
| gnus-extract-address-components 2590 0.3215299999 0.0001241428
| gnus-build-sparse-threads 1 0.294881 0.294881
| gnus-score-string 1 0.285263 0.285263
| gnus-sorted-complement 3 0.267874 0.0892913333
| gnus-summary-highlight-line 2588 0.2619589999 0.0001012206
| gnus-update-missing-marks 1 0.238278 0.238278
| gnus-agent-load-alist 1 0.2322109999 0.2322109999
| gnus-agent-read-file 1 0.23202 0.23202
| nnheader-insert-file-contents 4 0.141754 0.0354385
| gnus-gather-threads-by-subject 2 0.088504 0.044252
| gnus-correct-substring 504 0.0659760000 0.0001309047
| gnus-summary-limit-children 1468 0.0504970000 3.439...e-05
| gnus-sorted-intersection 2 0.044898 0.022449
| gnus-put-text-property-excluding-characters-with-faces 2588 0.0329380000 1.272...e-05
| gnus-request-group 1 0.032676 0.032676
| nntp-request-group 1 0.032633 0.032633
| gnus-put-text-property 5180 0.0260950000 5.037...e-06
| gnus-make-threads 2 0.025193 0.0125965
| gnus-killed-articles 1 0.024643 0.024643
| nnheader-message 179 0.0224710000 0.0001255363
| gnus-score-string< 10065 0.0220029999 2.186...e-06
| gnus-summary-initial-limit 1 0.017684 0.017684
| gnus-uncompress-range 7 0.013732 0.0019617142
| gnus-cut-threads 2 0.012557 0.0062785
| nntp-decode-text 2 0.009412 0.004706
| gnus-char-width 8064 0.0090279999 1.119...e-06
| gnus-thread-loop-p 2134 0.0067940000 3.183...e-06
| gnus-point-at-eol 1393 0.0050470000 3.623...e-06
| gnus-summary-setup-buffer 1 0.004155 0.004155
| gnus-adjust-marked-articles 1 0.00345 0.00345
| gnus-group-find-parameter 8 0.00334 0.0004175
| gnus-group-topic-parameters 8 0.002868 0.0003585
| gnus-message 15 0.002525 0.0001683333
| gnus-group-fast-parameter 6 0.002288 0.0003813333
| gnus-summary-mode 1 0.002218 0.002218
| gnus-group-decoded-name 7 0.002157 0.0003081428
| gnus-topic-hierarchical-parameters 8 0.001909 0.000238625
| gnus-group-name-decode 7 0.0016810000 0.0002401428
| gnus-update-summary-mark-positions 2 0.0016740000 0.0008370000
| gnus-summary-auto-select-subject 1 0.001672 0.001672
| gnus-summary-first-unread-subject 1 0.001661 0.001661
| gnus-cache-articles-in-group 1 0.001636 0.001636
| gnus-compute-unseen-list 1 0.001536 0.001536
| gnus-summary-first-subject 2 0.0015279999 0.0007639999
| gnus-inverse-list-range-intersection 1 0.001526 0.001526
| gnus-all-score-files 1 0.001501 0.001501
| gnus-summary-insert-line 2 0.001447 0.0007235
| gnus-topic-parent-topic 152 0.0014120000 9.289...e-06
| gnus-current-topics 8 0.001263 0.000157875
| gnus-set-mode-line 2 0.000982 0.000491
| gnus-summary-setup-default-charset 1 0.000861 0.000861
| nntp-find-connection-buffer 188 0.0008529999 4.537...e-06
| gnus-sort-gathered-threads 2 0.000807 0.0004035
| gnus-score-load-files 1 0.000758 0.000758
| gnus-update-format-specifications 2 0.000721 0.0003605
| gnus-score-load-file 8 0.0007059999 8.824...e-05
| gnus-set-sorted-intersection 1 0.000679 0.000679
| gnus-agent-article-name 3 0.000551 0.0001836666
| gnus-make-hashtable 3 0.000545 0.0001816666
| gnus-agent-open-history 1 0.000542 0.000542
| gnus-continuum-version 4 0.000506 0.0001265
| gnus-group-auto-expirable-p 1 0.0005 0.0005
| gnus-current-topic 8 0.000498 6.225e-05
| gnus-topic-parameters 16 0.000483 3.01875e-05
| gnus-make-sort-function 2 0.000475 0.0002375
| gnus-parameter-charset 1 0.000472 0.000472
| gnus-agent-save-history 1 0.000462 0.000462
| gnus-summary-set-local-parameters 1 0.000452 0.000452
| gnus-topic-find-topology 56 0.0004449999 7.946...e-06
| gnus-score-find-hierarchical 1 0.000439 0.000439
| gnus-byte-compile 2 0.000428 0.000214
| gnus-last-element 1 0.000397 0.000397
| gnus-set-global-variables 4 0.0003909999 9.774...e-05
| gnus-parameter-ignored-charsets 1 0.000365 0.000365
| gnus-goto-colon 8 0.0003549999 4.437...e-05
| gnus-group-name-charset 7 0.000352 5.028...e-05
| gnus-score-file-name 8 0.000349 4.3625e-05
| gnus-summary-position-point 7 0.000346 4.942...e-05
| gnus-point-at-bol 125 0.0003369999 2.695...e-06
| gnus-cache-file-name 3 0.000327 0.0001089999
| gnus-configure-windows 1 0.000315 0.000315
| gnus-agent-group-path 3 0.0002780000 9.266...e-05
| gnus-make-directory 2 0.000255 0.0001275
| gnus-summary-buffer-name 1 0.000242 0.000242
| gnus-apply-kill-file 1 0.000241 0.000241
| gnus-summary-goto-subject 1 0.000223 0.000223
| gnus-score-load-score-alist 4 0.00018 4.5e-05
| gnus-buffer-live-p 12 0.000178 1.483...e-05
| gnus-agent-enter-history 1 0.00016 0.00016
| gnus-group-goto-group 8 0.000157 1.9625e-05
| gnus-group-prev-group 1 0.000156 0.000156
| gnus-configure-frame 2 0.000154 7.7e-05
| gnus-find-method-for-group 15 0.000145 9.666...e-06
| gnus-get-buffer-create 5 0.000142 2.840...e-05
| gnus-group-next-unread-group 1 0.000136 0.000136
| gnus-update-read-articles 1 0.000127 0.000127
| gnus-group-search-forward 1 0.000117 0.000117
| gnus-newsgroup-kill-file 2 0.000117 5.85e-05
| gnus-thread-sort-by-number 44 0.0001149999 2.613...e-06
| gnus-short-group-name 2 0.000104 5.2e-05
| gnus-summary-make-local-variables 2 9.9e-05 4.95e-05
| nntp-possibly-change-group 2 9.8e-05 4.9e-05
| nntp-server-opened 3 8.9e-05 2.966...e-05
| gnus-agent-mode 1 8.7e-05 8.7e-05
| gnus-check-server 1 8.7e-05 8.7e-05
| gnus-all-windows-visible-p 1 8.7e-05 8.7e-05
| gnus-summary-set-display-table 1 8.6e-05 8.6e-05
| gnus-server-opened 1 7.8e-05 7.8e-05
| gnus-agent-lib-file 1 7.4e-05 7.4e-05
| gnus-summary-show-thread 1 7.1e-05 7.1e-05
| gnus-article-mark-to-type 22 6.8e-05 3.090...e-06
| nnheader-find-nov-line 1 6.7e-05 6.7e-05
| gnus-group-position-point 1 6.6e-05 6.6e-05
| gnus-parameters-get-parameter 2 6.3e-05 3.15e-05
| nnheader-concat 4 5.8e-05 1.45e-05
| gnus-newsgroup-savable-name 7 5.499...e-05 7.857...e-06
| gnus-group-get-parameter 9 5.299...e-05 5.888...e-06
| gnus-get-buffer-window 1 4.9e-05 4.9e-05
| nnheader-replace-duplicate-chars-in-string 3 4.9e-05 1.633...e-05
| gnus-agent-get-function 3 4.3e-05 1.433...e-05
| gnus-group-topic-p 1 4.2e-05 4.2e-05
| gnus-agent-get-undownloaded-list 1 4.1e-05 4.1e-05
| gnus-online 5 3.999...e-05 8e-06
| gnus-get-function 1 3.7e-05 3.7e-05
| nnheader-set-temp-buffer 1 3.7e-05 3.7e-05
| nnheader-translate-file-chars 19 3.599...e-05 1.894...e-06
| gnus-mode-line-buffer-identification 2 3.5e-05 1.75e-05
| gnus-summary-make-menu-bar 1 3.4e-05 3.4e-05
| gnus-group-topic-name 1 3e-05 3e-05
| gnus-group-parameter-value 7 2.999...e-05 4.285...e-06
| gnus-group-quit-config 1 2.7e-05 2.7e-05
| gnus-turn-off-edit-menu 1 2.6e-05 2.6e-05
| gnus-create-hash-size 3 2.5e-05 8.333...e-06
| gnus-visual-p 6 2.2e-05 3.666...e-06
| gnus-use-long-file-name 10 2.2e-05 2.2e-06
| gnus-summary-set-article-display-arrow 1 2.1e-05 2.1e-05
| gnus-group-group-name 1 1.9e-05 1.9e-05
| gnus-list-of-unread-articles 1 1.9e-05 1.9e-05
| gnus-group-real-prefix 1 1.8e-05 1.8e-05
| gnus-virtual-group-p 1 1.7e-05 1.7e-05
| gnus-agent-create-buffer 1 1.6e-05 1.6e-05
| gnus-get-unread-articles-in-group 1 1.6e-05 1.6e-05
| gnus-undo-register 1 1.4e-05 1.4e-05
| gnus-frames-on-display-list 1 1.3e-05 1.3e-05
| gnus-undo-boundary 4 1.3e-05 3.25e-06
| gnus-cache-update-active 2 9.999...e-06 4.999...e-06
| gnus-make-sort-function-1 2 9e-06 4.5e-06
| nnoo-current-server 3 8.999...e-06 2.999...e-06
| gnus-simplify-mode-line 1 8e-06 8e-06
| gnus-server-status 1 8e-06 8e-06
| nntp-find-connection 2 8e-06 4e-06
| gnus-home-score-file 2 7e-06 3.5e-06
| gnus-set-default-directory 1 6e-06 6e-06
| gnus-summary-maybe-hide-threads 2 6e-06 3e-06
| gnus-agent-method-p 1 5e-06 5e-06
| gnus-agent-history-buffer 1 5e-06 5e-06
| gnus-undo-register-1 1 5e-06 5e-06
| gnus-window-to-buffer-helper 2 4e-06 2e-06
| gnus-windows-old-to-new 1 4e-06 4e-06
| gnus-summary-make-tool-bar 1 4e-06 4e-06
| gnus-score-find-alist 1 4e-06 4e-06
| gnus-cache-save-buffers 2 4e-06 2e-06
| gnus-copy-sequence 1 4e-06 4e-06
| gnus-id-to-thread 1 3e-06 3e-06
| gnus-delete-if 1 3e-06 3e-06
| gnus-make-thread-indent-array 1 3e-06 3e-06
| gnus-agent-summary-make-menu-bar 1 2e-06 2e-06
\----
All group entries were done with `- ='.
kai
--
Simplification good! Oversimplification bad! (Larry Wall)
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 4:19 ` Karl Kleinpaste
2002-01-25 4:47 ` Lars Magne Ingebrigtsen
@ 2002-01-25 5:32 ` Daniel Pittman
1 sibling, 0 replies; 102+ messages in thread
From: Daniel Pittman @ 2002-01-25 5:32 UTC (permalink / raw)
On Thu, 24 Jan 2002, Karl Kleinpaste wrote:
> Daniel Pittman <daniel@rimspace.net> writes:
>> I peak at > 90,000 messages at the moment, with the ten most
>> active groups running down from there to ~ 12,000 messages.
>
> When you have a few spare moments, it would be instructive to see your
> timings for a 4-article entry to a group with 100, 20K, and 90K
> messages.
P-II 400, 288MB RAM
100: ~ 0.75 seconds
1000: ~ 1.25 seconds
15000: ~ 2 seconds
25000 (or 54000): ~ 7 seconds
17710: ~ 15 seconds.
There are two anomalous figures here. The first, the 25K or 54K, is my
working linux.kernel archive. *Groups* claims that 25K articles exist,
overview has 54K lines -- and that's around the right number.
Also, the 90K group, which I wanted to test, isn't in active any more,
nor does it's overview exist. It still has (some, possibly not all) it's
messages, though. I will probably respool soon and see what's up.
I think that I may have had some, er, damage done to active in the past,
though, which is a pain. I seem to have lost most of 'nnfolder:archive'
as well, which /really/ sucks. Er, though it's active has the right
groups listed. How very odd.
The second anomaly is my archive of the Gnus list. It's /very/ slow to
enter, because it has a whole set of ticked articles in it. Doing this
makes things /very/ slow, which is annoying.
The oldest article ticked is number '7062', with 36 total marked that
way.
24469 is the highest article number in that group; 6760 the oldest
mentioned in .overview.
Daniel
--
Regard all art critics as useless and dangerous.
-- Manifesto of the Futurists
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 3:37 ` Daniel Pittman
2002-01-25 4:19 ` Karl Kleinpaste
@ 2002-01-25 4:29 ` Lars Magne Ingebrigtsen
2002-01-25 5:16 ` Lars Magne Ingebrigtsen
1 sibling, 1 reply; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-25 4:29 UTC (permalink / raw)
Daniel Pittman <daniel@rimspace.net> writes:
>> There's certainly no harm to making Gnus able to handle the storage
>> needs of gargantuan archives. Sure, knock yourself out. But that's
>> not a problem currently being faced by more than maybe 5 people on the
>> planet, whereas every single Gnus user has to worry over the entry
>> time cost of a busy group with 900 new messages.
>
> That would be something that would be good to address first. Don't
> mistake it for the only problem, though.
It's not really an area that's been addressed before, either, so it's
an area that can be done something with. Scoring and threading have
already been optimized several times.
There are lots of new things in Gnus that haven't been particularly
optimized. For instance, I just did a new batch of elping, and saw
that the `seen' code called `gnus-member-of-range' once per article
displayed, which is a hight linear cost. I separated that thing out
into its own function and saw that the `gnus-newsgroup-unseen'
computation (in a 1000 article summary buffer) took 1 second. So I
wrote `gnus-inverse-list-range-intersection', and the computation now
takes 0.001426 seconds, says elp.
There's a lot of computational complexity in Gnus that needs more
eyes. I think it's a safe bet that normal group entry can easily be
made a few orders of magnitude faster. (And the
view-one-article-in-a- 200K-group can be made hundreds of orders of
magnitudes faster.)
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 4:29 ` Lars Magne Ingebrigtsen
@ 2002-01-25 5:16 ` Lars Magne Ingebrigtsen
2002-01-25 5:29 ` Lars Magne Ingebrigtsen
2002-01-25 5:39 ` Daniel Pittman
0 siblings, 2 replies; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-25 5:16 UTC (permalink / raw)
Is there a directive I can give to the byte-compiler to make all the
defsubsts into defuns and all the `inline's into `progn's? I want to
have a closer look at function timings without having to alter the
source...
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 5:16 ` Lars Magne Ingebrigtsen
@ 2002-01-25 5:29 ` Lars Magne Ingebrigtsen
2002-01-25 5:39 ` Daniel Pittman
1 sibling, 0 replies; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-25 5:29 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> Is there a directive I can give to the byte-compiler to make all the
> defsubsts into defuns and all the `inline's into `progn's? I want to
> have a closer look at function timings without having to alter the
> source...
I had a peek around, and the following seems to do the trick, I think:
;; To avoid having defsubsts and inlines happen.
(defun byte-optimize-inline-handler (form)
"byte-optimize-handler for the `inline' special-form."
(cons 'progn (cdr form)))
(defalias 'byte-compile-file-form-defsubst 'byte-compile-file-form-defun)
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 5:16 ` Lars Magne Ingebrigtsen
2002-01-25 5:29 ` Lars Magne Ingebrigtsen
@ 2002-01-25 5:39 ` Daniel Pittman
2002-01-25 5:48 ` Lars Magne Ingebrigtsen
1 sibling, 1 reply; 102+ messages in thread
From: Daniel Pittman @ 2002-01-25 5:39 UTC (permalink / raw)
On Fri, 25 Jan 2002, Lars Magne Ingebrigtsen wrote:
> Is there a directive I can give to the byte-compiler to make all the
> defsubsts into defuns and all the `inline's into `progn's? I want to
> have a closer look at function timings without having to alter the
> source...
Sadly, either hacking the byte complier or mapping through the sources
and doing `proclaim-notinline' on each `defsubst' seems to be needed.
That is a sad thing. :(
Er, under a fairly current XEmacs 21.5 beta, that is. Oh, and `inline'
/is/ `progn' here.
Daniel
--
We're marketers, so truth is kind of a nebulous concept to us.
-- Ray Scott, DEC field marketing rep
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 3:23 ` Karl Kleinpaste
2002-01-25 3:34 ` Lars Magne Ingebrigtsen
2002-01-25 3:37 ` Daniel Pittman
@ 2002-01-25 7:05 ` Justin Sheehy
2002-01-28 12:56 ` Jorge Godoy
3 siblings, 0 replies; 102+ messages in thread
From: Justin Sheehy @ 2002-01-25 7:05 UTC (permalink / raw)
Karl Kleinpaste <karl@charcoal.com> writes:
> Does anyone have a group, or does anyone plan to create a group,
> that goes beyond, say, 30K messages?
Until fairly recently, I was heavily using a group that has [checks]
59837 messages in it.
So, yes.
-Justin
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 3:23 ` Karl Kleinpaste
` (2 preceding siblings ...)
2002-01-25 7:05 ` Justin Sheehy
@ 2002-01-28 12:56 ` Jorge Godoy
3 siblings, 0 replies; 102+ messages in thread
From: Jorge Godoy @ 2002-01-28 12:56 UTC (permalink / raw)
Cc: ding
Karl Kleinpaste <karl@charcoal.com> writes:
> Yet you want to optimize the case that's so far out to the edge, I
> doubt there's anyone that even actually _has_ a single group with 200K
> messages in it, against which to test the planned optimization. (Does
> anyone? Really, right now? Does anyone have a group, or does anyone
> plan to create a group, that goes beyond, say, 30K messages?) You're
> optimizing this bizarre, unusual, never-seen case, while the case that
> I've run into at least a dozen times today alone haunts us, where
> getting into a big, busy, NNTP group costs me minutes of computation
> time while *Summary* is generated.
I have groups bigger than 30K messages. The average size is 50K
messages when I backup them (and other groups) to a CD. It takes 3 or
4 months to get back to that.
See you,
--
Godoy. <godoy@conectiva.com>
Escritório de Projetos -- Conectiva S.A.
Projects Office -- Conectiva Inc.
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 12:54 ` Karl Kleinpaste
2002-01-24 15:05 ` Lars Magne Ingebrigtsen
@ 2002-01-24 16:13 ` Frank Schmitt
2002-01-24 18:36 ` Simon Josefsson
1 sibling, 1 reply; 102+ messages in thread
From: Frank Schmitt @ 2002-01-24 16:13 UTC (permalink / raw)
Karl Kleinpaste <karl@charcoal.com> writes:
>When I select the whole group, Gnus says "scoring" within 3 or 4
>seconds of beginning entry, which then takes 28 seconds. Then Gnus
>says "Generating summary", which takes well over a minute before the
>summary is displayed. Total time is about 1:43. That's the sort of
>time occupation that worries me.
Absolutely my point of view, too. I have a local archive newsserver with
groups with more than 50k messages in it. When I try to enter those
groups with Gnus I can have a cigarette before Gnus is ready.
Slrn or XNews are *much* faster.
--
One Ring to rule them all, One Ring to find them,
One Ring to bring them all and in the darkness bind them
In the Land of Mordor where the Shadows lie.
19. Dezember 2001
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 11:58 ` Karl Kleinpaste
2002-01-24 12:11 ` Lars Magne Ingebrigtsen
@ 2002-01-24 12:29 ` Simon Josefsson
2002-01-25 14:40 ` Randal L. Schwartz
2002-01-24 13:50 ` Zlatko Calusic
2002-01-24 14:51 ` Kai Großjohann
3 siblings, 1 reply; 102+ messages in thread
From: Simon Josefsson @ 2002-01-24 12:29 UTC (permalink / raw)
Cc: ding
On Thu, 24 Jan 2002, Karl Kleinpaste wrote:
> I have been wondering about this since this discussion started. The
> slowness of entering a large group of 10K messages, or 100K, has very
> little to do with getting at either the overview data or the message
> files. It has to do with threading, scoring, and sorting. I don't
> know what Gnus' threading algorithm is, but I suspect its performance
> is at least as poor as O(n log n) and may even be as bad as O(n^2).
> It is not clear to me that it is possible to do better than O(n log n)
> in the first place, though I haven't contemplated the matter very long.
It is possible to do it in O(1) from Gnus's point of view: Push that
functionality down into the backend. :-)
Caching threading information is another way.
(Another thing that takes time, besides the ones you mention, are
inserting the entry for each message in the summary buffer, with the
proper MIME decoding and stuff.)
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 12:29 ` Simon Josefsson
@ 2002-01-25 14:40 ` Randal L. Schwartz
0 siblings, 0 replies; 102+ messages in thread
From: Randal L. Schwartz @ 2002-01-25 14:40 UTC (permalink / raw)
Cc: Karl Kleinpaste, ding
>>>>> "Simon" == Simon Josefsson <jas@extundo.com> writes:
Simon> It is possible to do it in O(1) from Gnus's point of view: Push that
Simon> functionality down into the backend. :-)
"Any calculation can be made in constant time, if the constant is big enough".
:-)
--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 11:58 ` Karl Kleinpaste
2002-01-24 12:11 ` Lars Magne Ingebrigtsen
2002-01-24 12:29 ` Simon Josefsson
@ 2002-01-24 13:50 ` Zlatko Calusic
2002-01-24 22:27 ` Zlatko Calusic
2002-01-25 2:57 ` Lars Magne Ingebrigtsen
2002-01-24 14:51 ` Kai Großjohann
3 siblings, 2 replies; 102+ messages in thread
From: Zlatko Calusic @ 2002-01-24 13:50 UTC (permalink / raw)
Cc: ding
Karl Kleinpaste <karl@charcoal.com> writes:
> Simon Josefsson <jas@extundo.com> writes:
>> NNML scales linearly with the number of messages in the group, does
>> it not? Gnus definitely doesn't scale linearly, so redesigning
>> something to get support for large mail/news backends should go into
>> Gnus, I think.
>
> I have been wondering about this since this discussion started. The
> slowness of entering a large group of 10K messages, or 100K, has very
> little to do with getting at either the overview data or the message
> files. It has to do with threading, scoring, and sorting. I don't
> know what Gnus' threading algorithm is, but I suspect its performance
> is at least as poor as O(n log n) and may even be as bad as O(n^2).
It is definitely O(n^2), I was measuring it recently. I have an nnml
group (ok, linux-kernel list, why keep it secret :)) which hovers
around the number of 15000 unread mails (12k - 17k typicaly, last few
months). It takes minutes! to enter that group on a quite fast machine
with a gobs of memory. It's no mystery I'm not eager to start reading
that list, and tomorrow things get only worse. :)
My observings are that threading and sorting take a lot of time (not
scoring!). And I found it is exactly O(n^2). If I read, kill,
whatever... 30% of articles in the group, elapsed time during
entering the group cuts in half. I was delibaretely measuring it to
see what is the upper limit of usability. Quite lowish, if I may add.
--
Zlatko
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 13:50 ` Zlatko Calusic
@ 2002-01-24 22:27 ` Zlatko Calusic
2002-01-25 2:57 ` Lars Magne Ingebrigtsen
1 sibling, 0 replies; 102+ messages in thread
From: Zlatko Calusic @ 2002-01-24 22:27 UTC (permalink / raw)
Zlatko Calusic <zlatko.calusic@iskon.hr> writes:
> Karl Kleinpaste <karl@charcoal.com> writes:
>
>> Simon Josefsson <jas@extundo.com> writes:
>>> NNML scales linearly with the number of messages in the group, does
>>> it not? Gnus definitely doesn't scale linearly, so redesigning
>>> something to get support for large mail/news backends should go into
>>> Gnus, I think.
>>
>> I have been wondering about this since this discussion started. The
>> slowness of entering a large group of 10K messages, or 100K, has very
>> little to do with getting at either the overview data or the message
>> files. It has to do with threading, scoring, and sorting. I don't
>> know what Gnus' threading algorithm is, but I suspect its performance
>> is at least as poor as O(n log n) and may even be as bad as O(n^2).
>
> It is definitely O(n^2), I was measuring it recently. I have an nnml
> group (ok, linux-kernel list, why keep it secret :)) which hovers
> around the number of 15000 unread mails (12k - 17k typicaly, last few
> months). It takes minutes! to enter that group on a quite fast machine
> with a gobs of memory. It's no mystery I'm not eager to start reading
> that list, and tomorrow things get only worse. :)
>
Correct numbers:
nnml group, 18294 articles, entering takes ~8 minutes on PIII1000
--
Zlatko
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 13:50 ` Zlatko Calusic
2002-01-24 22:27 ` Zlatko Calusic
@ 2002-01-25 2:57 ` Lars Magne Ingebrigtsen
2002-01-25 4:42 ` Lars Magne Ingebrigtsen
1 sibling, 1 reply; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-25 2:57 UTC (permalink / raw)
Zlatko Calusic <zlatko.calusic@iskon.hr> writes:
> It is definitely O(n^2), I was measuring it recently.
Are you sure? I ran the following loop on an Agentized group (which
means that it's really nnml):
(dotimes (i 9)
(push (list (* (1+ i) 1000)
(benchmark 1
(gnus-group-select-group (* (1+ i) 1000)))) times)
(gnus-summary-exit-no-update))
times =>
((8000 24.422008991241455)
(7000 20.992663979530334)
(6000 17.149681091308594)
(5000 14.887889981269836)
(4000 13.466870069503784)
(3000 9.70430600643158)
(2000 7.009864926338196)
(1000 3.862096071243286))
So we see that group entry time is actually linear, which surprised me
a lot.
(Athlon XP 1900+, DDR RAM, Emacs 21.1.)
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-25 2:57 ` Lars Magne Ingebrigtsen
@ 2002-01-25 4:42 ` Lars Magne Ingebrigtsen
0 siblings, 0 replies; 102+ messages in thread
From: Lars Magne Ingebrigtsen @ 2002-01-25 4:42 UTC (permalink / raw)
Lars Magne Ingebrigtsen <larsi@gnus.org> writes:
> times =>
> ((8000 24.422008991241455)
> (7000 20.992663979530334)
> (6000 17.149681091308594)
> (5000 14.887889981269836)
> (4000 13.466870069503784)
> (3000 9.70430600643158)
> (2000 7.009864926338196)
> (1000 3.862096071243286))
And now it's:
times =>
((8000 11.500802993774414)
(7000 9.747804045677185)
(6000 8.098258018493652)
(5000 6.130577087402344)
(4000 4.86285400390625)
(3000 3.723906993865967)
(2000 2.5801719427108765)
(1000 1.667241096496582))
Let's hope I didn't break too much. :-)
--
(domestic pets only, the antidote for overdose, milk.)
larsi@gnus.org * Lars Magne Ingebrigtsen
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 11:58 ` Karl Kleinpaste
` (2 preceding siblings ...)
2002-01-24 13:50 ` Zlatko Calusic
@ 2002-01-24 14:51 ` Kai Großjohann
3 siblings, 0 replies; 102+ messages in thread
From: Kai Großjohann @ 2002-01-24 14:51 UTC (permalink / raw)
Cc: ding
Karl Kleinpaste <karl@charcoal.com> writes:
> I have been wondering about this since this discussion started. The
> slowness of entering a large group of 10K messages, or 100K, has very
> little to do with getting at either the overview data or the message
> files. It has to do with threading, scoring, and sorting.
I suggest that you (setq gnus-verbose 10) and then you will see. For
me, it's "Fetching headers" and "Generating Summary Buffer" which
take long, whereas "Threading" and "Scoring" are comparatively
quick. Hm. I think it's especially "Generating summary buffer".
But maybe I'm using a stupid gnus-summary-line-format.
kai
--
Simplification good! Oversimplification bad! (Larry Wall)
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 9:59 ` Per Abrahamsen
` (2 preceding siblings ...)
2002-01-24 11:32 ` Simon Josefsson
@ 2002-01-24 17:14 ` Paul Jarc
2002-01-24 17:52 ` Per Abrahamsen
3 siblings, 1 reply; 102+ messages in thread
From: Paul Jarc @ 2002-01-24 17:14 UTC (permalink / raw)
Per Abrahamsen <abraham@dina.kvl.dk> wrote:
> Gnus will be less robust if it has to support mutiple filesystem
> layouts, so there has to be some real advantage to make up for that.
Gnus already supports multiple filesystem layouts via different
backends. I don't think that has made it less robust.
paul
^ permalink raw reply [flat|nested] 102+ messages in thread
* Re: db-backed mail back end
2002-01-24 17:14 ` Paul Jarc
@ 2002-01-24 17:52 ` Per Abrahamsen
0 siblings, 0 replies; 102+ messages in thread
From: Per Abrahamsen @ 2002-01-24 17:52 UTC (permalink / raw)
prj@po.cwru.edu (Paul Jarc) writes:
> Gnus already supports multiple filesystem layouts via different
> backends. I don't think that has made it less robust.
I'm certain of it. Bugs go unfixed longer because the developers use
different backends. More relevant, I have in the past had many
problems related to having a non-default setting of
'gnus-use-long-file-name', which is similar to the option that is
currently discussed.
But the existing backends each have real benefits. Disabling a
workaround on platforms where it is not needed, with the sole
justification that "it is not needed" has no real benefits. It just
makes the code less robust.
^ permalink raw reply [flat|nested] 102+ messages in thread