Gnus development mailing list
 help / color / mirror / Atom feed
From: Dan Christensen <jdc@uwo.ca>
To: ding@gnus.org
Subject: Re: faster gnus-thread-latest-date
Date: Mon, 07 Dec 2009 22:35:16 -0500	[thread overview]
Message-ID: <87hbs2yw7f.fsf@uwo.ca> (raw)
In-Reply-To: <87my1uiien.fsf@uwo.ca>

Dan Christensen <jdc@uwo.ca> writes:

> In my tests, it is noticeable faster.  Can it be improved further?

Actually, in my tests it is only slightly faster.  Still, the code is
much simpler, so unless someone can think of a reason not to, I think
the change should be made (after someone else checks it over and tests
it).

However, I think there's an orthogonal change which could greatly
improve the sorting speed.  For a group I have with about 6000 messages
and maybe 1000 threads (?), gnus-thread-latest-date is called 31028
times, taking about 10.5 seconds in total.  If the output from
gnus-thread-latest-date was cached, that would reduce the number of
computations of this date by at least a factor of 5, and so would
greatly reduce the time needed to enter the summary buffer.

This is the classic situation where decorate-sort-undecorate is a big
win.  The comparison function is slow here and gets called O(n log n)
times, but if you make one pass through the list of threads ahead of
time and compute the information that needs to be compared, you only
need to do the computation O(n) times.

Also, because the sub-threads are recursively sorted, each article
participates in many sorts.

In a perfect world, we would change Gnus to do a
decorate-sort-undecorate for all sorts.  But this would probably require
a change to the user interface for sorting.  The user would supply a
list of functions that each take a single thread and produce a single
output of a standard type (string, int, float, etc), and then Gnus would
do a decorate step that precomputes the list of those values.  The
sorting step would sort based on the usual order for the precomputed
values.  But this is a big change, so it's not ideal.

Another option would be to cache the slow computations with the thread
while doing the sort.  I wasn't sure of the best way to do this, so I
made up another extra-header to temporarily store the date converted
to a float.  And the speed-up is amazing.  The sorting time goes from
about 10 seconds to under 1 second!  Here's the code I used:

The lambda in the original code has been factored out into a separate
function which checks whether the data has been cached:

(defun gnus-article-float-date (header)
  (let ((extra-header (mail-header-extra header)))
    (or (cdr (assq 'Float-Date extra-header))
	(let ((float-date (condition-case ()
			      (gnus-float-time (mail-header-parse-date
						(mail-header-date header)))
			    (error 0))))
	  (mail-header-set-extra header 
				 (cons `(Float-Date . ,float-date) extra-header))
	  float-date))))

(defun gnus-thread-latest-date (thread)
  "Return the highest article date in THREAD."
  (apply 'max
	 (mapcar 'gnus-article-float-date
		 (message-flatten-list thread))))

Can anyone see any problems with caching the data like this?

A more intrusive but more logical change would be to change the header
array to have one more entry, specifically for cached data.  But that's
more than I'd like to tackle.

Dan




  reply	other threads:[~2009-12-08  3:35 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-07 21:27 Dan Christensen
2009-12-08  3:35 ` Dan Christensen [this message]
2009-12-08  7:07   ` Katsumi Yamaoka
2009-12-08 19:14   ` Ted Zlatanov
2009-12-09  0:33     ` Dan Christensen
2009-12-10  3:03       ` Dan Christensen
2009-12-11  3:01         ` Dan Christensen
2009-12-11 22:03           ` Dan Christensen
2009-12-13 23:25             ` Dan Christensen
2009-12-14 16:18               ` Dave Love
2009-12-13 23:56           ` Dan Christensen
2010-01-02  2:09             ` Dan Christensen
2010-01-12 17:17               ` Reiner Steib
2010-06-09 13:42             ` Dan Christensen
2010-06-10  0:32               ` Katsumi Yamaoka
2010-06-10  8:42                 ` Gnus new commits (was: faster gnus-thread-latest-date) Ted Zlatanov
2009-12-13 10:29   ` faster gnus-thread-latest-date Daniel Pittman
2009-12-13 23:38     ` Dan Christensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87hbs2yw7f.fsf@uwo.ca \
    --to=jdc@uwo.ca \
    --cc=ding@gnus.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).