From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/69267 Path: news.gmane.org!not-for-mail From: Dan Christensen Newsgroups: gmane.emacs.gnus.general Subject: Re: faster gnus-thread-latest-date Date: Mon, 07 Dec 2009 22:35:16 -0500 Message-ID: <87hbs2yw7f.fsf@uwo.ca> References: <87my1uiien.fsf@uwo.ca> NNTP-Posting-Host: lo.gmane.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: ger.gmane.org 1260243417 29906 80.91.229.12 (8 Dec 2009 03:36:57 GMT) X-Complaints-To: usenet@ger.gmane.org NNTP-Posting-Date: Tue, 8 Dec 2009 03:36:57 +0000 (UTC) To: ding@gnus.org Original-X-From: ding-owner+M17672@lists.math.uh.edu Tue Dec 08 04:36:50 2009 Return-path: Envelope-to: ding-account@gmane.org Original-Received: from util0.math.uh.edu ([129.7.128.18]) by lo.gmane.org with esmtp (Exim 4.50) id 1NHqsY-0004KP-U7 for ding-account@gmane.org; Tue, 08 Dec 2009 04:36:47 +0100 Original-Received: from localhost ([127.0.0.1] helo=lists.math.uh.edu) by util0.math.uh.edu with smtp (Exim 4.63) (envelope-from ) id 1NHqrg-0005lU-Vh; Mon, 07 Dec 2009 21:35:53 -0600 Original-Received: from mx2.math.uh.edu ([129.7.128.33]) by util0.math.uh.edu with esmtps (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1NHqrf-0005lI-95 for ding@lists.math.uh.edu; Mon, 07 Dec 2009 21:35:51 -0600 Original-Received: from quimby.gnus.org ([80.91.231.51]) by mx2.math.uh.edu with esmtp (Exim 4.69) (envelope-from ) id 1NHqrd-0005e3-62 for ding@lists.math.uh.edu; Mon, 07 Dec 2009 21:35:50 -0600 Original-Received: from lo.gmane.org ([80.91.229.12]) by quimby.gnus.org with esmtp (Exim 3.36 #1 (Debian)) id 1NHqrc-0007VB-00 for ; Tue, 08 Dec 2009 04:35:48 +0100 Original-Received: from list by lo.gmane.org with local (Exim 4.50) id 1NHqrb-00047d-Ly for ding@gnus.org; Tue, 08 Dec 2009 04:35:47 +0100 Original-Received: from bas3-london14-1096790638.dsl.bell.ca ([65.95.178.110]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 08 Dec 2009 04:35:47 +0100 Original-Received: from jdc by bas3-london14-1096790638.dsl.bell.ca with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 08 Dec 2009 04:35:47 +0100 X-Injected-Via-Gmane: http://gmane.org/ Original-Lines: 68 Original-X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: bas3-london14-1096790638.dsl.bell.ca User-Agent: Gnus/5.110011 (No Gnus v0.11) Emacs/23.1.50 (gnu/linux) Cancel-Lock: sha1:TPwIOV1KgNv+IWqyCmeAg6RnLKU= X-Spam-Score: -2.6 (--) List-ID: Precedence: bulk Xref: news.gmane.org gmane.emacs.gnus.general:69267 Archived-At: Dan Christensen writes: > In my tests, it is noticeable faster. Can it be improved further? Actually, in my tests it is only slightly faster. Still, the code is much simpler, so unless someone can think of a reason not to, I think the change should be made (after someone else checks it over and tests it). However, I think there's an orthogonal change which could greatly improve the sorting speed. For a group I have with about 6000 messages and maybe 1000 threads (?), gnus-thread-latest-date is called 31028 times, taking about 10.5 seconds in total. If the output from gnus-thread-latest-date was cached, that would reduce the number of computations of this date by at least a factor of 5, and so would greatly reduce the time needed to enter the summary buffer. This is the classic situation where decorate-sort-undecorate is a big win. The comparison function is slow here and gets called O(n log n) times, but if you make one pass through the list of threads ahead of time and compute the information that needs to be compared, you only need to do the computation O(n) times. Also, because the sub-threads are recursively sorted, each article participates in many sorts. In a perfect world, we would change Gnus to do a decorate-sort-undecorate for all sorts. But this would probably require a change to the user interface for sorting. The user would supply a list of functions that each take a single thread and produce a single output of a standard type (string, int, float, etc), and then Gnus would do a decorate step that precomputes the list of those values. The sorting step would sort based on the usual order for the precomputed values. But this is a big change, so it's not ideal. Another option would be to cache the slow computations with the thread while doing the sort. I wasn't sure of the best way to do this, so I made up another extra-header to temporarily store the date converted to a float. And the speed-up is amazing. The sorting time goes from about 10 seconds to under 1 second! Here's the code I used: The lambda in the original code has been factored out into a separate function which checks whether the data has been cached: (defun gnus-article-float-date (header) (let ((extra-header (mail-header-extra header))) (or (cdr (assq 'Float-Date extra-header)) (let ((float-date (condition-case () (gnus-float-time (mail-header-parse-date (mail-header-date header))) (error 0)))) (mail-header-set-extra header (cons `(Float-Date . ,float-date) extra-header)) float-date)))) (defun gnus-thread-latest-date (thread) "Return the highest article date in THREAD." (apply 'max (mapcar 'gnus-article-float-date (message-flatten-list thread)))) Can anyone see any problems with caching the data like this? A more intrusive but more logical change would be to change the header array to have one more entry, specifically for cached data. But that's more than I'd like to tackle. Dan