Gnus development mailing list
 help / color / mirror / Atom feed
* Re: searching local mail stores
@ 2016-09-27 17:29 myglc2
  0 siblings, 0 replies; 21+ messages in thread
From: myglc2 @ 2016-09-27 17:29 UTC (permalink / raw)
  To: ding

I received many helpful comments on my earlier post under this
subject. I have since spent a few months off-and-on trying various mail
search setups. Along the way I performed a couple benchmarks that might
be of interest to list readers, so I am reporting them below.

First, to get a handle on comments to the effect that nnmaildir is slow
when there are many messages I put ~132,000 messages into 9 Maildir sub
directories occupying 2.1G. I configured gnus to treat the top level
maildir as a single store.  On a 3.4 Ghz machine with a SSD, gnus took
~25 seconds to open the maildir. Initial notmuch indexing took a few
minutes. Search performance depended on the number of search hits
generated (see discussion below).  For comparison, I deleted messages
until the maildir had ~15,000 messages in 5 sub-directories occupying
1.1G, at which point gnus opened the maildir in ~5 seconds.

Second, I wanted to compare gnus/notmuch with mu4e. notmuch and mu4e
both use the xapian search/index engine and have emacs mail search UIs
that are independent of gnus. I expected these to perform similarly and
in casual comparison this seemed to be the case, so I did not compare
the mu4e and notmuch UIs any further.

However notmuch also supports the gnus nnir search interface, which
allows a gnus 'G G' search to deliver notmuch search results to gnus
summary buffers. To me, the appeal of notmuch was the possibility of
fast search while otherwise continuing to read messages in gnus.  So I
compared gnus/notmuch with the mu4e UI, as shown in Table 1, below.

Table 1: seconds to operate on a maildir directory containing 15,000 messages

| operation        | gnus/notmuch |       mu4e | gnus/notmuch |  mu4e (1) |
|                  |   first line | first line |    All lines | All lines |
+------------------+--------------+------------+--------------+-----------|
| open             |            5 |          1 |            5 |        10 |
| (re)sort date    |            1 |          1 |            1 |        10 |
| (re)sort subject |            1 |          1 |            1 |        10 |

Note: mu4e normally limits display to, at most, the first 500 search
hits. In the "mu4e (1) All lines" results above, mu4e was forced to
display all search hits by setting 'M-x mu4e-headers-toggle-full-search’.

So, to generalize, mu4e is snappier than gnus/notmuch, unless we force
mu4e to display a lot of lines.  The biggest difference is the "open"
time, which, as demonstrated above, becomes significant when the maildir
contains a lot of messages.

SEARCH PERFORMANCE:

When searching, if the number of search hits is modest (<100), the
search/display time is similar for gnus/notmuch and mu4e. However, when
a search produces many hits (e.g., ~15,000), gnus takes about 10 sec to
display all of the results and mu4e takes 23 (once again setting M-x
mu4e-headers-toggle-full-search).

So the most noticeable oveall difference between gnus/notmuch and mu4e
is the time taken to initially "open" the maildir.

Regarding setup, I found mu4e easier with its single point of
configuration. In comparison, gnus/notmuch requires a compatible
"parallel" configuration of notmuch and gnus.

Bottom line:

1) gnus/notmuch/maildir works pretty well (at least on my computer)
   for 15,000 or so messages.

2) At 100,000 or messages, the gnus maildir startup delay is so tiresome
   that mu4e becomes truly compelling.

FWIW, I am currently a happy user of both mu4e and gnus/notmuch. I use
mu4e+mbsync to read multiple Gmail accounts. I use gnus/notmuch to
search and read mailing list archives that I have mirrored locally into
maildir.

- George




^ permalink raw reply	[flat|nested] 21+ messages in thread
* searching local mail stores
@ 2016-01-02  5:30 myglc2
  2016-01-02 11:37 ` Adam Sjøgren
  0 siblings, 1 reply; 21+ messages in thread
From: myglc2 @ 2016-01-02  5:30 UTC (permalink / raw)
  To: ding

I would like to index and search across my local mail stores, and
eventually all my other files. gnus is my primary mail and news
reader. I have used ...

gmail -- nnimap -- gnus

... for some years. I now plan to move all my mail to my local server so
that I can index and search it more quickly. I have recently set up ...

gmail -- mbsync -- dovecot+lucene -- nnimap -- gnus 

gmail -- mbsync -- Maildir -- mu4e

I am using mu4e (part of mu) as a stalking horse + benchmark because it
is new and actively developed. mu uses the xapian search engine which
could be a good search solution for my other files since it handles many
document types. mu4e's primary downside (for me) is that it is not
integrated with gnus.

I want to compare the performance of mu4e with some of the gnus search
schemes.  Looking at gnus INFO ...

8 Searching
8.1 nnir Searching with various engines.
8.1.3 Setting up nnir
8.1.3.2 The imap Engine
8.1.3.3 The gmane Engine
8.1.3.4 The swish++ Engine
8.1.3.5 The swish-e Engine
8.1.3.6 The namazu Engine
8.1.3.7 The notmuch Engine
8.2 nnmairix

... it was not obvious which I should try. So I searched the gmane
gnus.user and gnus.general news groups. I found these hit counts and
dates:

| search  | gnus.user | gnus.user | gnus.general | gnus.general | total |
| term    |      hits |     dates |         hits |        dates |       |
|---------+-----------+-----------+--------------+--------------+-------|
| swish++ |        11 |     02-06 |           73 |        01-13 |    84 |
| swishe  |        33 |     02-10 |           40 |        97-14 |    73 |
| namazu  |        56 |     02-15 |          196 |        01-15 |   252 |
| notmuch |        33 |     09-15 |          101 |        10-15 |   134 |
| mairix  |       100 |     04-12 |          140 |        06-15 |   240 |
| mu4e    |         4 |     12-15 |            6 |        12-15 |    10 |
|---------+-----------+-----------+--------------+--------------+-------|

I plan to set aside notmuch because it seems pretty similar to mu4e. 

It looks like mairix and namazu are in the lead and the swishes are
pretty far behind. Or maybe the swishes work flawlessly:)

So maybe I should get namazu and/or mairix working?

Would anyone care to comment on which of these I should try and why?

- george




^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2016-09-27 17:29 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-27 17:29 searching local mail stores myglc2
  -- strict thread matches above, loose matches on Subject: below --
2016-01-02  5:30 myglc2
2016-01-02 11:37 ` Adam Sjøgren
2016-01-02 16:13   ` myglc2
2016-01-02 17:59     ` Adam Sjøgren
2016-01-02 21:54       ` myglc2
2016-01-02 22:06         ` Adam Sjøgren
2016-01-03 18:48           ` myglc2
2016-01-03  5:14       ` Benjamin Slade
2016-01-03  9:04         ` Eric Abrahamsen
2016-01-04 21:12           ` Alan Schmitt
2016-01-05  1:43             ` Eric Abrahamsen
2016-01-05  8:38               ` Rainer M Krug
2016-01-05  8:45                 ` Erik Colson
2016-01-05 10:14                   ` Rainer M Krug
2016-01-05  9:04                 ` Eric Abrahamsen
2016-01-05  9:26                 ` Alan Schmitt
2016-01-05  8:54               ` Alan Schmitt
2016-01-05  9:15                 ` Eric Abrahamsen
2016-01-05  9:29                   ` Alan Schmitt
2016-01-03 13:36         ` Adam Sjøgren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).