From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <52cfc303b267b7d8c9019ae8fec6ff5d@vitanuova.com> To: 9fans@cse.psu.edu Subject: Re: [9fans] Scaleable mail repositories. Date: Tue, 1 Nov 2005 19:56:30 +0000 From: rog@vitanuova.com In-Reply-To: <20051031123213.0E9A51AE726@dexter-peak.quanstro.net> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit Topicbox-Message-UUID: a372901c-ead0-11e9-9d60-3106f5b1d025 > on the other hand, what is the downside of keeping one message > per file? the upside is that no indexing is required. i'd say that an advantage of going for an indexed scheme is that one could potentially index attributes other than message number. i've never got around to biting the bullet on this, but i've long thought that it would be very nice to have a version of upas/fs which could offer different views onto the same mailbox. one could implement a clone-file style filesystem where each line directory holds a some subset of the messages in the overall mailbox, determined by writing a control message, e.g. a regexp restriction on a given header line. suitable indexing, and a little extra acme support could make this a smooth experience. i keep many of my old mail messages around, and it's painful to search through them - i usually end up using grep -n, and plumbing the mailbox file into acme, which has at least the advantage that it doesn't use up all my memory. however it's not a particularly pleasant experience, and i'd love to see something better. BTW, one advantage of a file-per-message format is that it enables straightforward annotation of messages without relying on mailbox-to-index-file consistency. i don't know how others use mail, but i'd find some sort of annotation useful (e.g. read/unread, intent to reply), and maybe this is a possible reason for changing the storage format. i'm not sure though. reading many files and directories will inevitably slow things down (a quick estimate on my current 23MB mbox shows that it would take just over 4 times as many 9P transactions to read the whole thing if each message were stored as the a separate file).