From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Wed, 9 Nov 2005 09:19:19 -0500 From: Sam To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu> Subject: Re: [9fans] Scaleable mail repositories. In-Reply-To: <57471c9f2e6b9a4c77886fffb87d244d@terzarima.net> Message-ID: References: <57471c9f2e6b9a4c77886fffb87d244d@terzarima.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Topicbox-Message-UUID: a98cc4a4-ead0-11e9-9d60-3106f5b1d025 In the not-so-distant past I was part of a three man effort to write a web site indexer / search engine generator. My job was to take the indexed files / urls (they sucked them down with java) and create a suffix tree database that could be searched upon via cgi. I don't have any specific numbers, but it was quite fast. This was when google was just becoming known and once we realized we could point google at a website the project was abandoned. The whole point of using suffix trees is linear time search wrt the size of the search string (note: not the size of the searched text). Seems like it's a good candidate for this task. Sam