From mboxrd@z Thu Jan 1 00:00:00 1970 X-Msuck: nntp://news.gmane.io/gmane.emacs.gnus.general/43148 Path: main.gmane.org!not-for-mail From: Kai.Grossjohann@CS.Uni-Dortmund.DE (Kai =?iso-8859-1?q?Gro=DFjohann?=) Newsgroups: gmane.emacs.gnus.general Subject: Re: Searching a news server Date: Sun, 17 Feb 2002 11:03:16 +0100 Sender: owner-ding@hpc.uh.edu Message-ID: References: NNTP-Posting-Host: coloc-standby.netfonds.no Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Trace: main.gmane.org 1035178289 16001 80.91.224.250 (21 Oct 2002 05:31:29 GMT) X-Complaints-To: usenet@main.gmane.org NNTP-Posting-Date: Mon, 21 Oct 2002 05:31:29 +0000 (UTC) Cc: ding@gnus.org Return-Path: Original-Received: (qmail 2146 invoked from network); 17 Feb 2002 10:05:39 -0000 Original-Received: from malifon.math.uh.edu (mail@129.7.128.13) by mastaler.com with SMTP; 17 Feb 2002 10:05:39 -0000 Original-Received: from sina.hpc.uh.edu ([129.7.128.10] ident=lists) by malifon.math.uh.edu with esmtp (Exim 3.20 #1) id 16cOBE-00026Y-00; Sun, 17 Feb 2002 04:04:24 -0600 Original-Received: by sina.hpc.uh.edu (TLB v0.09a (1.20 tibbs 1996/10/09 22:03:07)); Sun, 17 Feb 2002 04:04:20 -0600 (CST) Original-Received: from sclp3.sclp.com (qmailr@sclp3.sclp.com [209.196.61.66]) by sina.hpc.uh.edu (8.9.3/8.9.3) with SMTP id EAA24756 for ; Sun, 17 Feb 2002 04:04:00 -0600 (CST) Original-Received: (qmail 2121 invoked by alias); 17 Feb 2002 10:03:57 -0000 Original-Received: (qmail 2116 invoked from network); 17 Feb 2002 10:03:57 -0000 Original-Received: from waldorf.cs.uni-dortmund.de (129.217.4.42) by gnus.org with SMTP; 17 Feb 2002 10:03:57 -0000 Original-Received: from lothlorien.cs.uni-dortmund.de (lothlorien [129.217.19.67]) by waldorf.cs.uni-dortmund.de with ESMTP id g1HA3Lb02223; Sun, 17 Feb 2002 11:03:22 +0100 (MET) Original-Received: from lucy.cs.uni-dortmund.de (lucy [129.217.19.80]) by lothlorien.cs.uni-dortmund.de id LAA03782; Sun, 17 Feb 2002 11:03:16 +0100 (MET) Original-Received: by lucy.cs.uni-dortmund.de (Postfix, from userid 6104) id 7BAF93AFCD; Sun, 17 Feb 2002 11:03:16 +0100 (CET) Original-To: Russ Allbery In-Reply-To: (Russ Allbery's message of "Sat, 16 Feb 2002 13:16:37 -0800") Original-Lines: 23 User-Agent: Gnus/5.090006 (Oort Gnus v0.06) Emacs/21.2.50 (i686-pc-linux-gnu) Precedence: list X-Majordomo: 1.94.jlt7 Xref: main.gmane.org gmane.emacs.gnus.general:43148 X-Report-Spam: http://spam.gmane.org/gmane.emacs.gnus.general:43148 Russ Allbery writes: > I'll note from experience with things like this that the hard part in > making full-text searching acceptably fast will be incremental indexing of > each article as it comes in. Many of the existing search engines suck at > incremental indexing. Most of the weighting functions used seem to use normalization of some kind, so the indexing weight for a given term in a given document depends on the complete set of documents. So adding a document means that, strictly speaking, you have to reindex the whole collection. Hmpf. There is a hack for freeWAIS-sf which allows you to add N documents incrementally, with skew of weights. After more than N documents have been added, it reindexes the whole collection. Maybe that's a suitable workaround. freeWAIS-sf seems to be a bear to build... kai -- ~/.signature is: umop 3p!sdn (Frank Nobis)