From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id D61C6BB9C for ; Thu, 17 Nov 2005 13:39:46 +0100 (CET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id jAHCdkuw030666 for ; Thu, 17 Nov 2005 13:39:46 +0100 Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id NAA16507 for ; Thu, 17 Nov 2005 13:39:45 +0100 (MET) Received: from mail.enyo.de (mail.enyo.de [212.9.189.167]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id jAHCdjMw014647 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for ; Thu, 17 Nov 2005 13:39:45 +0100 Received: from deneb.vpn.enyo.de ([212.9.189.177] helo=deneb.enyo.de) by albireo.enyo.de with esmtp id 1Ecj3I-0007wO-MR; Thu, 17 Nov 2005 13:39:44 +0100 Received: from fw by deneb.enyo.de with local (Exim 4.54) id 1Ecj3H-0003l6-2j; Thu, 17 Nov 2005 13:39:43 +0100 From: Florian Weimer To: Oliver Bandel Cc: caml-list@inria.fr Subject: Re: [Caml-list] [1/2 OT] Indexing (and mergeable Index-algorithms) References: <20051116234238.GA5741@first.in-berlin.de> <437C40EE.7040005@bik-gmbh.de> <20051117092430.GA521@first.in-berlin.de> Date: Thu, 17 Nov 2005 13:39:43 +0100 In-Reply-To: <20051117092430.GA521@first.in-berlin.de> (Oliver Bandel's message of "Thu, 17 Nov 2005 10:24:30 +0100") Message-ID: <87u0ebih2o.fsf@mid.deneb.enyo.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Miltered: at nez-perce with ID 437C7A12.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at concorde with ID 437C7A11.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 indexing:01 oliver:01 bandel:01 indexing:01 inverted:01 reuse:01 iirc:01 full-text:98 full-text:98 incremental:01 florian:02 hints:03 probably:05 although:08 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.3 * Oliver Bandel: > I found an interesting paper there, about using updatable indexing > ("Fast Incremental Indexing for Full-Text Information Retrieval" > from Brown/Callen/Croft.) They talked about "inverted lists", and > this together with other hints from this list may be a good starter. If you need a full-text search capability on natural language documents, there are various libraries you could reuse: Xapian, Lucene (although this is one is writtein Java, IIRC), Estraier, OpenFTS, a MySQL component, and probably many more.