From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id AAC2DBBCA for ; Wed, 2 Apr 2008 15:30:22 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AjIBADcn80fPn3g4omdsb2JhbACBWo91AQEBAQIHBQcJFppg X-IronPort-AV: E=Sophos;i="4.25,593,1199660400"; d="scan'208";a="24504281" Received: from nn2.excitenetwork.com (HELO excite.com) ([207.159.120.56]) by mail4-smtp-sop.national.inria.fr with ESMTP; 02 Apr 2008 15:30:22 +0200 Received: by xprdmxin.myway.com (Postfix, from userid 110) id 7ADAF8B31E; Wed, 2 Apr 2008 09:30:25 -0400 (EDT) To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] More efficient implementation of intersection of sets? Received: from [132.230.166.192] by xprdmailfe21.nwk.excite.com via HTTP; Wed, 02 Apr 2008 09:30:25 EST X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: ID = 8a8a82c06e1c8d4366ddc3f449c5aa61 Reply-To: sasha.mal@excite.com From: "sasha mal" MIME-Version: 1.0 X-Sender: sasha.mal@excite.com X-Mailer: PHP Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: jon@ffconsultancy.com Message-Id: <20080402133025.7ADAF8B31E@xprdmxin.myway.com> Date: Wed, 2 Apr 2008 09:30:25 -0400 (EDT) X-Spam: no; 0.00; intersection:01 ocaml:01 runtime:01 pointer:01 intersection:01 ocaml:01 nodes:01 nodes:01 pointers:01 constructors:01 excite:98 excite:98 wrote:01 recursively:01 caml-list:01 > On Tuesday 01 April 2008 16:55:24 sasha mal wrote: > > Dear OCaml users! > > > > Currently, > > > > Set.inter x y > > > > splits y into two trees, one containing elements that are bigger and the > > other containing elements that are smaller than the top of x, then applies > > the procedure recursively. What is the exact runtime of the algorithm? > > We discussed this before on this list and the result was inconclusive. Suffice > to say, it is very fast! If could give a pointer to the discussion, that would be great! > > Is there a better one for the intersection for OCaml sets? > Not likely. I wouldn't be so sure. I'm thinking of an alternative approach which inserts keys one at a time. Basically, we go through both trees simultaneously, starting with smallest nodes of both trees. If two nodes with the same key are seen, the are inserted into a new tree that will hold the intersection; then the pair of nodes with next larger keys is taken. Otherwise, if, say, key1=key2 inside the first tree. If k=key2, we add k into the new tree and proceed to the next pair. If k>key2, we proceed with the pair (k, key2) instead of (key1, key2). For key1>key2, prcoeed analogously, i.e. search for the smallest key k>=key2 in the second tree. If at some point of time no "larger" key is found in one of the two original trees, the new tree where we insert elements to contains exactly the intersection. The search in the trees is done by using "pointers" which are lists of trees contained in one another. E.g. [a,b,c] means that c is the root, b is one of the children of c, a is one of the children of b. The advantage compared to the current implementation is that portions of one tree which are irrelevant for the intersection might be skipped. In the current implementation, they are rebalanced. The used time is at most O((t1+t2)log(t1 intersection t2)) if constructors need O(1) time. A better estimation is welcome. Could anyone compare this approach to the current implementation? _______________________________________________ Join Excite! - http://www.excite.com The most personalized portal on the Web!