From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id A345ABBCA for ; Wed, 2 Apr 2008 16:01:24 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AjIBAAIu80fPn3g4omdsb2JhbACBWo91AQEBAQIHBQcJFppf X-IronPort-AV: E=Sophos;i="4.25,593,1199660400"; d="scan'208";a="24505809" Received: from nn2.excitenetwork.com (HELO excite.com) ([207.159.120.56]) by mail4-smtp-sop.national.inria.fr with ESMTP; 02 Apr 2008 16:01:23 +0200 Received: by xprdmxin.myway.com (Postfix, from userid 110) id 055C08B314; Wed, 2 Apr 2008 10:01:27 -0400 (EDT) To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] More efficient implementation of intersection of sets? Received: from [132.230.166.192] by xprdmailfe21.nwk.excite.com via HTTP; Wed, 02 Apr 2008 10:01:27 EST X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: ID = 8a8a82c06e1c8d4366ddc3f449c5aa61 Reply-To: sasha.mal@excite.com From: "sasha mal" MIME-Version: 1.0 X-Sender: sasha.mal@excite.com X-Mailer: PHP Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: jon@ffconsultancy.com Message-Id: <20080402140127.055C08B314@xprdmxin.myway.com> Date: Wed, 2 Apr 2008 10:01:27 -0400 (EDT) X-Spam: no; 0.00; intersection:01 nodes:01 nodes:01 intersection:01 excite:98 excite:98 caml-list:01 pair:01 pair:01 corrections:01 implemented:02 tree:02 tree:02 efficient:07 approach:08 > I'm thinking of an alternative approach which inserts keys one at a time. > Basically, we go through both trees simultaneously, starting with smallest > nodes of both trees. If two nodes with the same key are seen, the are inserted > into a new tree that will contain the intersection; then the pair of nodes with > next larger keys is taken. Otherwise, if, say, key1 the smallest key k>=key2 inside the first tree. If k=key2, we add k into the > new tree and proceed to the next pair. If k>key2, we proceed with the pair > (k, key2) instead of (key1, key2). For key1>key2, proceed analogously, i.e. > search for the smallest key k>=key1 in the second tree. If at some point of > time no "larger" key is found in one of the two original trees, the new tree > where we insert elements to contains exactly the intersection. [Small corrections.] By the way, the current implemented procedure always splits the second tree, regardless of the sizes of the trees. Imagine that we have two trees of very different sizes (e.g. with 5 and 10^5 elements). What tree should be split then? _______________________________________________ Join Excite! - http://www.excite.com The most personalized portal on the Web!