From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <sasha.mal@excite.com>
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr
X-Spam-Level: 
X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled 
	version=3.1.3
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105])
	by yquem.inria.fr (Postfix) with ESMTP id A345ABBCA
	for <caml-list@yquem.inria.fr>; Wed,  2 Apr 2008 16:01:24 +0200 (CEST)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AjIBAAIu80fPn3g4omdsb2JhbACBWo91AQEBAQIHBQcJFppf
X-IronPort-AV: E=Sophos;i="4.25,593,1199660400"; 
   d="scan'208";a="24505809"
Received: from nn2.excitenetwork.com (HELO excite.com) ([207.159.120.56])
  by mail4-smtp-sop.national.inria.fr with ESMTP; 02 Apr 2008 16:01:23 +0200
Received: by xprdmxin.myway.com (Postfix, from userid 110)
	id 055C08B314; Wed,  2 Apr 2008 10:01:27 -0400 (EDT)
To: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] More efficient implementation of intersection of sets?
Received: from [132.230.166.192] by xprdmailfe21.nwk.excite.com via HTTP; Wed, 02 Apr 2008 10:01:27 EST
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: ID = 8a8a82c06e1c8d4366ddc3f449c5aa61
Reply-To: sasha.mal@excite.com
From: "sasha mal" <sasha.mal@excite.com>
MIME-Version: 1.0
X-Sender: sasha.mal@excite.com
X-Mailer: PHP
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Cc: jon@ffconsultancy.com
Message-Id: <20080402140127.055C08B314@xprdmxin.myway.com>
Date: Wed,  2 Apr 2008 10:01:27 -0400 (EDT)
X-Spam: no; 0.00; intersection:01 nodes:01 nodes:01 intersection:01 excite:98 excite:98 caml-list:01 pair:01 pair:01 corrections:01 implemented:02 tree:02 tree:02 efficient:07 approach:08 


> I'm thinking of an alternative approach which inserts keys one at a time.

> Basically, we go through both trees simultaneously, starting with smallest

> nodes of both trees. If two nodes with the same key are seen, the are inserted

> into a new tree that will contain the intersection; then the pair of nodes with

> next larger keys is taken. Otherwise, if, say, key1<key2, then we search for

> the smallest key k>=key2 inside the first tree. If k=key2, we add k into the

> new tree and proceed to the next pair. If k>key2, we proceed with the pair 

> (k, key2) instead of (key1, key2). For key1>key2, proceed analogously, i.e.

> search for the smallest key k>=key1 in the second tree. If at some point of

> time no "larger" key is found in one of the two original trees, the new tree

> where we insert elements to contains exactly the intersection.


[Small corrections.]


By the way, the current implemented procedure always splits the second tree, regardless of the sizes of the trees. Imagine that we have two trees of very different sizes (e.g. with 5 and 10^5 elements). What tree should be split then?

_______________________________________________
Join Excite! - http://www.excite.com
The most personalized portal on the Web!