From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id EEA85BC75 for ; Fri, 25 Feb 2005 22:50:26 +0100 (CET) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.195]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j1PLoQV7015218 for ; Fri, 25 Feb 2005 22:50:26 +0100 Received: by rproxy.gmail.com with SMTP id a36so154629rnf for ; Fri, 25 Feb 2005 13:50:25 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=TrL9/0OIRi2CltvSIiWiSCT4ACSYQXzUHL9EVPikzgaewmPyFB+nkzThYfooTwBsNqkJgYua4dN1QJAX8arY50XUQdD49MyQUU73RQrW9a8LmKairIGDGSFeDdgwZ1YE+Wg094cSK9mjWlmqjml/FkOfgbqt4M9FuxIN7Gy9j2o= Received: by 10.38.65.21 with SMTP id n21mr55261rna; Fri, 25 Feb 2005 13:50:25 -0800 (PST) Received: by 10.38.65.58 with HTTP; Fri, 25 Feb 2005 13:50:25 -0800 (PST) Message-ID: <7f8e92aa0502251350202ec368@mail.gmail.com> Date: Fri, 25 Feb 2005 23:50:25 +0200 From: Radu Grigore Reply-To: Radu Grigore To: Jon Harrop Subject: Re: [Caml-list] Complexity of Set.union Cc: caml-list@yquem.inria.fr In-Reply-To: <200502251947.46657.jon@jdh30.plus.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <200502240920.04902.jon@jdh30.plus.com> <200502251730.13003.jon@jdh30.plus.com> <20050225174853.GA25527@yquem.inria.fr> <200502251947.46657.jon@jdh30.plus.com> X-Miltered: at nez-perce with ID 421F9DA2.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 wrote:01 stl:01 inserting:01 ocaml:01 sequences:01 achieves:01 inputs:01 lib:01 ...:98 height:98 height:98 node:01 constructor:01 linear:02 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=RCVD_BY_IP autolearn=disabled version=3.0.2 X-Spam-Level: On Fri, 25 Feb 2005 19:47:45 +0000, Jon Harrop wrote: > I ask this because the STL set_union is probably O(n+N) (inserting an already > sorted range into a set is apparently linear) which is worse than the O((n+N) > log(n+N)) which you've suggested for OCaml. The complexity of set_union is indeed O(n+N), see [0]. It is basically a merge of sorted _sequences_ [1]. I assume n is the size of the small set and N is the size of the small set and the heights are h=O(lg n), H=O(lg N). With this the complexity of Set.union is more like O(n lg(n+N)), at least when all elements in one set are smaller than the elements of the other set. > I see. This could be improved in the unsymmetric case, by adding elements from > the smaller set to the larger set. But the size of the set isn't stored so > you'd have to make do with adding elements from the shallower set to the > deeper set. I've no idea what the complexity of that would be... That is how it works now. As Xavier said the trickiest part is split. > > Did you mean "of two equal height sets such that all elements of the > > first set are smaller than all elements of the second set"? > > Yes, that's what I meant. :-) In that case the current Set.union simply adds elements repeatedly from the set with small height to the set with big height. > > That > > could indeed run in constant time (just join the two sets with a > > "Node" constructor), but I doubt the current implementation achieves > > this because of the repeated splitting. What splitting? I see none in this case. > Having said that, wouldn't it take the Set.union function O(log n + log N) > time to prove that the inputs are non-overlapping, because it would have to > traverse to the min/max elements of both sets? I agree. Also, such a check looks ugly to me (for a standard library). -- regards, radu http://rgrig.blogspot.com/ [0] http://library.n0i.net/programming/c/cp-iso/lib-algorithms.html#lib.set.union [1] http://rgrig.blogspot.com/2004/11/merging-lists.html