From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id 395CEBBCA for ; Wed, 2 Apr 2008 01:51:31 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AoMBAEJn8kfUnw7Vb2dsb2JhbACCOI8SAQoHAgUHGJps X-IronPort-AV: E=Sophos;i="4.25,589,1199660400"; d="scan'208";a="10954746" Received: from ptb-relay02.plus.net ([212.159.14.213]) by mail3-smtp-sop.national.inria.fr with ESMTP; 02 Apr 2008 01:51:30 +0200 Received: from [80.229.56.224] (helo=beast.local) by ptb-relay02.plus.net with esmtp (Exim) id 1JgqGI-0004rz-2A for caml-list@yquem.inria.fr; Wed, 02 Apr 2008 00:51:30 +0100 From: Jon Harrop Organization: Flying Frog Consultancy Ltd. To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] More efficient implementation of intersection of sets? Date: Wed, 2 Apr 2008 00:42:01 +0100 User-Agent: KMail/1.9.7 References: <20080401155524.F1CE08B34A@xprdmxin.myway.com> In-Reply-To: <20080401155524.F1CE08B34A@xprdmxin.myway.com> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200804020042.02016.jon@ffconsultancy.com> X-Plusnet-Relay: f798380aef1a2bf8d90e02e798eb92d8 X-Spam: no; 0.00; intersection:01 ocaml:01 runtime:01 intersection:01 ocaml:01 ocaml's:01 vastly:01 gava:01 node:01 nodes:01 nodes:01 mutable:01 30%.:98 ditch:98 frog:98 On Tuesday 01 April 2008 16:55:24 sasha mal wrote: > Dear OCaml users! > > Currently, > > Set.inter x y > > splits y into two trees, one containing elements that are bigger and the > other containing elements that are smaller than the top of x, then applies > the procedure recursively. What is the exact runtime of the algorithm? We discussed this before on this list and the result was inconclusive. Suffice to say, it is very fast! > Is there a better one for the intersection for OCaml sets? Not likely. OCaml's implementation is already vastly more efficient than any other language I have ever seen (e.g. C++). Your next best bet is probably to parallelize the algorithm to improve the performance but that is extremely difficult to do without a concurrent GC. Frederic Gava did some work on this in OCaml. I am working on the same problem in F#. Failing that, you might want to apply some of the stock optimizations to the Set module, such as a Node1 type constructor for nodes with a value but no child nodes. That can improve performance by 30%. Alternatively, you may prefer to ditch immutable structures and opt for a hashset, which can be many times faster but is much more difficult to use because it is mutable. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/products/?e