From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.4 required=5.0 tests=AWL,MISSING_HEADERS autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id 33332BBCA for ; Wed, 2 Apr 2008 16:05:33 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AswBAC8v80fB/BYfa2dsb2JhbACRTw0FAgUHFppa X-IronPort-AV: E=Sophos;i="4.25,593,1199660400"; d="scan'208";a="24506038" Received: from smtp20.orange.fr ([193.252.22.31]) by mail4-smtp-sop.national.inria.fr with ESMTP; 02 Apr 2008 16:05:33 +0200 Received: from me-wanadoo.net (localhost [127.0.0.1]) by mwinf2019.orange.fr (SMTP Server) with ESMTP id 98C661C000B0 for ; Wed, 2 Apr 2008 16:05:32 +0200 (CEST) Received: from [192.168.0.50] (ARouen-156-1-72-162.w90-8.abo.wanadoo.fr [90.8.95.162]) by mwinf2019.orange.fr (SMTP Server) with ESMTP id 698EA1C000A0 for ; Wed, 2 Apr 2008 16:05:32 +0200 (CEST) X-ME-UUID: 20080402140532432.698EA1C000A0@mwinf2019.orange.fr Message-ID: <47F392B8.4040009@univ-paris12.fr> Date: Wed, 02 Apr 2008 16:05:44 +0200 From: =?ISO-8859-1?Q?Fr=E9d=E9ric_Gava?= Reply-To: gava@univ-paris12.fr Organization: =?ISO-8859-1?Q?Universit=E9_de_Paris_12=2C_LACL?= User-Agent: Thunderbird 1.5.0.13 (X11/20070824) MIME-Version: 1.0 Cc: caml-list@yquem.inria.fr Subject: Re: [Caml-list] More efficient implementation of intersection of sets? References: <20080401155524.F1CE08B34A@xprdmxin.myway.com> <200804020042.02016.jon@ffconsultancy.com> In-Reply-To: <200804020042.02016.jon@ffconsultancy.com> Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Spam: no; 0.00; gava:01 gava:01 intersection:01 ocaml's:01 vastly:01 ocaml:01 subset:01 hash:01 ocaml:01 lacl:01 12.:98 12.:98 caml-list:01 caml-list:01 algorithm:01 Dear John, Sasha and Caml-list > Not likely. OCaml's implementation is already vastly more efficient than any > other language I have ever seen (e.g. C++). Your next best bet is probably to > parallelize the algorithm to improve the performance but that is extremely > difficult to do without a concurrent GC. Frederic Gava did some work on this > in OCaml. I am working on the same problem in F#. You can have parallel sets without a concurrent GC : each processor has a subset of your initial set and you can distribute the elements using a hash function from element to the number of processor "p" (there is so "p" ocaml programs that runs and thus "p" GC). A random function can be used in general and generate a quick good load balancing. You can have more information here : http://lacl.univ-paris12.fr//gava/papers/gava_ppl_2008.pdf Note, that I used our "under development library" but this work can be done using OCaml-MPI Frédéric G.