From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id C7458BBAF for ; Sat, 13 Mar 2010 14:56:26 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AhkEAMcnm0tQRFuwWWdsb2JhbACacgsBFBcEOLhRhHsE X-IronPort-AV: E=Sophos;i="4.49,632,1262559600"; d="scan'208";a="58787587" Received: from furbychan.cocan.org ([80.68.91.176]) by mail4-smtp-sop.national.inria.fr with ESMTP; 13 Mar 2010 14:56:26 +0100 Received: from rich by furbychan.cocan.org with local (Exim 4.63) (envelope-from ) id 1NqRpH-0004VS-JS; Sat, 13 Mar 2010 13:56:23 +0000 Date: Sat, 13 Mar 2010 13:56:23 +0000 To: Hugo Ferreira Cc: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Shared memory parallel application: kernel threads Message-ID: <20100313135623.GA17113@annexia.org> References: <4B9A2BCB.3040607@inescporto.pt> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4B9A2BCB.3040607@inescporto.pt> User-Agent: Mutt/1.5.13 (2006-08-11) From: Richard Jones X-Spam: no; 0.00; ocaml:01 read-only:01 ocaml:01 gigabyte:98 threads:01 threads:01 caml-list:01 data:02 data:02 api:02 kernel:02 readme:03 mpi:04 parallel:05 shared:06 Also add my 2 cents here: At least look at OCaml Ancient for sharing the data. You possibly may not use it, but it was designed pretty much for what you have in mind. The README should be informative: http://merjis.com/developers/ancient http://merjis.com/_file/ancient-readme.txt (You should also look at the API and source). You didn't mention how large your read-only data set is, but OCaml Ancient should be able to handle 100s of gigabytes, assuming a 64 bit machine. We used to use it with 10s of gigabyte data sets without any issues. As others have said, don't use threads to launch your jobs. Look at one of the fork-based libraries. In addition to the ones mentioned, take a look at PreludeML, which is internally very simple and should give you a good start if you decide to write your own: http://github.com/kig/preludeml If you want to spread the jobs over multiple machines, then OCaml MPI is probably the way to go. Rich. -- Richard Jones Red Hat