From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id p3K80JRt009493 for ; Wed, 20 Apr 2011 10:00:19 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AlYBANyRrk3VhjEVmWdsb2JhbACEUJJwjXwUAQEBAQEICwsHFCWIb60QkSqBKYNOegSSIQc X-IronPort-AV: E=Sophos;i="4.64,245,1301868000"; d="scan'208";a="81398016" Received: from ihsmtp01cons.lis.interhost.com ([213.134.49.21]) by mail3-smtp-sop.national.inria.fr with ESMTP; 20 Apr 2011 10:00:13 +0200 Received: from [192.168.1.64] ([178.166.9.223]) by ihsmtp01cons.lis.interhost.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 20 Apr 2011 09:00:03 +0100 Message-ID: <4DAE9278.4050701@inescporto.pt> Date: Wed, 20 Apr 2011 08:59:52 +0100 From: Hugo Ferreira User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.14) Gecko/20110223 Thunderbird/3.1.8 MIME-Version: 1.0 To: Gerd Stolpmann CC: Eray Ozkural , Martin Jambon , caml-list@inria.fr References: <2054357367.219171.1300974318806.JavaMail.root@zmbs4.inria.fr> <4D8BD02D.1010505@inria.fr> <4D8C73C8.6020801@inescporto.pt> <1301055903.8429.314.camel@thinkpad> <341494683.237537.1301057887481.JavaMail.root@zmbs4.inria.fr> <4D8C944A.9060601@inria.fr> <4D8CB859.9040709@inescporto.pt> <4D8CDDCC.4010000@ens-lyon.org> <4D8CEAA4.2030403@inescporto.pt> <1303244809.8429.1272.camel@thinkpad> In-Reply-To: <1303244809.8429.1272.camel@thinkpad> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-OriginalArrivalTime: 20 Apr 2011 08:00:03.0678 (UTC) FILETIME=[F48FF7E0:01CBFF30] Subject: Re: [Caml-list] Efficient OCaml multicore -- roadmap? On 04/19/2011 09:26 PM, Gerd Stolpmann wrote: > Am Dienstag, den 19.04.2011, 12:57 +0300 schrieb Eray Ozkural: >> >> >> On Fri, Mar 25, 2011 at 9:19 PM, Hugo Ferreira >> wrote: >> On 03/25/2011 06:24 PM, Martin Jambon wrote: >> On 03/25/2011 01:10 PM, Fabrice Le Fessant >> wrote: >> Of course, sharing structured >> mutable data between threads will not >> be >> possible, but actually, it is a good >> thing if you want to write correct >> programs ;-) >> >> On 03/25/11 08:44, Hugo Ferreira replied: >> I'll stick to my guns here. It simply makes >> solving certain problem >> unfeasible. Point in case: I work on machine >> learning algorithms. I >> use large data-structures that must be >> processed (altered) >> in order to learn. Because these >> data-structures are large it become >> impractical to copy this to a process every >> time I start off a new >> "thread". >> >> The solution would be to use get/set via a >> message-passing interface. >> >> >> >> Cannot see how this works. Say I want to share a balanced >> binary tree. >> Several processes/threads each take this tree and alter it by >> adding and >> deleting elements. Each (new) tree is then further processed >> by other >> processes/threads. >> >> How can get/set be used in this scenario? >> >> >> >> >> I think it won't have good performance and it won't scale, and it will >> fail for truly delicate shared memory architectures of the future with >> thousands of cores.... >> >> >> And neither will it support on-chip message passing facilities of >> those future processors. >> >> >> The shared memory message passing never worked too well, anyway, too >> many redundant copies. Not fitting for high performance computing. >> >> >> No need at all except for embarrassingly parallel applications. I >> suppose that's the target, right? > > I did experiment a bit in the meantime with Netmulticore [1], my > implementation of multi-processing with shared memory. Netmulticore > provides both alterable data structures and a quite efficient message > passing interface. The experience so far: Message passing wins if you do > it the right way. Avoid ping-pong games like get/set - shared memory is > far better for this kind of operation. The better topology for message > passing are pipelines where data flows only into one direction. > > I don't know how this translates to Hugo's machine learning problem. I > could imagine a shared data structure is good here for providing the > starting point for learning. If you run several learning steps in > parallel, you want to avoid that these steps lock out each other, i.e. > try to ensure that they affect distinct parts of the matrix. These > updates would be sent (using message passing) to an update manager, > which would apply the learning results to the matrix, and compute the > next version, for which a number of new learning steps would be started. > I'm just guessing here how it could be done. In my imagination a clever > combination of both (alterable) shared memory and message passing is the > way to go. > Agreed. Currently I am using messaging via sexplib to send data-sets to slaves for processing and returns results to a master process the same way. But my objective is to share data structures and return partial results to a master process. Returning results requires messaging. > Hugo, I'd like to do some more experiments into this direction. Is there > a simple version of machine learning algorithm I could try to > parallelize? > Unfortunately not. The "system" currently distributes the experiments to 8 machines with 8 CPU's each connected in a cluster. The algorithms themselves, used in each experiment, do _not_ use parallel processing. I am now working on this (deal with issue of noisy data). My last attempt tried to use the Ancient module but failed because I needed to alter complex data-structures. Once I get the basic algorithm down, I will try to use parallel processing again. Lot of work until then 8-(. Note: a quick look at Netmulticore seems to indicate that I will still have to jump some hoops. Hugo > Gerd > > [1] Netmulticore: now available in ocamlnet-3.3.0test1: > http://blog.camlcity.org/blog/multicore2.html > >> >> >> Best, >> >> -- >> Eray Ozkural, PhD candidate. Comp. Sci. Dept., Bilkent University, >> Ankara >> http://groups.yahoo.com/group/ai-philosophy >> http://myspace.com/arizanesil http://myspace.com/malfunct >> > >