caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Hugo Ferreira <hmf@inescporto.pt>
To: Gerd Stolpmann <info@gerd-stolpmann.de>
Cc: Eray Ozkural <examachine@gmail.com>,
	Martin Jambon <martin.jambon@ens-lyon.org>,
	caml-list@inria.fr
Subject: Re: [Caml-list] Efficient OCaml multicore -- roadmap?
Date: Wed, 20 Apr 2011 08:59:52 +0100	[thread overview]
Message-ID: <4DAE9278.4050701@inescporto.pt> (raw)
In-Reply-To: <1303244809.8429.1272.camel@thinkpad>

On 04/19/2011 09:26 PM, Gerd Stolpmann wrote:
> Am Dienstag, den 19.04.2011, 12:57 +0300 schrieb Eray Ozkural:
>>
>>
>> On Fri, Mar 25, 2011 at 9:19 PM, Hugo Ferreira<hmf@inescporto.pt>
>> wrote:
>>          On 03/25/2011 06:24 PM, Martin Jambon wrote:
>>                          On 03/25/2011 01:10 PM, Fabrice Le Fessant
>>                          wrote:
>>                                    Of course, sharing structured
>>                                  mutable data between threads will not
>>                                  be
>>                                  possible, but actually, it is a good
>>                                  thing if you want to write correct
>>                                  programs ;-)
>>
>>                  On 03/25/11 08:44, Hugo Ferreira replied:
>>                          I'll stick to my guns here. It simply makes
>>                          solving certain problem
>>                          unfeasible. Point in case: I work on machine
>>                          learning algorithms. I
>>                          use large data-structures that must be
>>                          processed (altered)
>>                          in order to learn. Because these
>>                          data-structures are large it become
>>                          impractical to copy this to a process every
>>                          time I start off a new
>>                          "thread".
>>
>>                  The solution would be to use get/set via a
>>                  message-passing interface.
>>
>>
>>
>>          Cannot see how this works. Say I want to share a balanced
>>          binary tree.
>>          Several processes/threads each take this tree and alter it by
>>          adding and
>>          deleting elements. Each (new) tree is then further processed
>>          by other
>>          processes/threads.
>>
>>          How can get/set be used in this scenario?
>>
>>
>>
>>
>> I think it won't have good performance and it won't scale, and it will
>> fail for truly delicate shared memory architectures of the future with
>> thousands of cores....
>>
>>
>> And neither will it support on-chip message passing facilities of
>> those future processors.
>>
>>
>> The shared memory message passing never worked too well, anyway, too
>> many redundant copies. Not fitting for high performance computing.
>>
>>
>> No need at all except for embarrassingly parallel applications. I
>> suppose that's the target, right?
>
> I did experiment a bit in the meantime with Netmulticore [1], my
> implementation of multi-processing with shared memory. Netmulticore
> provides both alterable data structures and a quite efficient message
> passing interface. The experience so far: Message passing wins if you do
> it the right way. Avoid ping-pong games like get/set - shared memory is
> far better for this kind of operation. The better topology for message
> passing are pipelines where data flows only into one direction.
>
> I don't know how this translates to Hugo's machine learning problem. I
> could imagine a shared data structure is good here for providing the
> starting point for learning. If you run several learning steps in
> parallel, you want to avoid that these steps lock out each other, i.e.
> try to ensure that they affect distinct parts of the matrix. These
> updates would be sent (using message passing) to an update manager,
> which would apply the learning results to the matrix, and compute the
> next version, for which a number of new learning steps would be started.
> I'm just guessing here how it could be done. In my imagination a clever
> combination of both (alterable) shared memory and message passing is the
> way to go.
>

Agreed. Currently I am using messaging via sexplib to send data-sets to 
slaves for processing and returns results to a master process the same
way. But my objective is to share data structures and return partial
results to a master process. Returning results requires messaging.

> Hugo, I'd like to do some more experiments into this direction. Is there
> a simple version of machine learning algorithm I could try to
> parallelize?
>

Unfortunately not. The "system" currently distributes the experiments
to 8 machines with 8 CPU's each connected in a cluster. The algorithms
themselves, used in each experiment, do _not_ use parallel processing. I
am now working on this (deal with issue of noisy data). My last
attempt tried to use the Ancient module but failed because I needed to
alter complex data-structures.

Once I get the basic algorithm down, I will try to use parallel 
processing again. Lot of work until then 8-(.

Note: a quick look at Netmulticore seems to indicate that I will still 
have to jump some hoops.

Hugo

> Gerd
>
> [1] Netmulticore: now available in ocamlnet-3.3.0test1:
> http://blog.camlcity.org/blog/multicore2.html
>
>>
>>
>> Best,
>>
>> --
>> Eray Ozkural, PhD candidate.  Comp. Sci. Dept., Bilkent University,
>> Ankara
>> http://groups.yahoo.com/group/ai-philosophy
>> http://myspace.com/arizanesil http://myspace.com/malfunct
>>
>
>


  reply	other threads:[~2011-04-20  8:00 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <2054357367.219171.1300974318806.JavaMail.root@zmbs4.inria.fr>
2011-03-24 23:13 ` Fabrice Le Fessant
2011-03-25  0:23   ` [Caml-list] " Sylvain Le Gall
2011-03-25  9:55   ` [Caml-list] " Alain Frisch
2011-03-25 11:44     ` Gerd Stolpmann
     [not found]   ` <1396338209.232813.1301046980856.JavaMail.root@zmbs4.inria.fr>
2011-03-25 10:23     ` Fabrice Le Fessant
2011-03-25 12:07       ` Gerd Stolpmann
2011-04-16 12:12         ` Jon Harrop
2011-03-25 10:51   ` Hugo Ferreira
2011-03-25 12:25     ` Gerd Stolpmann
2011-03-25 12:58       ` Hugo Ferreira
     [not found]       ` <341494683.237537.1301057887481.JavaMail.root@zmbs4.inria.fr>
2011-03-25 13:10         ` Fabrice Le Fessant
2011-03-25 13:41           ` Dario Teixeira
2011-03-30 18:12             ` Jon Harrop
2011-03-25 15:44           ` Hugo Ferreira
2011-03-25 18:24             ` Martin Jambon
2011-03-25 19:19               ` Hugo Ferreira
2011-03-25 20:26                 ` Gerd Stolpmann
2011-03-26  9:11                   ` Hugo Ferreira
2011-03-26 10:31                   ` Richard W.M. Jones
2011-03-30 16:56                     ` Jon Harrop
2011-03-30 19:24                       ` Richard W.M. Jones
2011-04-20 21:44                   ` Jon Harrop
2011-04-19  9:57                 ` Eray Ozkural
2011-04-19 10:05                   ` Hugo Ferreira
2011-04-19 20:26                   ` Gerd Stolpmann
2011-04-20  7:59                     ` Hugo Ferreira [this message]
2011-04-20 12:30                       ` Markus Mottl
2011-04-20 12:53                         ` Hugo Ferreira
2011-04-20 13:22                           ` Markus Mottl
2011-04-20 14:00                       ` Edgar Friendly
2011-04-19 22:49                 ` Jon Harrop
2011-03-30 17:02               ` Jon Harrop
2011-04-20 19:23               ` Jon Harrop
2011-04-20 20:05                 ` Alexy Khrabrov
2011-04-20 23:00                   ` Jon Harrop
     [not found]                   ` <76544177.594058.1303341821437.JavaMail.root@zmbs4.inria.fr>
2011-04-21  7:48                     ` Fabrice Le Fessant
2011-04-21  8:35                       ` Hugo Ferreira
2011-04-23 17:32                         ` Jon Harrop
2011-04-21  9:09                       ` Alain Frisch
     [not found]                         ` <20110421.210304.1267840107736400776.Christophe.Troestler+ocaml@umons.ac.be>
2011-04-21 19:53                           ` Hezekiah M. Carty
2011-04-22  8:34                           ` Alain Frisch
     [not found]                         ` <799994864.610698.1303412613509.JavaMail.root@zmbs4.inria.fr>
2011-04-22  8:06                           ` Fabrice Le Fessant
2011-04-22  9:11                             ` Gerd Stolpmann
2011-04-23 10:17                               ` Eray Ozkural
2011-04-23 13:47                                 ` Alexy Khrabrov
2011-04-23 17:39                                   ` Eray Ozkural
2011-04-23 20:18                                     ` Alexy Khrabrov
2011-04-23 21:18                                       ` Jon Harrop
2011-04-24  0:33                                       ` Eray Ozkural
2011-04-28 14:42                                         ` orbitz
2011-04-23 19:02                                 ` Jon Harrop
2011-04-22  9:44                             ` Vincent Aravantinos
2011-04-21 10:09                       ` Philippe Strauss
2011-04-23 17:44                         ` Jon Harrop
2011-04-23 17:05                       ` Jon Harrop
2011-04-20 20:30                 ` Gerd Stolpmann
2011-04-20 23:33                   ` Jon Harrop
2011-03-25 20:27             ` Philippe Strauss
2011-04-19 22:47           ` Jon Harrop
     [not found]           ` <869445701.579183.1303253283515.JavaMail.root@zmbs4.inria.fr>
2011-04-20  9:25             ` Fabrice Le Fessant
2011-03-25 18:45   ` Andrei Formiga
2011-03-30 17:00     ` Jon Harrop
2011-04-13  3:36   ` Lucas Dixon
2011-04-13 13:01     ` Gerd Stolpmann
2011-04-13 13:09       ` Lucas Dixon
2011-04-13 23:04       ` Goswin von Brederlow
2011-04-16 13:54     ` Jon Harrop
2011-03-24 13:44 Alexy Khrabrov
2011-03-24 14:57 ` Gerd Stolpmann
2011-03-24 15:03   ` Joel Reymont
2011-03-24 15:28     ` Guillaume Yziquel
2011-03-24 15:48       ` Gerd Stolpmann
2011-03-24 15:38     ` Gerd Stolpmann
2011-03-25 19:49   ` Richard W.M. Jones

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4DAE9278.4050701@inescporto.pt \
    --to=hmf@inescporto.pt \
    --cc=caml-list@inria.fr \
    --cc=examachine@gmail.com \
    --cc=info@gerd-stolpmann.de \
    --cc=martin.jambon@ens-lyon.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).