From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by yquem.inria.fr (Postfix) with ESMTP id B574DBBAF for ; Sat, 17 Jul 2010 21:35:26 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AhMDAESjQUzZSMDjkWdsb2JhbACfchUBAQEBCQsKBxEDH78uhSUE X-IronPort-AV: E=Sophos;i="4.55,219,1278280800"; d="scan'208";a="55717461" Received: from fmmailgate02.web.de ([217.72.192.227]) by mail2-smtp-roc.national.inria.fr with ESMTP; 17 Jul 2010 21:35:25 +0200 Received: from smtp03.web.de ( [172.20.0.65]) by fmmailgate02.web.de (Postfix) with ESMTP id 17DC716C4676A; Sat, 17 Jul 2010 21:35:25 +0200 (CEST) Received: from [78.43.204.177] (helo=frosties.localdomain) by smtp03.web.de with asmtp (TLSv1:AES256-SHA:256) (WEB.DE 4.110 #4) id 1OaDAS-0003az-00; Sat, 17 Jul 2010 21:35:24 +0200 Received: from mrvn by frosties.localdomain with local (Exim 4.72) (envelope-from ) id 1OaDAS-0008RU-9Y; Sat, 17 Jul 2010 21:35:24 +0200 From: Goswin von Brederlow To: Eray Ozkural Cc: Goswin von Brederlow , Rich Neswold , "caml-list@yquem.inria.fr" Subject: Re: [Caml-list] Smart ways to implement worker threads References: <87sk3mcaeq.fsf@frosties.localdomain> <87zkxsfvsg.fsf@frosties.localdomain> <87hbk0vzut.fsf@frosties.localdomain> <942FD9F6-601C-4F02-83E8-84CD280A04FF@gmail.com> Date: Sat, 17 Jul 2010 21:35:24 +0200 In-Reply-To: <942FD9F6-601C-4F02-83E8-84CD280A04FF@gmail.com> (Eray Ozkural's message of "Sat, 17 Jul 2010 21:34:12 +0300") Message-ID: <87tynxc35v.fsf@frosties.localdomain> User-Agent: Gnus/5.110009 (No Gnus v0.9) XEmacs/21.4.22 (linux, no MULE) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: goswin-v-b@web.de X-Sender: goswin-v-b@web.de X-Provags-ID: V01U2FsdGVkX1+UjZ21Z0C0l4CYmztwwGr1eMtQraay0x17ccrO mITG/jnwISjMmdSro6xC4ci/y8FlAulwwprBDFmhEivpKNiTzT rODRvbcA0= X-Spam: no; 0.00; eray:01 ozkural:01 abstraction:01 pointers:01 re-use:01 ocaml:01 trivial:01 algebra:01 topology:01 topology:01 ocaml:01 cheers:01 eray:01 parallelism:01 parallelism:01 Eray Ozkural writes: > When I'm implementing a parallel/dist algorithm I take care of making > the communication code abstract enough to be re-used. Since > abstraction in C derivative languages (function pointers, templates > etc) are a joke; one need not bother with this as expected future code > re-use isn't much. > > On the other hand in ocaml we have the best module system which can be > used to create advanced parallel data structures and algorithms. A > shared mem queue would be among the most trivial of such > constructs. If we think of world domination we have to see the whole > picture. Parallel programming is considered difficult because common > programmers don't understand enough algebra to see that most problems > could be solved by substituting an operator in a generic parallel > algorithm template. Or that optimal algorithms could be re-used > combinatorially (consider the relation of a mesh topology to a linear > topology with s/f routing) Yeah. I would love to have all the basic ocaml modules also as a thread safe flavour. > The fact is that an efficient parallel algorithm need not be long > (theory suggests that fastest is shortest). It is our collective lack > of creativity that usually makes it long. > > Cheers, > > Eray > > Ps: parallelism is not tangential here. I believe it is unnecessary to > implement asynchronous processes just for the sake of handling > overlapping I/O and computation. That's like parallelism for high > school programming class. I'm a big fan of doing IO asynchronously in a single thread. Given that ocaml can only run one thread at a time anyway there is usualy no speed gain in using multiple threads. The overhead of making the code thread save is big in complexity (if only we hade thread save modules :) and all you end up doing is waiting on more cores. The exception is when you offload work to C code that can run within enter_blocking_section() / leave_blocking_section() and you have enough work to keep multiple cores busy. For example doing blockwise sha256 sums with 4 cores is 3-3.8 times as fast as single threaded depending on blocksize. MfG Goswin