Re: [Caml-list] JIT & HLVM, LLVM

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

From: Jon Harrop <jon@ffconsultancy.com>
To: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] JIT & HLVM, LLVM
Date: Wed, 30 Sep 2009 16:22:57 +0100	[thread overview]
Message-ID: <200909301622.57394.jon@ffconsultancy.com> (raw)
In-Reply-To: <caee5ad80909291808p5cb034aq62c3e71a7b189170@mail.gmail.com>

On Wednesday 30 September 2009 02:08:06 Mikkel Fahnøe Jørgensen wrote:
> 2009/9/27 Jon Harrop <jon@ffconsultancy.com>:
> > On Sunday 27 September 2009 20:23:13 kcheung@math.carleton.ca wrote:
> >> If Grand Central Dispatch makes its way
> >> into *nix...
> >
> > Incidentally, what exactly (technically) is Grand Central and how does it
> > relate to existing alternatives and the state-of-the-art? I Googled it
> > but failed to find anything useful...
>
> Grand Central Dispatch...

Thanks for the explanation! This seems to be attacking the same problem as 
Cilk and the TPL but my immediate impression (based entirely upon your 
description) is that Cilk and the TPL are probably better foundations.

Using one work stealing deque per core is much more efficient than the work 
sharing queue you described for two reasons:

1. Less global syncronization.

2. Subtasks are likely to be executed on the core that spawned them, which 
improves cache efficiency.

The parallel "for" loop you described is similar to the TPL's and leaves a lot 
to be desired. Specifically, you want to execute clusters of consecutive 
tasks on the same core for two reasons:

1. Using the same core improves cache efficiency because the end of one task 
is often spatially related to the start of the next, e.g. parallel quicksort, 
linear algebra or image processing.

2. Executing clusters of tasks amortizes the cost of queueing tasks.

The former is easily solved with the Cilk/TPL design by using recursive 
subdivision instead of the index sharing scheme you described because 
subtasks are likely to be executed on the same core. However, that will not 
work with a work sharing queue because subtasks are executed on random cores. 
Moreover, a trivial extension to this higher-order function allows you to 
pass in another function that describes the complexity of each task. This 
allows the recursive function to more intelligently subdivide the given range 
into clusters of variable-complexity tasks with similar total running times. 
This is the technique I use in our commercial F# libraries but I have not 
seen it described elsewhere.

Cilk and the TPL also just rely upon the programmer to make tasks sufficiently 
complex that the time spent queueing and dequeueing them is insignificant. I 
solved this problem by injecting run-time profiling code (with global 
synchronizations per cluster of tasks) to measure the proportion of the time 
being spent spawning rather than executing tasks and then use exponential 
backoff to increase the cluster size until a sufficiently small proportion of 
the time is spent spawning. Again, I have not seen this technique described 
elsewhere.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

next prev parent reply	other threads:[~2009-09-30 15:22 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-27 17:18 David McClain
2009-09-27 17:22 ` [Caml-list] " Vincent Aravantinos
2009-09-27 17:35   ` Edgar Friendly
2009-09-27 18:51     ` Jon Harrop
2009-09-27 19:07       ` Edgar Friendly
2009-09-27 19:23         ` kcheung
2009-09-27 19:33           ` Jon Harrop
2009-09-27 21:10           ` Jon Harrop
2009-09-27 21:45             ` Vincent Aravantinos
2009-09-30  1:08             ` Mikkel Fahnøe Jørgensen
2009-09-30 11:57               ` Stéphane Glondu
2009-09-30 12:54                 ` Mikkel Fahnøe Jørgensen
2009-09-30 13:42                   ` Stéphane Glondu
2009-09-30 19:11                     ` Mikkel Fahnøe Jørgensen
2009-09-30 15:22               ` Jon Harrop [this message]
2009-09-30 19:35                 ` Mikkel Fahnøe Jørgensen
2009-09-28 12:25     ` Gerd Stolpmann
     [not found] <20090927214558.D40C5BC5C@yquem.inria.fr>
2009-09-27 21:59 ` CUOQ Pascal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200909301622.57394.jon@ffconsultancy.com \
    --to=jon@ffconsultancy.com \
    --cc=caml-list@yquem.inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).