caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Jon Harrop <jon@ffconsultancy.com>
To: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] MetaOcaml and high-performance [was: AST versus Ocaml]
Date: Tue, 10 Nov 2009 15:38:54 +0000	[thread overview]
Message-ID: <200911101538.54857.jon@ffconsultancy.com> (raw)
In-Reply-To: <20091109042328.9330E1727F@Adric.ern.nps.edu>

On Monday 09 November 2009 04:23:28 oleg@okmij.org wrote:
> Because offshoring produces a portable C or Fortran code file, you can
> use the code on 32 or 64-bit platform. The reason the native MetaOCaml
> without offshoring does not work on amd64 is because at that time
> OCaml didn't emit PIC code for amd64. So, dynamic linking was
> impossible. That problem has long been fixed in later versions of
> OCaml...

Has the problem been fixed in MetaOCaml?

> Fortunately, some people have considered MetaOCaml to be a viable
> option for performance users and have reported good results. For
> example,
>
> 	Tuning MetaOCaml Programs for High Performance
> 	Diploma Thesis of Tobias Langhammer.
> 	http://www.infosun.fmi.uni-passau.de/cl/arbeiten/Langhammer.pdf
>
> Here is a good quotation from the Introduction:
>
> ``This thesis proposes MetaOCaml for enriching the domain of
> high-performance computing by multi-staged programming. MetaOCaml extends
> the OCaml language.
> ...
>     Benchmarks for all presented implementations confirm that the
> execution time can be reduced significantly by high-level
> optimizations. Some MetaOCaml programs even run as fast as respective
> C implementations. Furthermore, in situations where optimizations in
> pure MetaOCaml are limited, computation hotspots can be explicitly or
> implicitly exported to C. This combination of high-level and low-level
> techniques allows optimizations which cannot be obtained in pure C
> without enormous effort.''

That thesis contains three benchmarks:

1. Dense float matrix-matrix multiply.

2. Blur of an int image matrix as convolution with a 3x3 stencil matrix.

3. Polynomial multiplication with distributed parallelism.

I don't know about polynomial multiplication (suffice to say that it is not 
leveraging shared-memory parallelism which is what performance users value in 
today's multicore era) but the code for the first two benchmarks is probably 
10-100x slower than any decent implementation. For example, his fastest 
2048x2048 matrix multiply takes 167s whereas Matlab takes only 3.6s here.

In essence, the performance gain (if any) from offshoring to C or Fortran is 
dwarfed by the lack of shared-memory parallelism.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e


      reply	other threads:[~2009-11-10 15:37 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-11-09  4:23 oleg
2009-11-10 15:38 ` Jon Harrop [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200911101538.54857.jon@ffconsultancy.com \
    --to=jon@ffconsultancy.com \
    --cc=caml-list@yquem.inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).