From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id PAA04819; Tue, 6 Nov 2001 15:06:24 +0100 (MET)
X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id PAA03026 for <caml-list@pauillac.inria.fr>; Tue, 6 Nov 2001 15:06:23 +0100 (MET)
Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35])
	by concorde.inria.fr (8.11.1/8.10.0) with ESMTP id fA6E6MT22774;
	Tue, 6 Nov 2001 15:06:22 +0100 (MET)
Received: (from xleroy@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id PAA04788; Tue, 6 Nov 2001 15:06:21 +0100 (MET)
Date: Tue, 6 Nov 2001 15:06:21 +0100
From: Xavier Leroy <xavier.leroy@inria.fr>
To: Chris Hecker <checker@d6.com>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] native code optimization priorities
Message-ID: <20011106150621.B3221@pauillac.inria.fr>
References: <4.3.2.7.2.20011030174718.02888870@arda.pair.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0i
In-Reply-To: <4.3.2.7.2.20011030174718.02888870@arda.pair.com>; from checker@d6.com on Tue, Oct 30, 2001 at 07:08:37PM -0800
Sender: owner-caml-list@pauillac.inria.fr
Precedence: bulk

> However, I have a bunch of small things I'd like to implement (or
> see implemented) for making native numerical code faster.  This is
> primarily for my video game work, but the kinds of things I have in
> mind will also help any numerically intensive application.  So, here
> are my questions:
> 
> 0.  How important is optimization to the team?

Generating efficient machine code has always been an important aspect
of OCaml, and I spent quite a bit of work on this at the beginning of
the OCaml development (95-97).  Nowadays, we are largely satisfied
with the performances of the generated code, and get very few requests
for improving it, so this aspect of the OCaml implementation has
received little attention recently.

Also, I believe we've hit the point of diminishing returns: the major
optimizations (that lead to significant speedups on many programs) are
already in the ocamlopt compiler; further optimizations would (I
believe) result in tiny speedups (less than 5%) or be extremely
specific to a couple of test programs.

> 1.  Are there any new (big or small) optimizations planned or in the works?

Not really.  Like other members of the OCaml development teams, I have
vague ideas about things that could be done, e.g. a Pentium-4 back-end
that would use SSE2 registers for floating-point, but this is all
low priority.

Of course, we are committed to track changes in dominant processor
architectures; for instance, if the IA64 becomes widespread (heavens
forbid), some effort will have to be invested in cross-basic-block
instruction scheduling, if-conversion, and perhaps exploitation of
advanced loads.  But the fact is that computer architectures viewed
from the compiler writer's standpoint haven't changed significantly in
the last 5 years: these hardware guys do such a good job of cranking
out better and faster processors that require no change in the compiler...

> 2.  What's the relative priority of new features versus compiler
> optimizations?

As I said above, the demand for more optimizations is low.  Moreover,
advanced compiler optimizations require a lot of implementation and
testing work.

> 3.  Is there some kind of standard suite of test applications the
> caml team runs to figure out whether an optimization is worth it to
> include?

I use intensively the small benchmark suite available at:
        http://camlcvs.inria.fr/cgi-bin/cvsweb.cgi/ocaml/test/
These are mostly small benchmarks, but some of them (KB, fft, nucleic)
predict fairly well the performances of bigger applications.
The ICFP programming contest entries of the last three years have also
been used as benchmarks several times.  Finally, the Coq theorem
prover stresses quite well the compiler and runtime system as far as
symbolic processing is concerned.

> 4.  Are numerical operations an important area for ocaml to succeed?

Although ML is historically rooted in symbolic processing, I did quite
a bit of work on the compiler to achieve decent floating-point performance.
Still, symbolic processing is OCaml's bread-and-butter, and takes
precedence over floating-point performance.

>  Put another way, if an optimization helps numerical code but does
>  not help other code (or even slightly hurts it), how would that patch
>  be received?

Does not help: OK.  Slightly hurts it: that might be a problem.  OCaml
contains one instance of this: float arrays are special-cased in a way
that improves tremendously the performance of floating-point code,
but slows down polymorphic code operating on arrays.  I still think
this was an acceptable trade-off, but not everyone agrees.

Some of my earlier work on type-directed compilation (the Gallium
experimental compiler) was abandoned because while it improved the
performance of floating-point and integer computations, it slowed down
the garbage collector too much, causing pure symbolic processing to
take an unacceptable performance hit.

> What about command line options for optimization (of which there
> very few now) to offset this affect?

Only if we absolutely must.  The problem with having lots of compiler
flags is that it makes testing the compiler much harder -- in
principle, all combinations of flags should be tested...

> 5.  How does the team feel about optimizations added to the x86 code
> generator that don't help other platforms?

Fine with me.  Like all compiler writers, I hate the IA32
architecture, but that's what everyone uses these days.  The ocamlopt
back-end already contains quite a bit of IA32-specific code (in the
instruction selection phase, for instance).

Hope this answers your questions.

- Xavier Leroy
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr