From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id XAA25653; Sun, 18 Aug 2002 23:37:44 +0200 (MET DST)
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id XAA25676 for <caml-list@pauillac.inria.fr>; Sun, 18 Aug 2002 23:37:43 +0200 (MET DST)
Received: from praseodumium.btinternet.com (praseodumium.btinternet.com [194.73.73.82])
	by nez-perce.inria.fr (8.11.1/8.11.1) with ESMTP id g7ILYOj20634
	for <caml-list@inria.fr>; Sun, 18 Aug 2002 23:34:59 +0200 (MET DST)
Received: from host217-39-105-36.in-addr.btopenworld.com ([217.39.105.36] helo=beertje.william.bogus)
	by praseodumium.btinternet.com with esmtp (Exim 3.22 #8)
	id 17gXgV-0005uf-00; Sun, 18 Aug 2002 22:34:08 +0100
Received: (from williamc@localhost)
	by beertje.william.bogus (8.11.6/8.11.6/SuSE Linux 0.5) id g7ILbSu01884;
	Sun, 18 Aug 2002 22:37:28 +0100
X-Authentication-Warning: beertje.william.bogus: williamc set sender to williamc@paneris.org using -f
From: William Chesters <williamc@paneris.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <15712.5015.83450.380242@beertje.william.bogus>
Date: Sun, 18 Aug 2002 22:37:27 +0100
To: caml-list@inria.fr
Subject: Re: [Caml-list] O'Caml vs C++: a little benchmark
In-Reply-To: <200208181906.PAA00297@hickory.cc.columbia.edu>
References: <200208181716.NAA10426@hickory.cc.columbia.edu>
	<15711.57520.125841.25102@beertje.william.bogus>
	<200208181906.PAA00297@hickory.cc.columbia.edu>
X-Mailer: VM 6.95 under 21.4 (patch 4) "Artificial Intelligence" XEmacs Lucid
Sender: owner-caml-list@pauillac.inria.fr
Precedence: bulk

Oleg writes:
 > > You would see this clearly in the assembler output
 > > if you used ocamlopt -S.
 > 
 > Is there a tutorial for reading those *.s files ocamlopt produces?

Um.  "*.s files" are a human-readable representation of the basic
machine codes on which the CPU in your computer actually operates.
This so-called "assembler language" is standardised by the CPU
manufacturer.  For instance, if you are using Linux the CPU is
probably a Pentium and you can read about the assembler language at

    http://www.intel.com/design/intarch/techinfo/pentium/instsum.htm

or try

    http://www.google.com/search?q=%22intel+assembler

The advantage of looking at these files is that they tell
unambiguously what the CPU will be doing when it executes your
program, in a way which is comparable across languages.  If you are
interested in performance, programming idioms, and compilers, you
should definitely learn about these things.  (With modern CPUs it's no
longer so easy to relate the assembler code directly to performance
but it's still a fair guide to what the compiler is doing.)

 > In view of this, what stopped O'Caml creators from letting ocamlopt inline 
 > functions across module boundaries (especially if it's true that this could 
 > be responsible for a 100x speed boost)?

Well, it's not quite as simple as that.  You could say there are two
ways to achieve high performance.  The simple way is to use a nice
clean, but effective, compiler, and use programming styles which you
can see directly will lead to efficient code if compiled in the
obvious way.  The tricky way is to use programming styles which are,
on the face of it, highly inefficient, and rely on a funky compiler to
"see how to make them efficient".  You have attempted the latter
approach, but ocaml follows the former philosophy, so your program
works badly.

Of course the "funky compiler" approach seems very attractive.  The
point, though, is that no (current) compiler is capable of
automatically doing a great job under *all* circumstances.  In fact if
you try to use high-level abstractions in performance-sensitive code,
you very quickly get into a complicated and fairly pointless game of
trying to figure out how to fine-tune your code to make the compiler
do what you want.  Some compilers (like the KAI C++ compiler) let you
go considerably further than ocaml, but it's still entirely
satisfactory for moderately complex applications.  The problem is that
the more you rely on the compiler to bridge the gap between "what you
say" in the source and "what you mean" in the machine code, the more
uncertainty there is about what it will really do.

In practice what you end up doing is using abstractions in the outer
loops where performance is less important, and low-level idioms in the
inner loops.  You have to do that with ocaml, but in most cases even
with the best C++ compilers.

So ocaml's fairly minimalist approach is more defensible than you
might think.

Having said that I for one would like to see inter-module inlining in
the compiler and I don't think it would be hard to do.  One day I may
find the time ...
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners