From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id TAA26425; Sat, 4 Jan 2003 19:31:20 +0100 (MET)
X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id TAA26417 for <caml-list@pauillac.inria.fr>; Sat, 4 Jan 2003 19:31:19 +0100 (MET)
Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35])
	by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id h04IVIr23940
	for <caml-list@inria.fr>; Sat, 4 Jan 2003 19:31:18 +0100 (MET)
Received: (from xleroy@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id TAA26386 for caml-list@inria.fr; Sat, 4 Jan 2003 19:31:18 +0100 (MET)
Date: Sat, 4 Jan 2003 19:31:18 +0100
From: Xavier Leroy <xavier.leroy@inria.fr>
To: caml-list@inria.fr
Subject: Re: Coyote Gulch test in Caml (was Re: [Caml-list] speed )
Message-ID: <20030104193118.A26208@pauillac.inria.fr>
References: <3E15B3B3.3040106@163.com> <20030103143221.B29601@pauillac.inria.fr> <200301021753.h02Hr7Xh005056@nautilus-chet.watson.ibm.com> <20030103071042.T22850@speakeasy.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 1.0i
In-Reply-To: <20030103071042.T22850@speakeasy.org>; from shawnw@speakeasy.org on Fri, Jan 03, 2003 at 07:10:42AM -0800
Sender: owner-caml-list@pauillac.inria.fr
Precedence: bulk

> http://raevnos.pennmush.org/code/almabench-ocaml.tar.gz
> It's pretty much a straight translation of the C++ version, and not very
> impressive speed-wise on my system compared to the C++ one.

Thanks a lot for the OCaml translation.  As you say, the speed of the
OCaml version is about 50% of that of the C++ version, both on Athlon
with g++, and on Alpha with the Tru64 cxx compiler.  This is both
reassuring and disappointing:

Reassuring, because our blanket performance statement "OCaml
delivers at least 50% of the performance of a decent C compiler" is
not invalidated :-)

Disappointing, because the assembly code generated by ocamlopt isn't
too ugly despite the code not being very Caml-ish in style.  In
particular, (almost) all float and ref boxing is correctly eliminated.
Given this, I was expecting maybe 75% of the performances of C++, not
50%.  Simple hand optimization (CSE, loop unrolling) doesn't affect
the speed significantly.  Apparently, the ocamlopt-generated code
offers less instruction-level parallelism than the g++-generated code
for the float computations.  Still, I haven't really understood where
the factor of 2 comes from.  

Thanks again for an interesting benchmark,

- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners