From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id RAA11781; Thu, 27 Mar 2003 17:06:55 +0100 (MET) Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id RAA12018 for ; Thu, 27 Mar 2003 17:06:53 +0100 (MET) Received: from nef.ens.fr (nef.ens.fr [129.199.96.32]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id h2RG6p524094 for ; Thu, 27 Mar 2003 17:06:52 +0100 (MET) Received: from dmi.ens.fr (dmi.ens.fr [129.199.96.11]) by nef.ens.fr (8.12.8/1.01.28121999) with ESMTP id h2RG6oqZ059620 for ; Thu, 27 Mar 2003 17:06:50 +0100 (CET) Received: from basilic.ens.fr (monniaux@basilic [129.199.99.48]) by dmi.ens.fr (8.10.1/jb-1.3-180200) id h2RG6k728662 ; Thu, 27 Mar 2003 17:06:46 +0100 (MET) Received: from localhost (monniaux@localhost) by basilic.ens.fr (8.11.0/jb-1.1) id h2RG6ix03308 ; Thu, 27 Mar 2003 17:06:44 +0100 (MET) X-Authentication-Warning: basilic.ens.fr: monniaux owned process doing -bs Date: Thu, 27 Mar 2003 17:06:44 +0100 (MET) From: David Monniaux X-Sender: monniaux@basilic.ens.fr To: Liste CAML cc: Antoine Mine Subject: OCaml performance (was: Re: [Caml-list] DFT in OCaml vs. C) In-Reply-To: <20030327153247.A5015@pauillac.inria.fr> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT X-Spam: no; 0.00; monniaux:01 caml-list:01 gcc:01 delivers:99 sourceforge:01 monomorphic:01 curiously:01 compactor:01 17%:98 compiler:01 ocaml:01 caml:01 garbage:01 surprising:01 assembler:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk > The "Pentium 4 SSE2" column is an experimental code generator for the > Pentium 4 that uses SSE2 instructions and registers for floating-point > computations. (Before you ask: no, it's not publically available, In this case, to get meaningful comparison results, you should use gcc -march=pentium4 -msse2 or icc -march=pentium4 > and it delivers about 2/3 of the performances of C, even on the Pentium. Let me tell you about our experience here. We are developing a large program consisting of - a large part of Caml code handling complex data structures - a smaller C library handling certain numerical matrix computations that are triggered by the Caml code - some C (+ assembler) libraries dealing with system-dependent issues. I profiled the code using OProfile (http://oprofile.sourceforge.net), for expenses in clock cycles and cache faults. Earlier attempts were made with gprof. It turned out that we spent a significant amount of time in: - The Caml polymorphic compare function (15% time + some cache faults) Part of the problem seems to lie with the fact that the same function is called when comparing strings, int64's and other types, thus the processor has to do lots of tests and jumps just to get at the correct comparison function. Wouldn't it be reasonable to define String.compare and Int64.compare to call monomorphic functions? - The garbage collector (15% time + lots of cache faults) There's little we can do about it. Changing the size of the minor heap, adjusting it to optimize the use of L2 cache seems to gain 2.30% of the total running time. Curiously, using the compactor seems to slow things slightly. Would it be possible to optimize the GC cache-wise? For instance, have it ask the processor to "prefetch" data. - 17% in a particular matrix function written in C. There's little we can do except trying to optimize it carefully and compiling it with the best C compiler around. - The rest of the time is spent within the Caml code. Now this was a bit surprising to us, because we thought we spent far more time in the numerical computations. Now back to the original question about DFTs. In your real-life application, will DFT computations make a major part of the clock cycles spent by the program? David Monniaux http://www.di.ens.fr/~monniaux Laboratoire d'informatique de l'École Normale Supérieure, Paris, France ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners