From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id WAA22072; Thu, 27 Mar 2003 22:21:48 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id WAA21695 for ; Thu, 27 Mar 2003 22:21:47 +0100 (MET) Received: from marble.he.net (marble.he.net [216.218.230.2]) by nez-perce.inria.fr (8.11.1/8.11.1) with ESMTP id h2RLLhX15913 for ; Thu, 27 Mar 2003 22:21:46 +0100 (MET) Received: from ucdavis.edu ([208.227.214.58] (may be forged)) by marble.he.net (8.8.6/8.8.2) with ESMTP id NAA14192 for ; Thu, 27 Mar 2003 13:21:40 -0800 Message-ID: <3E836CD4.5030206@ucdavis.edu> Date: Thu, 27 Mar 2003 13:27:48 -0800 From: Issac Trotts User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.1) Gecko/20020913 Debian/1.1-1 MIME-Version: 1.0 To: OCaml List Subject: Re: OCaml performance (was: Re: [Caml-list] DFT in OCaml vs. C) References: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam: no; 0.00; issac:01 trotts:01 ijtrotts:01 caml-list:01 monniaux:01 gcc:01 delivers:99 sourceforge:01 monomorphic:01 curiously:01 compactor:01 17%:98 compiler:01 ocaml:01 caml:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk David Monniaux wrote: >>The "Pentium 4 SSE2" column is an experimental code generator for the >>Pentium 4 that uses SSE2 instructions and registers for floating-point >>computations. (Before you ask: no, it's not publically available, >> >> > >In this case, to get meaningful comparison results, you should use >gcc -march=pentium4 -msse2 or icc -march=pentium4 > > > >>and it delivers about 2/3 of the performances of C, even on the Pentium. >> >> > >Let me tell you about our experience here. We are developing a large >program consisting of >- a large part of Caml code handling complex data structures >- a smaller C library handling certain numerical matrix computations that > are triggered by the Caml code >- some C (+ assembler) libraries dealing with system-dependent issues. > >I profiled the code using OProfile (http://oprofile.sourceforge.net), for >expenses in clock cycles and cache faults. Earlier attempts were made with >gprof. > >It turned out that we spent a significant amount of time in: > >- The Caml polymorphic compare function (15% time + some cache faults) > > Part of the problem seems to lie with the fact that the same function is > called when comparing strings, int64's and other types, thus the > processor has to do lots of tests and jumps just to get at the correct > comparison function. > > Wouldn't it be reasonable to define String.compare and Int64.compare to > call monomorphic functions? > >- The garbage collector (15% time + lots of cache faults) > > There's little we can do about it. Changing the size of the minor heap, > adjusting it to optimize the use of L2 cache seems to gain 2.30% of the > total running time. > > Curiously, using the compactor seems to slow things slightly. > > Would it be possible to optimize the GC cache-wise? For instance, have > it ask the processor to "prefetch" data. > >- 17% in a particular matrix function written in C. There's little we can > do except trying to optimize it carefully and compiling it with the best > C compiler around. > >- The rest of the time is spent within the Caml code. > >Now this was a bit surprising to us, because we thought we spent far more >time in the numerical computations. > > >Now back to the original question about DFTs. In your real-life >application, will DFT computations make a major part of the clock cycles >spent by the program? > There's a small image processing experiment I want to do that will compute lots of DFTs on small sub-images and will probably spend most of its clock cycles doing the transforms. - Issac ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners