From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id AAA05929; Sun, 19 Jan 2003 00:40:31 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id AAA05835 for ; Sun, 19 Jan 2003 00:40:30 +0100 (MET) Received: from speakeasy.org (dsl081-017-141.sea1.dsl.speakeasy.net [64.81.17.141]) by nez-perce.inria.fr (8.11.1/8.11.1) with ESMTP id h0INeSv19907 for ; Sun, 19 Jan 2003 00:40:28 +0100 (MET) Received: (from shawnw@localhost) by speakeasy.org (8.10.1/8.10.1) id h0INotY11756 for caml-list@inria.fr; Sat, 18 Jan 2003 15:50:55 -0800 Date: Sat, 18 Jan 2003 15:50:54 -0800 From: Shawn Wagner To: caml-list@inria.fr Subject: Re: Coyote Gulch test in Caml (was Re: [Caml-list] speed ) Message-ID: <20030118155054.C22850@speakeasy.org> Mail-Followup-To: caml-list@inria.fr References: <3E15B3B3.3040106@163.com> <20030103071042.T22850@speakeasy.org> <20030104193118.A26208@pauillac.inria.fr> <200301181749.48295.oleg_inconnu@myrealbox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200301181749.48295.oleg_inconnu@myrealbox.com>; from oleg_inconnu@myrealbox.com on Sat, Jan 18, 2003 at 05:49:47PM -0500 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Sat, Jan 18, 2003 at 05:49:47PM -0500, Oleg wrote: > On Saturday 04 January 2003 01:31 pm, Xavier Leroy wrote: > > Apparently, the ocamlopt-generated code > > offers less instruction-level parallelism than the g++-generated code > > for the float computations. Still, I haven't really understood where > > the factor of 2 comes from. > > It's been a couple of weeks. I'm wondering if you got any new insights into > this? > > Just as wild guess: the code contains calls to "sin" and "cos" on the same > value. Perhaps GCC manages to optimize those into one call to "sincos" It doesn't. I tried making a C++ version that does when I was fooling around with it. Didn't really help. The single greatest speed increase I got (Which did something like cut the runtime in half) was -ffast-math, which cuts out the trig function calls in favor of direct use of the proper x86 instructions. But the inlined-sincos (__sincos()) in glibc causes a segfault on my athlon when I tried using it. :P Something like gcc's -ffast-math for ocamlopt would be nice, but improving the scheduler is probably of more general use, making it able to target code for specific CPUs like gcc does with good results. Targeting athlon instead of i386 cut the almabench time by 30 seconds, for example. I don't know anything about the code-generation bits of ocamlopt, though, so I have no idea how big of a project that would be. -- Shawn Wagner shawnw@speakeasy.org ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners