From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id ADA4ABC88 for ; Tue, 8 Feb 2005 12:26:42 +0100 (CET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j18BQgRJ002963 for ; Tue, 8 Feb 2005 12:26:42 +0100 Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id MAA17122 for ; Tue, 8 Feb 2005 12:26:41 +0100 (MET) Received: from will.iki.fi (will.iki.fi [217.169.64.20]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j18BQf3P030369; Tue, 8 Feb 2005 12:26:41 +0100 Received: from acerf.exomi.com (fa-3-0-0.fw.exomi.com [217.169.64.99]) by will.iki.fi (Postfix) with ESMTP id 49D2F138; Tue, 8 Feb 2005 13:26:41 +0200 (EET) Subject: Re: [Caml-list] [Benchmark] NBody From: Ville-Pertti Keinonen To: Xavier Leroy Cc: Christophe TROESTLER , "O'Caml Mailing List" In-Reply-To: <20050208104312.GA10035@yquem.inria.fr> References: <20050207.195724.87945401.Christophe.Troestler@umh.ac.be> <20050208104312.GA10035@yquem.inria.fr> Content-Type: text/plain Date: Tue, 08 Feb 2005 13:26:36 +0200 Message-Id: <1107861996.654.50.camel@localhost> Mime-Version: 1.0 X-Mailer: Evolution 2.0.3 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit X-Miltered: at nez-perce with ID 4208A1F2.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at concorde with ID 4208A1F1.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 wrote:01 gcc:01 ocaml:01 gcc:01 stack:01 ocaml:01 unboxed:01 stack:01 allocations:01 aligned:01 aligned:01 binary:01 ...:98 variables:02 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.2 X-Spam-Level: On Tue, 2005-02-08 at 11:43 +0100, Xavier Leroy wrote: > The best gcc output is faster than the best OCaml output by about 30%. > Looking at the asm code, the main difference is that gcc keeps some > float variables (dx, dy, dz, etc) in the floating-point stack while > OCaml stores them (unboxed) to the stack. Maybe the Java > implementation you used manages to use the float stack. Who knows. An interesting question is whether Java aligns allocations and stack to 4-byte or 8-byte boundaries on x86. A few years ago, when keeping the stack aligned for better floating point performance was a new gcc feature, and only worked as long as it was aligned initially (and consistently kept it misaligned if it wasn't), I played around with a floating point intensive C program that would exhibit an almost 40% difference in performance depending on what the binary was called...