From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id D2015BC84 for ; Wed, 30 Mar 2005 09:37:52 +0200 (CEST) Received: from ptb-relay03.plus.net (ptb-relay03.plus.net [212.159.14.214]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j2U6jFSR028560 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 30 Mar 2005 08:45:15 +0200 Received: from [80.229.56.224] (helo=chetara) by ptb-relay03.plus.net with esmtp (Exim) id 1DGWx0-0002SS-Qe for caml-list@yquem.inria.fr; Wed, 30 Mar 2005 06:45:15 +0000 From: Jon Harrop Organization: Flying Frog Consultancy Ltd. To: caml-list@yquem.inria.fr Subject: 32- and 64-bit performance Date: Wed, 30 Mar 2005 03:40:15 +0100 User-Agent: KMail/1.7.1 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200503300340.15874.jon@ffconsultancy.com> X-Miltered: at concorde with ID 424A4AFB.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; timings:01 ocamlopt:01 ocaml:01 ocaml:01 model:01 2048:01 ocamlopt:01 frog:98 slower:01 integer:01 checking:01 primes:01 caml:02 debian:02 bounds:02 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.2 X-Spam-Level: I just bought a new Athlon 64 laptop and installed 32- and 64-bit Debian. Here are some timings, showing the performance change when moving from 32- to 64-bit using ocamlopt (3.08.2) and g++ (3.4.4): Sieve primes up to 10^8 (bit-twiddling/array limited): 32-bit OCaml: 7.102s 64-bit OCaml: 5.697s Ratio: 1.25 32-bit C++: 19.145s 64-bit C++: 13.433s Ratio: 1.43 100th-nearest neighbours from a 10k-atom model of amorphous silicon (de/allocation limited): 32-bit OCaml: 28.407s 64-bit OCaml: 35.538s Ratio: 0.80 32-bit C++: 14.035s 64-bit C++: 12.392s Ratio: 1.13 Generate, bubble sort and accumulate an array of 10^4 double-precision random floating-point numbers: 32-bit OCaml: 1.185s 64-bit OCaml: 0.785s Ratio: 1.51 32-bit C++: 1.471s 64-bit C++: 0.957s Ratio: 1.54 without bounds checking: 32-bit OCaml: 0.992s 64-bit OCaml: 0.591s Ratio: 1.68 32-bit C++: 1.249s 64-bit C++: 0.705s Ratio: 1.77 2048^2 mandelbrot (float-arithmetic limited): 32-bit OCaml: 2.946s 64-bit OCaml: 1.704s Ratio: 1.73 32-bit C++: 1.479s 64-bit C++: 1.161s Ratio: 1.27 1024 FFTs and iFFTs (float-arithmetic limited): 32-bit OCaml: 31.491s 64-bit OCaml: 9.260s Ratio: 3.40 32-bit C++: 8.441s 64-bit C++: 8.562s Ratio: 0.99 Accumulate a Lorentzian over the number of integer triples (i, j, k) which lie in i^2 + j^2 + k^2 < 400 (float-arithmetic limited): 32-bit OCaml: 16.329s 64-bit OCaml: 9.459s Ratio: 1.73 32-bit C++: 8.002s 64-bit C++: 5.933s Ratio: 1.35 So ocamlopt does seem to generate significantly better code in these examples, particularly when they are floating point intensive. Also, only one test is slower in 64-bit, due to its heavy use of trees. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. Objective CAML for Scientists http://www.ffconsultancy.com/products/ocaml_for_scientists