From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 28BB0BB91 for ; Sun, 9 Jan 2005 19:56:00 +0100 (CET) Received: from biscayne-one-station.mit.edu (BISCAYNE-ONE-STATION.MIT.EDU [18.7.7.80]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j09ItwTs003573 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sun, 9 Jan 2005 19:55:59 +0100 Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by biscayne-one-station.mit.edu (8.12.4/8.9.2) with ESMTP id j09ItprG027283 for ; Sun, 9 Jan 2005 13:55:57 -0500 (EST) Received: from [18.98.6.119] (WESTGATE-THREE-SEVENTY-FOUR.MIT.EDU [18.98.6.119]) (authenticated bits=0) (User authenticated as farr@ATHENA.MIT.EDU) by outgoing.mit.edu (8.12.4/8.12.4) with ESMTP id j09IkxBO018404 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for ; Sun, 9 Jan 2005 13:47:02 -0500 (EST) Mime-Version: 1.0 (Apple Message framework v619) Content-Transfer-Encoding: 7bit Message-Id: Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="Apple-Mail-1-287674356" To: caml-list@yquem.inria.fr From: "Will M. Farr" Subject: ocamlprof versus gprof on Mac os X Date: Sun, 9 Jan 2005 13:46:49 -0500 X-Pgp-Agent: GPGMail 1.0.2 X-Mailer: Apple Mail (2.619) X-Scanned-By: MIMEDefang 2.42 X-Miltered: at nez-perce with ID 41E17E3E.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; ocamlprof:01 ocaml:01 ocamlprof:01 bytecode:01 runtime:01 ocamlopt:01 sqrt:01 iter:01 iter:01 oldify:01 minor:01 heap:01 minor:01 oldify:01 malloc:01 X-Attachments: type="application/pgp-signature" name="PGP.sig" name="PGP.sig" X-Spam-Checker-Version: SpamAssassin 3.0.0 (2004-09-13) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.0 X-Spam-Level: --Apple-Mail-1-287674356 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII; format=flowed Hello all, I've been trying to optimize a quick nbody gravitational calculation in ocaml (it's for the shootout benchmark at http://shootout.alioth.debian.org/ ), and I've been getting some weird results out of gprof. I have a function which is used twice to compute the total energy of the system --- once at the beginning of the integration, and once at the end. The output of ocamlprof on a bytecode executable reflects the number of times this function is called correctly: let energy bl = (* 2 *) let t_accumulator acc b = (* 10 *) acc +. (t b) in let v_accumulator acc a b = (* 20 *) acc +. (v a b) in let acc1 = List.fold_left t_accumulator 0.0 bl in fold_left_n2 v_accumulator acc1 bl;; However, no matter how many integration steps I take (1000, 1000000, 10000000), this function appears to take up the same fraction of the runtime according to gprof (after compiling with ocamlopt -p): % cumulative self self total time seconds seconds calls ms/call ms/call name 28.4 28.52 28.52 _camlNbody__v_accumulator_198 [1] 22.8 51.47 22.95 ___sqrt [2] 16.8 68.38 16.91 _camlNbody__force_on_135 [3] 6.5 74.95 6.57 _camlNbody__norm_93 [4] 5.0 80.00 5.05 _camlList__iter_98 [5] 4.4 84.41 4.41 _camlNbody__update_pos_170 [6] 4.2 88.60 4.19 _camlNbody__norm2_91 [7] 3.2 91.77 3.17 _camlNbody__update_forces_145 [8] 2.4 94.14 2.37 _camlNbody__up_177 [9] 1.7 95.82 1.68 _camlNbody__loop_152 [10] 1.2 97.04 1.22 _caml_curry2 [11] 0.8 97.88 0.84 _camlNbody__zero_body_force_133 [12] 0.6 98.52 0.64 _caml_curry2_1 [13] 0.4 98.95 0.43 _camlNbody__iter_n2_149 [14] 0.4 99.35 0.40 _camlNbody__step_174 [15] 0.3 99.68 0.33 _camlList__tl_66 [16] 0.2 99.90 0.22 _camlList__hd_63 [17] 0.2 100.09 0.19 _camlNbody__main_223 [18] 0.1 100.18 0.09 _caml_oldify_local_roots [19] 0.1 100.24 0.06 _caml_major_collection_slice [20] 0.0 100.29 0.05 _caml_darken [21] 0.0 100.33 0.04 _caml_call_gc [22] 0.0 100.36 0.03 _caml_empty_minor_heap [23] 0.0 100.39 0.03 _caml_fl_allocate [24] 0.0 100.42 0.03 _caml_minor_collection [25] 0.0 100.44 0.02 _caml_gc_message [26] 0.0 100.46 0.02 _caml_oldify_one [27] 0.0 100.48 0.02 _szone_malloc [28] 0.0 100.49 0.01 _alloc_to_do [29] 0.0 100.50 0.01 _caml_compact_heap_maybe [30] 0.0 100.51 0.01 _caml_do_roots [31] 0.0 100.52 0.01 _caml_final_do_strong_roots [32] 0.0 100.53 0.01 _caml_final_empty_young [33] 0.0 100.54 0.01 _caml_final_update [34] 0.0 100.55 0.01 _caml_fl_merge_block [35] 0.0 100.56 0.01 _caml_oldify_mopup [36] 0.0 100.57 0.01 _mark_slice [37] 0.0 100.58 0.01 _sweep_slice [38] This is on a run with 1 000 000 integration steps. Is ocamlprof -p broken on Mac os X? There are several candidate functions which could be taking that much time and don't appear in the list, so it looks like it's just confused the names of various functions; is there a way I can fix this? Thanks for the help! Will Farr BTW, on my system, this calculation takes ~2 times as long as a java program to do the same thing; if anyone wants to defend the honor of ocaml better, I would be happy to pass along source code, and we can see whether there's something faster to be done. --Apple-Mail-1-287674356 content-type: application/pgp-signature; x-mac-type=70674453; name=PGP.sig content-description: This is a digitally signed message part content-disposition: inline; filename=PGP.sig content-transfer-encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (Darwin) iD8DBQFB4XwgjFCrhUweU3MRArSrAJ9H03DNbb3GXsDeYMcld3G/4T1YlQCgtFtU YQrQ19pMR0Hbcvaif3o5HS4= =FBTH -----END PGP SIGNATURE----- --Apple-Mail-1-287674356--