From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id E30BB7EE51 for ; Wed, 24 Apr 2013 19:40:16 +0200 (CEST) Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of Xavier.Leroy@inria.fr) identity=pra; client-ip=212.27.42.2; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="Xavier.Leroy@inria.fr"; x-sender="Xavier.Leroy@inria.fr"; x-conformance=sidf_compatible Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of Xavier.Leroy@inria.fr) identity=mailfrom; client-ip=212.27.42.2; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="Xavier.Leroy@inria.fr"; x-sender="Xavier.Leroy@inria.fr"; x-conformance=sidf_compatible Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@smtp2-g21.free.fr) identity=helo; client-ip=212.27.42.2; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="Xavier.Leroy@inria.fr"; x-sender="postmaster@smtp2-g21.free.fr"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvsBAJUXeFHUGyoClGdsb2JhbABQgzwBvmSBABYOAQEBAQcNCQkUAyWCHwEBAwEBUyUGCwshFg8JAwIBAgFFEwYCAQEQh3oKCL4xjW2BShaDNQOXHIYOjiI X-IPAS-Result: AvsBAJUXeFHUGyoClGdsb2JhbABQgzwBvmSBABYOAQEBAQcNCQkUAyWCHwEBAwEBUyUGCwshFg8JAwIBAgFFEwYCAQEQh3oKCL4xjW2BShaDNQOXHIYOjiI X-IronPort-AV: E=Sophos;i="4.87,542,1363129200"; d="scan'208";a="12095464" Received: from smtp2-g21.free.fr ([212.27.42.2]) by mail3-smtp-sop.national.inria.fr with ESMTP; 24 Apr 2013 19:40:15 +0200 Received: from [192.168.1.2] (unknown [82.237.71.191]) by smtp2-g21.free.fr (Postfix) with ESMTP id 4F4E14B00C6 for ; Wed, 24 Apr 2013 19:40:12 +0200 (CEST) Message-ID: <517818FB.8050002@inria.fr> Date: Wed, 24 Apr 2013 19:40:11 +0200 From: Xavier Leroy User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130329 Thunderbird/17.0.5 MIME-Version: 1.0 To: caml-list@inria.fr References: <20130424183543.e3a4290382f7f9ce7b522a57@gmail.com> In-Reply-To: <20130424183543.e3a4290382f7f9ce7b522a57@gmail.com> X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Caml-list] ackermann microbenchmark strange results On 24/04/13 12:35, ygrek wrote: > Running `make bench` here consistently gives the following (ack1, ack3 - tuples, ack2, ack4 - curried) : [...] > Moreover, the generated assembly code for the main loop is the same, > afaics. The only difference is the initialization of structure > fields and the initial call to ack. Please can anybody explain the > performance difference? As others said, probably code placement. A good tool to explore these issues under Linux is "perf", in particular "perf stats", which lets you count various hardware events. Running it on your 4 programs, you'll see that the number of instructions executed is almost exactly the same, but the branch mispredictions (for instance) vary significantly, in a way correlated with the overall execution time. (Note that this doesn't have anything to do with OCaml: you'd observe crazy variations like these with any compiled language.) > I understand that microbenchmarks are no way the basis to draw > performance conclusions upon, but I cannot explain these results to > myself in any meaninful way. Nobody can except perhaps a few engineers at Intel or AMD who have the whole microarchitecture of their processors imprinted in their heads. (And then they will not tell you.) See, modern microarchitectures contain so many clever heuristics that performance is generally very good but essentially unpredictable... rixed@happyleptic.org adds: > Remember me of a paper I read some years ago that was measuring the > effect of the various optimisation levels of gcc against the effect > of addresses choices (randomised using environment strings of > various lengths!), and which conclusion was that the effect of -O3 > compared to -O2 was less than the effect of "choosing" a good > environment string :-) Couldn't find it again using Google ; maybe > someone remember this paper? It's probably "Producing Wrong Data Without Doing Anything Obviously Wrong!" by Diwan et al, ASPLOS 2009, http://www-plan.cs.colorado.edu/diwan/asplos09.pdf This paper made quite a splash in the compiler community... - Xavier Leroy