From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id 8B7D37EE51 for ; Wed, 24 Apr 2013 17:57:34 +0200 (CEST) Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of jfc@MIT.EDU) identity=pra; client-ip=18.7.68.36; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="jfc@mit.edu"; x-sender="jfc@MIT.EDU"; x-conformance=sidf_compatible Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of jfc@mit.edu designates 18.7.68.36 as permitted sender) identity=mailfrom; client-ip=18.7.68.36; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="jfc@mit.edu"; x-sender="jfc@mit.edu"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@dmz-mailsec-scanner-7.mit.edu) identity=helo; client-ip=18.7.68.36; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="jfc@mit.edu"; x-sender="postmaster@dmz-mailsec-scanner-7.mit.edu"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AmECAHP/d1ESB0QknGdsb2JhbABQwhaBABYOAQEBAQEGDQkJFCiCHwEBBAF5BQsLISUPSAaIIQm2WYdMjW0MgTcHg0sDq2aBTw X-IPAS-Result: AmECAHP/d1ESB0QknGdsb2JhbABQwhaBABYOAQEBAQEGDQkJFCiCHwEBBAF5BQsLISUPSAaIIQm2WYdMjW0MgTcHg0sDq2aBTw X-IronPort-AV: E=Sophos;i="4.87,542,1363129200"; d="scan'208";a="12085870" Received: from dmz-mailsec-scanner-7.mit.edu ([18.7.68.36]) by mail3-smtp-sop.national.inria.fr with ESMTP; 24 Apr 2013 17:57:33 +0200 X-AuditID: 12074424-b7f8c6d0000028c4-4a-517800ebff57 Received: from mailhub-auth-2.mit.edu ( [18.7.62.36]) by dmz-mailsec-scanner-7.mit.edu (Symantec Messaging Gateway) with SMTP id 00.C1.10436.BE008715; Wed, 24 Apr 2013 11:57:31 -0400 (EDT) Received: from outgoing.mit.edu (OUTGOING-AUTH-1.MIT.EDU [18.9.28.11]) by mailhub-auth-2.mit.edu (8.13.8/8.9.2) with ESMTP id r3OFvUVY015312; Wed, 24 Apr 2013 11:57:31 -0400 Received: from localhost (CONTENTS-VNDER-PRESSVRE.MIT.EDU [18.9.64.11]) (authenticated bits=0) (User authenticated as jfc@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id r3OFvT9a012995; Wed, 24 Apr 2013 11:57:30 -0400 Message-Id: <201304241557.r3OFvT9a012995@outgoing.mit.edu> To: ygrek cc: caml-list@inria.fr In-reply-to: <20130424183543.e3a4290382f7f9ce7b522a57@gmail.com> References: <20130424183543.e3a4290382f7f9ce7b522a57@gmail.com> Comments: In-reply-to ygrek message dated "Wed, 24 Apr 2013 18:35:43 +0800." X-Mailer: MH-E 8.2; nmh 1.3; GNU Emacs 23.3.1 Date: Wed, 24 Apr 2013 11:57:29 -0400 From: John Carr X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrBIsWRmVeSWpSXmKPExsUixG6novuaoSLQ4PhrA4tPOzawWNx4+oHZ gclj56y77B6TXhxiCWCK4rJJSc3JLEst0rdL4MpY9+IbU8Ebtoq56yezNDCeZO1i5OSQEDCR ODhnKjuELSZx4d56NhBbSGAfo8TNlXYQ9kZGiYn9hV2MXED2L0aJ7q5fQA0cHLwCVhLb5juA 1IgIKElMmn2UBcRmBprz6uEDsJnCAo4SeyctZwEp5xRwkLj9RhbEFBKwl9jw3Q6iOlNizuPH jBAX6Er8278GbAqLgKpE35nbzCA2m4CsxKP2LsYJjPwLGBlWMcqm5Fbp5iZm5hSnJusWJyfm 5aUW6Zrr5WaW6KWmlG5iBAeQi8oOxuZDSocYBTgYlXh4dzwvDxRiTSwrrsw9xCjJwaQkypv8 ByjEl5SfUpmRWJwRX1Sak1p8iFGCg1lJhLfgN1CONyWxsiq1KB8mJc3BoiTOez3lpr+QQHpi SWp2ampBahFMVoaDQ0mC989/oEbBotT01Iq0zJwShDQTByfIcB6g4fdBaniLCxJzizPTIfKn GBWlxHlvgyQEQBIZpXlwvbAIf8UoDvSKMO9FkCoeYHKA634FNJgJaHDtdLDBJYkIKakGRiGh p+oXbzxgqXn2Z+vmi5Fp0TNOvzkSKcTxSd9US8ko9Jn8PSVup4DAto6EFzLN8soRh45w1yuU VFksjFpr+Vv5qWXZppDQZX6c0z5aWait+DP1/qfzxjfXpFyRrHHn7X91N3qT3v6cZpZb+/jr 4gIn6+y6alP17LTyF+mOnhfnFszo1vy2R4mlOCPRUIu5qDgRAD/qmhbLAgAA Subject: Re: [Caml-list] ackermann microbenchmark strange results Try changing loop alignment by editing assembly code. The address of ack is different in the different versions. Modern Intel processors are sensitive to code alignment. There is a limit on the number of branch prediction table entries per cache line. An instruction that crosses a cache line boundary may be slower than an instruction within a cache line. I am not surprised to see a 20% difference caused by an apparently irrelevant code change. > Moreover, the generated assembly code for the main loop is the same, afaics. The only > difference is the initialization of structure fields and the initial call to ack. Please can anybody > explain the performance difference? I understand that microbenchmarks are no way the basis to draw > performance conclusions upon, but I cannot explain these results to myself in any meaninful way. > Please help! :)