From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <skaller@users.sourceforge.net>
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr
X-Spam-Level: 
X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled 
	version=3.1.3
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38])
	by yquem.inria.fr (Postfix) with ESMTP id DA444BC0A
	for <caml-list@yquem.inria.fr>; Sat, 21 Apr 2007 04:57:15 +0200 (CEST)
Received: from ipmail01.adl2.internode.on.net (ipmail01.adl2.internode.on.net [203.16.214.140])
	by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l3L2vDa8005197
	for <caml-list@inria.fr>; Sat, 21 Apr 2007 04:57:15 +0200
X-IronPort-AV: E=Sophos;i="4.14,434,1170595800"; 
   d="scan'208";a="117569346"
Received: from ppp8-148.lns1.syd7.internode.on.net (HELO [192.168.1.201]) ([59.167.8.148])
  by ipmail01.adl2.internode.on.net with ESMTP; 21 Apr 2007 12:27:09 +0930
Subject: Re: [Caml-list] Slow allocations with 64bit code?
From: skaller <skaller@users.sourceforge.net>
To: Markus Mottl <markus.mottl@gmail.com>
Cc: ocaml <caml-list@inria.fr>,
	yaron jane <yminsky@janestcapital.com>
In-Reply-To: <f8560b80704201331j2f8abe89q71b15c0616609f35@mail.gmail.com>
References: <f8560b80704201331j2f8abe89q71b15c0616609f35@mail.gmail.com>
Content-Type: text/plain
Date: Sat, 21 Apr 2007 12:57:04 +1000
Message-Id: <1177124224.14498.46.camel@rosella.wigram>
Mime-Version: 1.0
X-Mailer: Evolution 2.10.1 
Content-Transfer-Encoding: 7bit
X-Miltered: at discorde with ID 46297D89.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)!
X-Spam: no; 0.00; allocations:01 markus:01 mottl:01 pointers:01 pointers:01 model:01 enforces:01 model:01 compactor:01 unboxing:01 heap:01 heap:01 sourceforge:01 garbage:01 wrote:01 

On Fri, 2007-04-20 at 16:31 -0400, Markus Mottl wrote:

> This is only a difference of about 10%, but I have seen more complex
> cases where there are timing differences in excess of 50%, which is
> already pretty substantial.

I am surprised! The 64 bit code is so fast!

You are using 64 bit pointers. They're twice as big as 32 bit
pointers. So every 'box' and all heap slot are double the size.

On a memory intensive operation, you'd expect the 64 bit model
to run at half the speed. In your example:

let () =
  for i = 1 to 100000000 do
    ignore (Int32.add 42l 24l)
  done

it would appear 'ignore' enforces an allocation which is subsequently
garbage collected. So you have both allocation and collection which
hits double the memory than on a 32 bit model. It seems likely
the reason the code is only 10% slower here is that the minor
heap compactor is successfully preventing this code hitting
much memory, possibly keeping the whole thing in cache.

This will be a gc tuning detail.

Try folding over a large list. The 64 bit version should
take twice as long because the memory accesses are the only
part of the operation that takes significant time.
[Everything else should fit in the cache except reading
the list: boxing unboxing the accumulator and invoking
the argument closure should all be effectively zero cost]


-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net