From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pierre.weis@inria.fr>
Delivered-To: caml-list@yquem.inria.fr
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78])
	by yquem.inria.fr (Postfix) with ESMTP id 39E55BC32
	for <caml-list@yquem.inria.fr>; Sun, 13 Mar 2005 13:52:59 +0100 (CET)
Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35])
	by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j2DCqw6U027507
	for <caml-list@yquem.inria.fr>; Sun, 13 Mar 2005 13:52:58 +0100
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id NAA00804 for <caml-list@pauillac.inria.fr>; Sun, 13 Mar 2005 13:52:57 +0100 (MET)
Received: from mail.physik.uni-muenchen.de (mail.physik.uni-muenchen.de [192.54.42.129])
	by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j2DCqvD4027500
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL)
	for <caml-list@inria.fr>; Sun, 13 Mar 2005 13:52:57 +0100
Received: from localhost (unknown [127.0.0.1])
	by mail.physik.uni-muenchen.de (Postfix) with ESMTP id 2DB3520032;
	Sun, 13 Mar 2005 13:52:57 +0100 (CET)
Received: from mail.physik.uni-muenchen.de ([127.0.0.1])
 by localhost (mail.physik.uni-muenchen.de [127.0.0.1]) (amavisd-new, port 10024)
 with LMTP id 07781-01-6; Sun, 13 Mar 2005 13:52:55 +0100 (CET)
Received: from mailhost.cip.physik.uni-muenchen.de (kaiser.cip.physik.uni-muenchen.de [141.84.136.1])
	by mail.physik.uni-muenchen.de (Postfix) with ESMTP id 626FC20017;
	Sun, 13 Mar 2005 13:52:55 +0100 (CET)
Received: from eiger.cip.physik.uni-muenchen.de (eiger.cip.physik.uni-muenchen.de [141.84.136.54])
	by mailhost.cip.physik.uni-muenchen.de (Postfix) with ESMTP
	id 585C026EDA; Sun, 13 Mar 2005 13:52:55 +0100 (CET)
Received: by eiger.cip.physik.uni-muenchen.de (Postfix, from userid 3092)
	id 172CA1E4B9; Sun, 13 Mar 2005 13:52:55 +0100 (CET)
Received: from localhost (localhost [127.0.0.1])
	by eiger.cip.physik.uni-muenchen.de (Postfix) with ESMTP id 123F9F2BC;
	Sun, 13 Mar 2005 13:52:55 +0100 (CET)
Date: Sun, 13 Mar 2005 13:52:55 +0100 (CET)
From: Thomas Fischbacher <Thomas.Fischbacher@Physik.Uni-Muenchen.DE>
To: Eijiro Sumii <eijiro_sumii@anet.ne.jp>
Cc: caml-list@inria.fr, sumii@saul.cis.upenn.edu
Subject: Re: [Caml-list] how to "disable" GC?
In-Reply-To: <20050312.225725.125114910.eijiro_sumii@anet.ne.jp>
Message-ID: <Pine.LNX.4.61.0503131341420.8010@eiger.cip.physik.uni-muenchen.de>
References: <20050312.225725.125114910.eijiro_sumii@anet.ne.jp>
X-BOFH: Daemons did it
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Virus-Scanned: amavisd-new at physik.uni-muenchen.de
X-Miltered: at nez-perce with ID 423437AA.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)!
X-Miltered: at nez-perce with ID 423437A9.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)!
X-Spam: no; 0.00; caml-list:01 eijiro:01 sumii:01 runtime:01 buffer:01 integers:01 cip:98 cip:98 lambda:01 lambda:01 wrote:01 short:01 slower:01 encode:01 caches:01 
X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr
X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled 
	version=3.0.2
X-Spam-Level: 


On Sat, 12 Mar 2005, Eijiro Sumii wrote:

> 2. Some programs get much slower for larger heaps, even though they
>    don't seem to trigger any GC.  An example is such programs is given
>    below.  Why is this?  (There also exist programs that are not
>    affected at all, so this is not because of the initialization
>    overhead of the runtime system.)

Quite in general, Cache hierarchy may be an issue here. The CPU can only 
hold a small amount of data in its L1 cache, which is quite fast. Next, 
there are L2 and then L3 caches (should you have that), but if you have to 
do a lot of real memory access, that usually will kill you. Especially if 
your address space is so badly scrambled that the CPU constantly needs to 
re-load its address translation cache (called translation lookaside buffer 
for x86). Worst case situation on a 3-level MMU system is that you will
have to do four memory reads for one word of data.

So, as a quite general rule when doing computationally challenging 
algebraic stuff in a functional language: try hard to design your data 
representation in such a way that you minimize RAM access. If you can 
e.g. encode short lists of small integers into a fixnum-integer, then do 
so. A few shift and masking operations do not matter much for many 
pipelined processors, provided the code can be held in the cache. Memory 
access does. One can overdo it, though: if you need ten times more code to 
get rid of a simple memory access, you're on the wrong track. Code also 
has to be transported to the CPU core.

-- 
regards,               tf@cip.physik.uni-muenchen.de              (o_
 Thomas Fischbacher -  http://www.cip.physik.uni-muenchen.de/~tf  //\
(lambda (n) ((lambda (p q r) (p p q r)) (lambda (g x y)           V_/_
(if (= x 0) y (g g (- x 1) (* x y)))) n 1))                  (Debian GNU)