From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 39E55BC32 for ; Sun, 13 Mar 2005 13:52:59 +0100 (CET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j2DCqw6U027507 for ; Sun, 13 Mar 2005 13:52:58 +0100 Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id NAA00804 for ; Sun, 13 Mar 2005 13:52:57 +0100 (MET) Received: from mail.physik.uni-muenchen.de (mail.physik.uni-muenchen.de [192.54.42.129]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j2DCqvD4027500 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Sun, 13 Mar 2005 13:52:57 +0100 Received: from localhost (unknown [127.0.0.1]) by mail.physik.uni-muenchen.de (Postfix) with ESMTP id 2DB3520032; Sun, 13 Mar 2005 13:52:57 +0100 (CET) Received: from mail.physik.uni-muenchen.de ([127.0.0.1]) by localhost (mail.physik.uni-muenchen.de [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 07781-01-6; Sun, 13 Mar 2005 13:52:55 +0100 (CET) Received: from mailhost.cip.physik.uni-muenchen.de (kaiser.cip.physik.uni-muenchen.de [141.84.136.1]) by mail.physik.uni-muenchen.de (Postfix) with ESMTP id 626FC20017; Sun, 13 Mar 2005 13:52:55 +0100 (CET) Received: from eiger.cip.physik.uni-muenchen.de (eiger.cip.physik.uni-muenchen.de [141.84.136.54]) by mailhost.cip.physik.uni-muenchen.de (Postfix) with ESMTP id 585C026EDA; Sun, 13 Mar 2005 13:52:55 +0100 (CET) Received: by eiger.cip.physik.uni-muenchen.de (Postfix, from userid 3092) id 172CA1E4B9; Sun, 13 Mar 2005 13:52:55 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by eiger.cip.physik.uni-muenchen.de (Postfix) with ESMTP id 123F9F2BC; Sun, 13 Mar 2005 13:52:55 +0100 (CET) Date: Sun, 13 Mar 2005 13:52:55 +0100 (CET) From: Thomas Fischbacher To: Eijiro Sumii Cc: caml-list@inria.fr, sumii@saul.cis.upenn.edu Subject: Re: [Caml-list] how to "disable" GC? In-Reply-To: <20050312.225725.125114910.eijiro_sumii@anet.ne.jp> Message-ID: References: <20050312.225725.125114910.eijiro_sumii@anet.ne.jp> X-BOFH: Daemons did it MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: amavisd-new at physik.uni-muenchen.de X-Miltered: at nez-perce with ID 423437AA.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at nez-perce with ID 423437A9.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 eijiro:01 sumii:01 runtime:01 buffer:01 integers:01 cip:98 cip:98 lambda:01 lambda:01 wrote:01 short:01 slower:01 encode:01 caches:01 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.2 X-Spam-Level: On Sat, 12 Mar 2005, Eijiro Sumii wrote: > 2. Some programs get much slower for larger heaps, even though they > don't seem to trigger any GC. An example is such programs is given > below. Why is this? (There also exist programs that are not > affected at all, so this is not because of the initialization > overhead of the runtime system.) Quite in general, Cache hierarchy may be an issue here. The CPU can only hold a small amount of data in its L1 cache, which is quite fast. Next, there are L2 and then L3 caches (should you have that), but if you have to do a lot of real memory access, that usually will kill you. Especially if your address space is so badly scrambled that the CPU constantly needs to re-load its address translation cache (called translation lookaside buffer for x86). Worst case situation on a 3-level MMU system is that you will have to do four memory reads for one word of data. So, as a quite general rule when doing computationally challenging algebraic stuff in a functional language: try hard to design your data representation in such a way that you minimize RAM access. If you can e.g. encode short lists of small integers into a fixnum-integer, then do so. A few shift and masking operations do not matter much for many pipelined processors, provided the code can be held in the cache. Memory access does. One can overdo it, though: if you need ten times more code to get rid of a simple memory access, you're on the wrong track. Code also has to be transported to the CPU core. -- regards, tf@cip.physik.uni-muenchen.de (o_ Thomas Fischbacher - http://www.cip.physik.uni-muenchen.de/~tf //\ (lambda (n) ((lambda (p q r) (p p q r)) (lambda (g x y) V_/_ (if (= x 0) y (g g (- x 1) (* x y)))) n 1)) (Debian GNU)