From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id B6935BC68 for ; Fri, 27 Oct 2006 22:05:34 +0200 (CEST) Received: from smtp102.sbc.mail.mud.yahoo.com (smtp102.sbc.mail.mud.yahoo.com [68.142.198.201]) by concorde.inria.fr (8.13.6/8.13.6) with SMTP id k9RK5WL4021497 for ; Fri, 27 Oct 2006 22:05:34 +0200 Received: (qmail 79182 invoked from network); 27 Oct 2006 20:05:32 -0000 Received: from unknown (HELO ?192.168.1.100?) (rftp@pacbell.net@69.230.218.139 with plain) by smtp102.sbc.mail.mud.yahoo.com with SMTP; 27 Oct 2006 20:05:32 -0000 Message-ID: <4542667A.4040700@rftp.com> Date: Fri, 27 Oct 2006 13:05:14 -0700 From: Robert Roessler User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20061018 SeaMonkey/1.5a MIME-Version: 1.0 To: Caml-list Subject: Re: [Caml-list] caml_oldify_local_roots takes 50% of the total runtime References: <17727.34685.561877.977822@tandem.cs.ru.nl> <1161802911.12050.5.camel@rosella.wigram> <1161870528.20369.27.camel@localhost.localdomain> <454206CF.1080306@janestcapital.com> In-Reply-To: <454206CF.1080306@janestcapital.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Miltered: at concorde with ID 4542668C.002 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; oldify:01 runtime:01 generations:01 timings:01 ocaml's:01 allocations:01 ocaml's:01 trivial:01 improper:01 adjustable:98 heap:01 heap:01 wrote:01 minor:01 minor:01 Brian Hurt wrote: > With respect to 4146 (minor heap size adjusts to memory size)- I'm not > sure this is a good idea. 32K is small enough to fit into L1 cache with > space left over on pretty much all systems these days (64K L1 cache > seems to be standard). Having the minor heap small enough to fit into > L1 cache, and thus having the minor heap live in L1 cache, seems to me > to be a big advantage. If for no other reason than it makes minor > collections a lot faster- the collector doesn't have to fault memory > into cache, it's already there. > > If anything, I'd be inclined to add more generations. I haven't done > any timings, but my first guess for reasonable values would be: Gen1 > (youngest): 32K (fits into L1 cache), Gen2: 256K (fits into L2 cache), > Gen3: 8M (fits into memory), Gen4 (oldest): as large as necessary. I have not yet looked at the details of OCaml's memory management, but in my commercial Smalltalk-80 implementation, I used a simple scheme: Twin 32K "young" spaces... the allocations come from one of these (like OCaml's, using trivial and fast code) until it is exhausted. Then all of the reachable [young] objects are copied to the other [empty] "young" space. Eventually, a more or less conventional mark-and-sweep is performed on the "old" space. Adjustable parameters control 1) how many times an object ping-pongs between the two "young" spaces before it is promoted to the "old" space (4 was a reasonable number), and 2) what size an allocation needs to be before it is created as "old" (this somewhat controls the consumption of the "young" spaces at the expense of some improper "old" promotions). This approach allowed an 8MHz 286 to perform at close to Xerox "Dolphin" speeds, while using [typically] less than 1% of total time for memory management. Humorously, the 32K was chosen not because it would fit into cache (there wasn't any), but because it would allow both "young" spaces to fit into a single x86 64K "segment" - remember those? :) Robert Roessler robertr@rftp.com http://www.rftp.com