From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <goswin-v-b@web.de>
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105])
	by yquem.inria.fr (Postfix) with ESMTP id A0804BC37
	for <caml-list@yquem.inria.fr>; Fri, 13 Nov 2009 18:09:08 +0100 (CET)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AksCAEAh/UrZSMDqi2dsb2JhbACBTZpGAQEBCgsKBxEFuWCEPAQ
X-IronPort-AV: E=Sophos;i="4.44,738,1249250400"; 
   d="scan'208";a="50301902"
Received: from fmmailgate03.web.de ([217.72.192.234])
  by mail4-smtp-sop.national.inria.fr with ESMTP; 13 Nov 2009 18:09:08 +0100
Received: from smtp05.web.de (fmsmtp05.dlan.cinetic.de [172.20.4.166])
	by fmmailgate03.web.de (Postfix) with ESMTP id 6A36913295FA8;
	Fri, 13 Nov 2009 18:07:42 +0100 (CET)
Received: from [95.208.117.111] (helo=frosties.localdomain)
	by smtp05.web.de with asmtp (TLSv1:AES256-SHA:256)
	(WEB.DE 4.110 #314)
	id 1N8zcc-0007MF-00; Fri, 13 Nov 2009 18:07:42 +0100
Received: from mrvn by frosties.localdomain with local (Exim 4.69)
	(envelope-from <goswin-v-b@web.de>)
	id 1N8zcb-0001pg-R0; Fri, 13 Nov 2009 18:07:41 +0100
From: Goswin von Brederlow <goswin-v-b@web.de>
To: Richard Jones <rich@annexia.org>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] How to write a GC for size > memory
References: <87vdhfxb6c.fsf@frosties.localdomain>
	<20091113083802.GA3468@annexia.org>
Date: Fri, 13 Nov 2009 18:07:41 +0100
In-Reply-To: <20091113083802.GA3468@annexia.org> (Richard Jones's message of
	"Fri, 13 Nov 2009 08:38:02 +0000")
Message-ID: <87r5s2qroi.fsf@frosties.localdomain>
User-Agent: Gnus/5.110006 (No Gnus v0.6) XEmacs/21.4.22 (linux)
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: goswin-v-b@web.de
X-Sender: goswin-v-b@web.de
X-Provags-ID: V01U2FsdGVkX18fuQkQ2BYWS6a4K+lDZUQXZGW4OIG+C/41rnjp
	1MwNL2qKfrY6I7ggSWfr5SjqAzZXRz+4BYE+EFS3CX6O/zJT3w
	vsjVVCkiM=
X-Spam: no; 0.00; 0100,:01 generational:01 mutables:01 node:01 nodes:01 nodes:01 generational:01 ocaml:01 ocaml:01 tib:01 bigarray:01 invariants:01 node:01 unacceptable:01 chunks:01 

Richard Jones <rich@annexia.org> writes:

> On Fri, Nov 13, 2009 at 06:10:03AM +0100, Goswin von Brederlow wrote:
>> Normaly the GC would switch between defrag and freeing chunk
>> mode. Both would be concurrent to normal filesystem
>> operations. Possibly only run when the filesystem is idle. The
>> compation mode would only happen in situations where the FS would
>> otherwise have to return ENOSPC.
>> 
>> To further improve this I would like to add a generational component
>> into the GC. I have no mutables (except the roots) so an old chunk (or
>> an old B-Tree node) can hold no references to a newer chunk. Also new
>> files are far more likely to be deleted again than old files and new
>> B-Tree nodes more likely to be modified again than old B-Tree
>> nodes. Seems to screem for a generational approach. A generational
>> approach should allow the GC to only scan a fraction of the B-Tree for
>> its sweep. But I'm not quite sure how to go about that.
>> 
>> Comments, Ideas, Improvements, Urls welcome.
>
> As I said on IRC I still think ocaml-ancient is a good fit to this,
> unless you are planning to modify the OCaml GC itself.

I'm not trying to use the OCaml GC. This is totaly impossible for the
simple reason that a 32bit cpu can not mmap 8 TiB of disk space and
use that as Ocaml heap. I could use ocaml-ancient to map selected
blocks of the disk into memory but I'm already using libaio and
Bigarray for this. The ocaml GC itself can cope fine with mapped
blocks. What I need to do is grabage collect the disk blocks
itself. And, as said, even using 1 bit of memory per disk block is too
much.

> ocaml-ancient essentially converts parts of the heap to use manual
> memory allocation, on top of which you can write a more suitable GC in
> OCaml for your long-lived rarely-changing blocks (eg. one based on
> reference counting).  The invariants you describe above are exactly
> the ones which ocaml-ancient needs.
>
> Rich.

Even if that where possible that would change absolutely nothing. I
still need to write the GC. So the question of how to do that best
remains.

As a side node: I considered doing reference counting on the
blocks. Since I have no loops that would be free of leaks. But would
require verry frequent change in the reference count and that count
would have to be stored in a different location than the data
itself. For every B-Tree block written I would have to inrement the
counter for every block it points to, which can be each in a different
block on disk. So instead of writing 4k I need to write maybe 500k,
which is an unacceptable slowdown.

The reference counting described in my last mail would only be to
prioritize chunks where probably the least amount of used blocks are.

MfG
        Goswin