caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Gerd Stolpmann <info@gerd-stolpmann.de>
To: Yoann Padioleau <pad@fb.com>
Cc: Caml List <caml-list@yquem.inria.fr>
Subject: Re: [Caml-list] strategies to deal with huge in-memory "object" graphs?
Date: Sun, 10 Aug 2014 08:51:43 +0200	[thread overview]
Message-ID: <1407653503.5797.28.camel@e130> (raw)
In-Reply-To: <7F92DE4E-875E-4020-AF4F-5BC19080225A@fb.com>

[-- Attachment #1: Type: text/plain, Size: 2272 bytes --]

Am Freitag, den 08.08.2014, 22:20 +0000 schrieb Yoann Padioleau:
> Hi list,
> 
> I have an application that is gradually creating a graph (using ocamlgraph) and 
> the amount of memory it is using is around 3 or 4 Gb (my machine has 
> 74Gb of RAM). There are lots of nodes and edges in this graph. The problem is that building
> this graph takes a huge amount of time. As the build progresses, it gets slower
> and slower. My guess is that the “object” graph is getting really huge and
> so the Gc needs to explore each time even more. I’ve tried things like
> 
>   (* see www.elehack.net/michael/blog/2010/06/ocaml-memory-tuning *)
>   Gc.set { (Gc.get()) with Gc.minor_heap_size = 4_000_000 };
>   (* goes from 5300s to 3000s for building db for www *)
>   Gc.set { (Gc.get()) with Gc.major_heap_increment = 8_000_000 };
>   Gc.set { (Gc.get()) with Gc.space_overhead = 300 };
> 
> 
> but it does not really help. It is still really slow.
> 
> In the past I sometimes use the Marshall module to reduce the number of “objects”,
> but it forces me to rewrite quite a lot the code. 
> 
> Is there a way to partition the heap so that for instance in my case all the graph
> related things are put in a different area that the Gc does not have to explore each time.
> I’d like a minor heap, major heap, and then  a do_not_gc_this_heap_it_is_only_growing_there_is_no_garbage_here_to_collect.

The latter can be accomplished by setting space_overhead to a large
value (maybe 1E6). Also set max_overhead to 1E6 to avoid compactions.

Once you have built the graph, you can move the whole beast (provided it
is read-only) to a non-GC-managed area with either Ancient (simpler to
use, fewer features), or Ocamlnet's Netmulticore. But you really can
only move the whole graph, with everything that is reachable from it.
Any mutation will kill the program.

Gerd

> 
> 

-- 
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
My OCaml site:          http://www.camlcity.org
Contact details:        http://www.camlcity.org/contact.html
Company homepage:       http://www.gerd-stolpmann.de
------------------------------------------------------------


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

      parent reply	other threads:[~2014-08-10  6:51 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-08 22:20 Yoann Padioleau
2014-08-09 19:57 ` Gabriel Kerneis
2014-08-10  6:51 ` Gerd Stolpmann [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1407653503.5797.28.camel@e130 \
    --to=info@gerd-stolpmann.de \
    --cc=caml-list@yquem.inria.fr \
    --cc=pad@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).