caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Goswin von Brederlow <goswin-v-b@web.de>
To: William Le Ferrand <William.Le-Ferrand@polytechnique.edu>
Cc: caml users <caml-list@inria.fr>
Subject: Re: [Caml-list] Storing ocaml values outside ocaml's heap
Date: Thu, 08 Dec 2011 11:33:16 +0100	[thread overview]
Message-ID: <871usf4akz.fsf@frosties.localnet> (raw)
In-Reply-To: <CAGS5m-=5CvbV+V9Bu4XVYaRj_jy0E4-u1Vg1oGgZLGRrA=89Lw@mail.gmail.com> (William Le Ferrand's message of "Wed, 7 Dec 2011 20:35:29 -0800")

William Le Ferrand <William.Le-Ferrand@polytechnique.edu> writes:

> Dear list, 
>
> We are building a cache in ocaml and we're wondering if it would make sense to
> store ocaml values outside the reach of the gc. (gc on a 20GB cache hangs the
> process for a second or so).
>
> To run some experiments, we wrote a small library (https://github.com/besport/
> ocaml-everlasting) that exposes two functions, get and set. 
>
> When inserting a value, we copy recursively the blocs outside of the reach of
> the gc (and put the resulting value in some C array). When getting the value,
> we simply pass the pointer to the copied value to the ocaml code (the structure
> is still coherent and the value is directly usable). We also wrote an "update"
> function that compare a new value with the existing value in cache, to avoid
> unnecessary memory allocation/deallocation.
>
> It does not seems very stable though, but I don't know if it is a bug in the
> update function or simply because this approach is not reasonable. Do you have
> any thoughts? Is there any clever way to build a large cache in an ocaml app ? 
>
> Thanks in advance for any tips!
>
> Best
>
> William

For a generic case you will have to inspect the values, tell the GC
about all the other ocaml values it points to and track any heap values
that have a reference to your external value so you don't delete it
before it is dead. In short you need to write a GC.

For special cases you can store completly self contained values, e.g. a
array of floats or a record of non pointer types. The simplest of types.


For a cache you often have millions of objects of the same type. And
often those contain pointers but do not share them. Then it becomes
feasable to copy everything they point to into the cache as well. Or
write a little C glue to store the data in an abstract or custom block
that does not contain pointers to outside the block. Often you can also
store the data much more compact than ocaml allows and thereby reduce
the memory footprint and increase cache efficiency.

MfG
        Goswin

      parent reply	other threads:[~2011-12-08 10:33 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-08  4:35 William Le Ferrand
2011-12-08  6:04 ` Stéphane Glondu
2011-12-08  8:19   ` William Le Ferrand
2011-12-08  9:04     ` Anil Madhavapeddy
2011-12-08  9:07       ` William Le Ferrand
2011-12-08 11:03       ` oliver
2011-12-08 17:02         ` Goswin von Brederlow
2011-12-08  9:40   ` Gaius Hammond
2011-12-08  8:49 ` Anders Fugmann
2011-12-08  8:56   ` William Le Ferrand
2011-12-08  9:29     ` Anders Fugmann
2011-12-08  9:12 ` Gerd Stolpmann
2011-12-08 10:33 ` Goswin von Brederlow [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=871usf4akz.fsf@frosties.localnet \
    --to=goswin-v-b@web.de \
    --cc=William.Le-Ferrand@polytechnique.edu \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).