caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Richard Jones <rich@annexia.org>
To: Yoann Padioleau <pad@facebook.com>
Cc: caml-list@inria.fr
Subject: Re: ancient module
Date: Tue, 14 Sep 2010 21:46:24 +0100	[thread overview]
Message-ID: <20100914204624.GA1246@annexia.org> (raw)
In-Reply-To: <7366F08F-88A4-40BA-95EE-1E682BEDBEFA@facebook.com>

On Tue, Sep 14, 2010 at 08:19:49PM +0000, Yoann Padioleau wrote:
> Hi,
>
> I am trying to use your Ancient module to avoid having the garbage
> collector spends lots of time iterating over huge data in memory. It
> works quite well for arrays but for hashtbl I have some problems
> where I am not able to find back keys that were clearly in the
> original hashtbl (before Ancient.mark it).
>
> In the doc it says: 
> 
> (1) Ad-hoc polymorphic primitives (structural equality, marshalling
> and hashing) do not work on ancient data structures, meaning that you
> will need to provide your own comparison and hashing functions.  

The issue is described by Xavier Leroy:
http://caml.inria.fr/pub/ml-archives/caml-list/2006/09/977818689f4ceb2178c592453df7a343.en.html

As far as my understanding goes, what happens is that the OCaml
compare function (or some C equivalent in the runtime) looks at the
two string pointers and decides that since both are out of the normal
heap they are just opaque objects.  Thus it won't compare the content
of the strings, but will just do pointer equality.  This massively
breaks assumptions in some ordinary OCaml code, in this instance in
Hashtbl.

> which mean I have to transform my code using Hashtbl.xxx into one
> using the functorized version of hashtbl ? I have hashtbl of strings
> to complex data type.  What would be a good hash function for
> strings ?

It may be that Map also has the same problems.  You wouldn't really
know except by examining the code.

Later you wrote:
> Actually it seems I have the problem only with Hashtbl from strings
> to whatever.  I also have some Hashtbl from int to whatever and they
> work fine after the Ancient.mark.

ints aren't compared in the same way.  They are always compared using
pointer equality, so there's no issue.

I've only used ancient to store simple arrays, and when we needed to
do string equality I remember writing a function which was aware of
the above issue (you can compare them byte for byte just fine, even
from OCaml code).

Rich.

-- 
Richard Jones
Red Hat


       reply	other threads:[~2010-09-14 20:46 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <7366F08F-88A4-40BA-95EE-1E682BEDBEFA@facebook.com>
2010-09-14 20:46 ` Richard Jones [this message]
2010-09-14 20:48   ` [Caml-list] " Richard Jones
2010-09-15  7:41     ` Erkki Seppala
2010-09-20 18:52   ` Gerd Stolpmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100914204624.GA1246@annexia.org \
    --to=rich@annexia.org \
    --cc=caml-list@inria.fr \
    --cc=pad@facebook.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).