Thank you for your answers and suggestions. I upgraded to ocaml 4.00, and the update to the hash function is great and appreciated. I haven't studied closely hash.c yet, but I am very curious about it and will look at it, thanks for pointing it.

However, the hope was to be able to use a hash table with (potentially) circular lists as keys. And, interestingly, this does not quite work, even with the update of the hashing function:

let t1 = Hashtbl.create 10;;
let rec ones = 1 :: ones;;
let rec twos = 2 :: twos;;
let ones2 = 1 :: 1 :: ones;;
let rec ones3 = 1 :: 1 :: ones3;;
(* ones, ones2 and ones3 are observationally equivalent, and their hash is the same: *)
Hashtbl.hash ones;;
- : int = 312056965
Hashtbl.hash ones2;;
- : int = 312056965
Hashtbl.hash ones3;;
- : int = 312056965

Hashtbl.add t1 ones 1;;
Hashtbl.add t1 twos 2;;
Hashtbl.find t1 ones;;
- : int = 1 (* as expected *)

Hashtbl.add t1 ones2 11;;
Hashtbl.find t1 ones;;
- : int = 11 (* as expected *)
Hashtbl.find t1 ones2;;
- : int = 11 (* as expected *)

(* However this does not work and runs forever *)
Hashtbl.find t1 ones3;;

Looking a little bit further into the implementation of Hashtbl.find (available at https://github.com/ocaml/ocaml/blob/trunk/stdlib/hashtbl.ml), it appears that inside each bucket, the keys are compared using Pervasives.compare. Pervasives.compare halts on ones and ones2 (probably because they have sublists that are physically equal, though I haven't checked the implementation of compare), but it does not halt on ones and ones3, or on ones2 and ones3:

compare ones ones2;;
- : int = 0
compare ones ones3;; (* runs forever *)
compare ones2 ones3;; (* runs forever *)

I would be curious to know if this is by design (it is supposed not to work), or if it is a problem with the implementation of compare, or of Hashtbl.find. In particular, if it is by design, why have updated the hash function to support circular lists?
I am also now stuck on creating an (efficient) hashtable supporting circular data structures as keys. Any idea on this?

Thank you,
Jean-Baptiste Jeannin

On 01/17/2013 12:05 PM, Nicholas Lucaroni wrote:
You should update to OCaml 4.xx I don't think the hash function works at all on lists in 3.12.1 (in the sense that it, as you've found, it always returns 0).

in ocaml 4.00+rc1
# Hashtbl.hash (6 :: 7 :: 8 :: 9 :: 10 :: 11 :: 12 :: 13:: 14 :: 15 :: []);;
- : int = 418135511

in ocaml 3.12.1
# Hashtbl.hash (6 :: 7 :: 8 :: 9 :: 10 :: 11 :: 12 :: 13:: 14 :: 15 :: []);;
- : int = 0



On Thu, Jan 17, 2013 at 11:41 AM, Edgar Friendly <thelema314@gmail.com> wrote:
The source is quite easy to read, and is available here:

https://github.com/ocaml/ocaml/blob/trunk/byterun/hash.c

It turns out that even seeded MurmurHash3 does not prevent algorithmic attacks as expected, as demonstrated and explained at 29c3: http://events.ccc.de/congress/2012/Fahrplan/events/5152.en.html and www.youtube.com/watch?v=Vdrab3sB7MU.

E.


On Thu, Jan 17, 2013 at 10:46 AM, Nicholas Lucaroni <nicholas.r.lucaroni@gmail.com> wrote:
https://sympa.inria.fr/sympa/arc/caml-list/2011-07/msg00117.html

That thread talks about the previous and a better alternative (which is in 4.x). 

Xavier had said,
The SVN trunk for OCaml contains a complete reimplementation of the
generic hash function that addresses both issues: lists and other
complex keys are traversed breadth-first in a more cautious manner
than before, and the mixing functions are based on MurmurHash 3, which
exhibits very good statistical properties.  All this should go in the
next major release 3.13.  So, everyone with an interest in efficient
hash tables is welcome to try the trunk sources and let me know of any
problems encountered.

On Thu, Jan 17, 2013 at 10:35 AM, Jean-Baptiste Jeannin <jeannin@cs.cornell.edu> wrote:
Hello,

The default OCaml function (Hashtbl.hash) says that it "always terminates, even on cyclic structures.". I would be curious to know what its complexity is, both on a finite list and on a cyclic list (created by let rec l = 1 :: l for example). Is the algorithm that is being used published anywhere?

Also, this hashing function seems to be returning 0 on any cyclic list (at least the ones I tried). Is this normal? Does anyone know any better hashing function on cyclic lists?

# Hashtbl.hash [ 1 ; 2 ];;
- : int = 131199
# let rec ones = 1 :: ones;;
val ones : int list =
  [1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
   ...]
# Hashtbl.hash ones;;
- : int = 0
# Hashtbl.hash (5 :: 4 :: ones);;
- : int = 0

Thank you,
Jean-Baptiste Jeannin

--
Caml-list mailing list.  Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs