caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Hash function: complexity and circular structures
@ 2013-01-17 15:35 Jean-Baptiste Jeannin
  2013-01-17 15:46 ` Nicholas Lucaroni
  0 siblings, 1 reply; 7+ messages in thread
From: Jean-Baptiste Jeannin @ 2013-01-17 15:35 UTC (permalink / raw)
  To: caml-list

Hello,

The default OCaml function (Hashtbl.hash) says that it "always 
terminates, even on cyclic structures.". I would be curious to know what 
its complexity is, both on a finite list and on a cyclic list (created 
by let rec l = 1 :: l for example). Is the algorithm that is being used 
published anywhere?

Also, this hashing function seems to be returning 0 on any cyclic list 
(at least the ones I tried). Is this normal? Does anyone know any better 
hashing function on cyclic lists?

# Hashtbl.hash [ 1 ; 2 ];;
- : int = 131199
# let rec ones = 1 :: ones;;
val ones : int list =
   [1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 
1; 1;
    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 
1; 1;
    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 
1; 1;
    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 
1; 1;
    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 
1; 1;
    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 
1; 1;
    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 
1; 1;
    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 
1; 1;
    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 
1; 1;
    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 
1; 1;
    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 
1; 1;
    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
    ...]
# Hashtbl.hash ones;;
- : int = 0
# Hashtbl.hash (5 :: 4 :: ones);;
- : int = 0

Thank you,
Jean-Baptiste Jeannin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Hash function: complexity and circular structures
  2013-01-17 15:35 [Caml-list] Hash function: complexity and circular structures Jean-Baptiste Jeannin
@ 2013-01-17 15:46 ` Nicholas Lucaroni
  2013-01-17 16:41   ` Edgar Friendly
  0 siblings, 1 reply; 7+ messages in thread
From: Nicholas Lucaroni @ 2013-01-17 15:46 UTC (permalink / raw)
  To: Jean-Baptiste Jeannin; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 2856 bytes --]

https://sympa.inria.fr/sympa/arc/caml-list/2011-07/msg00117.html

That thread talks about the previous and a better alternative (which is in
4.x).

Xavier had said,
*The SVN trunk for OCaml contains a complete reimplementation of the*
*generic hash function that addresses both issues: lists and other
complex keys are traversed breadth-first in a more cautious manner
than before, and the mixing functions are based on MurmurHash 3, which
exhibits very good statistical properties.  All this should go in the
next major release 3.13.  So, everyone with an interest in efficient
hash tables is welcome to try the trunk sources and let me know of any
problems encountered.*

On Thu, Jan 17, 2013 at 10:35 AM, Jean-Baptiste Jeannin <
jeannin@cs.cornell.edu> wrote:

> Hello,
>
> The default OCaml function (Hashtbl.hash) says that it "always terminates,
> even on cyclic structures.". I would be curious to know what its complexity
> is, both on a finite list and on a cyclic list (created by let rec l = 1 ::
> l for example). Is the algorithm that is being used published anywhere?
>
> Also, this hashing function seems to be returning 0 on any cyclic list (at
> least the ones I tried). Is this normal? Does anyone know any better
> hashing function on cyclic lists?
>
> # Hashtbl.hash [ 1 ; 2 ];;
> - : int = 131199
> # let rec ones = 1 :: ones;;
> val ones : int list =
>   [1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
> 1;
>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
> 1;
>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
> 1;
>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
> 1;
>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
> 1;
>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
> 1;
>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
> 1;
>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
> 1;
>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
> 1;
>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
> 1;
>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
> 1;
>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>    ...]
> # Hashtbl.hash ones;;
> - : int = 0
> # Hashtbl.hash (5 :: 4 :: ones);;
> - : int = 0
>
> Thank you,
> Jean-Baptiste Jeannin
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/**arc/caml-list<https://sympa.inria.fr/sympa/arc/caml-list>
> Beginner's list: http://groups.yahoo.com/group/**ocaml_beginners<http://groups.yahoo.com/group/ocaml_beginners>
> Bug reports: http://caml.inria.fr/bin/caml-**bugs<http://caml.inria.fr/bin/caml-bugs>
>

[-- Attachment #2: Type: text/html, Size: 7294 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Hash function: complexity and circular structures
  2013-01-17 15:46 ` Nicholas Lucaroni
@ 2013-01-17 16:41   ` Edgar Friendly
  2013-01-17 17:05     ` Nicholas Lucaroni
  0 siblings, 1 reply; 7+ messages in thread
From: Edgar Friendly @ 2013-01-17 16:41 UTC (permalink / raw)
  To: Nicholas Lucaroni; +Cc: Jean-Baptiste Jeannin, caml-list

[-- Attachment #1: Type: text/plain, Size: 3399 bytes --]

The source is quite easy to read, and is available here:

https://github.com/ocaml/ocaml/blob/trunk/byterun/hash.c

It turns out that even seeded MurmurHash3 does not prevent algorithmic
attacks as expected, as demonstrated and explained at 29c3:
http://events.ccc.de/congress/2012/Fahrplan/events/5152.en.html and
www.youtube.com/watch?v=Vdrab3sB7MU.

E.


On Thu, Jan 17, 2013 at 10:46 AM, Nicholas Lucaroni <
nicholas.r.lucaroni@gmail.com> wrote:

> https://sympa.inria.fr/sympa/arc/caml-list/2011-07/msg00117.html
>
> That thread talks about the previous and a better alternative (which is in
> 4.x).
>
> Xavier had said,
> *The SVN trunk for OCaml contains a complete reimplementation of the*
> *generic hash function that addresses both issues: lists and other
> complex keys are traversed breadth-first in a more cautious manner
> than before, and the mixing functions are based on MurmurHash 3, which
> exhibits very good statistical properties.  All this should go in the
> next major release 3.13.  So, everyone with an interest in efficient
> hash tables is welcome to try the trunk sources and let me know of any
> problems encountered.*
>
> On Thu, Jan 17, 2013 at 10:35 AM, Jean-Baptiste Jeannin <
> jeannin@cs.cornell.edu> wrote:
>
>> Hello,
>>
>> The default OCaml function (Hashtbl.hash) says that it "always
>> terminates, even on cyclic structures.". I would be curious to know what
>> its complexity is, both on a finite list and on a cyclic list (created by
>> let rec l = 1 :: l for example). Is the algorithm that is being used
>> published anywhere?
>>
>> Also, this hashing function seems to be returning 0 on any cyclic list
>> (at least the ones I tried). Is this normal? Does anyone know any better
>> hashing function on cyclic lists?
>>
>> # Hashtbl.hash [ 1 ; 2 ];;
>> - : int = 131199
>> # let rec ones = 1 :: ones;;
>> val ones : int list =
>>   [1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>> 1; 1;
>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>    ...]
>> # Hashtbl.hash ones;;
>> - : int = 0
>> # Hashtbl.hash (5 :: 4 :: ones);;
>> - : int = 0
>>
>> Thank you,
>> Jean-Baptiste Jeannin
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa.inria.fr/sympa/**arc/caml-list<https://sympa.inria.fr/sympa/arc/caml-list>
>> Beginner's list: http://groups.yahoo.com/group/**ocaml_beginners<http://groups.yahoo.com/group/ocaml_beginners>
>> Bug reports: http://caml.inria.fr/bin/caml-**bugs<http://caml.inria.fr/bin/caml-bugs>
>>
>
>

[-- Attachment #2: Type: text/html, Size: 8018 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Hash function: complexity and circular structures
  2013-01-17 16:41   ` Edgar Friendly
@ 2013-01-17 17:05     ` Nicholas Lucaroni
  2013-01-18 17:46       ` Jean-Baptiste Jeannin
  0 siblings, 1 reply; 7+ messages in thread
From: Nicholas Lucaroni @ 2013-01-17 17:05 UTC (permalink / raw)
  To: Edgar Friendly; +Cc: Jean-Baptiste Jeannin, caml-list

[-- Attachment #1: Type: text/plain, Size: 3965 bytes --]

You should update to OCaml 4.xx I don't think the hash function works at
all on lists in 3.12.1 (in the sense that it, as you've found, it always
returns 0).

in ocaml 4.00+rc1
# Hashtbl.hash (6 :: 7 :: 8 :: 9 :: 10 :: 11 :: 12 :: 13:: 14 :: 15 :: []);;
- : int = 418135511

in ocaml 3.12.1
# Hashtbl.hash (6 :: 7 :: 8 :: 9 :: 10 :: 11 :: 12 :: 13:: 14 :: 15 :: []);;
- : int = 0



On Thu, Jan 17, 2013 at 11:41 AM, Edgar Friendly <thelema314@gmail.com>wrote:

> The source is quite easy to read, and is available here:
>
> https://github.com/ocaml/ocaml/blob/trunk/byterun/hash.c
>
> It turns out that even seeded MurmurHash3 does not prevent algorithmic
> attacks as expected, as demonstrated and explained at 29c3:
> http://events.ccc.de/congress/2012/Fahrplan/events/5152.en.html and
> www.youtube.com/watch?v=Vdrab3sB7MU.
>
> E.
>
>
> On Thu, Jan 17, 2013 at 10:46 AM, Nicholas Lucaroni <
> nicholas.r.lucaroni@gmail.com> wrote:
>
>> https://sympa.inria.fr/sympa/arc/caml-list/2011-07/msg00117.html
>>
>> That thread talks about the previous and a better alternative (which is
>> in 4.x).
>>
>> Xavier had said,
>> *The SVN trunk for OCaml contains a complete reimplementation of the*
>> *generic hash function that addresses both issues: lists and other
>> complex keys are traversed breadth-first in a more cautious manner
>> than before, and the mixing functions are based on MurmurHash 3, which
>> exhibits very good statistical properties.  All this should go in the
>> next major release 3.13.  So, everyone with an interest in efficient
>> hash tables is welcome to try the trunk sources and let me know of any
>> problems encountered.*
>>
>> On Thu, Jan 17, 2013 at 10:35 AM, Jean-Baptiste Jeannin <
>> jeannin@cs.cornell.edu> wrote:
>>
>>> Hello,
>>>
>>> The default OCaml function (Hashtbl.hash) says that it "always
>>> terminates, even on cyclic structures.". I would be curious to know what
>>> its complexity is, both on a finite list and on a cyclic list (created by
>>> let rec l = 1 :: l for example). Is the algorithm that is being used
>>> published anywhere?
>>>
>>> Also, this hashing function seems to be returning 0 on any cyclic list
>>> (at least the ones I tried). Is this normal? Does anyone know any better
>>> hashing function on cyclic lists?
>>>
>>> # Hashtbl.hash [ 1 ; 2 ];;
>>> - : int = 131199
>>> # let rec ones = 1 :: ones;;
>>> val ones : int list =
>>>   [1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1; 1;
>>>    1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>>> 1;
>>>    ...]
>>> # Hashtbl.hash ones;;
>>> - : int = 0
>>> # Hashtbl.hash (5 :: 4 :: ones);;
>>> - : int = 0
>>>
>>> Thank you,
>>> Jean-Baptiste Jeannin
>>>
>>> --
>>> Caml-list mailing list.  Subscription management and archives:
>>> https://sympa.inria.fr/sympa/**arc/caml-list<https://sympa.inria.fr/sympa/arc/caml-list>
>>> Beginner's list: http://groups.yahoo.com/group/**ocaml_beginners<http://groups.yahoo.com/group/ocaml_beginners>
>>> Bug reports: http://caml.inria.fr/bin/caml-**bugs<http://caml.inria.fr/bin/caml-bugs>
>>>
>>
>>
>

[-- Attachment #2: Type: text/html, Size: 8961 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Hash function: complexity and circular structures
  2013-01-17 17:05     ` Nicholas Lucaroni
@ 2013-01-18 17:46       ` Jean-Baptiste Jeannin
  2013-01-18 18:27         ` Gabriel Scherer
  0 siblings, 1 reply; 7+ messages in thread
From: Jean-Baptiste Jeannin @ 2013-01-18 17:46 UTC (permalink / raw)
  To: Nicholas Lucaroni; +Cc: Edgar Friendly, caml-list

[-- Attachment #1: Type: text/plain, Size: 6806 bytes --]

Thank you for your answers and suggestions. I upgraded to ocaml 4.00, 
and the update to the hash function is great and appreciated. I haven't 
studied closely hash.c yet, but I am very curious about it and will look 
at it, thanks for pointing it.

However, the hope was to be able to use a hash table with (potentially) 
circular lists as keys. And, interestingly, this does not quite work, 
even with the update of the hashing function:

let t1 = Hashtbl.create 10;;
let rec ones = 1 :: ones;;
let rec twos = 2 :: twos;;
let ones2 = 1 :: 1 :: ones;;
let rec ones3 = 1 :: 1 :: ones3;;
(* ones, ones2 and ones3 are observationally equivalent, and their hash 
is the same: *)
Hashtbl.hash ones;;
- : int = 312056965
Hashtbl.hash ones2;;
- : int = 312056965
Hashtbl.hash ones3;;
- : int = 312056965

Hashtbl.add t1 ones 1;;
Hashtbl.add t1 twos 2;;
Hashtbl.find t1 ones;;
- : int = 1 (* as expected *)

Hashtbl.add t1 ones2 11;;
Hashtbl.find t1 ones;;
- : int = 11 (* as expected *)
Hashtbl.find t1 ones2;;
- : int = 11 (* as expected *)

(* However this does not work and runs forever *)
Hashtbl.find t1 ones3;;

Looking a little bit further into the implementation of Hashtbl.find 
(available at 
https://github.com/ocaml/ocaml/blob/trunk/stdlib/hashtbl.ml), it appears 
that inside each bucket, the keys are compared using Pervasives.compare. 
Pervasives.compare halts on ones and ones2 (probably because they have 
sublists that are physically equal, though I haven't checked the 
implementation of compare), but it does not halt on ones and ones3, or 
on ones2 and ones3:

compare ones ones2;;
- : int = 0
compare ones ones3;; (* runs forever *)
compare ones2 ones3;; (* runs forever *)

I would be curious to know if this is by design (it is supposed not to 
work), or if it is a problem with the implementation of compare, or of 
Hashtbl.find. In particular, if it is by design, why have updated the 
hash function to support circular lists?
I am also now stuck on creating an (efficient) hashtable supporting 
circular data structures as keys. Any idea on this?

Thank you,
Jean-Baptiste Jeannin

On 01/17/2013 12:05 PM, Nicholas Lucaroni wrote:
> You should update to OCaml 4.xx I don't think the hash function works 
> at all on lists in 3.12.1 (in the sense that it, as you've found, it 
> always returns 0).
>
> in ocaml 4.00+rc1
> # Hashtbl.hash (6 :: 7 :: 8 :: 9 :: 10 :: 11 :: 12 :: 13:: 14 :: 15 :: 
> []);;
> - : int = 418135511
>
> in ocaml 3.12.1
> # Hashtbl.hash (6 :: 7 :: 8 :: 9 :: 10 :: 11 :: 12 :: 13:: 14 :: 15 :: 
> []);;
> - : int = 0
>
>
>
> On Thu, Jan 17, 2013 at 11:41 AM, Edgar Friendly <thelema314@gmail.com 
> <mailto:thelema314@gmail.com>> wrote:
>
>     The source is quite easy to read, and is available here:
>
>     https://github.com/ocaml/ocaml/blob/trunk/byterun/hash.c
>
>     It turns out that even seeded MurmurHash3 does not prevent
>     algorithmic attacks as expected, as demonstrated and explained at
>     29c3:
>     http://events.ccc.de/congress/2012/Fahrplan/events/5152.en.html
>     and www.youtube.com/watch?v=Vdrab3sB7MU
>     <http://www.youtube.com/watch?v=Vdrab3sB7MU>.
>
>     E.
>
>
>     On Thu, Jan 17, 2013 at 10:46 AM, Nicholas Lucaroni
>     <nicholas.r.lucaroni@gmail.com
>     <mailto:nicholas.r.lucaroni@gmail.com>> wrote:
>
>         https://sympa.inria.fr/sympa/arc/caml-list/2011-07/msg00117.html
>
>         That thread talks about the previous and a better alternative
>         (which is in 4.x).
>
>         Xavier had said,
>         /The SVN trunk for OCaml contains a complete reimplementation of the/
>         /generic hash function that addresses both issues: lists and other
>         complex keys are traversed breadth-first in a more cautious manner
>         than before, and the mixing functions are based on MurmurHash 3, which
>         exhibits very good statistical properties.  All this should go in the
>         next major release 3.13.  So, everyone with an interest in efficient
>         hash tables is welcome to try the trunk sources and let me know of any
>         problems encountered./
>
>         On Thu, Jan 17, 2013 at 10:35 AM, Jean-Baptiste Jeannin
>         <jeannin@cs.cornell.edu <mailto:jeannin@cs.cornell.edu>> wrote:
>
>             Hello,
>
>             The default OCaml function (Hashtbl.hash) says that it
>             "always terminates, even on cyclic structures.". I would
>             be curious to know what its complexity is, both on a
>             finite list and on a cyclic list (created by let rec l = 1
>             :: l for example). Is the algorithm that is being used
>             published anywhere?
>
>             Also, this hashing function seems to be returning 0 on any
>             cyclic list (at least the ones I tried). Is this normal?
>             Does anyone know any better hashing function on cyclic lists?
>
>             # Hashtbl.hash [ 1 ; 2 ];;
>             - : int = 131199
>             # let rec ones = 1 :: ones;;
>             val ones : int list =
>               [1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>             1; 1; 1; 1; 1; 1; 1;
>                1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>             1; 1; 1; 1; 1; 1; 1;
>                1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>             1; 1; 1; 1; 1; 1; 1;
>                1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>             1; 1; 1; 1; 1; 1; 1;
>                1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>             1; 1; 1; 1; 1; 1; 1;
>                1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>             1; 1; 1; 1; 1; 1; 1;
>                1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>             1; 1; 1; 1; 1; 1; 1;
>                1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>             1; 1; 1; 1; 1; 1; 1;
>                1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>             1; 1; 1; 1; 1; 1; 1;
>                1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>             1; 1; 1; 1; 1; 1; 1;
>                1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>             1; 1; 1; 1; 1; 1; 1;
>                1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1;
>             1; 1; 1; 1; 1; 1;
>                ...]
>             # Hashtbl.hash ones;;
>             - : int = 0
>             # Hashtbl.hash (5 :: 4 :: ones);;
>             - : int = 0
>
>             Thank you,
>             Jean-Baptiste Jeannin
>
>             -- 
>             Caml-list mailing list.  Subscription management and archives:
>             https://sympa.inria.fr/sympa/arc/caml-list
>             Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>             Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>
>
>


[-- Attachment #2: Type: text/html, Size: 18485 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Hash function: complexity and circular structures
  2013-01-18 17:46       ` Jean-Baptiste Jeannin
@ 2013-01-18 18:27         ` Gabriel Scherer
  2013-01-18 20:11           ` Nick Lucaroni
  0 siblings, 1 reply; 7+ messages in thread
From: Gabriel Scherer @ 2013-01-18 18:27 UTC (permalink / raw)
  To: Jean-Baptiste Jeannin; +Cc: Nicholas Lucaroni, Edgar Friendly, caml-list

A blunt point of view: comparing implicitly circular structures is a
sure road to hell, and you should use an explicit representation of
circularity (eg. with a element that just means "nothing here, you
should rewind to the other side") that will not blow up at each
occasion it gets -- and is generally much more flexible.

On Fri, Jan 18, 2013 at 6:46 PM, Jean-Baptiste Jeannin
<jeannin@cs.cornell.edu> wrote:
>
> I would be curious to know if this is by design (it is supposed not to
> work), or if it is a problem with the implementation of compare, or of
> Hashtbl.find. In particular, if it is by design, why have updated the hash
> function to support circular lists?
> I am also now stuck on creating an (efficient) hashtable supporting circular
> data structures as keys. Any idea on this?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Hash function: complexity and circular structures
  2013-01-18 18:27         ` Gabriel Scherer
@ 2013-01-18 20:11           ` Nick Lucaroni
  0 siblings, 0 replies; 7+ messages in thread
From: Nick Lucaroni @ 2013-01-18 20:11 UTC (permalink / raw)
  To: Gabriel Scherer; +Cc: Jean-Baptiste Jeannin, Edgar Friendly, caml-list

[-- Attachment #1: Type: text/plain, Size: 1639 bytes --]

Calling find from the values ones and ones2 work because the compare
function does a physical comparison before structural comparison and both
of these values are derived from the same list you used as the key. So as
the structural comparison affirms the physical differences between the two
types, a physical comparison ends the loop of the recursive structure. And
this is also why looking up ones3 will result in an infinite loop of
comparisons.

You can implement your own comparison algorithm in the equal function of
the Hashtbl.HashedType module for the Hashtbl.Make functor if you must
proceed with cyclic lists in this fashion as keys. But as Gabriel has said,
you should look at a different data-structures.


On Fri, Jan 18, 2013 at 1:27 PM, Gabriel Scherer
<gabriel.scherer@gmail.com>wrote:

> A blunt point of view: comparing implicitly circular structures is a
> sure road to hell, and you should use an explicit representation of
> circularity (eg. with a element that just means "nothing here, you
> should rewind to the other side") that will not blow up at each
> occasion it gets -- and is generally much more flexible.
>
> On Fri, Jan 18, 2013 at 6:46 PM, Jean-Baptiste Jeannin
> <jeannin@cs.cornell.edu> wrote:
> >
> > I would be curious to know if this is by design (it is supposed not to
> > work), or if it is a problem with the implementation of compare, or of
> > Hashtbl.find. In particular, if it is by design, why have updated the
> hash
> > function to support circular lists?
> > I am also now stuck on creating an (efficient) hashtable supporting
> circular
> > data structures as keys. Any idea on this?
>

[-- Attachment #2: Type: text/html, Size: 2142 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-01-18 20:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-17 15:35 [Caml-list] Hash function: complexity and circular structures Jean-Baptiste Jeannin
2013-01-17 15:46 ` Nicholas Lucaroni
2013-01-17 16:41   ` Edgar Friendly
2013-01-17 17:05     ` Nicholas Lucaroni
2013-01-18 17:46       ` Jean-Baptiste Jeannin
2013-01-18 18:27         ` Gabriel Scherer
2013-01-18 20:11           ` Nick Lucaroni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).