caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] ocaml defaut hash ?
@ 2016-04-23  9:16 Christophe Raffalli
  2016-04-23 11:21 ` Gerd Stolpmann
  0 siblings, 1 reply; 4+ messages in thread
From: Christophe Raffalli @ 2016-04-23  9:16 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 598 bytes --]


Hello,

I am hashing value of type (bool * int array), where the size of the
array is 8.

According to the documentation, I think all words should be considered
by the default hash function.  Nevertheless, I get the following
histogram for bucket size, showing a lot of collision:

0 -> 9369
1 -> 506
2 -> 3387
3 -> 221
4 -> 1871
5 -> 119
6 -> 594
7 -> 34
8 -> 195
9 -> 16
10 -> 51
11 -> 7
12 -> 11
13 -> 1
14 -> 2

By the way what I really need is a hash function for arrays that I can
update when I update one entry in the array.  Does anyone known of
such a hash function ?

Cheers,
Christophe

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] ocaml defaut hash ?
  2016-04-23  9:16 [Caml-list] ocaml defaut hash ? Christophe Raffalli
@ 2016-04-23 11:21 ` Gerd Stolpmann
  2016-04-23 15:41   ` Christophe Raffalli
  0 siblings, 1 reply; 4+ messages in thread
From: Gerd Stolpmann @ 2016-04-23 11:21 UTC (permalink / raw)
  To: Christophe Raffalli; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 1062 bytes --]

Am Samstag, den 23.04.2016, 11:16 +0200 schrieb Christophe Raffalli:
> By the way what I really need is a hash function for arrays that I can
> update when I update one entry in the array.  Does anyone known of
> such a hash function ?

It would have to be commutative then (you need to remove the old version
and add the new one, for any element for the array). You could use XOR
or + of the hashes of the individual elements.

If this doesn't work good enough, I'd try to define larger hash blocks.
E.g. consider 4 consecutive elements as a block, and compute the hash of
the block, and take the XOR of all blocks you have. Quality depends a
little bit on what is in the elements.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
My OCaml site:          http://www.camlcity.org
Contact details:        http://www.camlcity.org/contact.html
Company homepage:       http://www.gerd-stolpmann.de
------------------------------------------------------------


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] ocaml defaut hash ?
  2016-04-23 11:21 ` Gerd Stolpmann
@ 2016-04-23 15:41   ` Christophe Raffalli
  2016-04-23 17:22     ` Gerd Stolpmann
  0 siblings, 1 reply; 4+ messages in thread
From: Christophe Raffalli @ 2016-04-23 15:41 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 1908 bytes --]

On 16-04-23 13:21:40, Gerd Stolpmann wrote:
> Am Samstag, den 23.04.2016, 11:16 +0200 schrieb Christophe Raffalli:
> > By the way what I really need is a hash function for arrays that I can
> > update when I update one entry in the array.  Does anyone known of
> > such a hash function ?
>
> It would have to be commutative then (you need to remove the old version
> and add the new one, for any element for the array). You could use XOR
> or + of the hashes of the individual elements.

Hello,

xor or + was not enough and it does not have to be commutative.

I tried

let hash_array a =
  let r = ref 0 in
    Array.iteri (fun i x -> r := !r lxor Hashtbl.hash (i,x)) a;
      !r

Which works not to bad ... but some small hash value (below 10, after
moduli) seems to come too often ...

> If this doesn't work good enough, I'd try to define larger hash blocks.
> E.g. consider 4 consecutive elements as a block, and compute the hash of
> the block, and take the XOR of all blocks you have.

This is already done. My array is in fact a packed array of values that
can be represented on few bits, implemented cleanly via a functor.

The property I would like if you think that the array is a 2 dimensional bitmaps are
- the hash changes when any bit changes
- the hash changes for all (most) translation at angle multiple of pi/4
with zéro padding (making xor no sufficient)
- covering all range of caml integer as usual


> Quality depends a
> little bit on what is in the elements.
>
> Gerd
> --
> ------------------------------------------------------------
> Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
> My OCaml site:          http://www.camlcity.org
> Contact details:        http://www.camlcity.org/contact.html
> Company homepage:       http://www.gerd-stolpmann.de
> ------------------------------------------------------------
>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] ocaml defaut hash ?
  2016-04-23 15:41   ` Christophe Raffalli
@ 2016-04-23 17:22     ` Gerd Stolpmann
  0 siblings, 0 replies; 4+ messages in thread
From: Gerd Stolpmann @ 2016-04-23 17:22 UTC (permalink / raw)
  To: Christophe Raffalli; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 3203 bytes --]

Am Samstag, den 23.04.2016, 17:41 +0200 schrieb Christophe Raffalli:
> On 16-04-23 13:21:40, Gerd Stolpmann wrote:
> > Am Samstag, den 23.04.2016, 11:16 +0200 schrieb Christophe Raffalli:
> > > By the way what I really need is a hash function for arrays that I can
> > > update when I update one entry in the array.  Does anyone known of
> > > such a hash function ?
> >
> > It would have to be commutative then (you need to remove the old version
> > and add the new one, for any element for the array). You could use XOR
> > or + of the hashes of the individual elements.
> 
> Hello,
> 
> xor or + was not enough and it does not have to be commutative.

Well, you need to be able to "subtract" the hash of the old version
somehow from the aggregate hash. Well, algebra was a long time ago.

> I tried
> 
> let hash_array a =
>   let r = ref 0 in
>     Array.iteri (fun i x -> r := !r lxor Hashtbl.hash (i,x)) a;
>       !r
> 
> Which works not to bad ... but some small hash value (below 10, after
> moduli) seems to come too often ...

Instead of (i,x) you could also use any function of these. If you can
spend some RAM: Let's rnd(i) the i-th random number of a memorized
sequence. The hash of the i-th element is then hash(rnd(i)*x). You could
also use another operation instead of * but multiplication is fairly
cheap, and "mangles" already a lot. If you want to spend more time,
hash(rnd(i)*x mod p) for a large prime p (instead of the implicit
modulus 2^31/2^63), because the set of values is largest with a primal
modulus.

> > If this doesn't work good enough, I'd try to define larger hash blocks.
> > E.g. consider 4 consecutive elements as a block, and compute the hash of
> > the block, and take the XOR of all blocks you have.
> 
> This is already done. My array is in fact a packed array of values that
> can be represented on few bits, implemented cleanly via a functor.
> 
> The property I would like if you think that the array is a 2 dimensional bitmaps are
> - the hash changes when any bit changes
> - the hash changes for all (most) translation at angle multiple of pi/4
> with zéro padding (making xor no sufficient)

I think when you make the element hash dependent on the position i you
get exactly this property, even with xor.

Gerd

> - covering all range of caml integer as usual
> 
> 
> > Quality depends a
> > little bit on what is in the elements.
> >
> > Gerd
> > --
> > ------------------------------------------------------------
> > Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
> > My OCaml site:          http://www.camlcity.org
> > Contact details:        http://www.camlcity.org/contact.html
> > Company homepage:       http://www.gerd-stolpmann.de
> > ------------------------------------------------------------
> >

-- 
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
My OCaml site:          http://www.camlcity.org
Contact details:        http://www.camlcity.org/contact.html
Company homepage:       http://www.gerd-stolpmann.de
------------------------------------------------------------


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-04-23 17:22 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-23  9:16 [Caml-list] ocaml defaut hash ? Christophe Raffalli
2016-04-23 11:21 ` Gerd Stolpmann
2016-04-23 15:41   ` Christophe Raffalli
2016-04-23 17:22     ` Gerd Stolpmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).