From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id 5762CBBAF for ; Sat, 20 Feb 2010 09:54:25 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApoBADMyf0vRVdzgkGdsb2JhbACDBY4kiVoIFQEBAQEJCQwHEwMfrAoygVWEVYhyAQEDBYN/ZASDEw X-IronPort-AV: E=Sophos;i="4.49,507,1262559600"; d="scan'208";a="45182181" Received: from mail-fx0-f224.google.com ([209.85.220.224]) by mail3-smtp-sop.national.inria.fr with ESMTP; 20 Feb 2010 09:54:24 +0100 Received: by fxm24 with SMTP id 24so911211fxm.17 for ; Sat, 20 Feb 2010 00:54:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=/LqFOcvOxeNS+7+niLkbPXPVgv4bN3LZO/41IDAlNwo=; b=hhjxmYEy3yDCOq/qll2xkvD6DsLLrVLx9O/IbKuzAY6lyRN53bQqaBsK4OdLL8fPQu yp68hSHrGGdcC57YbfF01nvrOGiQjpJTPsBNqwoQAL5kpL88sTgfTT8z/NW66zoRKKd8 W152PCEdS+C02dVWLD4gHypc4d8TMuN+fff/U= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=LjEhkXZK8ye9dMFzdMTM0X25I43AfyPgAJ5jROmKCcYxFhUcZNkPsiSH799kAPB+9x 4vdcQbNsQ4MlZSEQCq/UvDwOPykbzJSRM7Z36Sr++6zUIgIdry8/eafeO7Fx8XMgT/OJ U9FSJrltdoP9v9rc55GasIxTRToi+IDf0JpTw= MIME-Version: 1.0 Received: by 10.239.134.148 with SMTP id 20mr1286469hbz.42.1266656063964; Sat, 20 Feb 2010 00:54:23 -0800 (PST) Date: Sat, 20 Feb 2010 14:24:23 +0530 Message-ID: <1ae8fe881002200054s2e9752d8x620695c3c4e27939@mail.gmail.com> Subject: range of hash function From: =?UTF-8?Q?Gr=C3=A9goire_Seux?= To: caml-list Content-Type: multipart/alternative; boundary=001485f945e0bec79304800458ae X-Spam: no; 0.00; hash:01 hash:01 hashes:01 hashtbl:01 hashes:01 hashtbl:01 gregoire:01 probability:01 probability:01 strings:01 strings:01 string:02 string:02 uniform:04 uniform:04 X-Attachments: cset="UTF-8" cset="UTF-8" --001485f945e0bec79304800458ae Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable hello ! i would like to use the polymorhpic hash function on strings. But i would like to know what is the probability of a collision between two hashes. my first question is about the range of the Hashtbl.hash function: what is its range ? ( string -> [1..N] ?) the second question is : can i assume that the result is a uniform distribution over [1..N] ? (for 10=E2=81=B6 words which is an estimation of= the english vocabulary size) the third one is : is it possible to predict which will be the collision ? = I mean collisions are between words which are very 'similar' (for ex: "boy" and "boys") or are completely unpredictable. thanks ! --=20 Gr=C3=A9goire Seux --001485f945e0bec79304800458ae Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable hello !

i would like to use the polymorhpic hash functio= n on strings. But i would like to know what is the probability of a collisi= on between two hashes.=C2=A0

my first question is = about the range of the Hashtbl.hash function: what is its range ? ( string = -> [1..N] ?)

the second question is : can i assume that the result i= s a uniform distribution over [1..N] ? (for 10=E2=81=B6 words which is an e= stimation of the english vocabulary size)

t= he third one is : is it possible to predict which will be the collision ? I= mean collisions are between words which are very 'similar' (for ex= : "boy" and "boys") or are completely unpredictable.

thanks !


--
Gr=C3=A9go= ire Seux
--001485f945e0bec79304800458ae--