From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id q189f20m018833 for ; Wed, 8 Feb 2012 10:41:05 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AloEACJCMk/ZSMDdjmdsb2JhbABAA69mIgEBAQEHDSQpgXIBAQQBOj8QCw4TJQ8BBCghE4d8A7Iyi04CAgkFDAcGATQGgn8dAyhbETeDIgSacY0Z X-IronPort-AV: E=Sophos;i="4.73,383,1325458800"; d="scan'208";a="143350744" Received: from fmmailgate01.web.de ([217.72.192.221]) by mail1-smtp-roc.national.inria.fr with ESMTP; 08 Feb 2012 10:41:05 +0100 Received: from moweb002.kundenserver.de (moweb002.kundenserver.de [172.19.20.108]) by fmmailgate01.web.de (Postfix) with ESMTP id B9E741A9F5309 for ; Wed, 8 Feb 2012 10:41:04 +0100 (CET) Received: from frosties.localnet ([95.208.118.96]) by smtp.web.de (mrweb001) with ESMTPA (Nemesis) id 0MRl2f-1S6JNB2fqz-00T7JF; Wed, 08 Feb 2012 10:41:04 +0100 Received: from mrvn by frosties.localnet with local (Exim 4.77) (envelope-from ) id 1Rv41Q-0001Iu-07; Wed, 08 Feb 2012 10:41:04 +0100 From: Goswin von Brederlow To: "Gerd Stolpmann" Cc: "Goswin von Brederlow" , caml-list@inria.fr References: <1325263446.5036.104.camel@samsung> <87zkd531kl.fsf@frosties.localnet> <403c4e4bb2cfce801aad217c80365879.squirrel@gps.dynxs.de> Date: Wed, 08 Feb 2012 10:41:03 +0100 In-Reply-To: <403c4e4bb2cfce801aad217c80365879.squirrel@gps.dynxs.de> (Gerd Stolpmann's message of "Tue, 31 Jan 2012 15:16:37 +0100") Message-ID: <8739alfys0.fsf@frosties.localnet> User-Agent: Gnus/5.110009 (No Gnus v0.9) XEmacs/21.4.22 (linux, no MULE) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Provags-ID: V02:K0:4fhahQpavo6y87DxIe6OBGJKnz5mRrVn1K51dDyOhTe JM5t1iJVGCEE5Sjr7slIBSxeupwfzJ4ZqzRTBoXyBz2rywJv9u 8a1xx+/TvQ0KbckpHbLNnjvoxwUzLRzCi/Thk2cOpn1ucdSWqg V4JnMI1WFVO6weRGkvr3zvfv47uAKyT2wrklzgymT83cOconpZ QG2j0Pk1doe4hDpf/hWrg== Subject: Re: [Caml-list] Hashtbl and security "Gerd Stolpmann" writes: >> Gerd Stolpmann writes: >> You are forgetting a variable in this. To create a DOS in the hashtable >> you need to hit the same bucket again and again. And bucket = hash % size. >> >> You focused on the hash but lets talk about the size for a moment. >> >> The size is rather limited and large hashtabels simply won't increate >> size anymore and allways have lots of collisions. So even if someone >> isn't actively trying to create DOS the performace still tanks as you >> add more items. > > I cannot completely follow here. If you add more elements, the bucket > array is resized. The ocaml Hashtbl does this when the number of elements > exceeds half of the array size. The argument of Hashtbl.create is only the > initial size of the bucket array. > > The upper bound is Sys.max_array_length, of course. Granted, on 32 bit > platforms it is low (4M). But nobody of sound mind would run ocaml > programs on 32 bit for production purposes. (If you do, consider to > switch. You can use more memory and get a little performance boost, not > only for ocaml programs.) > > Gerd In practice I see a serious performance degradation with large hashtables. So much so that I had to use an array of hashtables or hashtable of hashtables and split the values across them. Maybe the size of the hashtable indeed isn't the problem but the hash function. Here is something odd: # for i = 1 to 63 do Printf.printf "%d %d\n" (1 lsl i) (Hashtbl.hash (1 lsl i)) done;; 2 2 4 4 8 8 16 16 32 32 64 64 128 128 256 256 512 512 1024 1024 2048 2048 4096 4096 8192 8192 16384 16384 32768 32768 65536 65536 131072 131072 262144 262144 524288 524288 1048576 1048576 2097152 2097152 4194304 4194304 8388608 8388608 16777216 16777216 33554432 33554432 67108864 67108864 134217728 134217728 268435456 268435456 536870912 536870912 1073741824 0 2147483648 0 4294967296 0 8589934592 0 17179869184 0 34359738368 0 68719476736 0 137438953472 0 274877906944 0 549755813888 0 1099511627776 0 2199023255552 0 4398046511104 0 8796093022208 0 17592186044416 0 35184372088832 0 70368744177664 0 140737488355328 0 281474976710656 0 562949953421312 0 1125899906842624 0 2251799813685248 0 4503599627370496 0 9007199254740992 0 18014398509481984 0 36028797018963968 0 72057594037927936 0 144115188075855872 0 288230376151711744 0 576460752303423488 0 1152921504606846976 0 2305843009213693952 0 -4611686018427387904 0 0 0 - : unit = () Is every value over a billion hashed to 0? MfG Goswin