From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id 43F93BBB7 for ; Tue, 25 Jul 2006 18:35:29 +0200 (CEST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id k6PGZSa5009944 for ; Tue, 25 Jul 2006 18:35:28 +0200 Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id SAA18871 for ; Tue, 25 Jul 2006 18:35:27 +0200 (MET DST) Received: from nf-out-0910.google.com (nf-out-0910.google.com [64.233.182.189]) by nez-perce.inria.fr (8.13.6/8.13.6) with ESMTP id k6PGZRwS014531 for ; Tue, 25 Jul 2006 18:35:27 +0200 Received: by nf-out-0910.google.com with SMTP id y38so269315nfb for ; Tue, 25 Jul 2006 09:35:27 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=H1RS+4JCFK/qJnxZXdemaW+z77O0m2GKMROvCZhim/5dIKssiPcu6hme+ZeGFCrdGmZEQ+pFlFOhXBIwK9q8cNpWgTDt/XWARZNGRFRra9QQBVNwzhVrrV++53/GUTijhaS8yozVYU4FwEFH+d26whZnDJStYSMfn9yPzyuEsSs= Received: by 10.48.242.9 with SMTP id p9mr920635nfh; Tue, 25 Jul 2006 09:35:23 -0700 (PDT) Received: by 10.49.35.19 with HTTP; Tue, 25 Jul 2006 09:35:23 -0700 (PDT) Message-ID: Date: Tue, 25 Jul 2006 18:35:23 +0200 From: Tom To: "Christoph Bauer" Subject: Re: AW: [Caml-list] generic Hashtbl.to_array Cc: "Brian Hurt" , caml-list In-Reply-To: <26EB47FDD566A7469FC862DAF373792F017112F4@kaiserslautern1.lmsintl.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_203531_12669535.1153845323549" References: <26EB47FDD566A7469FC862DAF373792F017112F4@kaiserslautern1.lmsintl.com> X-j-chkmail-Score: MSGID : 44C6484F.000 on nez-perce : j-chkmail score : X : 0/20 1 X-Miltered: at concorde with ID 44C64850.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at nez-perce with ID 44C6484F.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; hashtbl:01 toploop:01 hashtbl:01 hashtable:01 toploop:01 hashtable:01 W8:98 W10:98 W8:98 caml-list:01 W15:98 corrected:01 corrected:01 int:01 int:01 X-Attachments: cset="UTF-8" cset="UTF-8" X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.1 required=5.0 tests=HTML_40_50,HTML_MESSAGE, RCVD_BY_IP autolearn=disabled version=3.0.3 ------=_Part_203531_12669535.1153845323549 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline I'm sorry to say that, but I believe that you results are flawed... If we look at the code of to_array_1 and to_array_5, there is no possibility that the former was faster... if nothing else, it has an additional if jump each and every loop. I simply couldn't believe your results. Upon inspecting your code with Toploop, I found out some flaws... let h () = let h = Hashtbl.create 100000 in for i = 0 to 99999 do (* <<< not Hashtbl.length h, as it returns 0 for ampty hashtable *) Hashtbl.add h (Random.int max_int) (Random.int max_int); done; h let to_array_1 t = let dummy = Array.init 0 (fun _ -> raise Not_found) in fst (Hashtbl.fold (fun k v (a, i) -> if i = 0 then let a = Array.make (Hashtbl.length t) (k, v) in (a, 1) (* <<<<< Not 0, as it causes no progress *) else (a.(i) <- (k, v); (a, i + 1))) t (dummy, 0)) I also corrected my implementation: let mgc = Obj.magic 0 <<< So that the function is executed only once. let to_array_5 t = let a = Array.make (Hashtbl.length t) mgc in ignore (Hashtbl.fold (fun k v i -> a.(i) <- (k, v); i + 1) t 0) ; a I tried to do some benchmarking, but I do not have much time... anyhow, my implementation is faster as far as I tested it. Believe in your dreams! ------=_Part_203531_12669535.1153845323549 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline I'm sorry to say that, but I believe that you results are flawed...

If we look at the code of to_array_1 and to_array_5, there is no possibility that the former was faster... if nothing else, it has an additional if jump each and every loop. I simply couldn't believe your results.

Upon inspecting your code with Toploop, I found out some flaws...

let h () =
  let h = Hashtbl.create 100000 in
    for i = 0 to 99999 do            (* <<< not Hashtbl.length h, as it returns 0 for ampty hashtable *)
      Hashtbl.add h (Random.int max_int) (Random.int max_int);
    done;
    h

let to_array_1 t =
  let dummy =  Array.init 0 (fun _ -> raise Not_found) in
    fst
      (Hashtbl.fold

         (fun k v (a, i) ->
            if i = 0 then
              let a = Array.make (Hashtbl.length t) (k, v) in
                (a, 1)           (* <<<<< Not 0, as it causes no progress *)
            else (a.(i) <- (k, v); (a, i + 1)))
         t (dummy, 0))

I also corrected my implementation:

let mgc = Obj.magic 0      <<< So that the function is executed only once.
let to_array_5 t =
 let a =  Array.make (Hashtbl.length t) mgc in
   ignore
     (Hashtbl.fold (fun k v i -> a.(i) <- (k, v); i + 1) t 0) ;
     a

I tried to do some benchmarking, but I do not have much time... anyhow, my implementation is faster as far as I tested it.

Believe in your dreams!
------=_Part_203531_12669535.1153845323549--