From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 45FE4BBB7 for ; Tue, 25 Jul 2006 17:53:10 +0200 (CEST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by nez-perce.inria.fr (8.13.6/8.13.6) with ESMTP id k6PFr919008745 for ; Tue, 25 Jul 2006 17:53:09 +0200 Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id RAA18628 for ; Tue, 25 Jul 2006 17:53:08 +0200 (MET DST) Received: from mail-outfwd.lms.be (Kaiserslautern1.lms-gmbh.de [213.68.136.230]) by nez-perce.inria.fr (8.13.6/8.13.6) with ESMTP id k6PFr8sB008738 for ; Tue, 25 Jul 2006 17:53:08 +0200 Received: from localhost (unknown [127.0.0.1]) by mail-outfwd.lms.be (Postfix) with ESMTP id 3A6137F441; Tue, 25 Jul 2006 18:13:09 +0200 (CEST) Received: from mail-kl.lmsintl.com ([127.0.0.1]) by localhost (kl-ftp [127.0.0.1]) (amavisd-new, port 20024) with ESMTP id 32506-05; Tue, 25 Jul 2006 18:13:09 +0200 (CEST) Received: from kaiserslautern1.lmsintl.com (unknown [10.2.0.100]) by mail-kl.lmsintl.com (Postfix) with ESMTP id 1B68CB6EC6; Tue, 25 Jul 2006 18:13:09 +0200 (CEST) Received: by kaiserslautern1.lmsintl.com with Internet Mail Service (5.5.2653.19) id <32PZS81C>; Tue, 25 Jul 2006 17:53:07 +0200 Message-ID: <26EB47FDD566A7469FC862DAF373792F017112F4@kaiserslautern1.lmsintl.com> From: Christoph Bauer To: Brian Hurt , caml-list Subject: AW: AW: [Caml-list] generic Hashtbl.to_array Date: Tue, 25 Jul 2006 17:53:01 +0200 MIME-Version: 1.0 X-Mailer: Internet Mail Service (5.5.2653.19) Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C6B002.6856BC88" X-Virus-Scanned: by IT Services X-j-chkmail-Score: MSGID : 44C63E65.000 on nez-perce : j-chkmail score : XX : 5/20 0 X-j-chkmail-Score: MSGID : 44C63E64.000 on nez-perce : j-chkmail score : XX : 5/20 0 X-Miltered: at nez-perce with ID 44C63E65.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at nez-perce with ID 44C63E64.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; hashtbl:01 hashtbl:01 val:01 val:01 abstr:01 pointer:01 abstr:01 pointer:01 1.0:98 2.0:98 W6:98 W6:98 1.0:98 2.0:98 caml-list:01 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.1 required=5.0 tests=FORGED_RCVD_HELO,HTML_MESSAGE, HTML_TITLE_EMPTY autolearn=disabled version=3.0.3 This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible. ------_=_NextPart_001_01C6B002.6856BC88 Content-Type: text/plain The dirtiest solution: let to_array t = let a = Array.make (Hashtbl.length t) (Obj.magic 0) in ignore (Hashtbl.fold (fun k v i -> a.(i) <- (k, v); i + 1) t 0) ; a Does it work correctly for floats? Looks good for floats. # let to_array t = let a = Array.make (Hashtbl.length t) (Obj.magic 0) in ignore (Hashtbl.fold (fun k v i -> a.(i) <- (k, v); i + 1) t 0) ; a ;; val to_array : ('a, 'b) Hashtbl.t -> ('a * 'b) array = # let h = Hashtbl.create 0;; val h : ('_a, '_b) Hashtbl.t = # Hashtbl.add h 1.0 2.0;; - : unit = () # to_array h;; - : (float * float) array = [|(1., 2.)|] # Gc.compact ();; - : unit = () # BTW, the array should store a pointer to a tuple of two floats, so I thinkt float or ints doesn't matter. I won't use this solution, because it isn't better than others. Christoph Bauer ------_=_NextPart_001_01C6B002.6856BC88 Content-Type: text/html

 
The dirtiest solution:

let to_array t =
 let a =  Array.make (Hashtbl.length t) (Obj.magic 0)  in
   ignore
     (Hashtbl.fold (fun k v i -> a.(i) <- (k, v); i + 1) t 0) ;
     a
  
 
 

Does it work correctly for floats? 
 
Looks good for floats.
 
# let to_array t =
   let a =  Array.make (Hashtbl.length t) (Obj.magic 0)  in
     ignore
       (Hashtbl.fold (fun k v i -> a.(i) <- (k, v); i + 1) t 0) ;
       a
 
  ;;
val to_array : ('a, 'b) Hashtbl.t -> ('a * 'b) array = <fun>
# let h = Hashtbl.create 0;;
val h : ('_a, '_b) Hashtbl.t = <abstr>
# Hashtbl.add h 1.0 2.0;;
- : unit = ()
# to_array h;;
- : (float * float) array = [|(1., 2.)|]
# Gc.compact ();;
- : unit = ()
#
 
 BTW, the array should store a pointer to a tuple of two floats, so
I thinkt float or ints doesn't matter. I won't use this solution, because
 it isn't better than others. 

 Christoph Bauer 

------_=_NextPart_001_01C6B002.6856BC88-- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id 43F93BBB7 for ; Tue, 25 Jul 2006 18:35:29 +0200 (CEST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id k6PGZSa5009944 for ; Tue, 25 Jul 2006 18:35:28 +0200 Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id SAA18871 for ; Tue, 25 Jul 2006 18:35:27 +0200 (MET DST) Received: from nf-out-0910.google.com (nf-out-0910.google.com [64.233.182.189]) by nez-perce.inria.fr (8.13.6/8.13.6) with ESMTP id k6PGZRwS014531 for ; Tue, 25 Jul 2006 18:35:27 +0200 Received: by nf-out-0910.google.com with SMTP id y38so269315nfb for ; Tue, 25 Jul 2006 09:35:27 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=H1RS+4JCFK/qJnxZXdemaW+z77O0m2GKMROvCZhim/5dIKssiPcu6hme+ZeGFCrdGmZEQ+pFlFOhXBIwK9q8cNpWgTDt/XWARZNGRFRra9QQBVNwzhVrrV++53/GUTijhaS8yozVYU4FwEFH+d26whZnDJStYSMfn9yPzyuEsSs= Received: by 10.48.242.9 with SMTP id p9mr920635nfh; Tue, 25 Jul 2006 09:35:23 -0700 (PDT) Received: by 10.49.35.19 with HTTP; Tue, 25 Jul 2006 09:35:23 -0700 (PDT) Message-ID: Date: Tue, 25 Jul 2006 18:35:23 +0200 From: Tom To: "Christoph Bauer" Subject: Re: AW: [Caml-list] generic Hashtbl.to_array Cc: "Brian Hurt" , caml-list In-Reply-To: <26EB47FDD566A7469FC862DAF373792F017112F4@kaiserslautern1.lmsintl.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_203531_12669535.1153845323549" References: <26EB47FDD566A7469FC862DAF373792F017112F4@kaiserslautern1.lmsintl.com> X-j-chkmail-Score: MSGID : 44C6484F.000 on nez-perce : j-chkmail score : X : 0/20 1 X-Miltered: at concorde with ID 44C64850.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at nez-perce with ID 44C6484F.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; hashtbl:01 toploop:01 hashtbl:01 hashtable:01 toploop:01 hashtable:01 W8:98 W10:98 W8:98 caml-list:01 W15:98 corrected:01 corrected:01 int:01 int:01 X-Attachments: cset="UTF-8" cset="UTF-8" X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.1 required=5.0 tests=HTML_40_50,HTML_MESSAGE, RCVD_BY_IP autolearn=disabled version=3.0.3 ------=_Part_203531_12669535.1153845323549 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline I'm sorry to say that, but I believe that you results are flawed... If we look at the code of to_array_1 and to_array_5, there is no possibility that the former was faster... if nothing else, it has an additional if jump each and every loop. I simply couldn't believe your results. Upon inspecting your code with Toploop, I found out some flaws... let h () = let h = Hashtbl.create 100000 in for i = 0 to 99999 do (* <<< not Hashtbl.length h, as it returns 0 for ampty hashtable *) Hashtbl.add h (Random.int max_int) (Random.int max_int); done; h let to_array_1 t = let dummy = Array.init 0 (fun _ -> raise Not_found) in fst (Hashtbl.fold (fun k v (a, i) -> if i = 0 then let a = Array.make (Hashtbl.length t) (k, v) in (a, 1) (* <<<<< Not 0, as it causes no progress *) else (a.(i) <- (k, v); (a, i + 1))) t (dummy, 0)) I also corrected my implementation: let mgc = Obj.magic 0 <<< So that the function is executed only once. let to_array_5 t = let a = Array.make (Hashtbl.length t) mgc in ignore (Hashtbl.fold (fun k v i -> a.(i) <- (k, v); i + 1) t 0) ; a I tried to do some benchmarking, but I do not have much time... anyhow, my implementation is faster as far as I tested it. Believe in your dreams! ------=_Part_203531_12669535.1153845323549 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline I'm sorry to say that, but I believe that you results are flawed...

If we look at the code of to_array_1 and to_array_5, there is no possibility that the former was faster... if nothing else, it has an additional if jump each and every loop. I simply couldn't believe your results.

Upon inspecting your code with Toploop, I found out some flaws...

let h () =
  let h = Hashtbl.create 100000 in
    for i = 0 to 99999 do            (* <<< not Hashtbl.length h, as it returns 0 for ampty hashtable *)
      Hashtbl.add h (Random.int max_int) (Random.int max_int);
    done;
    h

let to_array_1 t =
  let dummy =  Array.init 0 (fun _ -> raise Not_found) in
    fst
      (Hashtbl.fold

         (fun k v (a, i) ->
            if i = 0 then
              let a = Array.make (Hashtbl.length t) (k, v) in
                (a, 1)           (* <<<<< Not 0, as it causes no progress *)
            else (a.(i) <- (k, v); (a, i + 1)))
         t (dummy, 0))

I also corrected my implementation:

let mgc = Obj.magic 0      <<< So that the function is executed only once.
let to_array_5 t =
 let a =  Array.make (Hashtbl.length t) mgc in
   ignore
     (Hashtbl.fold (fun k v i -> a.(i) <- (k, v); i + 1) t 0) ;
     a

I tried to do some benchmarking, but I do not have much time... anyhow, my implementation is faster as far as I tested it.

Believe in your dreams!
------=_Part_203531_12669535.1153845323549-- From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 44823BB83 for ; Tue, 15 Aug 2006 10:26:41 +0200 (CEST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by nez-perce.inria.fr (8.13.6/8.13.6) with ESMTP id k7F8QeBn005362 for ; Tue, 15 Aug 2006 10:26:40 +0200 Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id KAA29912 for ; Tue, 15 Aug 2006 10:26:40 +0200 (MET DST) Received: from rouge.crans.org (rouge.crans.org [138.231.136.3]) by nez-perce.inria.fr (8.13.6/8.13.6) with ESMTP id k7F8QdSA005356 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Tue, 15 Aug 2006 10:26:40 +0200 Received: from localhost (rouge.adm.crans.org [138.231.144.3]) by rouge.crans.org (Postfix) with ESMTP id ADA29BB80 for ; Tue, 15 Aug 2006 10:26:39 +0200 (CEST) Received: from rouge.crans.org ([138.231.144.3]) by localhost (rouge [138.231.144.3]) (amavisd-new, port 10024) with LMTP id 01578-04-8 for ; Tue, 15 Aug 2006 10:26:39 +0200 (CEST) Received: from [10.27.0.2] (korell.crans.org [138.231.142.254]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by rouge.crans.org (Postfix) with ESMTP id 0688CBB6B for ; Tue, 15 Aug 2006 10:26:38 +0200 (CEST) Message-ID: <44E1853B.4010402@crans.org> Date: Tue, 15 Aug 2006 17:26:35 +0900 From: =?UTF-8?B?U3TDqXBoYW5lIEdsb25kdQ==?= User-Agent: Thunderbird 1.5.0.5 (X11/20060728) MIME-Version: 1.0 To: caml-list Subject: Re: [Caml-list] generic Hashtbl.to_array References: <26EB47FDD566A7469FC862DAF373792F017112F4@kaiserslautern1.lmsintl.com> In-Reply-To: X-Enigmail-Version: 0.94.0.0 OpenPGP: id=FCE03DAA Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at crans.org X-Miltered: at nez-perce with ID 44E18540.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at nez-perce with ID 44E1853F.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; hashtbl:01 inlined:01 phane:98 phane:98 caml-list:01 integer:01 corrected:01 seems:03 let:03 compiled:04 stephane:07 generic:08 function:08 basically:09 ...:90 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.3 We shouldn't talk about Obj.magic, but... Tom a =C3=A9crit : > I also corrected my implementation: >=20 > let mgc =3D Obj.magic 0 <<< So that the function is executed only = once. Does this provide any benefit? It seems to me that Obj.magic is the (inlined) identity (so basically Obj.magic 0 is compiled directly into the integer 0). --=20 St=C3=A9phane