From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id 15E98BBB7 for ; Tue, 25 Jul 2006 18:20:13 +0200 (CEST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id k6PGKCfP007453 for ; Tue, 25 Jul 2006 18:20:12 +0200 Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id SAA18870 for ; Tue, 25 Jul 2006 18:20:11 +0200 (MET DST) Received: from smtp3.adl2.internode.on.net (smtp3.adl2.internode.on.net [203.16.214.203]) by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id k6PGK2A3007394; Tue, 25 Jul 2006 18:20:09 +0200 Received: from rosella (ppp232-211.lns3.syd7.internode.on.net [59.167.232.211]) by smtp3.adl2.internode.on.net (8.13.6/8.13.5) with ESMTP id k6PGJogu041853; Wed, 26 Jul 2006 01:49:51 +0930 (CST) (envelope-from skaller@users.sourceforge.net) Subject: Re: [Caml-list] generic Hashtbl.to_array From: skaller To: Christoph Bauer Cc: Damien Doligez , caml-list In-Reply-To: <26EB47FDD566A7469FC862DAF373792F017112A1@kaiserslautern1.lmsintl.com> References: <26EB47FDD566A7469FC862DAF373792F017112A1@kaiserslautern1.lmsintl.com> Content-Type: text/plain Date: Wed, 26 Jul 2006 02:19:49 +1000 Message-Id: <1153844390.1389.53.camel@rosella.wigram> Mime-Version: 1.0 X-Mailer: Evolution 2.6.1 Content-Transfer-Encoding: 7bit X-Miltered: at concorde with ID 44C644BC.002 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at concorde with ID 44C644B2.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; hashtbl:01 hashtbl:01 iter:01 surprising:01 ocaml:01 caching:01 2006:98 recycling:98 paging:98 wrote:01 sourceforge:01 caml-list:01 slower:01 data:02 match:02 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.3 On Tue, 2006-07-25 at 14:00 +0200, Christoph Bauer wrote: > Hi, > > > let to_array t = > > let init = ref None in > > begin try Hashtbl.iter (fun k v -> init := Some (k,v); > > raise Exit) t > > with Exit -> () > > end; > > match !init with > > | None -> [| |] > > | Some i -> > > let a = Array.make (Hashtbl.length t) i in > > ignore (Hashtbl.fold (fun k v i -> a.(i) <- (k, v); i + 1) t 0); > > a > > ;; > > it's curious, but this solution is slower than the others! > > [skaller's solution seems to be the same, so I > include only this one in the "benchmark"] This is not entirely surprising, since Ocaml wastes time first initialising the array.. and then assigning the same cells new values, recycling the same store through the cache twice. However there is no need for a secondary data structure, which may delay hitting limits of the next level of caching (for example VM paging disk). I see in your tests the number of elements is 100,000. It would be interesting to increase this in a series of tests to see if the performance tradeoffs change: cache effects would predict a 'kink' when you hit cache boundaries? -- John Skaller Felix, successor to C++: http://felix.sf.net