From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38]) by yquem.inria.fr (Postfix) with ESMTP id 7CAB4BC69 for ; Wed, 18 Jul 2007 18:08:09 +0200 (CEST) Received: from orion.metastack.com (no-dns-yet.demon.co.uk [80.177.38.218] (may be forged)) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l6IG88Xw014504 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Wed, 18 Jul 2007 18:08:09 +0200 Received: from treble (cpc2-cmbg6-0-0-cust535.cmbg.cable.ntl.com [81.107.34.24]) (authenticated bits=0) by orion.metastack.com (8.13.4/8.13.3) with ESMTP id l6IFvn90002654 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO) for ; Wed, 18 Jul 2007 16:57:49 +0100 From: "David Allsopp" To: "OCaml List" References: <469E3171.7040205@gnu.org> Subject: RE: [Caml-list] modifying hash tables Date: Wed, 18 Jul 2007 17:08:07 +0100 Organization: MetaStack Solutions Ltd. Message-ID: <003601c7c955$d4db6660$6a7ba8c0@treble> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 In-Reply-To: <469E3171.7040205@gnu.org> Thread-Index: AcfJTxG0AgHBoEiUTDu4cmFCWXTM4AAAjVoQ X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3138 X-Miltered: at discorde with ID 469E3AE8.001 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; hash:01 hashtable:01 iter:01 hashtable:01 iter:01 hash:01 hashtbl:01 hashtbl:01 printf:01 printf:01 bindings:01 bindings:01 o'caml:01 idioms:01 inherently:01 > I want to modify the _current_ datum in Hashtable.iter, e.g., increment > a count. Is this code valid? > Hashtable.iter ht ~f:(fun ~key ~data -> > Hashtable.replace ht ~key ~data:(data+1)) > (it appears to work, but I was told that "it is not guaranteed to work", > i.e., the documentation does not promise that). > If the above code is wrong, what is TRT? You're referring to module "Hashtable" - where's this from? The potential problem is that you're changing the hash table while it's being "read" - it's a matter of "definition" on what that sort of use means. For example, let t = Hashtbl.create 32 in Hashtbl.add t 1 "a"; Hashtbl.add t 2 "b"; Hashtbl.add t 3 "c"; (* As it happens, these will print in order... *) Hashtbl.iter (fun k v -> Printf.printf "%d: %s\n" k v) t; (* Now let's mess around... *) let f k v = Printf.printf "%d: %s\n" k v; if k = 2 then Hashtbl.remove t 3 in Hashtbl.iter f t;; prints three lines resulting from the initial iter but only two lines resulting from the second iter yet the documentation for Hashtbl.iter states "Hashtbl.iter f tbl applies f to all bindings in table tbl" - the definition of "all bindings" is left somewhat fuzzy (probably on purpose!). Just to make things weirder - if you re-run the code above but change the 32 to 3 in the first line then all three bindings are printed twice because of the different ordering of the elements in the hash table! You can get some even "stranger" (by which I mean unpredictable) results if you come up with a function that adds elements and so causes the table to be resized mid-way through the iter... In your code, you're only replacing the data of existing bindings so you're likely to be fine using iter (at least in the current implementation) as the keys won't change - it's probably worthy of a big safety comment in your code, though. "Safer" solutions include ExtLib's Hashtbl.ExtHashtbl.map function or the Map module. Both of these have performance hits that may be of concern - maps are on average slower than hash tables in O'Caml and Hashtbl.map copies the table each time (though it's not clear to me why it bothers doing that - it wouldn't take much to alter it to do an in-place map). The "problem" as such is applying functional idioms such as fold and iter to an inherently imperative data structure... David