caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: "David Allsopp" <dra-news@metastack.com>
To: "OCaml List" <caml-list@yquem.inria.fr>
Subject: RE: [Caml-list] modifying hash tables
Date: Wed, 18 Jul 2007 17:08:07 +0100	[thread overview]
Message-ID: <003601c7c955$d4db6660$6a7ba8c0@treble> (raw)
In-Reply-To: <469E3171.7040205@gnu.org>

> I want to modify the _current_ datum in Hashtable.iter, e.g., increment
> a count. Is this code valid?
> Hashtable.iter ht ~f:(fun ~key ~data ->
>   Hashtable.replace ht ~key ~data:(data+1))
> (it appears to work, but I was told that "it is not guaranteed to work",
> i.e., the documentation does not promise that).
> If the above code is wrong, what is TRT?

You're referring to module "Hashtable" - where's this from?

The potential problem is that you're changing the hash table while it's
being "read" - it's a matter of "definition" on what that sort of use means.
For example,

let t = Hashtbl.create 32
in
  Hashtbl.add t 1 "a"; Hashtbl.add t 2 "b"; Hashtbl.add t 3 "c";
  (* As it happens, these will print in order... *)
  Hashtbl.iter (fun k v -> Printf.printf "%d: %s\n" k v) t;
  (* Now let's mess around... *)
  let f k v =
    Printf.printf "%d: %s\n" k v;
    if k = 2
    then Hashtbl.remove t 3
  in
    Hashtbl.iter f t;;

prints three lines resulting from the initial iter but only two lines
resulting from the second iter yet the documentation for Hashtbl.iter states
"Hashtbl.iter f tbl applies f to all bindings in table tbl" - the definition
of "all bindings" is left somewhat fuzzy (probably on purpose!). Just to
make things weirder - if you re-run the code above but change the 32 to 3 in
the first line then all three bindings are printed twice because of the
different ordering of the elements in the hash table! You can get some even
"stranger" (by which I mean unpredictable) results if you come up with a
function that adds elements and so causes the table to be resized mid-way
through the iter...

	In your code, you're only replacing the data of existing bindings so
you're likely to be fine using iter (at least in the current implementation)
as the keys won't change - it's probably worthy of a big safety comment in
your code, though. "Safer" solutions include ExtLib's Hashtbl.ExtHashtbl.map
function or the Map module. Both of these have performance hits that may be
of concern - maps are on average slower than hash tables in O'Caml and
Hashtbl.map copies the table each time (though it's not clear to me why it
bothers doing that - it wouldn't take much to alter it to do an in-place
map).

	The "problem" as such is applying functional idioms such as fold and
iter to an inherently imperative data structure...


David


  parent reply	other threads:[~2007-07-18 16:08 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-18 15:27 Sam Steingold
2007-07-18 16:00 ` [Caml-list] " Christopher L Conway
2007-07-18 16:01 ` Sam Steingold
2007-07-18 16:36   ` [Caml-list] " Brian Hurt
2007-07-18 16:08 ` David Allsopp [this message]
2007-07-18 16:38 ` Zheng Li
2007-07-18 20:38 ` [Caml-list] " Jon Harrop

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='003601c7c955$d4db6660$6a7ba8c0@treble' \
    --to=dra-news@metastack.com \
    --cc=caml-list@yquem.inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).