caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Jacques Garrigue <garrigue@kurims.kyoto-u.ac.jp>
To: Thomas.Fischbacher@Physik.Uni-Muenchen.DE
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Hashtbl and destructive operations on keys
Date: Tue, 23 Mar 2004 09:59:14 +0900	[thread overview]
Message-ID: <20040323095914D.garrigue@kurims.kyoto-u.ac.jp> (raw)
In-Reply-To: <Pine.LNX.4.58.0403222348530.30579@seekar.cip.physik.uni-muenchen.de>

From: Thomas Fischbacher <Thomas.Fischbacher@Physik.Uni-Muenchen.DE>

> What I rather want to do is roughly along the following lines:
> 
> 
> let nonconsing_accum f_combine f_copy ht h_key h_val =
>   if Hashtbl.mem ht h_key
>   then Hashtbl.replace ht h_key 
>       (f_combine (Hashtbl.find ht h_key) h_val)
>   else Hashtbl.add ht (f_copy h_key) h_val
> ;;
> 
> let result =
>   (
>    let ht = Hashtbl.create 10 in
>    let arr = Array.make 4 0 in
>    for i = 0 to 1000 do
>      for k = 0 to 4-1 do
>        arr.(k) <- Random.int 5;
>      done;
>      nonconsing_accum (+) Array.copy ht arr 1;
>    done;
>    Hashtbl.fold (fun k v so_far -> (k,v)::so_far) ht [];
>   )
> ;;
> 
> Former experience with CMU CL tells me that this technique of not 
> unnecessarily consing hash keys can easily lead to a performance gain of 
> a factor of 10 - depending on the application, of course.
> 
> But I have a very bad feeling about it as long as I do not have any 
> guarantee that I really can re-use the corpus of a hash key passed to 
> Hashtbl.replace.

If I understand correctly your intention, you are asking whether
Hashtbl.replace may end-up putting parts of your key inside the
hash table?
Looking at the source code for Hashtbl.replace, this may only happen
if an equivalent key was not already in the table, and since your
program only calls replace for keys already in the table, this is safe.

But you probably already know that, so what you really want is a
sentence like "the key itself is inserted in the hashtable only if
there was no element to replace, otherwise the original key is kept"
added to the documentation.
Note however that such a comment may be seen as an overspecification:
it is more confusing than useful for normal use...

Moreover, the behaviour for the Map module is different: there, the
new key is used. (This is for the add function, but it happens to have
the same semantics as Hashtbl.replace)
And this is also different from using Hashtbl.remove followed by
Hashtbl.add, which was supposed to have the same semantics as
Hashtbl.replace.

> Yes, one answer certainly is "just make the hash value a ref and work on 
> its contents". But there are indeed some crazy situations where one is 
> better off using the above technique.

I'm wondering what kind of crazy situation.
Looks to me that that using a ref would be more efficient:
  let nonconsing_accum f_combine f_copy ht h_key h_val =
    try
      let h_ref = Hashtbl.find ht h_key in
      h_ref := f_combine !h_ref h_val
    with Not_found ->
      Hashtbl.add ht (f_copy h_key) (ref h_val)
  ;;
In your approach your are accessing 3 times the hash table, where once
should be enough.

However one might argue that nonconsing_accum should be a primitive on
hashtables, as it can be implemented more efficiently by directly
accessing the internal representation.

Jacques Garrigue

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


  reply	other threads:[~2004-03-23  0:59 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-22 18:16 Thomas Fischbacher
2004-03-22 19:09 ` Remi Vanicat
2004-03-22 23:13   ` Thomas Fischbacher
2004-03-23  0:59     ` Jacques Garrigue [this message]
2004-03-23  9:20     ` Correnson Loïc
2004-03-23 12:12       ` Thomas Fischbacher
2004-03-23  9:45     ` Oliver Bandel
2004-03-22 23:44 Oliver Bandel
2004-03-22 23:44 Oliver Bandel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040323095914D.garrigue@kurims.kyoto-u.ac.jp \
    --to=garrigue@kurims.kyoto-u.ac.jp \
    --cc=Thomas.Fischbacher@Physik.Uni-Muenchen.DE \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).