From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id NAA13765; Tue, 23 Mar 2004 13:12:38 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id NAA11163 for ; Tue, 23 Mar 2004 13:12:37 +0100 (MET) Received: from mail.physik.uni-muenchen.de (mail.physik.uni-muenchen.de [192.54.42.129]) by nez-perce.inria.fr (8.12.10/8.12.10) with ESMTP id i2NCDAKW016434 for ; Tue, 23 Mar 2004 13:13:11 +0100 Received: from localhost (unknown [127.0.0.1]) by mail.physik.uni-muenchen.de (Postfix) with ESMTP id E815420156; Tue, 23 Mar 2004 12:12:35 +0000 (UTC) Received: from mail.physik.uni-muenchen.de ([127.0.0.1]) by localhost (mail.physik.uni-muenchen.de [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 10127-01-99; Tue, 23 Mar 2004 13:12:24 +0100 (CET) Received: from mailhost.cip.physik.uni-muenchen.de (kaiser.cip.physik.uni-muenchen.de [141.84.136.1]) by mail.physik.uni-muenchen.de (Postfix) with ESMTP id 266002027E; Tue, 23 Mar 2004 13:12:24 +0100 (CET) Received: from seekar.cip.physik.uni-muenchen.de (seekar.cip.physik.uni-muenchen.de [141.84.136.52]) by mailhost.cip.physik.uni-muenchen.de (Postfix) with ESMTP id D65798514F; Tue, 23 Mar 2004 13:12:23 +0100 (CET) Received: by seekar.cip.physik.uni-muenchen.de (Postfix, from userid 3092) id CE95179FD; Tue, 23 Mar 2004 13:12:23 +0100 (CET) Received: from localhost (localhost [127.0.0.1]) by seekar.cip.physik.uni-muenchen.de (Postfix) with ESMTP id CC37779FC; Tue, 23 Mar 2004 13:12:23 +0100 (CET) Date: Tue, 23 Mar 2004 13:12:23 +0100 (CET) From: Thomas Fischbacher To: =?iso-8859-1?Q?Correnson_Lo=EFc?= Cc: Ocaml Subject: Re: [Caml-list] Hashtbl and destructive operations on keys In-Reply-To: <00b901c410b8$0c1adb10$0203a8c0@hoedic> Message-ID: References: <871xnky9zk.dlv@vanicat.homelinux.org> <00b901c410b8$0c1adb10$0203a8c0@hoedic> X-BOFH: Daemons did it MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE X-Virus-Scanned: by amavisd-new at physik.uni-muenchen.de X-Miltered: at nez-perce by Joe's j-chkmail ("http://j-chkmail.ensmp.fr")! X-Loop: caml-list@inria.fr X-Spam: no; 0.00; physik:01 caml-list:01 hashtbl:01 hashtbl:01 low-level:01 hashing:01 hashing:01 reuse:01 ocaml's:01 hash:01 collision:01 low-level:01 fragile:01 hash:01 hashes:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk X-Status: X-Keywords: X-UID: 330 On Tue, 23 Mar 2004, Correnson Lo=EFc wrote: > Notice also that implementing your own hashtbl would not be so far > difficult. Yes, I know, and I did so in lisp before. But one thing you must not=20 forget: when doing research, your primary constraint is time. You want to= =20 get problem X solved with as little effort as possible (since everyone=20 looks at your scientific throughput - already assuming your research is of= =20 reasonable quality). Note that I am not talking about=20 compiler/language/other CS research here! Every day, you have a certain=20 amount of mental energy which you can use. And you do not want to spend=20 too much of it on low-level details like implementing your own hashing=20 functions when you have more pressing problems in your mind, like=20 devising a new and contrived algorithm to solve a new and contrived=20 problem, or like the actual background of the problem you are=20 investigating. > For your dedicated use, the code for hashing and replacing this key to th= at > value would not exceed 10 lines, > assuming you can reuse Ocaml's hashing function and implements hash > collision with a simple array and associative lists. This suggestion is not far away from "why don't you use C right away?" - I= =20 do not want to, I want to make use of available powerful tools to I do not= =20 have to spend too much time on uninteresting low-level details. > Instead, your *codes* to work around Hashtbl with or without references m= ay > never answer your question and will remain actually under specified and > fragile for your own purpose. That's precisely why I asked for additional specification. > Moreover, it is clear in your case that the > capability for ocaml hash tables to register several values for one key i= s > useless, since your always use replace and never add for existing key. This is in fact true, but then I must ask whether the "other", more usual= =20 type of hashes provided e.g. by LISP isn't the one that is more widely=20 used, and if there really is so much of an overhead in this feature,=20 why ocaml does not provide the other type of hashes? > To > finish with, your work-around code uses 3 times the hashtbl functions, he= nce > you hash the key 3 times. This is incompatible with looking for performan= ce! Before I talk long... look at the following example: let ht =3D Hashtbl.create 1000000;; for i =3D 0 to 1000000 do let s =3D String.create 5 in for k =3D 0 to 5-1 do String.set s k (Char.chr ((i lsr (k lsl 2)) land=20 15)) done; Hashtbl.add ht s i; done;; (* Linux-specific... *) let h =3D open_in "/proc/self/status" in let rec scan () =3D let line =3D input_line h in if String.sub line 0 7 =3D "VmSize:" then (close_in h; line) else scan() in Printf.printf "MEM: %s" (scan()) ;; Try it. Then substitute the Hashtbl.add s i by Hashtbl.add s (ref i).=20 You'll see that there is roughly 25% difference in RAM consumption. Now,=20 if you are working close to the limit of virtual address space, or even=20 only if this 25% makes the difference whether your program has to page out= =20 or not, this is indeed relevant. But of course, it would be terrific if there were officially=20 sanctioned variants of the hashtable functions that take a pre-computed=20 hash key as an extra arg. --=20 regards, tf@cip.physik.uni-muenchen.de (o_ Thomas Fischbacher - http://www.cip.physik.uni-muenchen.de/~tf //\ (lambda (n) ((lambda (p q r) (p p q r)) (lambda (g x y) V_/_ (if (=3D x 0) y (g g (- x 1) (* x y)))) n 1)) (Debian GNU) ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners