From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: weis Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id SAA26062 for caml-redistribution; Tue, 26 Jan 1999 18:43:52 +0100 (MET) Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id BAA12422 for ; Tue, 26 Jan 1999 01:15:44 +0100 (MET) Received: from miss.wu-wien.ac.at (miss.wu-wien.ac.at [137.208.107.17]) by nez-perce.inria.fr (8.8.7/8.8.7) with ESMTP id BAA05475 for ; Tue, 26 Jan 1999 01:15:42 +0100 (MET) Received: (from mottl@localhost) by miss.wu-wien.ac.at (8.9.0/8.9.0) id BAA17324 for caml-list@inria.fr; Tue, 26 Jan 1999 01:15:37 +0100 (MET) From: Markus Mottl Message-Id: <199901260015.BAA17324@miss.wu-wien.ac.at> Subject: implementation of set (standard library) To: caml-list@inria.fr (OCAML) Date: Tue, 26 Jan 1999 01:15:37 +0100 (MET) X-Mailer: ELM [version 2.4 PL21] MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Sender: weis Hello again, I have taken a look at the implementation of sets in the standard library and thought that the "add" function could be implemented differently (possibly faster). Here the alternative code: exception Found let rec add2 x s = let rec add2' = function Empty -> Node(Empty, x, Empty, 1) | Node(l, v, r, _) as t -> let c = Ord.compare x v in if c = 0 then raise Found else if c < 0 then bal (add2' l) v r else bal l v (add2' r) in try add2' s with Found -> s I have measured the performance of this function: If about 5% (or more, probably less) of all "adds" meet an already existing element in the set, this function is faster. One should also note that the memory requirements are lower: there is no copying of paths in the case of already existing elements. This trick could also be applied to "add" in "Map". Maybe there are other functions where it can be applied. Two functions I would really like to see in "Set": exception Found let add_elt x s = let el = ref x in let rec add_elt' = function Empty -> Node(Empty, x, Empty, 1) | Node(l, v, r, _) as t -> let c = Ord.compare x v in if c = 0 then begin el := v; raise Found end else if c < 0 then bal (add_elt' l) v r else bal l v (add_elt' r) in try (add_elt' s, x) with Found -> (s, !el) let rec find x = function Empty -> raise Not_found | Node(l, v, r, _) -> let c = Ord.compare x v in if c = 0 then v else if c < 0 then find x l else find x r The "alternative" add function returns a tuple of the set and the element. Note: the element could be a (mutable) object. It could be possible that one wants to insert an object into a set and send further messages to it. But if the object already exists in the set (or better: the comparison function tells me so), it would be fine to send the messages to the object already in the set and to discard the object that would have been added. At least the "find"-function in "set" would be fine. The reason is the same as above: one could send messages to an object found in the set. The only workaround I have at the moment is to use a map instead, where key and value are the same object -> this allows me to use find. Probably not very elegant... A "last wish": an iteration function that respects the order of elements in set and also map. At the moment there is just the function "elements" (only in set), which allows to get the elements "in order". An ordered insertion function would be very powerful, because one could emulate other functions like e.g. "fold" with it (with mutable accumulators), too - but the elements would be used in the computation in the appropriate order... Best regards, Markus -- Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl