From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id RAA02002; Sat, 11 May 2002 17:47:28 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id RAA02177 for ; Sat, 11 May 2002 17:47:27 +0200 (MET DST) Received: from moutng0.schlund.de (moutng0.kundenserver.de [212.227.126.170]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id g4BFl9n04431 for ; Sat, 11 May 2002 17:47:15 +0200 (MET DST) Received: from [212.227.126.160] (helo=mrelayng0.kundenserver.de) by moutng0.schlund.de with esmtp (Exim 3.22 #2) id 176Z5R-0003JD-00; Sat, 11 May 2002 17:47:09 +0200 Received: from [80.129.100.158] (helo=gate.gerd-stolpmann.de) by mrelayng0.kundenserver.de with asmtp (Exim 3.22 #2) id 176Z5Q-0001FZ-00; Sat, 11 May 2002 17:47:08 +0200 Received: from ice.gerd-stolpmann.de (ice.gerd-stolpmann.de [192.168.0.13]) by gate.gerd-stolpmann.de (Postfix) with ESMTP id 83628CCCF; Sat, 11 May 2002 17:47:06 +0200 (CEST) Received: from ice (localhost [127.0.0.1]) by ice.gerd-stolpmann.de (Postfix) with ESMTP id 6F3F01AD75; Sat, 11 May 2002 17:47:05 +0200 (MEST) Date: Sat, 11 May 2002 17:47:04 +0200 From: Gerd Stolpmann To: Stefano Zacchiroli Cc: caml-list@inria.fr Subject: Re: [Caml-list] Suggestion about balanced trees in stdlib Message-ID: <20020511174704.A632@ice.gerd-stolpmann.de> References: <20020510185954.C635@ice.gerd-stolpmann.de> <20020510223110.GB31219@cs.unibo.it> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit In-Reply-To: <20020510223110.GB31219@cs.unibo.it>; from zack@cs.unibo.it on Sat, May 11, 2002 at 00:31:10 +0200 X-Mailer: Balsa 1.2.4 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On 2002.05.11 00:31 Stefano Zacchiroli wrote: > On Fri, May 10, 2002 at 06:59:54PM +0200, Gerd Stolpmann wrote: > > Of course, Set and Map would recur to that more special module. > > > > Balanced_tree would have the representation of Map, i.e. the elements are > > pairs of keys and attached values. To emulate Set, the value () is used. > > I don't know how the compiler treat unit values wrt optimization or > such, but using an unit in each node to emulate Set seems to me > inefficient for a module of the standard library. > > I think, but I haven't thought a lot about it, that the problem can be > solved using an implementation polymorphic in the leaf type hidden to > the use that access Set and Map through functors as usual. type 'a map = Empty | Node of 'a map * key * 'a * 'a map * int type set = Empty | Node of set * elt * set * int The difference is that every map node has 6 words, and every set node consists of 5 words. My suggestion is to prefer the map representation, and define set = unit map, wasting one word per node, and making sets 20% larger. The other possibility would be to prefer the set representation, e.g. type 'a baltree = Empty | Node of 'a baltree * 'a elt * 'a baltree * int and to reduce map and set with some functor tricks to this representation. Now sets are not larger, but maps require 8 words of memory, because you have to use an additional pair 'a elt = ('a, map_elt), i.e. maps are 33% larger. I think this is the worse choice. BTW, if memory consumption matters, the current representation can be still optimized by removing the balance factor, and encoding it in the variant tag: type set = Empty | Node_0 of set * elt * set | Node_l of | Node_r of Of course, the readability of the code would suffer, but one can argue that the overall performance is more important than code beauty for the standard library. However, the point is that it is a matter of discussion what is acceptable in the stdlib, because there are a lot of views to this central library. I would prefer to pay an increased memory consumption of 20% for sets, and to get the possibility to use balanced trees directly for that costs. Maybe other people have different views on this. > Anyway the idea seems really good. > > I also suggest to add different kind of visit (i.e. not only iter) like > {pre,post,in}visit via different functions or specifing the desired > behaviour when using the functor or both. Why not export functions that access the subtrees directly: left_subtree, right_subtree, top_element. Gerd -- ---------------------------------------------------------------------------- Gerd Stolpmann Telefon: +49 6151 997705 (privat) Viktoriastr. 45 64293 Darmstadt EMail: gerd@gerd-stolpmann.de Germany ---------------------------------------------------------------------------- ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners