From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 4EA11BC88 for ; Sun, 6 Feb 2005 23:26:26 +0100 (CET) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.197]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j16MQPaj010646 for ; Sun, 6 Feb 2005 23:26:25 +0100 Received: by rproxy.gmail.com with SMTP id i8so498151rne for ; Sun, 06 Feb 2005 14:26:25 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=brI30XeaVsBMZqcHTpzGh7SdCNe0rSaQzmf++OhgdE4qo4wluUSDGgRWDzAVQIpP3TEGvawGcwZuzUMmZjQi1Fum1UCGJwCsvowvmVOW0V5jXkjWiuVy4WTPKQVnVioKSZFA4SpqGipSmnc3ekTn6LzV3m7Qcs/NFuwpO1LnLB4= Received: by 10.38.101.30 with SMTP id y30mr18268rnb; Sun, 06 Feb 2005 14:26:25 -0800 (PST) Received: by 10.38.65.58 with HTTP; Sun, 6 Feb 2005 14:26:25 -0800 (PST) Message-ID: <7f8e92aa05020614264edc454@mail.gmail.com> Date: Mon, 7 Feb 2005 00:26:25 +0200 From: Radu Grigore Reply-To: Radu Grigore To: Jon Subject: Re: [Caml-list] The boon of static type checking Cc: caml-list@yquem.inria.fr In-Reply-To: Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <891bd33905020213315a2ebb18@mail.gmail.com> <7f8e92aa0502060222383aac60@mail.gmail.com> X-Miltered: at nez-perce with ID 42069991.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 wrote:01 ocaml:01 stack:01 stack:01 recursive:01 unbalanced:01 corresponds:01 ocaml:01 unreasonable:01 stl:01 stl:01 ocaml's:01 unreasonable:01 hashtbl:01 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=RCVD_BY_IP autolearn=disabled version=3.0.2 X-Spam-Level: On Sun, 06 Feb 2005 09:32:19 -0800 (PST), Jon wrote: > Thus, I would consider the ability of the > (functional) ocaml Set and Map to fork lineages to be more important. By "fork lineages" you mean keeping "old" versions of the map around? I never needed that but I agree it is cute and that it is likely that I'll need it sometime soon. > > True, the > > stack will probably never go above 40 calls but it is a problem if you > > want to accumulate large structures. > > I would like to think that your stack will be big enough to fit 40 > recursive calls if you expect your computer to handle trillion-element > maps! In the worst case (most unbalanced) 40 corresponds to a few million keys, not trillions. If I haven't made a mistake the minimum number of keys in a (OCaml) Map with height h is given by the recurrence: T(h)=1+T(h-1)+T(h-3). > > Yet another problem is the missing cardinal function. > An arbitrary number of functions are "missing". You can't expect INRIA to > implement all infinity of them for you! :-) Considering the fact that the code in set.ml and map.ml looks like it is copy&pasted I guess asking for some consistency in the interface is not unreasonable. In any case copy&pasting the cardinal function wouldn't provide better performance (I think). > The STL will probably take <<1ms because the STL's size() is probably O(1) > whereas Map's fold is O(n). In contrast, forking lineages is probably O(n) > with the STL (requiring a copy) and O(1) with ocaml's Map. I haven't actually tested with STL but the implementation of size() is indeed a simple return. Do you have an example where forking lineages is useful? As I said, I don't doubt there are situations where it is useful; just that right now I'm having trouble coming up with a realistic example. > You seem to be complaining about poor performance when you know you are > using the wrong data structure. Talk about unreasonable expectations. :-) I am exploring possibilities right now. My only "complaint" is that switching from Hashtbl to Map for a dictionary of about 200000 keys decreases the performance a LOT! I would have expected a factor of 20-30 but it seems to be a lot more: probably due to memory management (as Gerd Stolpmann notices) due to the functional nature of Map. > IIRC, the exact data structure you want is already on the hump. HTH. Thanks, I'll search for it. -- regards, radu http://rgrig.idilis.ro/