From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id CB3F6BC88 for ; Sun, 6 Feb 2005 11:22:53 +0100 (CET) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.206]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j16AMrir005310 for ; Sun, 6 Feb 2005 11:22:53 +0100 Received: by rproxy.gmail.com with SMTP id i8so411457rne for ; Sun, 06 Feb 2005 02:22:52 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=XGoyTnihET0plsHFW9zVIPH/n+FunbnHrPS50st/N0XJudOR2XD9Pan3l8zC+hjfU9lV2Wcq7lhpTDIQi3u4BsVXPZI1vVttJRaQpZ5QibXdWjs8p5eYNrdB+VClmhB85F/Rjo7kpZW6D6+3nD3tVXx8yCe2JJZpqdpANA9dEFo= Received: by 10.38.97.13 with SMTP id u13mr211478rnb; Sun, 06 Feb 2005 02:22:52 -0800 (PST) Received: by 10.38.65.58 with HTTP; Sun, 6 Feb 2005 02:22:52 -0800 (PST) Message-ID: <7f8e92aa0502060222383aac60@mail.gmail.com> Date: Sun, 6 Feb 2005 12:22:52 +0200 From: Radu Grigore Reply-To: Radu Grigore To: Jon Harrop Subject: Re: [Caml-list] The boon of static type checking Cc: caml-list@yquem.inria.fr In-Reply-To: <200502041026.56107.jon@jdh30.plus.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <891bd33905020213315a2ebb18@mail.gmail.com> <877e9a170502031856175260c8@mail.gmail.com> <877e9a17050203185674680413@mail.gmail.com> <200502041026.56107.jon@jdh30.plus.com> X-Miltered: at nez-perce with ID 4205EFFD.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 wrote:01 stl:01 stl:01 recursive:01 stack:01 ocaml's:01 associative:01 hashtbl:01 hashtbl:01 hand-written:01 1975:98 structures:01 tail:01 strings:01 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=RCVD_BY_IP autolearn=disabled version=3.0.2 X-Spam-Level: On Fri, 4 Feb 2005 10:26:55 +0000, Jon Harrop wrote: > The code to implement a simple tree is unwieldy in C++ (e.g. 1975 LOC > in stl_tree.h and stl_map.h vs 198 LOC in map.ml). The implementation in stl_tree is not a "simple" tree. The difference in size is also because the STL implementation is more efficient. For one thing RB trees are better balanced than HB(2) trees. Another observation is that fold on maps is not tail recursive. True, the stack will probably never go above 40 calls but it is a problem if you want to accumulate large structures. Yet another problem is the missing cardinal function. I have a StringMap with about 200000: it takes about 3 minutes (!!!) to get its length using a fold. That would take something like 1 millisecond in C++. I'm a bit annoyed right now with OCaml's Map because I'm writting a tool where I need to construct big associative tables from strings to other data. An implementation with Hashtbl makes the program work in 1 minute. The problem is that the next execution phase needs to fold over the tables in increasing order of keys. Simple, no? Switch from Hashtbl to Map. Right. Now the same program doesn't finish executing in 3 hours. I would have expected a running time of about 20-30 minutes. Maybe a hand-written lexical-tree will be good enough. -- regards, radu http://rgrig.idilis.ro/