From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id A4E1BBC6B for ; Wed, 27 Jun 2007 22:55:35 +0200 (CEST) Received: from smtp2f.orange.fr (smtp2f.orange.fr [80.12.242.152]) by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id l5RKtZhQ016787 for ; Wed, 27 Jun 2007 22:55:35 +0200 Received: from me-wanadoo.net (localhost [127.0.0.1]) by mwinf2f21.orange.fr (SMTP Server) with ESMTP id 2B0A47000092 for ; Wed, 27 Jun 2007 22:55:35 +0200 (CEST) Received: from [192.168.1.162] (ALagny-153-1-98-23.w90-3.abo.wanadoo.fr [90.3.145.23]) by mwinf2f21.orange.fr (SMTP Server) with ESMTP id E5E8F7000089; Wed, 27 Jun 2007 22:55:34 +0200 (CEST) X-ME-UUID: 20070627205534941.E5E8F7000089@mwinf2f21.orange.fr Mime-Version: 1.0 (Apple Message framework v752.2) In-Reply-To: <4682CA02.4040006@janestcapital.com> References: <200706271314.35134.jon@ffconsultancy.com> <1A1D6F56-B3DB-4552-969C-9859482175AC@lrde.epita.fr> <46826C69.5060802@functionality.de> <20070627191633.73ab011a@kerneis.info> <342AAB2F-A76C-42C8-A5F1-4139BE2FE4A9@lrde.epita.fr> <4682BF04.8090606@janestcapital.com> <4682CA02.4040006@janestcapital.com> Content-Type: text/plain; charset=ISO-8859-1; delsp=yes; format=flowed Message-Id: <25B2FF5F-7C58-4F70-B1AE-0FD7577FB521@lrde.epita.fr> Content-Transfer-Encoding: quoted-printable From: =?ISO-8859-1?Q?Qu=F4c_Peyrot?= Subject: Re: [Caml-list] Book about functional design patterns Date: Wed, 27 Jun 2007 22:55:33 +0200 To: Brian Hurt , caml-list@yquem.inria.fr X-Mailer: Apple Mail (2.752.2) X-Miltered: at concorde with ID 4682CEC7.002 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; lrde:01 ocaml:01 allocations:01 allocating:01 stack:01 long-time:01 ocaml:01 allocating:01 stack:01 mutable:01 pointer:01 node:01 nodes:01 node:01 nodes:01 On Jun 27, 2007, at 10:35 PM, Brian Hurt wrote: > Qu=F4c Peyrot wrote: > >> >> On Jun 27, 2007, at 9:48 PM, Brian Hurt wrote: >> >>> In Ocaml, allocations are relatively cheap- a cost similiar to =20 >>> that of allocating on the stack. Which is why when you tell =20 >>> long-time Ocaml programmers that you want to avoid an allocation =20= >>> cost by allocating on the stack, they tend to go "um, why?" >>> >>> Mutable data structures have their cost as well. When you assign =20= >>> a pointer into an object old enough to be in the major heap, =20 >>> Ocaml kicks off a minor collection. For small N, this can often =20= >>> make O (log N) purely functional structures faster than their O=20 >>> (1) imperative counterparts. >>> >>> No to mention the correctness advantages, plus other advantages. >> >> >> If I have a tree/map datastructure and I add an element to it, my =20= >> understanding it that, when building the new tree, all the node up =20= >> to the root are going to be replaced. Is my understanding correct? > > No. Only those elements that change need to be reallocated- the =20 > rest can be shared between the old and new tree. So, assuming the =20 > tree is a balanced tree, only the log N or so nodes between the =20 > root and the changed node will need to be reallocated, of the N =20 > nodes in the tree- so you're sharing N - log N nodes between the =20 > two trees. That's exactly what I meant with "all the nodes up to the root" (you =20 need to change a leaf, for that you need to change its parent, but =20 for that you ... and so on up to the root). >> Now let's say I want to build a tree with millions elements, and =20 >> I'm only interested in the final result, i.e. I don't need to be =20 >> able to rollback to a previous state or fancy stuff like that (I =20 >> therefore never keep a reference to the root of the old tree, =20 >> each time I add a new element). >> Is there a way of writing the building code (with a fold-type of =20 >> construct or something like that) such that the compiler will =20 >> understand it, and will generate a code equivalent to the =20 >> imperative code (i.e. we don't allocate new nodes up to the root =20 >> each time we insert an element)? >> > With a million element tree, you're only reallocating 20-30 nodes =20 > (assuming it's balanced), and sharing the other 999,970-999,980 =20 > nodes between the two trees. Allocating each new node is only =20 > going to be a small number of clock cycles, like 5-10 clocks. So =20 > the total allocation cost is only 100-200 clocks or so. I don't understand that part. Each time you add a node, log2(n) node =20 needs to be re-allocated. Hence the total number of reallocations is: Sum(ceil(log2(i)), i =3D 1..N) with N =3D 1000000, which is significant. Where is my mistake in these maths? --=20 Best Regards, Qu=F4c