From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <chojin@lrde.epita.fr>
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr
X-Spam-Level: 
X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled 
	version=3.1.3
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39])
	by yquem.inria.fr (Postfix) with ESMTP id A4E1BBC6B
	for <caml-list@yquem.inria.fr>; Wed, 27 Jun 2007 22:55:35 +0200 (CEST)
Received: from smtp2f.orange.fr (smtp2f.orange.fr [80.12.242.152])
	by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id l5RKtZhQ016787
	for <caml-list@yquem.inria.fr>; Wed, 27 Jun 2007 22:55:35 +0200
Received: from me-wanadoo.net (localhost [127.0.0.1])
	by mwinf2f21.orange.fr (SMTP Server) with ESMTP id 2B0A47000092
	for <caml-list@yquem.inria.fr>; Wed, 27 Jun 2007 22:55:35 +0200 (CEST)
Received: from [192.168.1.162] (ALagny-153-1-98-23.w90-3.abo.wanadoo.fr [90.3.145.23])
	by mwinf2f21.orange.fr (SMTP Server) with ESMTP id E5E8F7000089;
	Wed, 27 Jun 2007 22:55:34 +0200 (CEST)
X-ME-UUID: 20070627205534941.E5E8F7000089@mwinf2f21.orange.fr
Mime-Version: 1.0 (Apple Message framework v752.2)
In-Reply-To: <4682CA02.4040006@janestcapital.com>
References: <200706271314.35134.jon@ffconsultancy.com>	<1A1D6F56-B3DB-4552-969C-9859482175AC@lrde.epita.fr>	<46826C69.5060802@functionality.de>	<F2BC247F-45B1-4BD3-A006-D214D635ABC6@lrde.epita.fr>	<20070627191633.73ab011a@kerneis.info>	<342AAB2F-A76C-42C8-A5F1-4139BE2FE4A9@lrde.epita.fr>	<4682BF04.8090606@janestcapital.com> <F4E26A72-D556-4FEF-806B-157A901C8B17@lrde.epita.fr> <4682CA02.4040006@janestcapital.com>
Content-Type: text/plain; charset=ISO-8859-1; delsp=yes; format=flowed
Message-Id: <25B2FF5F-7C58-4F70-B1AE-0FD7577FB521@lrde.epita.fr>
Content-Transfer-Encoding: quoted-printable
From: =?ISO-8859-1?Q?Qu=F4c_Peyrot?= <chojin@lrde.epita.fr>
Subject: Re: [Caml-list] Book about functional design patterns
Date: Wed, 27 Jun 2007 22:55:33 +0200
To: Brian Hurt <bhurt@janestcapital.com>, caml-list@yquem.inria.fr
X-Mailer: Apple Mail (2.752.2)
X-Miltered: at concorde with ID 4682CEC7.002 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)!
X-Spam: no; 0.00; lrde:01 ocaml:01 allocations:01 allocating:01 stack:01 long-time:01 ocaml:01 allocating:01 stack:01 mutable:01 pointer:01 node:01 nodes:01 node:01 nodes:01 


On Jun 27, 2007, at 10:35 PM, Brian Hurt wrote:

> Qu=F4c Peyrot wrote:
>
>>
>> On Jun 27, 2007, at 9:48 PM, Brian Hurt wrote:
>>
>>> In Ocaml, allocations are relatively cheap- a cost similiar to =20
>>> that  of allocating on the stack.  Which is why when you tell =20
>>> long-time  Ocaml programmers that you want to avoid an allocation =20=

>>> cost by  allocating on the stack, they tend to go "um, why?"
>>>
>>> Mutable data structures have their cost as well.  When you assign =20=

>>> a  pointer into an object old enough to be in the major heap, =20
>>> Ocaml  kicks off a minor collection.  For small N, this can often =20=

>>> make O (log N) purely functional structures faster than their O=20
>>> (1)  imperative counterparts.
>>>
>>> No to mention the correctness advantages, plus other advantages.
>>
>>
>> If I have a tree/map datastructure and I add an element to it, my  =20=

>> understanding it that, when building the new tree, all the node up =20=

>> to  the root are going to be replaced. Is my understanding correct?
>
> No.  Only those elements that change need to be reallocated- the =20
> rest can be shared between the old and new tree.  So, assuming the =20
> tree is a balanced tree, only the log N or so nodes between the =20
> root and the changed node will need to be reallocated, of the N =20
> nodes in the tree- so you're sharing N - log N nodes between the =20
> two trees.

That's exactly what I meant with "all the nodes up to the root" (you =20
need to change a leaf, for that you need to change its parent, but =20
for that you ... and so on up to the root).

>> Now let's say I want to build a tree with millions elements, and =20
>> I'm  only interested in the final result, i.e. I don't need to be =20
>> able to  rollback to a previous state or fancy stuff like that (I =20
>> therefore  never keep a reference to the root of the old tree, =20
>> each time I add a  new element).
>> Is there a way of writing the building code (with a fold-type of  =20
>> construct or something like that) such that the compiler will  =20
>> understand it, and will generate a code equivalent to the =20
>> imperative  code (i.e. we don't allocate new nodes up to the root =20
>> each time we  insert an element)?
>>
> With a million element tree, you're only reallocating 20-30 nodes =20
> (assuming it's balanced), and sharing the other 999,970-999,980 =20
> nodes between the two trees.  Allocating each new node is only =20
> going to be a small number of clock cycles, like 5-10 clocks.  So =20
> the total allocation cost is only 100-200 clocks or so.

I don't understand that part. Each time you add a node, log2(n) node =20
needs to be re-allocated. Hence the total number of reallocations is:
Sum(ceil(log2(i)), i =3D 1..N) with N =3D 1000000, which is significant.
Where is my mistake in these maths?

--=20
Best Regards,
Qu=F4c