From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <skaller@users.sourceforge.net>
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39])
	by yquem.inria.fr (Postfix) with ESMTP id 5D2D3BB9A
	for <caml-list@yquem.inria.fr>; Wed,  5 Oct 2005 10:42:09 +0200 (CEST)
Received: from smtp3.adl2.internode.on.net (smtp3.adl2.internode.on.net [203.16.214.203])
	by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j958g7HH028070
	for <caml-list@yquem.inria.fr>; Wed, 5 Oct 2005 10:42:08 +0200
Received: from rosella (ppp2-148.lns1.syd7.internode.on.net [59.167.2.148])
	by smtp3.adl2.internode.on.net (8.12.9/8.12.6) with ESMTP id j958g1cv000848;
	Wed, 5 Oct 2005 18:12:02 +0930 (CST)
	(envelope-from skaller@users.sourceforge.net)
Subject: Re: Ant: Re: Ant: Re: Ant: Re: Ant: Re: [Caml-list] Avoiding
	shared data
From: skaller <skaller@users.sourceforge.net>
To: Martin Chabr <martin_chabr@yahoo.de>
Cc: caml-list@yquem.inria.fr
In-Reply-To: <20051004180926.81268.qmail@web26810.mail.ukl.yahoo.com>
References: <20051004180926.81268.qmail@web26810.mail.ukl.yahoo.com>
Content-Type: text/plain
Date: Wed, 05 Oct 2005 18:42:01 +1000
Message-Id: <1128501721.16267.93.camel@rosella>
Mime-Version: 1.0
X-Mailer: Evolution 2.2.1.1 
Content-Transfer-Encoding: 7bit
X-Miltered: at concorde with ID 434391DF.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)!
X-Spam: no; 0.00; caml-list:01 avoiding:01 mutable:01 caching:01 computed:01 overloading:01 unification:01 recursive:01 cached:01 hashtbl:01 hash:01 hashtables:01 caching:01 wrote:01 sourceforge:01 
X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr
X-Spam-Level: 
X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled 
	version=3.0.3

On Tue, 2005-10-04 at 20:09 +0200, Martin Chabr wrote:

> > Mutable shared data is very useful: caching is
> > an obvious example where this is precisely what you
> > want.

> This sounds very interesting. Can you say a little
> more about it please?

Well others know more about techniques like HashConsing 
and memoisation than I do .. but basically it is sometimes
useful for a pure function which is expensive to evaluate,
to have a cache of results already computed.

For example, in Felix I have to do 'lookup' to bind
string names of functions to entries in the symbol table.

The lookup is very expensive for functions, because Felix
supports overloading, which means the argument type has
to be calculated, and unification against bound signatures
of candidates is used to select the most specialised matching
function.

the problem is that to bind the types required to do this
*also* requires lookup (and it can even be recursive).

So in the process of doing all this lookup, it would be nice
to cache the results so that the 'expensive' lookup is only
done once, and after that the cached value is used.

In fact I tried that using a Hashtbl .. but because the keys 
were complex, it turned out that polymorphic compare and hash
functions were actually *slower* than my nasty purely functional
lookup code.

However, I do cache the environments, and this save some
time: environments are created by a call like

	build_env (parent)

which follows the chain of parents to ground, gathering
all the symbols defined in each one (resulting in a list
of hashtables). This process isn't all that slow .. the
problem is that in a purely functional system with
random symbol lookups, it has to be done once for EVERY
lookup .. not just once when you do a lookup, since as
noted above doing a lookup can spawn a hundreds of other
lookups. So caching the 'environment' in which to do the
lookups was a good way to speed up the process, without
changing the functional style of the code.

-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net