From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id 5D2D3BB9A for ; Wed, 5 Oct 2005 10:42:09 +0200 (CEST) Received: from smtp3.adl2.internode.on.net (smtp3.adl2.internode.on.net [203.16.214.203]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j958g7HH028070 for ; Wed, 5 Oct 2005 10:42:08 +0200 Received: from rosella (ppp2-148.lns1.syd7.internode.on.net [59.167.2.148]) by smtp3.adl2.internode.on.net (8.12.9/8.12.6) with ESMTP id j958g1cv000848; Wed, 5 Oct 2005 18:12:02 +0930 (CST) (envelope-from skaller@users.sourceforge.net) Subject: Re: Ant: Re: Ant: Re: Ant: Re: Ant: Re: [Caml-list] Avoiding shared data From: skaller To: Martin Chabr Cc: caml-list@yquem.inria.fr In-Reply-To: <20051004180926.81268.qmail@web26810.mail.ukl.yahoo.com> References: <20051004180926.81268.qmail@web26810.mail.ukl.yahoo.com> Content-Type: text/plain Date: Wed, 05 Oct 2005 18:42:01 +1000 Message-Id: <1128501721.16267.93.camel@rosella> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 Content-Transfer-Encoding: 7bit X-Miltered: at concorde with ID 434391DF.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 avoiding:01 mutable:01 caching:01 computed:01 overloading:01 unification:01 recursive:01 cached:01 hashtbl:01 hash:01 hashtables:01 caching:01 wrote:01 sourceforge:01 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.3 On Tue, 2005-10-04 at 20:09 +0200, Martin Chabr wrote: > > Mutable shared data is very useful: caching is > > an obvious example where this is precisely what you > > want. > This sounds very interesting. Can you say a little > more about it please? Well others know more about techniques like HashConsing and memoisation than I do .. but basically it is sometimes useful for a pure function which is expensive to evaluate, to have a cache of results already computed. For example, in Felix I have to do 'lookup' to bind string names of functions to entries in the symbol table. The lookup is very expensive for functions, because Felix supports overloading, which means the argument type has to be calculated, and unification against bound signatures of candidates is used to select the most specialised matching function. the problem is that to bind the types required to do this *also* requires lookup (and it can even be recursive). So in the process of doing all this lookup, it would be nice to cache the results so that the 'expensive' lookup is only done once, and after that the cached value is used. In fact I tried that using a Hashtbl .. but because the keys were complex, it turned out that polymorphic compare and hash functions were actually *slower* than my nasty purely functional lookup code. However, I do cache the environments, and this save some time: environments are created by a call like build_env (parent) which follows the chain of parents to ground, gathering all the symbols defined in each one (resulting in a list of hashtables). This process isn't all that slow .. the problem is that in a purely functional system with random symbol lookups, it has to be done once for EVERY lookup .. not just once when you do a lookup, since as noted above doing a lookup can spawn a hundreds of other lookups. So caching the 'environment' in which to do the lookups was a good way to speed up the process, without changing the functional style of the code. -- John Skaller Felix, successor to C++: http://felix.sf.net