From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id 2C4B0BC69 for ; Tue, 14 Aug 2007 22:52:53 +0200 (CEST) Received: from pih-relay06.plus.net (pih-relay06.plus.net [212.159.14.133]) by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id l7EKqqq2027522 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NO) for ; Tue, 14 Aug 2007 22:52:52 +0200 Received: from [80.229.56.224] (helo=beast.local) by pih-relay06.plus.net with esmtp (Exim) id 1IL3Nj-0003Sa-Vc for caml-list@yquem.inria.fr; Tue, 14 Aug 2007 21:52:52 +0100 From: Jon Harrop Organization: Flying Frog Consultancy Ltd. To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Weak hash table for attaching extra data to an object Date: Tue, 14 Aug 2007 21:44:15 +0100 User-Agent: KMail/1.9.7 References: <20070814101535.GA14485@furbychan.cocan.org> <46C18D1B.2070303@functionality.de> In-Reply-To: <46C18D1B.2070303@functionality.de> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200708142144.15414.jon@ffconsultancy.com> X-Miltered: at concorde with ID 46C21624.001 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; hash:01 node:01 nodes:01 arises:01 algebra:01 expr:01 expr:01 seq:01 seq:01 nodes:01 recursive:01 aexpr:01 aexpr:01 -rectypes:01 hashing:01 On Tuesday 14 August 2007 12:08:11 Thomas Fischbacher wrote: > Richard Jones wrote: > > I have a module which I can't edit[1]. This module, let's call it > > Document, stores a representation of an XML tree in internal node > > objects: > > > > type intnode > > > > What I want to do is -- completely outside Document -- attach my own > > extra information to these nodes. > > Ah, so the situation really is somewhat common, and not something > just I am running into occasionally. (The problem in particular > also arises occasionally when writing symbolic algebra code when one > wants to attach meta-data (e.g. on typesetting) to terms without > touching the underlying term representations in any way.) I think this is an important design choice that people should be aware of when they first write the code. If you think the product types inside your expr type will be extended: type expr = | Int of int | Var of string | Seq of expr * expr array;; with pattern matches like this: Seq(h, t) -> ... Then a good first step is to turn n-ary constructor arguments into a single record: type expr = | Int of int | Var of string | Seq of seq and seq = {h: expr; t: expr array};; Then you write your pattern matches like this: Seq{h=h; t=t} -> ... and you are free to add more fields to the record without having to update such pattern matches. You also need to replace direct construction with an explicit constructor function: let seq h t = Seq{h=h; t=t} You can also parameterize the type over metadata: type 'a expr = | Int of int | Var of string | Seq of 'a seq and 'a seq = {h: 'a expr; t: 'a expr array; meta: 'a};; Now you can insert arbitrary metadata into Seq nodes. Another possible solution (at design time) is to "untie the recursive knot" of the data structure: type 'expr aexpr = | Int of int | Var of string | Seq of 'expr * 'expr array;; This allows you to bind data structures together and, in particular, lets you inject metadata nodes between levels of exprs in the tree: type 'a expr = Meta of 'a * 'a expr aexpr;; (This can be done more efficiently using -rectypes) If your metadata are heavily used (e.g. hashing or culling information) and you are designing your own data structure then these approaches can all be useful (as well as polymorphic variants, that allow other forms of extension) because these approaches all result in fast code. If your metadata are sporadically used or, in particular, only sparsely present or you do not have access to the library then the weak hash table approach is probably the way to go. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. OCaml for Scientists http://www.ffconsultancy.com/products/ocaml_for_scientists/?e