From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id RAA19880; Thu, 2 Aug 2001 17:56:17 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id RAA19875 for ; Thu, 2 Aug 2001 17:56:16 +0200 (MET DST) Received: from fichte.ai.univie.ac.at (fichte.ai.univie.ac.at [131.130.174.156]) by nez-perce.inria.fr (8.11.1/8.10.0) with ESMTP id f72FuFr26415 for ; Thu, 2 Aug 2001 17:56:15 +0200 (MET DST) Received: from kastanie.ai.univie.ac.at (root@kastanie.ai.univie.ac.at [131.130.174.141]) by fichte.ai.univie.ac.at (8.9.3/8.9.3/Debian 8.9.3-21) with ESMTP id RAA17694; Thu, 2 Aug 2001 17:56:13 +0200 Received: (from markus@localhost) by kastanie.ai.univie.ac.at (8.9.3/8.9.3/Debian 8.9.3-21) id RAA18116; Thu, 2 Aug 2001 17:56:13 +0200 Date: Thu, 2 Aug 2001 17:56:13 +0200 From: Markus Mottl To: Gregory Morrisett Cc: OCAML Subject: Re: [Caml-list] "super-compaction" of values Message-ID: <20010802175613.A16445@kastanie.ai.univie.ac.at> References: <8E31D6933A2FE64F8AE3CC1381EEDCE70B319A@NT.kal.com> <24C06C2959910949931A2337CB264DCB052044@opus.cs.cornell.edu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <24C06C2959910949931A2337CB264DCB052044@opus.cs.cornell.edu>; from jgm@cs.cornell.edu on Thu, Aug 02, 2001 at 09:28:04 -0400 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Thu, 02 Aug 2001, Gregory Morrisett wrote: > Similarly, Zhong Shao's Flint IL uses hash-consing for type terms > quite effectively. I can dig up the reference if you like. In > these two settings, one has to worry about the exponential blow > up you can get by turning a DAG into a tree. This is not a problem for me, because the trees really stay trees (as datastructure), only that OCaml will represent them as DAGs in memory after the transformation. Of course, if you want to print such a tree, you'll have to cut exponentially many trees for the paper ;) So there are really two kinds of DAGs: the explicitly constructed one that uses (integer) indices to refer to subtrees, and the "normal" tree, which uses sharing. Unfortunately, I can't use the sharing-tree representation with memory addresses as indices alone, because the GC can move things, which destroys any kind of order relation based on memory addresses to search for matching subtrees... :( > In addition, the hash-consing made structural equality tests quite cheap > (O(1)) which is important for type checking or proof verification. Well, one has to "hash-cons" a freshly generated tree at least once, but then one can find equivalent (sub)trees very fast, because they'll be pointer-equal. If they aren't, they cannot be structurally equivalent. The algorithm I have in mind shouldn't require more than O(n*log(n)) time (n being the number of nodes in the tree) for merging a fresh tree into the hash-consing datastructure. Hm, let's see... > For TAL, GC was not an issue, though we thought it might be. For Flint, > if memory serves, they periodically flushed the table or used some > finalization/weak-pointer tricks. In the first attempt I'll probably use "manual" removing of unneeded trees from the hash-consing datastructure. Maybe then I'll play some finalization tricks that remove unreachable trees automatically, only leaving subtrees shared with not yet removed trees. Thanks to all of you for your numerous answers! Regards, Markus Mottl -- Markus Mottl markus@oefai.at Austrian Research Institute for Artificial Intelligence http://www.oefai.at/~markus ------------------- Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr