caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* RE: [Caml-list] "super-compaction" of values
@ 2001-08-02 13:46 Dave Berry
  2001-08-02 13:28 ` Gregory Morrisett
  0 siblings, 1 reply; 11+ messages in thread
From: Dave Berry @ 2001-08-02 13:46 UTC (permalink / raw)
  To: Gregory Morrisett, Markus Mottl, John R Harrison; +Cc: caml-list

In a similar vein, way back in the 1980's Kevin Mitchell added
hash-consing to the top-level environment of the Edinburgh ML
interpreter.  This saved a great deal of space.

-----Original Message-----
From: Gregory Morrisett [mailto:jgm@cs.cornell.edu]
Sent: 02 August 2001 14:28
To: Markus Mottl; John R Harrison
Cc: caml-list@inria.fr
Subject: RE: [Caml-list] "super-compaction" of values


> > I ended up writing a naive implementation, without making stuff
> > garbage-collectible, but only using it for structures I knew were
> > persistent. To my chagrin, it turned out that Malcolm was absolutely
> > right. Space usage actually went *up*, presumably because 
> the hashing
> > datastructures were large enough to overwhelm the small amount of
> > sharing.

For the TAL type-checker, we used hash-consing to represent type terms
(which are quite big for TAL) and this had a significantly good impact
on performance.  See:

Dan Grossman and Greg Morrisett.  Scalable Certification for Typed
Assembly Language.  In the 2000 ACM SIGPLAN Workshop on Types in
Compilation, Montreal, Canada, September 2000.
www.cs.cornell.edu/talc/papers/tal_scale.ps

Similarly, Zhong Shao's Flint IL uses hash-consing for type terms
quite effectively.  I can dig up the reference if you like.  In
these two settings, one has to worry about the exponential blow
up you can get by turning a DAG into a tree.  In addition, the
hash-consing made structural equality tests quite cheap (O(1))
which is important for type checking or proof verification.  For
TAL, GC was not an issue, though we thought it might be.  For Flint,
if memory serves, they periodically flushed the table or used
some finalization/weak-pointer tricks.  

And long ago, I remember that Eric Cooper hacked the SML/NJ collector
to do hash-consing on major collections for immutable objects so as
to generically compress the heap.  If memory serves, he got fantastic
reductions in overall space (at the expense of much slower collections.)
Those were the days when SML/NJ did a lot of paging on a standard
workstation...

Another interesting data point is that Scott Nettles and his group
at Penn did some work compressing heap images and found that they
compress remarkably well, which suggests that we waste a lot of space.

-Greg
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ:
http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives:
http://caml.inria.fr
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 11+ messages in thread
[parent not found: <200108011606.JAA29019@dhpc0010.pdx.intel.com>]
* Re:  [Caml-list] "super-compaction" of values
@ 2001-08-01 13:12 Damien Doligez
  0 siblings, 0 replies; 11+ messages in thread
From: Damien Doligez @ 2001-08-01 13:12 UTC (permalink / raw)
  To: caml-list

>From: Markus Mottl <markus@mail4.ai.univie.ac.at>

>would return "la" unchanged but "lb" would now share the common structure
>with list_a (= la) (also when inserted in any other order).

Look for "hash-consing" in the (LISP) literature.  But doing it for
arbitrary types looks complex.


>  * Support automatic GC? Or should the user make sure that possible
>    administrative memory overhead is prevented by telling "'a t" that
>    some value need not be remembered for sharing with further values
>    anymore.  The "automatic" version is probably less efficient and
>    much more complicated.

Module Weak is here to help you solve this problem.

-- Damien
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 11+ messages in thread
* [Caml-list] "super-compaction" of values
@ 2001-08-01 10:20 Markus Mottl
  2001-08-01 13:41 ` Thorsten Ohl
  2001-08-02  1:48 ` Alexander V. Voinov
  0 siblings, 2 replies; 11+ messages in thread
From: Markus Mottl @ 2001-08-01 10:20 UTC (permalink / raw)
  To: OCAML

Hello,

has anybody already attempted to implement some kind of "super-compaction"
of arbitrary values, e.g.:

  let list_a = [1, 2, 3, 4]
  let list_b =       [3, 4]

Both are in completely distinct parts of the memory. What I'd find
interesting is a module like:

  type 'a t

  val empty : unit -> 'a t
  val add : 'a -> 'a t -> 'a

which, when applied like this:

  let t = empty () in
  let la = add list_a t in
  let lb = add list_b t in
  ...

would return "la" unchanged but "lb" would now share the common structure
with list_a (= la) (also when inserted in any other order).

If this worked for arbitrary datastructures, one could exploit this to
produce very compact representations of values by maximizing the amount
of sharing between them.

One could use the "Obj"-module for this purpose. Problems, however, are:

  * Probably not useful for cyclic datastructures (too inefficient), but
    support of acyclic structures only would be ok.

  * Support automatic GC? Or should the user make sure that possible
    administrative memory overhead is prevented by telling "'a t" that
    some value need not be remembered for sharing with further values
    anymore.  The "automatic" version is probably less efficient and
    much more complicated.

An extreme form of this implementation could even have functions +
types like:

  val add : 'a -> t -> 'a

This would even allow sharing between representations of different types.

A correct and efficient implementation is probably everything else but
trivial. If anybody has already tried this crazy idea, I'd be interested
to know!

Regards,
Markus Mottl

-- 
Markus Mottl                                             markus@oefai.at
Austrian Research Institute
for Artificial Intelligence                  http://www.oefai.at/~markus
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs  FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr  Archives: http://caml.inria.fr


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2001-08-02 15:56 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-02 13:46 [Caml-list] "super-compaction" of values Dave Berry
2001-08-02 13:28 ` Gregory Morrisett
2001-08-02 15:56   ` Markus Mottl
     [not found] <200108011606.JAA29019@dhpc0010.pdx.intel.com>
2001-08-02  9:42 ` Markus Mottl
  -- strict thread matches above, loose matches on Subject: below --
2001-08-01 13:12 Damien Doligez
2001-08-01 10:20 Markus Mottl
2001-08-01 13:41 ` Thorsten Ohl
2001-08-01 14:09   ` Markus Mottl
2001-08-01 14:29     ` Thorsten Ohl
2001-08-01 15:25       ` Markus Mottl
2001-08-02  1:48 ` Alexander V. Voinov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).