caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Structure sharing
@ 1994-03-16 17:01 John Harrison
  1994-03-16 20:26 ` Xavier Leroy
  1994-03-17  9:21 ` Chet Murthy
  0 siblings, 2 replies; 3+ messages in thread
From: John Harrison @ 1994-03-16 17:01 UTC (permalink / raw)
  To: caml-list; +Cc: John.Harrison


My research work is in theorem proving and I'm quite interested in the impact
on efficiency of structure-sharing the datatypes representing terms etc.

Has anyone tried implementing global hash consing in CAML? (That is, ensuring
that you never keep two distinct copies of any "cons cell".) Presumably this
would make the system slower in most applications (the faster EQ for structural
equality testing being overwhelmed by the slowness of constructors). But it
might make it much more usable on really small systems, which is consonant with
the CAML Light ideology.

Alternatively, how well would it work if I used the "hashtbl" library and did
everything myself for the datatypes I'm interested in? Am I making things non
GC-able by putting them in a hash table? How do the hash tables in "hashtbl"
work exactly?

Thanks,

  John Harrison (jrh@cl.cam.ac.uk)

  Hardware Verification Group
  University of Cambridge Computer Laboratory
  New Museums Site
  Pembroke Street
  Cambridge CB2 3QG
  England.




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Structure sharing
  1994-03-16 17:01 Structure sharing John Harrison
@ 1994-03-16 20:26 ` Xavier Leroy
  1994-03-17  9:21 ` Chet Murthy
  1 sibling, 0 replies; 3+ messages in thread
From: Xavier Leroy @ 1994-03-16 20:26 UTC (permalink / raw)
  To: John Harrison; +Cc: caml-list, John.Harrison

To complement Pierre's answer:

> But [hash-consing] might make it much more usable on really small
> systems, which is consonant with the CAML Light ideology.

Generally speaking, the design of Caml Light goes a long way to get
compact code, but does not try very hard to represent data compactly.
For instance, cons cells are 3 words instead of 2. In practice, the
only system where more compact data representations would make a big
difference is the 8086 PC, with its 640k limit; as soon as, say, 2M
are available (which is the case on all Macintoshes and PCs nowadays),
the current representations seems to be unproblematic.

> Alternatively, how well would it work if I used the "hashtbl" library
> and did everything myself for the datatypes I'm interested in? Am I
> making things non GC-able by putting them in a hash table?

Yes. By programming directly in C and using special features of the
current garbage collector (finalized objects and the fact that objects
allocated in the old generation are never relocated), you could get a
more efficient implementation of hash-consing that does not prevent
garbage collection. That's definitely non-trivial.

> How do the hash tables in "hashtbl" work exactly?

The only bit of magic is the hashing primitive, which walks the data
structure representing any Caml Light value (in the same way as
generic structural equality) and computes a hash value based on the
nodes encountered. Apart from this, it's standard dynamic hashing.

- Xavier Leroy




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Structure sharing
  1994-03-16 17:01 Structure sharing John Harrison
  1994-03-16 20:26 ` Xavier Leroy
@ 1994-03-17  9:21 ` Chet Murthy
  1 sibling, 0 replies; 3+ messages in thread
From: Chet Murthy @ 1994-03-17  9:21 UTC (permalink / raw)
  To: John Harrison; +Cc: caml-list


Hi, John,

I also did some work recently with structure-sharing in Coq.
Specifically, I wrote a caml-light generic hash-conser which walks the
in-memory representation in an unsafe manner, "re-sharing" a
datastructure.  Since I don't have type information, I can't detect
the fact that something's mutable, so this only works for purely
applicative data.

Whatever.  Anyway, I had an application where I wanted to read
something in off of disk, and then "un-share" it completely - walk it
completely, so I could hash the leaves of the datastructure into a
symbol table.  But then, well, the datastructure is unshared - I ought
to re-share it, no?  Well, hash-consing was a modest loss on a
reasonable-sized benchmark.

That is, for Constructions at least, the datastructures are simple
enough, and small enough, that hash-consing costs more than it is
worth when it is done "globally"

On the other hand, when I did it on the output of the type-checker,
and the term which I fed as input (coupled, so that the type-checker's
output will be hash-consed "onto" the term being type-checked), I got
a slight speedup.

All this to say that it probably isn't a big win either way, in
caml-light, for Coq.

--chet--




^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~1994-03-17 13:44 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1994-03-16 17:01 Structure sharing John Harrison
1994-03-16 20:26 ` Xavier Leroy
1994-03-17  9:21 ` Chet Murthy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).