caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Resources on OCaml's sharing
@ 2010-03-10  9:45 Matthias Puech
  2010-03-10 17:36 ` [Caml-list] " Peter Hawkins
  0 siblings, 1 reply; 2+ messages in thread
From: Matthias Puech @ 2010-03-10  9:45 UTC (permalink / raw)
  To: caml-list

Dear Camlists,

Does anyone know of a description of the sharing mechanism inside
Ocaml's heap (I guess)? I'm interested in any kind material, from a
formal account in a paper (even not directly related to Caml), to an
informal description and tips of usage, anything that could fill in my
shameful ignorance on that matter. Particularly, I'm trying to understand:
- where architecturally it takes place in the compiler,
- how it is an approximation from the perfect case of maximal sharing
(hash-consing I guess), i.e. what's the algorithm
- when can I safely state that a = b implies a == b.
- how is it that the function below is "smarter" than List.map? What do
we gain, what do we loose?

I understand sharing is part of ML's folklore and I didn't find any
resource on it, but maybe I missed something...

Thank you all in advance,
	-m

<<

let rec list_smartmap f l = match l with
    [] -> l
  | h::tl ->
      let h' = f h and tl' = list_smartmap f tl in
	if h'==h && tl'==tl then l
	else h'::tl'

>>


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Caml-list] Resources on OCaml's sharing
  2010-03-10  9:45 Resources on OCaml's sharing Matthias Puech
@ 2010-03-10 17:36 ` Peter Hawkins
  0 siblings, 0 replies; 2+ messages in thread
From: Peter Hawkins @ 2010-03-10 17:36 UTC (permalink / raw)
  To: Matthias Puech; +Cc: caml-list

Hi...

Here are some comments for ML-like languages in general, not specific to OCaml.

On Wed, Mar 10, 2010 at 1:45 AM, Matthias Puech <puech@cs.unibo.it> wrote:
> shameful ignorance on that matter. Particularly, I'm trying to understand:
> - where architecturally it takes place in the compiler,

It doesn't. The compiler doesn't have any support for recognizing and
exploiting sharing. If a program creates sharing, then that's all well
and good, but the compiler doesn't try to find sharing that wasn't
already present. Indeed in the presence of mutable state as in ML,
it's difficult for a compiler to introduce sharing without changing
the semantics of the program.

How do we ever create sharing as a programmer? Consider the following
OCaml session:

# let x = [1;2;3];;
val x : int list = [1; 2; 3]

This code introduces sharing, by creating two lists "y" and "z" with a
pointer to a common shared tail "x":
# let y = 7::x;;
val y : int list = [7; 1; 2; 3]
# let z = 42::x;;
val z : int list = [42; 1; 2; 3]
# List.tl y == x;;
- : bool = true
# List.tl z == x;;
- : bool = true

On the other hand, this code creates two distinct yet semantically
equal tail lists:
# let y' = 7::[1;2;3];;
val y' : int list = [7; 1; 2; 3]
# let z' = 42::[1;2;3];;
val z' : int list = [42; 1; 2; 3]
# List.tl y' == x;;
- : bool = false
# List.tl z' == x;;
- : bool = false
# List.tl y' == List.tl z';;
- : bool = false


We have semantic equality, but not reference equality:
# y = y';;
- : bool = true
# z = z';;
- : bool = true
# y == y';;
- : bool = false
# z == z';;
- : bool = false


You might want to google for "purely functional data structures" or
"persistent data structures", e.g.:
http://en.wikipedia.org/wiki/Purely_functional

Chris Okasaki's writing is the standard reference, I believe.

> - how it is an approximation from the perfect case of maximal sharing
> (hash-consing I guess), i.e. what's the algorithm

See above. The only sharing is that explicitly created by the programmer.

> - when can I safely state that a = b implies a == b.

Pretty much never. The only way you can know  "a == b" if you know
that variable "b" is a copy of variable "a", or vice versa.

> - how is it that the function below is "smarter" than List.map? What do
> we gain, what do we loose?
> <<
>
> let rec list_smartmap f l = match l with
>    [] -> l
>  | h::tl ->
>      let h' = f h and tl' = list_smartmap f tl in
>        if h'==h && tl'==tl then l
>        else h'::tl'
>


Gain: If "f" is the identity function for some suffix of the list, you
save some space and return the original instance of that suffix,
rather than a copy.
Loss: You spend time doing some pointer comparisons on every iteration
that probably almost never succeed, assuming f is not the identity
function.

Peter


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2010-03-10 17:36 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-10  9:45 Resources on OCaml's sharing Matthias Puech
2010-03-10 17:36 ` [Caml-list] " Peter Hawkins

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).