live variables

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* live variables
@ 2001-01-12 18:02 Charles Martin
  2001-01-15 15:22 ` Xavier Leroy
  0 siblings, 1 reply; 2+ messages in thread
From: Charles Martin @ 2001-01-12 18:02 UTC (permalink / raw)
  To: caml-list

I would like to follow up to the mailing list a series of posts on 
comp.lang.functional.  Jerome Vouillon explained that the native code 
compiler compiles

	let rec foldl f q = function
	| [] -> q
	| x :: xx -> foldl f (f q x) xx

as if it were written

	let rec foldl f q = function
	| [] -> q
	| l ->
	  let x = List.hd l in
	  let r = f q x in
	  let xx = List.tl l in
	  foldl f r xx

As Daniel Wang points out, this does not preserve the basic liveness 
properties of the original program.  As he asked on the newsgroup, is there 
a deep reason for this, or is this just a bug that has yet to be fixed?

Charles

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: live variables
  2001-01-12 18:02 live variables Charles Martin
@ 2001-01-15 15:22 ` Xavier Leroy
  0 siblings, 0 replies; 2+ messages in thread
From: Xavier Leroy @ 2001-01-15 15:22 UTC (permalink / raw)
  To: Charles Martin; +Cc: caml-list

> I would like to follow up to the mailing list a series of posts on 
> comp.lang.functional.  Jerome Vouillon explained that the native code 
> compiler compiles
> 	let rec foldl f q = function
> 	| [] -> q
> 	| x :: xx -> foldl f (f q x) xx
> as if it were written
> 	let rec foldl f q = function
> 	| [] -> q
> 	| l ->
> 	  let x = List.hd l in
> 	  let r = f q x in
> 	  let xx = List.tl l in
> 	  foldl f r xx
> As Daniel Wang points out, this does not preserve the basic liveness 
> properties of the original program.  As he asked on the newsgroup, is there 
> a deep reason for this, or is this just a bug that has yet to be fixed?

There are two notions of "liveness":
- GC liveness: when is the value of a local variable treated as a
  memory root for the garbage collector?
- Dataflow liveness: is the value of a local variable going to be used
  again later in the same activation of the same function?

For the bytecode interpreter, everything that is stored in the
evaluation stack is a GC root, i.e. live for the GC.  This roughly
means that a variable is GC live at every point in its lexical scope
(with obvious exceptions for tail-calls and for function
abstractions).  Hence, the second version of foldl above
behaves exactly like the alternative version below:

 	let rec foldl f q = function
 	| [] -> q
 	| l ->
 	  let x = List.hd l in
 	  let xx = List.tl l in
 	  let r = f q x in
 	  foldl f r xx

I.e. the variable "l" remains GC live until foldl tail-calls in both
cases.

The native-code compiler actually ties GC liveness with the result of
its dataflow liveness analysis: only local variables that have been
determined dataflow live at a GC point are treated as memory roots.
This is an optimization that may allow earlier deallocation.
For instance, "l" is not live during the call to "f" in the
third version of "foldl" above, but "l" is live during the call to "f"
in the second version.  The head of "l" as well as its head cons could
therefore be reclaimed earlier in the former case than in the latter.

But this is just an optimization, not a guaranteed property.  Indeed,
"the basic liveness properties of the original programs" are quite
hard to define independently of a particular execution model and
liveness analysis algorithm.  So, I certainly don't see this behavior
(no early deallocation of the head of l in the second form above)
as a bug; at most, it's a quality of implementation issue.

Now, why is it that "foldl" is translated to the second form above and
not the third?  This is due to one of the "let" optimizations performed by
ocamlc and ocamlopt:
        let x = e in ... x ... (* one occurence of x only *)
is replaced by
        ... e ...
(This is performed only for certain compiler-generated lets where "e"
is guaranteed to be side-effect free.  There are similar optimizations
when "x" does not occur in the body of the let, and also when "e" is
another variable "y".)  So, for "foldl", the pattern-matcher compiler
first generate something very naive with lots of intermediate lets:
        let rec foldl f q t1 =
          let t2 = t1 in
          match t2 with
            [] -> q
          | _::_ -> let t3 = hd t2 and t4 = tl t2 in
                    foldl f (f q t3) t2
which then gets transformed to:
        let rec foldl f q t1 =
          match t1 with
            [] -> q
          | _::_ -> foldl f (f q (hd t1)) (tl t1)

These optimizations on compiler-generated "let"s is a big win for the
bytecode compiler, both in terms of code size and execution speed, and
also help the native-code compiler to some extent by decreasing
register pressure and workload on the register allocator.

Unfortunately, this can conflict with early reclaimation of data
structures in some cases (for the native-code compiler).  It's really
a code size/speed vs. memory usage conflict.

Ideally, the "let" optimizations should take the life span of
variables into account, at least for the native-code compiler, but I
really don't know how to do this.

- Xavier Leroy

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2001-01-16 10:42 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-01-12 18:02 live variables Charles Martin
2001-01-15 15:22 ` Xavier Leroy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).