On Mon, 2005-07-25 at 20:20 -0500, Brian Hurt wrote:

> No- variable length arrays are needed when you're implementing other data 
> structures on top of arrays, like stacks or queues.  At which point I 
> start asking which data structure you really need- a variable length 
> array, or a stack or queue?

I need to represent what is naturally a graph of some kind:
I have a Hashtable mapping function number to code,
where the code is a list of instructions (including labels
and branches to them). The main
kind of optimisation is inlining function calls.

I am not currently using any sophisticated analysis,
although I would like to: in part this is because
the data structure I'm using isn't very good.

There is a tension between a good data structure
in the abstract (for example a graph) and one which
is good in Ocaml -- Ocaml can work with inductive
data structures like lists well because of pattern
matching, but abstract data structures are much
harder to use.

The main problem is that rewriting rules are intrinsically
mutations, which makes caching very hard. I do some of this
and it is a mess if I forget to update a cache.

An entirely functional approach is likely to be absurdly
inefficient: you simply cannot recalculate the properties
of the code after a rewrite rule is applied from
scratch every time.

What my code actually does is optimise special cases
found by pattern matching -- and other algorithms
try to coerce the code into this form. For example:


	fun f (x:int):int = {
		y := f (x - 1);
		return y
	|

This representation of a function does not contain
a direct tail call. However by subtitution of y
we can reduce the code to:

	return f (x - 1);

which is in tail form, and can be recognized by a
pattern match as a tail call. Additional logic determines
the call is a self call, so the function is rewritten as:

	fun f(x:int):int = {
		var a = x;
	loop:>
	 	a' := a - 1;
		a = a';
		goto loop;
	}

and a bit more work eliminates a' (this is entirely
non-trivial when the parameter is a tuple, recall
previous discussion of parallel assignment).

The point is that processing this requires three or four 
distinct processes: the substitution is justified by a limited
kind of linear data flow analysis which is enough to handle
unpacking an expression in to SSA form, the tail-call is
done with a single pattern match, and the tail-recursion
is handled by a check and a transformation that needs
to know the current function name, and also needs to 
modify the code at the start of the function.  In addition
you don't want to do that twice if there are TWO 
self-tail-rec calls.

Whilst there is no particular guarantee of closure
elimination, in common cases the process does seem
to generate efficient code -- testing on Ackermann's
function shows the result is faster than any other
language translator including both Ocamlopt and gcc,
probably because one less word is required on the
machine stack.

In addition, I doubt there is any single best
theory of optimisation with guarantees, at least
in 2005 :)

So given my explanation of the kinds of things
I'm doing I'd be interested to learn what kind
of data structure you think I should be using.

-- 
John Skaller <skaller at users dot sourceforge dot net>