caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: John Carr <jfc@MIT.EDU>
To: Ville-Pertti Keinonen <will@exomi.com>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Function Inlining
Date: Tue, 02 Aug 2005 11:03:05 -0400	[thread overview]
Message-ID: <200508021503.j72F35hR011108@all-night-tool.mit.edu> (raw)
In-Reply-To: Your message of "Tue, 02 Aug 2005 09:44:29 +0300." <42EF164D.1010201@exomi.com>


> You're probably running into restrictions other than size that OCaml has 
> on function inlining, such as function definitions inside the function, 
> structured constants and let rec.
> 
> Getting rid of these restrictions could potentially improve OCaml code 
> generation considerably, but as far as I can tell, that would require 
> some additional features in the code generation such as sharing 
> structured constants, explicitly transforming tail calls to loops etc.

I have been working on part of this.

My 64 bit SPARC code generator produces bad assembly code when
compiling the TK library.  The failure is caused by poor code
generation in the intermediate phase of the compiler.  There is a
nested (not top level) module named Tkintf containing hundreds of
values.  The code generated to initialize Tk.Tkintf looks like:

	1. compute the hundreds of values to be stored in the Tkintf module
	2. allocate a block to hold Tkintf
	3. store the hundreds of values computed earlier into the Tkintf block
	4. store a pointer to the Tkintf block into the appropriate field of Tk

This sequence results in hundreds of live values.  Most have to be
spilled to the stack between 1 and 3, but the stack can only hold
about 256 values.  (Because ocamlopt does not use register windows
it is limited to a 2 kilobyte stack frame.)  The assembler fails
with address offset out of range errors.

The ocaml developers were aware of this sort of problem and a
different method is used to initialize top level modules, though
there is a separate inefficiency there because it isn't necessary to
compute at runtime the majority of module values which are constant,
either integer constants or pointers to statically allocated blocks.

I have been experimenting with making both constant closures and
structured constants shareable and (for top level modules) compiling
constant initializers into the data section.  So given a module like

	let x = 1 let y = (1,2,3) let rec f x = ... and g x = ...

the assembly code would look like (with tag values omitted for clarity):

	.data
L1:	# begin three value tuple
	.word	3	# integer 1
	.word	5	# integer 2
	.word	7	# integer 3
L2:	# begin block describing mutually recursive functions f and g
	.word	1	# f arity
	.word	f	# pointer to function f
	.word	1	# g arity
	.word	g	# pointer to function g
module:
	.word	3	# integer 1
	.word	L1	# pointer to (1,2,3)
	.word	L2	# pointer to closure for f
	.word	L2+8	# pointer to closure for g

Values not computable at compile time cause a .skip directive and the
module entry code computes the initial value as in the current compiler.

This solves the stack overflow problem.  I'm not convinced that
the asmcomp phase of the compiler is the right place to do this
optimization.  Perhaps it should be moved to the intermediate
phase where the top level module optimization is done now.

(This is more than just optimizing the performance of code that is
executed once.  For TK a substantial fraction of the code size is
the module entry point.  Also, a few jobs ago I cut the startup time
of a product I worked on by several seconds by optimizing the dynamic
loading process, which is similarly a one time act.)


  reply	other threads:[~2005-08-02 15:03 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-08-01 21:24 Jonathan Bryant
2005-08-02  6:44 ` [Caml-list] " Ville-Pertti Keinonen
2005-08-02 15:03   ` John Carr [this message]
2005-08-03  7:16     ` Ville-Pertti Keinonen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200508021503.j72F35hR011108@all-night-tool.mit.edu \
    --to=jfc@mit.edu \
    --cc=caml-list@inria.fr \
    --cc=will@exomi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).