caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] [PATCH] Dynamic freeing of dynamically loaded code
@ 2003-12-22  7:55 Nuutti Kotivuori
  2003-12-22 11:43 ` Richard Jones
  0 siblings, 1 reply; 3+ messages in thread
From: Nuutti Kotivuori @ 2003-12-22  7:55 UTC (permalink / raw)
  To: caml-list

* NOTE * NOTE * NOTE * NOTE * NOTE * NOTE * NOTE * NOTE * NOTE * NOTE *

  This is a preliminary release of this patch and contains requires
  still a lot of tweaking to be decently usable. Feel free to play
  around with it, but don't expect it to work or solve your problems.

* NOTE * NOTE * NOTE * NOTE * NOTE * NOTE * NOTE * NOTE * NOTE * NOTE *

Yuletime tidings!

So, as the questions on the list on how to make dynamically loaded
code garbage collectable evolved into implementation plans, now the
implementation plans evolved into actual code.

I'll outline really briefly what the implementation consists of:

 - Staticalloc module to have static_alloc'd blocks that are freed on
 finalization.

 - Modification to byterun - on CLOSURE and CLOSUREREC instructions,
 if nvars is below zero, negate it before using and append the last
 element of current env into the created closure.

   - Explanation: making two new bytecode instructions, CLOSUREDYN and
   CLOSURERECDYN, proved to be more difficult than I assumed, so I
   went the trivial way for now.

 - Implement reify_bytecode_with_ref - do as reify_bytecode, but go
 through the codeblock and negate all nvars parameters to CLOSURE and
 CLOSUREDYN and append given ref to the closure generated.

 - Make Dynlink use all this.

 - Fix utils/consistbl.ml to only store last crc given for an
 interface name. Should not affect functionality at all, but prevents
 memory bloat. (In the process of finding this I also noticed that
 Hashtbl resizing will use a non-tail-recursive function to reallocate
 a single bucket - and if that bucket is large enough: stack
 overflow. I filed a bug report on this already.)

What is the result now then:

 - Running Dynlink.loadfile "test.cmo";; on a file 1,000,000 times
 results a memory image rougly 12Mb when finished. Even 50,000 times
 used to make it 40Mb and crash on a stack overflow. Since the memory
 image of ocamlrun at start seems to be around 4Mb, a conservative
 estimate for the amount of memory taken by the loading would be 8Mb -
 that is, roughly 8 bytes per load. And atleast four bytes will
 necessarily go to the allocation of a new global on each time around.

 - Performance impacts of this on either normal bytecode, or
 dynamically loaded code, were totally unnoticeable. Somebody should
 run a much longer tests and prepare them better though. The impact of
 the code should be that executing CLOSURE and CLOSUREDYN instructions
 is a tiny bit more expensive, and each closure created from
 dynamically loaded code takes one word more memory.

What are the limitations then:

 - If the module defines any toplevel functions, it cannot ever be
 freed, because the closures are referenced from the global
 table. This is true even for loadfile_private. Eg. code that defines
 functions will still not be garbage collected.

 - If the module has any literals, the literals will not be freed and
 will be reallocated on each time it is loaded.

 - The global_data table will bloat for each load and the space will
 not be reclaimed.

 - Toploop doesn't use this yet.

What next:

 - Removing the limitations as much as possible - some might be
 unavoidable, but the cost shouldn't be higher than what it is now -
 and 8 bytes per loaded file is quite acceptable in all but the
 stringest environments.

So, in summary:

SUCCESS!

There is still a lot to do, though, and I will keep on working on the
implementation. Any review on the code anyone might wish to do would
be welcome indeed - as well as general comments on the subject.

I fear to attach the diff here as I don't know the policy of this list
on them, so I will put it online. If it's okay to post patches here, I
will do so in future revisions of this code. Also I wonder how I might
supply this patch when it's finished to the OCaml maintainers - they
will probably wish to do several things differently though - or if
just having it here on the mailing list is enough.

The patch should be applied on top of a current cvs checkout of
ocaml. Making it compile is a bit tricky though. I include hand hacked
Makefile and .depend changes in it to make it easier - yet I didn't
wish to supply a new ocamlc binary or anything. When starting to use
the code, first run 'make world' - this will break at some point
because the primitives are different - after that do 'make bootstrap',
which should complete gracefully. After that you can do whatever, like
'make clean', 'make world.opt' and 'make install' - everything should
work.

So, here's the link:

  http://www.iki.fi/naked/ocaml-dynlink-free.diff

Thanks for listening, and have a nice holiday everyone!
-- Naked

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] [PATCH] Dynamic freeing of dynamically loaded code
  2003-12-22  7:55 [Caml-list] [PATCH] Dynamic freeing of dynamically loaded code Nuutti Kotivuori
@ 2003-12-22 11:43 ` Richard Jones
  2003-12-22 17:04   ` Nuutti Kotivuori
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Jones @ 2003-12-22 11:43 UTC (permalink / raw)
  Cc: caml-list

On Mon, Dec 22, 2003 at 09:55:40AM +0200, Nuutti Kotivuori wrote:
>  - If the module defines any toplevel functions, it cannot ever be
>  freed, because the closures are referenced from the global
>  table. This is true even for loadfile_private. Eg. code that defines
>  functions will still not be garbage collected.

Does this apply to:

let () = ...

code?  I assume these aren't really "toplevel functions".

More seriously, what about toplevel functions which aren't referenced
outside the code (they may even be made private using an .mli file).
I have a LOT of code which does this sort of thing:

let run r =
  let q = new cgi r in

  (* ... blah blah the CGI script ... *)

(* Register the script's run function. *)
let () =
  register_script run

The toplevel 'run' function is there, but never referenced directly
from outside the code, although of course it is called from outside
the code.

Rich.

-- 
Richard Jones. http://www.annexia.org/ http://freshmeat.net/users/rwmj
Merjis Ltd. http://www.merjis.com/ - improving website return on investment
C2LIB is a library of basic Perl/STL-like types for C. Vectors, hashes,
trees, string funcs, pool allocator: http://www.annexia.org/freeware/c2lib/

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] [PATCH] Dynamic freeing of dynamically loaded code
  2003-12-22 11:43 ` Richard Jones
@ 2003-12-22 17:04   ` Nuutti Kotivuori
  0 siblings, 0 replies; 3+ messages in thread
From: Nuutti Kotivuori @ 2003-12-22 17:04 UTC (permalink / raw)
  To: Richard Jones; +Cc: caml-list

Richard Jones wrote:
> On Mon, Dec 22, 2003 at 09:55:40AM +0200, Nuutti Kotivuori wrote:
>> - If the module defines any toplevel functions, it cannot ever be
>> freed, because the closures are referenced from the global
>> table. This is true even for loadfile_private. Eg. code that
>> defines functions will still not be garbage collected.
>
> Does this apply to:

It applies if you define a _name_ at toplevel.

> let () = ...
>
> code?  I assume these aren't really "toplevel functions".

No, they aren't.

> More seriously, what about toplevel functions which aren't
> referenced outside the code (they may even be made private using an
> .mli file).  I have a LOT of code which does this sort of thing:
>
> let run r =
> let q = new cgi r in
>
> (* ... blah blah the CGI script ... *)
>
> (* Register the script's run function. *)
> let () =
> register_script run
>
> The toplevel 'run' function is there, but never referenced directly
> from outside the code, although of course it is called from outside
> the code.

I'm afraid this will define run as a global function. You would have
to define it like:

let () =
  let run r = ... in
  register_script run

or some in a similar manner. So this is a real bother to work around
in common code in general.

But, if you don't need the code working right away, I'm rather hopeful
I can make this problem go away by modifying the Symtable
implementation to keep track of which module uses which module - and
optionally reusing old global table entries.

But in fact, handling the literals leaking seems to be a more
difficult operation - with the symtable, you can remove global table
entries when they cannot be accessed any more, and let the gc take
care of the code - but with literals, the literals would need to stick
around as long as the code does, irrespective of the symbol table - so
if we don't wish to put them all in the closure environments, the
freeing of literals would need to be tied into the freeing of the
code.

Ofcourse this issue can be lessened quite a lot by re-using old
literals if they happen to share the same contents - but still if you
have a long literal you modify each time around, loading the code, it
will bloat if something isn't done.

Perhaps the literal table could use weak references to the code
references, and clean up after they've changed to Empty, but that
seems elaborate.

But, I'm just brainstorming here now.

-- Naked

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-12-22 17:04 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-22  7:55 [Caml-list] [PATCH] Dynamic freeing of dynamically loaded code Nuutti Kotivuori
2003-12-22 11:43 ` Richard Jones
2003-12-22 17:04   ` Nuutti Kotivuori

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).