caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Suggestion: a possible alternative to shared libraries?
@ 2004-06-14 14:23 Benjamin Geer
  2004-06-14 15:03 ` Gerd Stolpmann
  0 siblings, 1 reply; 3+ messages in thread
From: Benjamin Geer @ 2004-06-14 14:23 UTC (permalink / raw)
  To: caml-list; +Cc: caml

The pros and cons of being able to create shared libraries in Caml have 
been abundantly discussed on this list.  I've been thinking about the 
reasons why I originally thought it would be a good idea[1], and Xavier 
Leroy's very reasonable objections[2], and then I read about this 
enhancement which Sun has added to its 1.5 JVM:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4416624

(I've included the most relevant bits of that page at the bottom of this 
email.)

Reading this, it occurred to me to ask whether a similar approach might 
provide similar benefits for Caml.

The reasoning is as follows: if many small programs use the same large 
library, each of the small programs will require a lot of memory. 
Rather than create shared libraries to get around this, the idea is to 
write the in-memory representation of libraries (ideally on an as-needed 
basis) into a cache consisting of one or more files that can be mapped 
read-only into memory; that memory could then be shared by multiple Caml 
processes that needed to use the same libraries.

Perhaps Caml's existing MD5 signatures for libraries could be used to 
distinguish between an older version of a library and a newer one, so 
each Caml process would only memory-map the version that it was 
originally linked with, in order to avoid the "DLL hell" problem that 
Xavier points out.

Could someone in the Caml development team tell me whether this is a 
completely crazy idea?

Ben

[1] 
http://caml.inria.fr/bin/caml-bugs/feature%20wish?id=2372;user=guest;selectid=2372

[2] http://caml.inria.fr/archives/200405/msg00295.html

**************************

[Description of the Sun JVM enhancement:]

When the JRE is installed on
supported platforms using the Sun provided installer, the installer
loads a set of classes from the system jar file into a private
internal representation, and dumps that representation to a file,
called a "shared archive".... During subsequent JVM invocations, the 
shared archive is memory-mapped in, saving the cost of loading
those classes and allowing much of the JVM's metadata for these
classes to be shared among multiple JVM processes....

The footprint cost of new JVM instances has been reduced in two ways.
First, a portion of the shared archive, currently between five and six
megabytes, is mapped read-only and therefore shared among multiple JVM
processes. Previously this data was replicated in each JVM instance.
Second, less data is loaded out of the shared archive because the
metadata for unused methods remains completely untouched as opposed to
being created and processed during class loading. These savings allow
more applications to be run concurrently on the same machine.

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] Suggestion: a possible alternative to shared libraries?
  2004-06-14 14:23 [Caml-list] Suggestion: a possible alternative to shared libraries? Benjamin Geer
@ 2004-06-14 15:03 ` Gerd Stolpmann
  2004-06-14 15:37   ` Benjamin Geer
  0 siblings, 1 reply; 3+ messages in thread
From: Gerd Stolpmann @ 2004-06-14 15:03 UTC (permalink / raw)
  To: Benjamin Geer; +Cc: caml-list

Am Mon, 2004-06-14 um 16.23 schrieb Benjamin Geer:
> The reasoning is as follows: if many small programs use the same large 
> library, each of the small programs will require a lot of memory. 
> Rather than create shared libraries to get around this, the idea is to 
> write the in-memory representation of libraries (ideally on an as-needed 
> basis) into a cache consisting of one or more files that can be mapped 
> read-only into memory; that memory could then be shared by multiple Caml 
> processes that needed to use the same libraries.

Basically, a shared library _is_ an in-memory representation of the code
to be mapped into memory. There is a bit of additional stuff that makes
it complicated, in particular the file format is organized such that
most of the file can be mapped read-only, and only a small fraction is
mapped read-write (e.g. for address relocations).

So you are only re-inventing the wheel, at least if you want to do it
for native code.

I read the announcement differently. As Java is based on bytecode, and
the .class files are transformed after being loaded into memory (to
increase speed, this may or may not include JIT compiling), they are
dumping the memory after this transformation, and emulate a kind of
shared library for their execution model. They cannot use normal shared
libraries because they don't use normal ways of code representation,
that's why it is reasonable to port this wheel to the Java vehicle.

One can discuss whether one should do the same for O'Caml bytecode. In
particular, neither bytecode executables nor loaded CMAs are loaded by
memory mapping. When I am not completely wrong, executables could be
loaded by mapping them read-only (at least in the case when no endianess
transformation is necessary), as they are mainly a memory footprint. For
CMAs, a non-trivial change of the file format would be necessary as the
loader performs relocations, and these relocations should be restricted
to certain memory areas to maximize memory sharing.

> Perhaps Caml's existing MD5 signatures for libraries could be used to 
> distinguish between an older version of a library and a newer one, so 
> each Caml process would only memory-map the version that it was 
> originally linked with, in order to avoid the "DLL hell" problem that 
> Xavier points out.

Much easier: Include the MD5 signature in the name of the shared
library.

However, the hell persists: You don't know which versions are still in
use.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
------------------------------------------------------------

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Caml-list] Suggestion: a possible alternative to shared libraries?
  2004-06-14 15:03 ` Gerd Stolpmann
@ 2004-06-14 15:37   ` Benjamin Geer
  0 siblings, 0 replies; 3+ messages in thread
From: Benjamin Geer @ 2004-06-14 15:37 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: caml-list

Gerd Stolpmann wrote:
> Much easier: Include the MD5 signature in the name of the shared
> library.

And use the operating system's run-time linker?  That would make for 
very long library names... but maybe that wouldn't be too terrible.

> However, the hell persists: You don't know which versions are still in
> use.

Do you mean in memory, or by installed programs that aren't necessarily 
running?  To take care of the first problem, wouldn't it be possible to 
maintain a count of the processes that currently have a library mapped 
into memory?  Could GODI take care of the second problem?

Benjamin

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2004-06-14 15:37 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-06-14 14:23 [Caml-list] Suggestion: a possible alternative to shared libraries? Benjamin Geer
2004-06-14 15:03 ` Gerd Stolpmann
2004-06-14 15:37   ` Benjamin Geer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).