Re: Dynamic loading. Again.

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* Re: Dynamic loading. Again.
@ 2000-11-28 11:14 Ken Wakita
  0 siblings, 0 replies; 13+ messages in thread
From: Ken Wakita @ 2000-11-28 11:14 UTC (permalink / raw)
  To: vsl; +Cc: caml-list

Partial functions are implemented by auxiliary functions (such as
caml_curry_n_m) produces in an auxiliary C file.  Extensive calls to
them in PIC scheme may incur some performance degradation.  Under UNIX
you could use the -dstartup option to let the ocamlopt compiler leave
the auxiliary C file under the temporary directory (in my
configuration /tmp).

Ken

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Dynamic loading. Again.
  2000-12-01 16:49           ` John Max Skaller
@ 2000-12-04 21:06             ` Gerd Stolpmann
  0 siblings, 0 replies; 13+ messages in thread
From: Gerd Stolpmann @ 2000-12-04 21:06 UTC (permalink / raw)
  To: John Max Skaller, fabrice.le_fessant; +Cc: Vitaly Lugovsky, caml-list

On Fri, 01 Dec 2000, John Max Skaller wrote:
>	Suppose I load library X. It gets mapped to
>
>	0xFF00-0000
>
>Now suppose YOU load library X. It is already in memory.
>The loader knows that. It gets mapped to
>
>	0xFF00-0000
>
>The same address for both of us.  Now I load Y, and you load Z.
>Y loads first, under X. Then Z loads, under Y.
>Or the other way around: the order doesn't matter, but the 
>address space used had better be the same. We all share
>the same physical memory at the same address (for the shared
>libraries).

This is impractical. What about libraries loading libraries, especially
libraries loading system libraries? System libraries are loaded at addresses
you cannot determine, and the loader will probably load the system library at
different addresses for different processes. So your patching approach will
fail, and you have to work around the problem. (Unless you have your own
operating system...)

The world uses a different solution, and it does not make sense to build a
parallel world with its own rules. Furthermore there are very good reasons not
to patch the text segment:

- The startup time of executables is much shorter (relocation on demand)

- The number of libraries per system is not limited by the address space.
  Say you have 1GB for shared libraries, and every library is 512k
  ==> maximum number of libraries per system = 1GB / 512k = 2048
  This is rather small for big systems, and may be even much smaller 
  because of the next issue

- No artificial fragmentation of the address space:
  Because of unloading libraries that are no longer used the address space 
  of the whole system will be fragmented. Perhaps you get into the situation
  that you cannot load a big library because you do not find a free region
  of addresses that is large enough

- It is possible to load the same library several times because the read-write
  mapped sections of the library (i.e. global variables) can be mapped twice at
  different addresses

- It is not necessary to move text segments of libraries to swap files if the
  library is swapped out; the already working image of the text segment can be
  reloaded from the original library file

Of course, on the IA32 the code of shared libraries runs slower; however this
is not true for properly designed processors. For C code, the factor is 5% to
20% depending on the nature of the code. I do not see any good reason why a
comparable factor could not be reached for O'Caml.

Gerd
-- 
----------------------------------------------------------------------------
Gerd Stolpmann      Telefon: +49 6151 997705 (privat)
Viktoriastr. 100             
64293 Darmstadt     EMail:   gerd@gerd-stolpmann.de
Germany                     
----------------------------------------------------------------------------



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Dynamic loading. Again.
  2000-12-01 14:24         ` Fabrice Le Fessant
@ 2000-12-01 16:49           ` John Max Skaller
  2000-12-04 21:06             ` Gerd Stolpmann
  0 siblings, 1 reply; 13+ messages in thread
From: John Max Skaller @ 2000-12-01 16:49 UTC (permalink / raw)
  To: fabrice.le_fessant; +Cc: Vitaly Lugovsky, caml-list

Fabrice Le Fessant wrote:

> >   The per process data lives at the same address for
> > every process, but the underlying memory is swapped for
> > every task switch (at least, on the 486 this is done).
> 
> What about libraries mapped at the same address, 

	They can't be.

> or on overlapping
> segments ? They could be loaded first by different processes, 

	Only one library can be loaded at a time.
The loader must block while it is loading a library.
[At least, it must block while allocating the address space]

>then another one would want to use both of them and would not be able to
> map both of them !!! You MUST have position independent code if you
> really want your code to be SHARED.

	Suppose I load library X. It gets mapped to

	0xFF00-0000

Now suppose YOU load library X. It is already in memory.
The loader knows that. It gets mapped to

	0xFF00-0000

The same address for both of us.  Now I load Y, and you load Z.
Y loads first, under X. Then Z loads, under Y.
Or the other way around: the order doesn't matter, but the 
address space used had better be the same. We all share
the same physical memory at the same address (for the shared
libraries).

Perhaps there is a confusion here: ALL the code is built to
run at address zero. ALL the code is patched to the actual
load address. The address space for each library is allocated
so it cannot overlap, at the first load attempt. Subsequent
attempts to load the library simply refer to it at the 
address at which it is already loaded. Everyone maps the code
to the same address it is actually loaded at! This means
your code address space will consist of 'chunks' (with holes
for the loaded libraries you are not using).

In some sense, the library image is position independent, because
it can be loaded at any address. But this is a loader function,
not a property of the addressing modes of instructions used.

In the old days, I used to write assembler for code that was
loaded directly into memory straight off disk. This code
has to be position independent in the sense that there
was NO 'loader' to patch anything: the disk image was a
a memory image of the code, ready to run. So absolute
addressing was used only for reference to things like ROM
in fixed positions.

The problem is always the data, not the code. Relative jumps
and calls are easy. Its not so easy to do 'relative' data,
if there are many processes each requiring its own copy
of writable store: that required indirection through a register.
But these days, that indirection is provided by system level
hardware like VM, rather than an application register.

The reason is probably that application code cannot
EVER use real absolute addresses. Only the OS is allowed
to do that.  Instead, absolute addresses are translated
using an invisible 'register', there being no good reason
to waste an application register for the job. Why waste
bits specifying the register in the instruction set?
[I cite this argument without believing it :-]

-- 
John (Max) Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
checkout Vyper http://Vyper.sourceforge.net
download Interscript http://Interscript.sourceforge.net

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Dynamic loading. Again.
  2000-12-01 12:59       ` John Max Skaller
@ 2000-12-01 14:24         ` Fabrice Le Fessant
  2000-12-01 16:49           ` John Max Skaller
  0 siblings, 1 reply; 13+ messages in thread
From: Fabrice Le Fessant @ 2000-12-01 14:24 UTC (permalink / raw)
  To: John Max Skaller; +Cc: Vitaly Lugovsky, caml-list

>   The per process data lives at the same address for
> every process, but the underlying memory is swapped for 
> every task switch (at least, on the 486 this is done).

What about libraries mapped at the same address, or on overlapping
segments ? They could be loaded first by different processes, then
another one would want to use both of them and would not be able to
map both of them !!! You MUST have position independent code if you
really want your code to be SHARED.

- Fabrice

Homepage: http://pauillac.inria.fr/~lefessan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Dynamic loading. Again.
  2000-11-30 18:00     ` Chris Hecker
@ 2000-12-01 13:34       ` John Max Skaller
  0 siblings, 0 replies; 13+ messages in thread
From: John Max Skaller @ 2000-12-01 13:34 UTC (permalink / raw)
  To: Chris Hecker; +Cc: fabrice.le_fessant, Vitaly Lugovsky, caml-list

Chris Hecker wrote:
> 
> >{I think all this stinks, and is a result of using a stupid language
> >like C for systems programming .. but that's another story]
> 
> Okay, I'll bite.  Why does the current situation stink, and how would you change it?
> 
> Chris

	I'd start by eliminating global variables;
probably, I'd eliminate the stack as well and use continuation objects.
Function pointers would denote closures (not just code objects).
I'd throw out all primitive data types (except possibly bool).
main would go. Compiled interfaces with type-safe linkage.
Decent syntax. More formal standard. 
Ummm.. just about everything you can think of is wrong with C.

	I'm currently developing an application level language
(called Felix) that does some of this. [No global variables,
closures, procedural continuations, garbage collection.
Functional code still uses the machine stack for performance.
Functions cannot have side effects (but they can depend on
variables in their environment). 

	The translator control inverts procedural code so that
one writes blocking reads, but the generated code is actually
event driven. 

	There are no primitive data types (except, sort of, bool).

	Gak: the generated code is C/C++. For a low level systems programming
language, you'd need to target assembler (which is much harder).
The syntax is a bit C like, to attract the C/C++ people.

-- 
John (Max) Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
checkout Vyper http://Vyper.sourceforge.net
download Interscript http://Interscript.sourceforge.net



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Dynamic loading. Again.
  2000-11-30 16:08     ` Fabrice Le Fessant
@ 2000-12-01 12:59       ` John Max Skaller
  2000-12-01 14:24         ` Fabrice Le Fessant
  0 siblings, 1 reply; 13+ messages in thread
From: John Max Skaller @ 2000-12-01 12:59 UTC (permalink / raw)
  To: fabrice.le_fessant; +Cc: Vitaly Lugovsky, caml-list

Fabrice Le Fessant wrote:
 
> Well, you are right. You don't need to generate position independent
> code, but it is better if you want the code to be shared between
> processes. Otherwise, each process has its own code, and a shared
> library is not that shared ...

	Nope. The code can be shared anyhow: it is relocated
by patching at load time and everyone uses the same load address.
Only one copy of the code, used by all clients.

	The per process data lives at the same address for
every process, but the underlying memory is swapped for 
every task switch (at least, on the 486 this is done).

	So exactly the same logical memory addresses can be used
by multiple processes without interference. No application level
register or use of relative addressing is required.

	I don't know about Linux, but NT copies the per process
data lazily (that is, each processes copy of the data space
is mapped on to a read only memory page at the same address,
and the page is copied on write to new physical memory,
which is located, for THAT process only, at the same address).

	Summary: all application level NT and Linux-486 code is
relocatable, no application level register is required,
absolute address modes can be used freely. The big problem is, 
in fact, sharing memory between processes!

-- 
John (Max) Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
checkout Vyper http://Vyper.sourceforge.net
download Interscript http://Interscript.sourceforge.net



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Dynamic loading. Again.
  2000-11-30  7:02   ` John Max Skaller
  2000-11-30 16:08     ` Fabrice Le Fessant
  2000-11-30 18:00     ` Chris Hecker
@ 2000-11-30 20:55     ` Gerd Stolpmann
  2 siblings, 0 replies; 13+ messages in thread
From: Gerd Stolpmann @ 2000-11-30 20:55 UTC (permalink / raw)
  To: John Max Skaller, fabrice.le_fessant; +Cc: Vitaly Lugovsky, caml-list

On Thu, 30 Nov 2000, John Max Skaller wrote:
>Fabrice Le Fessant wrote:
>> 
>> Last year, Mark Hayden and I did some work on dynamic linking of
>> native code for Linux. It worked, with few modifications in the
>> compiler to generate relocatable code in the ELF format, but the code
>> was really big (something like twice the normal size) and really slow
>> (about twice slower). 
>
>	Do you know why it was slower??
>
>	Normally, static and load time linkage produce identical code,
>and the code doesn't have to be position independent: any code can
>be shared, and have distinct per process data at the same virtual
>address. Absolute addresses are relocated by patching once at load time.

As far as I know, ELF executables do not patch the code directly because this
would have the disadvantage that the text segments could not be shared by
several processes (it is unlikely that two processes load the same library at
the same start address). There are several sections at the beginning of the
library containing the addresses of the resolved symbols; only these sections
are patched. As far as I remember one of these special sections contains small
stubs implementing jumps and calls to other libraries, and another section is a
table of addresses of foreign variables. The text segment must be
position-independent.

>What's usually required is segmentation (splitting the code into
>executable
>and data segments).
>
>{I think all this stinks, and is a result of using a stupid language
>like C
>for systems programming .. but that's another story]

It's assembly language. 

Gerd
-- 
----------------------------------------------------------------------------
Gerd Stolpmann      Telefon: +49 6151 997705 (privat)
Viktoriastr. 100             
64293 Darmstadt     EMail:   gerd@gerd-stolpmann.de
Germany                     
----------------------------------------------------------------------------



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Dynamic loading. Again.
  2000-11-30  7:02   ` John Max Skaller
  2000-11-30 16:08     ` Fabrice Le Fessant
@ 2000-11-30 18:00     ` Chris Hecker
  2000-12-01 13:34       ` John Max Skaller
  2000-11-30 20:55     ` Gerd Stolpmann
  2 siblings, 1 reply; 13+ messages in thread
From: Chris Hecker @ 2000-11-30 18:00 UTC (permalink / raw)
  To: John Max Skaller, fabrice.le_fessant; +Cc: Vitaly Lugovsky, caml-list


>{I think all this stinks, and is a result of using a stupid language
>like C for systems programming .. but that's another story]

Okay, I'll bite.  Why does the current situation stink, and how would you change it?

Chris



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Dynamic loading. Again.
  2000-11-30  7:02   ` John Max Skaller
@ 2000-11-30 16:08     ` Fabrice Le Fessant
  2000-12-01 12:59       ` John Max Skaller
  2000-11-30 18:00     ` Chris Hecker
  2000-11-30 20:55     ` Gerd Stolpmann
  2 siblings, 1 reply; 13+ messages in thread
From: Fabrice Le Fessant @ 2000-11-30 16:08 UTC (permalink / raw)
  To: John Max Skaller; +Cc: Vitaly Lugovsky, caml-list

>  Fabrice Le Fessant wrote:
>  > 
>  > Last year, Mark Hayden and I did some work on dynamic linking of
>  > native code for Linux. It worked, with few modifications in the
>  > compiler to generate relocatable code in the ELF format, but the code
>  > was really big (something like twice the normal size) and really slow
>  > (about twice slower). 
>  
>  	Do you know why it was slower??
>  
>  	Normally, static and load time linkage produce identical code,
>  and the code doesn't have to be position independent: any code can
>  be shared, and have distinct per process data at the same virtual
>  address. Absolute addresses are relocated by patching once at load time.
>  What's usually required is segmentation (splitting the code into
>  executable
>  and data segments).

Well, you are right. You don't need to generate position independent
code, but it is better if you want the code to be shared between
processes. Otherwise, each process has its own code, and a shared
library is not that shared ... 

The problem with generating position independent code is that you need
one register for the GOT pointer. One the i386 arch, this is a big
penalty. Moreover, all function calls are indirect (we did not
optimize for the case of "static" functions), and the GOT must be
saved and re-computed -- if used -- at every function entry point,
since it is different for each module. 

I think some optimizations should probably improve this a lot !

- Fabrice

Homepage: http://pauillac.inria.fr/~lefessan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Dynamic loading. Again.
  2000-11-29 16:42 ` Fabrice Le Fessant
@ 2000-11-30  7:02   ` John Max Skaller
  2000-11-30 16:08     ` Fabrice Le Fessant
                       ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: John Max Skaller @ 2000-11-30  7:02 UTC (permalink / raw)
  To: fabrice.le_fessant; +Cc: Vitaly Lugovsky, caml-list

Fabrice Le Fessant wrote:
> 
> Last year, Mark Hayden and I did some work on dynamic linking of
> native code for Linux. It worked, with few modifications in the
> compiler to generate relocatable code in the ELF format, but the code
> was really big (something like twice the normal size) and really slow
> (about twice slower). 

	Do you know why it was slower??

	Normally, static and load time linkage produce identical code,
and the code doesn't have to be position independent: any code can
be shared, and have distinct per process data at the same virtual
address. Absolute addresses are relocated by patching once at load time.
What's usually required is segmentation (splitting the code into
executable
and data segments).

{I think all this stinks, and is a result of using a stupid language
like C
for systems programming .. but that's another story]

-- 
John (Max) Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
checkout Vyper http://Vyper.sourceforge.net
download Interscript http://Interscript.sourceforge.net



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Dynamic loading. Again.
  2000-11-26  0:09 Vitaly Lugovsky
  2000-11-28  1:27 ` John Max Skaller
@ 2000-11-29 16:42 ` Fabrice Le Fessant
  2000-11-30  7:02   ` John Max Skaller
  1 sibling, 1 reply; 13+ messages in thread
From: Fabrice Le Fessant @ 2000-11-29 16:42 UTC (permalink / raw)
  To: Vitaly Lugovsky; +Cc: caml-list

Last year, Mark Hayden and I did some work on dynamic linking of
native code for Linux. It worked, with few modifications in the
compiler to generate relocatable code in the ELF format, but the code
was really big (something like twice the normal size) and really slow
(about twice slower). It was for ocaml 2.??, but it should probably
work with a few changes for the current version. If you want the
patch, I will try to find it in my archives. It was a few days work,
so we dit not optimize it as it should be, but it can be used as a
start point.

If you are only interested with linking some code -- not time
critical --, you can try the dynlink library which is included in the efuns
package (http://pauillac.inria.fr/efuns). It will allow you to
dynamically link BYTECODE modules inside NATIVE code programs. These
modules will be executed slower than the native code, and even slower
than the bytecode run by ocamlrun, but they will use native functions
for all functions which were included in the program, so that it would
be good to execute powerful configuration scripts, or computation
orders. 

Regards,

- Fabrice

Homepage: http://pauillac.inria.fr/~lefessan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Dynamic loading. Again.
  2000-11-26  0:09 Vitaly Lugovsky
@ 2000-11-28  1:27 ` John Max Skaller
  2000-11-29 16:42 ` Fabrice Le Fessant
  1 sibling, 0 replies; 13+ messages in thread
From: John Max Skaller @ 2000-11-28  1:27 UTC (permalink / raw)
  To: Vitaly Lugovsky; +Cc: caml-list

Vitaly Lugovsky wrote:
> 
>  I think there is a reason to implement dynamic loading of native compiled
> code for Ocaml (e.g. PIC code generation for some platforms),
> as well as dynamycally linked runtime code (I just managed to link
> bytecode runtime dynamically, so, it's not so impossible). It's just a
> way to kill Java. 

	My problem was different: in writing my Python interpreter, 
I needed to provide emulations for a set of dynamically loadable Python
extensions. 
I had to link them all in statically, even though every 'module' has the
same
signature: when I added modules, I had to update a string -> function
table.
No way I would want clients adding extensions to have to mess with the
source of the interpreter.

	Funny thing is, I can actually dynamically load and execute
_Python_ extensions under my Ocaml written interpreter! They're
written in C, and it's not so hard to write an Ocaml/C interface
that can do dynamic loading. 

-- 
John (Max) Skaller, mailto:skaller@maxtal.com.au
10/1 Toxteth Rd Glebe NSW 2037 Australia voice: 61-2-9660-0850
checkout Vyper http://Vyper.sourceforge.net
download Interscript http://Interscript.sourceforge.net



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Dynamic loading. Again.
@ 2000-11-26  0:09 Vitaly Lugovsky
  2000-11-28  1:27 ` John Max Skaller
  2000-11-29 16:42 ` Fabrice Le Fessant
  0 siblings, 2 replies; 13+ messages in thread
From: Vitaly Lugovsky @ 2000-11-26  0:09 UTC (permalink / raw)
  To: caml-list

 I think there is a reason to implement dynamic loading of native compiled
code for Ocaml (e.g. PIC code generation for some platforms),
as well as dynamycally linked runtime code (I just managed to link
bytecode runtime dynamically, so, it's not so impossible). It's just a
way to kill Java. I'm using JSP->Oracle BC4J->Oracle for now, and I hate
it. Ocaml in this kind of tasks will be much better, but to implement 
JSP-like engine we need dynamic loading. Ocaml is pretty fast in native
code, so it will be a greate argument against Java for application
servers. And, sure, it's a way for ocaml to Enterprise. Let's beat'em all.
"Real world" must be turned to use real technologies.

 For now, I'm trying to implement engine using bytecode only, just to
proove a concept.

--

   V.S.Lugovsky aka Mauhuur (http://ontil.ihep.su/~vsl) (UIN=45482254)

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2000-12-04 22:06 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-11-28 11:14 Dynamic loading. Again Ken Wakita
  -- strict thread matches above, loose matches on Subject: below --
2000-11-26  0:09 Vitaly Lugovsky
2000-11-28  1:27 ` John Max Skaller
2000-11-29 16:42 ` Fabrice Le Fessant
2000-11-30  7:02   ` John Max Skaller
2000-11-30 16:08     ` Fabrice Le Fessant
2000-12-01 12:59       ` John Max Skaller
2000-12-01 14:24         ` Fabrice Le Fessant
2000-12-01 16:49           ` John Max Skaller
2000-12-04 21:06             ` Gerd Stolpmann
2000-11-30 18:00     ` Chris Hecker
2000-12-01 13:34       ` John Max Skaller
2000-11-30 20:55     ` Gerd Stolpmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).