From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <info@gerd-stolpmann.de>
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39])
	by yquem.inria.fr (Postfix) with ESMTP id E0915BB81
	for <caml-list@yquem.inria.fr>; Sun, 23 Oct 2005 12:21:45 +0200 (CEST)
Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.190])
	by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j9NALiDc008701
	for <caml-list@yquem.inria.fr>; Sun, 23 Oct 2005 12:21:45 +0200
Received: from [212.227.126.160] (helo=mrelayng.kundenserver.de)
	by moutng.kundenserver.de with esmtp (Exim 3.35 #1)
	id 1ETcyz-0002ob-00; Sun, 23 Oct 2005 12:21:41 +0200
Received: from [84.58.131.163] (helo=gate.lan.gerd-stolpmann.de)
	by mrelayng.kundenserver.de with asmtp (Exim 3.35 #1)
	id 1ETcyz-0001SH-00; Sun, 23 Oct 2005 12:21:41 +0200
Received: from flakew.lan.gerd-stolpmann.de (flakew.lan.gerd-stolpmann.de [192.168.0.32])
	by gate.lan.gerd-stolpmann.de (Postfix) with ESMTP id C0288C085;
	Sun, 23 Oct 2005 12:21:40 +0200 (CEST)
Subject: Re: [Caml-list] The Bytecode Interpreter...
From: Gerd Stolpmann <info@gerd-stolpmann.de>
To: Jonathan Roewen <jonathan.roewen@gmail.com>
Cc: David MENTRE <david.mentre@gmail.com>, caml-list@yquem.inria.fr
In-Reply-To: <ad8cfe7e0510221803x6015eaa8w7b5f5a322d4ebdf@mail.gmail.com>
References: <ad8cfe7e0510210301n69e7a384s23e45f2870de9395@mail.gmail.com>
	 <3d13dcfc0510210427g5ea98df7s@mail.gmail.com>
	 <ad8cfe7e0510221803x6015eaa8w7b5f5a322d4ebdf@mail.gmail.com>
Content-Type: text/plain
Date: Sun, 23 Oct 2005 12:21:39 +0200
Message-Id: <1130062899.27787.126.camel@localhost.localdomain>
Mime-Version: 1.0
X-Mailer: Evolution 2.2.1.1 
Content-Transfer-Encoding: 7bit
X-Provags-ID: kundenserver.de abuse@kundenserver.de auth:a6865a839c0178d9aa0ce41878507ea2
X-Miltered: at concorde with ID 435B6438.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)!
X-Spam: no; 0.00; caml-list:01 bytecode:01 gerd:01 stolpmann:01 toplevel:01 bytecode:01 toplevel:01 ocaml:01 cmi:01 ocamlc:01 ocamlc:01 command-line:01 compiler:01 syntax:01 surprising:01 
X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr
X-Spam-Level: 
X-Spam-Status: No, score=0.1 required=5.0 tests=FORGED_RCVD_HELO 
	autolearn=disabled version=3.0.3

Am Sonntag, den 23.10.2005, 14:03 +1300 schrieb Jonathan Roewen:
> > No difference. Toplevel expression are compiled as bytecode and then
> > executed by bytecode interpreter.
> 
> I think I'm starting to understand some of how the toplevel & bytecode
> interpreter work. Some of it, I don't understand why, so will continue
> from there ;-)
> 
> In the toplevel, you can #load in an ocaml library, but you still need
> all the .cmi files in order to open/use them. Why is that? Shouldn't
> it be in the library?

I don't know exactly the reasons, but I think this scheme is driven by
the needs of ocamlc, not the toplevel. If the interfaces were packed
into the .cma files, one would have to specify the .cma even for ocamlc
-c. As the command-line of ocamlc is patterned after the C compiler
command syntax, this would be quite surprising. Some hacks would become
impossible (e.g. several .cma implementations for the same set of .cmi).
I think the idea is not bad. One clear disadvantage would be that the
interfaces would be stored twice (in .cma and .cmxa) if one had both a
bytecode and a native code version of the library.

> Also, I presume loading libraries on the fly, and having access to
> -all- symbols defined in .cmi files available does -not- use the
> Dynlink module. Is this correct?

As far as I know, both Dynlink and the toplevel share parts of the
infrastructure needed for that. Dynlink only loads the bytecode,
relocates it, and resolves symbols pointing to already loaded libraries.
For this purpose, Dynlink maintains a symbol table (i.e. a mapping from
identifiers to addresses).

When one #loads within a toploop, more or less the same happens.
However, the toploop is a full compiler, and does not only have the
symbol table, but also the current environment, i.e. the set of
currently visible types and values with the still-needed parts of their
definitions. As far as I know, #load does not modify the environment
(because nothing changes - the loaded modules are already in scope when
the .cmi file is in the search path).

> How can a bytecode program do the same things as the toplevel in terms
> of #load-ing libraries and accessing any given value as specifed in
> the compiled interface files?

Because it has the symbol table. It is part of every bytecode executable
and .cmo file, but normally not loaded, because not needed for
execution. When Dynlink is initialized, the symbol table is extracted
from the bytecode executable.

> Lastly, why is the Dynlink module unable to provide the ability to
> access any given value in a compiled interface file when loading a
> bytecode object? Or can it? The comments for the Dynlink module
> specify that this -isn't- possible. However, it -can- access values
> from the program the loads the .cmo -- is that correct?

In principle, it can. As mentioned, it has the symbol table, so it can
look up every symbol.

However, the question arises how to access symbols at runtime while
keeping type safety. A function like dlopen cannot be type safe and is
not provided (or better, this interface is not installed - see the
compiler module Symtable for direct access, it is part of Dynlink).

To ensure type safety at runtime, O'Caml uses interface hashes (MD5
sums). A new module can only access already loaded modules when it knows
the MD5 sums, i.e. is compiled against the same interfaces. So the
reason for the restriction of only being able to "look back" from the
new module to the existing modules is that this scheme opens a way to
check type safety. 

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Telefon: 06151/153855                  Telefax: 06151/997714
------------------------------------------------------------