From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id E0915BB81 for ; Sun, 23 Oct 2005 12:21:45 +0200 (CEST) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.190]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j9NALiDc008701 for ; Sun, 23 Oct 2005 12:21:45 +0200 Received: from [212.227.126.160] (helo=mrelayng.kundenserver.de) by moutng.kundenserver.de with esmtp (Exim 3.35 #1) id 1ETcyz-0002ob-00; Sun, 23 Oct 2005 12:21:41 +0200 Received: from [84.58.131.163] (helo=gate.lan.gerd-stolpmann.de) by mrelayng.kundenserver.de with asmtp (Exim 3.35 #1) id 1ETcyz-0001SH-00; Sun, 23 Oct 2005 12:21:41 +0200 Received: from flakew.lan.gerd-stolpmann.de (flakew.lan.gerd-stolpmann.de [192.168.0.32]) by gate.lan.gerd-stolpmann.de (Postfix) with ESMTP id C0288C085; Sun, 23 Oct 2005 12:21:40 +0200 (CEST) Subject: Re: [Caml-list] The Bytecode Interpreter... From: Gerd Stolpmann To: Jonathan Roewen Cc: David MENTRE , caml-list@yquem.inria.fr In-Reply-To: References: <3d13dcfc0510210427g5ea98df7s@mail.gmail.com> Content-Type: text/plain Date: Sun, 23 Oct 2005 12:21:39 +0200 Message-Id: <1130062899.27787.126.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.2.1.1 Content-Transfer-Encoding: 7bit X-Provags-ID: kundenserver.de abuse@kundenserver.de auth:a6865a839c0178d9aa0ce41878507ea2 X-Miltered: at concorde with ID 435B6438.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 bytecode:01 gerd:01 stolpmann:01 toplevel:01 bytecode:01 toplevel:01 ocaml:01 cmi:01 ocamlc:01 ocamlc:01 command-line:01 compiler:01 syntax:01 surprising:01 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.1 required=5.0 tests=FORGED_RCVD_HELO autolearn=disabled version=3.0.3 Am Sonntag, den 23.10.2005, 14:03 +1300 schrieb Jonathan Roewen: > > No difference. Toplevel expression are compiled as bytecode and then > > executed by bytecode interpreter. > > I think I'm starting to understand some of how the toplevel & bytecode > interpreter work. Some of it, I don't understand why, so will continue > from there ;-) > > In the toplevel, you can #load in an ocaml library, but you still need > all the .cmi files in order to open/use them. Why is that? Shouldn't > it be in the library? I don't know exactly the reasons, but I think this scheme is driven by the needs of ocamlc, not the toplevel. If the interfaces were packed into the .cma files, one would have to specify the .cma even for ocamlc -c. As the command-line of ocamlc is patterned after the C compiler command syntax, this would be quite surprising. Some hacks would become impossible (e.g. several .cma implementations for the same set of .cmi). I think the idea is not bad. One clear disadvantage would be that the interfaces would be stored twice (in .cma and .cmxa) if one had both a bytecode and a native code version of the library. > Also, I presume loading libraries on the fly, and having access to > -all- symbols defined in .cmi files available does -not- use the > Dynlink module. Is this correct? As far as I know, both Dynlink and the toplevel share parts of the infrastructure needed for that. Dynlink only loads the bytecode, relocates it, and resolves symbols pointing to already loaded libraries. For this purpose, Dynlink maintains a symbol table (i.e. a mapping from identifiers to addresses). When one #loads within a toploop, more or less the same happens. However, the toploop is a full compiler, and does not only have the symbol table, but also the current environment, i.e. the set of currently visible types and values with the still-needed parts of their definitions. As far as I know, #load does not modify the environment (because nothing changes - the loaded modules are already in scope when the .cmi file is in the search path). > How can a bytecode program do the same things as the toplevel in terms > of #load-ing libraries and accessing any given value as specifed in > the compiled interface files? Because it has the symbol table. It is part of every bytecode executable and .cmo file, but normally not loaded, because not needed for execution. When Dynlink is initialized, the symbol table is extracted from the bytecode executable. > Lastly, why is the Dynlink module unable to provide the ability to > access any given value in a compiled interface file when loading a > bytecode object? Or can it? The comments for the Dynlink module > specify that this -isn't- possible. However, it -can- access values > from the program the loads the .cmo -- is that correct? In principle, it can. As mentioned, it has the symbol table, so it can look up every symbol. However, the question arises how to access symbols at runtime while keeping type safety. A function like dlopen cannot be type safe and is not provided (or better, this interface is not installed - see the compiler module Symtable for direct access, it is part of Dynlink). To ensure type safety at runtime, O'Caml uses interface hashes (MD5 sums). A new module can only access already loaded modules when it knows the MD5 sums, i.e. is compiled against the same interfaces. So the reason for the restriction of only being able to "look back" from the new module to the existing modules is that this scheme opens a way to check type safety. Gerd -- ------------------------------------------------------------ Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany gerd@gerd-stolpmann.de http://www.gerd-stolpmann.de Telefon: 06151/153855 Telefax: 06151/997714 ------------------------------------------------------------