From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by c5ff346549e7 (Postfix) with ESMTP id BCC305D4 for ; Thu, 8 Nov 2018 13:04:24 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.54,478,1534802400"; d="scan'208,217";a="354745511" Received: from sympa.inria.fr ([193.51.193.213]) by mail2-relais-roc.national.inria.fr with ESMTP; 08 Nov 2018 14:04:22 +0100 Received: by sympa.inria.fr (Postfix, from userid 20132) id 9F41382534; Thu, 8 Nov 2018 14:04:22 +0100 (CET) Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 511B2824E4 for ; Thu, 8 Nov 2018 14:04:15 +0100 (CET) Authentication-Results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=ivg@ieee.org; spf=Pass smtp.mailfrom=ivg@ieee.org; spf=None smtp.helo=postmaster@mail-lf1-f43.google.com IronPort-PHdr: =?us-ascii?q?9a23=3Ash0JMRRqBOLKouGDDxrRAC1/Bdpsv+yvbD5Q0YIu?= =?us-ascii?q?jvd0So/mwa69bRSN2/xhgRfzUJnB7Loc0qyK6/+mATRIyK3CmUhKSIZLWR4BhJ?= =?us-ascii?q?detC0bK+nBN3fGKuX3ZTcxBsVIWQwt1Xi6NU9IBJS2PAWK8TW94jEIBxrwKxd+?= =?us-ascii?q?KPjrFY7OlcS30P2594HObwlSizexfbF/IA+qoQnNq8IbnZZsJqEtxxXTv3BGYf?= =?us-ascii?q?5WxWRmJVKSmxbz+MK994N9/ipTpvws6ddOXb31cKokQ7NYCi8mM30u683wqRbD?= =?us-ascii?q?VwqP6WACXWgQjxFFHhLK7BD+Xpf2ryv6qu9w0zSUMMHqUbw5Xymp4qF2QxHqlS?= =?us-ascii?q?gHLSY0/27ZisNyjKxVrhGvqQFhzYHIe4yaLuZyc7nHcN8GWWZMXMBcXDFBDIOm?= =?us-ascii?q?aIsPCvIMM+VGr4bnoVsFsBqwBQ6wBOPo1D9Hmn/23awm0+Q6DArL2w0gH8wBsH?= =?us-ascii?q?nPrdX6KrkdXv6vwKnP1zXDYOpb1DHg44bGdRAhpOuDXbN2ccfJzkkgCxnKjlCU?= =?us-ascii?q?qYD/ODOVzOsNv3CG4OZ6UOKvj3YrpBtrrjiqwscgkojJhoQPylDF7yp12og1Jc?= =?us-ascii?q?egREFne9KkFZ9QuzmVN4t3XsMiQ3xotz0gxrIavp67eTAGx489yx7ab/yKdZWD?= =?us-ascii?q?7BH7VOuJPzt0mHZodKi8ihuy60Ss1+zxW8uu3FtFoCdIlMTHuGoX2BzJ8MeHT+?= =?us-ascii?q?Nw/ke/1jaL0ADe8uRELlo1larfMpIhxrAwmocKvUTNESL7ll/6jKCRdkUj9eio?= =?us-ascii?q?7/robq/6qZ+bMo94kgD+MqIwlcyjGek0LBQCUmyB9em/1LDv51D1TbRIg/Esna?= =?us-ascii?q?TUvojWJcEBqa64Bw9V3Jwj6xG6Dzq+0dQYg3YHIUlEeB2ZkYfmJUvCIPfiDfew?= =?us-ascii?q?m1isiitkx+jaPr39BZXANmTMn63kfbZ58kJczAszzctD559PEbEAIPfzWlfru9?= =?us-ascii?q?DCDx85NRa0w+f9B9ln2IMeQzHHPqjMPrnfvVKS5+lpLe6WeKcIvjfjbvwk4/rj?= =?us-ascii?q?y3EjynEHeqz8/IUebjiXGehhPU6ZYGb3yoMAD2givwczQartklLUAm0bXGq7Q6?= =?us-ascii?q?9pvmJzM4mhF4qWHtn80ozE5z+yG9htXk4DD1mNFXnycIDdA6UNZS+fZMh7nW5d?= =?us-ascii?q?DOTze8oazRir8TTC5f9/NOONon8ZuJ/ukt9v6L+LzExgxXlPF82Yllq1YSR0k2?= =?us-ascii?q?cPHWJk2al+pQl5xA7G3/QjxfNfEtNX6rVCVQJobZM=3D?= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0BCAABbM+RbhiunVdFjHgEGBwaBUwcLA?= =?us-ascii?q?YIoQU8zJ4M5P4EdknaCDXqIGIhMhVSBeg0lhEcCgxUaBwEEMgsNAQMBAQIBAQE?= =?us-ascii?q?BARMBAQEICwsIKSMMgjYigmMBAgIBDBcEGQEBLAUGAQQLCQILGh0CAiISAQUBH?= =?us-ascii?q?BkbgwYBgXQFCA+LZZAHPIodcHszgnYBAQWBMAGFdggSi2cXgUE/gRGCXQcugxs?= =?us-ascii?q?CAoE3VYJXgleJGoZgj1AJhjJAhhWEDxiCI45JjSCCbodTDyGBEhYBggZ9CDsxB?= =?us-ascii?q?oI1CYISGoNTgT6CYnSFXCQwiy+CPgEB?= X-IPAS-Result: =?us-ascii?q?A0BCAABbM+RbhiunVdFjHgEGBwaBUwcLAYIoQU8zJ4M5P4E?= =?us-ascii?q?dknaCDXqIGIhMhVSBeg0lhEcCgxUaBwEEMgsNAQMBAQIBAQEBARMBAQEICwsIK?= =?us-ascii?q?SMMgjYigmMBAgIBDBcEGQEBLAUGAQQLCQILGh0CAiISAQUBHBkbgwYBgXQFCA+?= =?us-ascii?q?LZZAHPIodcHszgnYBAQWBMAGFdggSi2cXgUE/gRGCXQcugxsCAoE3VYJXgleJG?= =?us-ascii?q?oZgj1AJhjJAhhWEDxiCI45JjSCCbodTDyGBEhYBggZ9CDsxBoI1CYISGoNTgT6?= =?us-ascii?q?CYnSFXCQwiy+CPgEB?= X-IronPort-AV: E=Sophos;i="5.54,478,1534802400"; d="scan'208,217";a="354745432" Received: from mail-lf1-f43.google.com ([209.85.167.43]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/AES128-GCM-SHA256; 08 Nov 2018 14:04:13 +0100 Received: by mail-lf1-f43.google.com with SMTP id c16so14101063lfj.8 for ; Thu, 08 Nov 2018 05:04:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ieee.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=XOYu+F9GlBgOxEODLOsAjks098j/oEQfesFWsVfps9M=; b=JzB7qfxydVQRqwfQAtx9kH5+UJWZu003L/ZV2plqCJHPkKeUIqSlC2/b/3nZ+9dPmN VsIIwBEIqw7qb67BUdQ1cssmrYi+arsIuIOUc8hBV23+FBeTSOPr0IKT/nEhuHFzC8oa bzwBvmGuVs5a8ND+xzdYx8bAZ5hixOV/XaeKo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=XOYu+F9GlBgOxEODLOsAjks098j/oEQfesFWsVfps9M=; b=J7/7P4zexAg3G85w7qcHF2hSUMO/HsOk2iKudMHsVEHE6r/gcFIbhN+KRmLndMpbP4 b1u+fpxYzDbr8sBJ/UOrh1pwew76qYH2OMEvjnqNb8PpG41RJ11K8lmGhbQkS88HJhxA OYFbrNmTysZNcWN7lc4JP9UJTu+CPrslseCKxbG+vgeZvED1S5EVcFM+NIKxSseK0XW/ 0ATWBCLMJqzOcfVRFYfzvKO8hEgOKcnG1Y6zCTDm9ZqxIsaO9g7ah7Z/5L2iuffdTNK7 dL5HGeN4ozDQHysA1Py7xBv1KsHcVbf9HnOZ8Q1nE3/l/1yX01pY/iBpu7rd6XJyLeWr D/Jw== X-Gm-Message-State: AGRZ1gKNHGMTYjTzqvydpeK8n1EMfvOj4mbEeLzH+wAIS68YjgQEf4E8 zXnpOiHFBXZfRvV3eniEw2VWKx6wwSJEkLNxFnF8eylLMKQ= X-Google-Smtp-Source: AJdET5c36VidY/6zZ/2Vk3yRry6PHR/yirUQkApy5gMHPIfULhbrsMQlmT2hD/mJ04pe+T2E0wEn39VwdHHJ5f5Gnkc= X-Received: by 2002:a19:5d42:: with SMTP id p2mr2814201lfj.83.1541682250903; Thu, 08 Nov 2018 05:04:10 -0800 (PST) MIME-Version: 1.0 References: <1238750250.1711857.1541671514743.JavaMail.zimbra@univ-lille.fr> In-Reply-To: <1238750250.1711857.1541671514743.JavaMail.zimbra@univ-lille.fr> From: Ivan Gotovchits Date: Thu, 8 Nov 2018 08:02:06 -0500 Message-ID: To: frederic.fort@univ-lille.fr Cc: caml-list Content-Type: multipart/alternative; boundary="000000000000ec4fcd057a26de58" Subject: Re: [Caml-list] Dynlink plugin reevaluates modules of main program Reply-To: Ivan Gotovchits X-Loop: caml-list@inria.fr X-Sequence: 17126 Errors-to: caml-list-owner@inria.fr Precedence: list Precedence: bulk Sender: caml-list-request@inria.fr X-no-archive: yes List-Id: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --000000000000ec4fcd057a26de58 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Frederic, You're observing a glimpse of an undefined behavior that occurs when OCaml runtime reloads a compilation unit that is already loaded. It is a well-known bug (MPR#4208, MPR#4229, MPR#4839, MPR#6462, MPR#6957, MPR#6950) which is not yet fixed [1]. In a luckier general case, it will lead to a segmentation fault. But sometimes, it may just flip some bits, turn true into false, and ... happy debugging. The crux of the problem is that the runtime not only reevaluates OCaml values, but it also resets the roots table and other runtime data structures which breaks GC invariants and sends it on a rampage on your data. That is not to say that you can't load code dynamically in OCaml. You can, and we do this successfully in BAP, which uses plugins very extensively. It just means that you can't trust the runtime and hope that it will take care of the correctness and need to ensure it yourself. Basically, that means that your loader must track which compilation units are already loaded, and your plugin must contain meta information that tells the loader which compilation units it requires and which it provides. This requires quite a cooperation from all the parts. In BAP we solved it in the following case: 1) Developed a `bapbuild` tool which is an ocamlbuild enhanced with a plugin [2] that knows how to build `*.plugin` files. A plugin is a zip file underneath the hood with a fixed layout (called bundle in our parlance). It contains a MANIFEST file which includes the list of required libraries and a list of provided units, along with some meta information and, of course, the cmxs (and cma) for the code itself. Optionally, the bundle may include all the dependent libraries (to make the plugin loadable in environments where the required libraries are not provided). The `bapbuild` tool will package all the dependencies by default, and since some libraries in the OPAM universe do not provide `cmxs` at all it will also build cmxs for them and package them into the plugin. Note, 2) Developed a `bap_plugins` runtime library [3] which loads plugins, fulfilling their dependencies and ensuring that no units are loaded twice. 3) The host program (which loads plugins) may (and will) also contain some compilation units in it, as it will be linked from some set of compilation units that are either local to the project or came from external libraries. So we need some cooperation from the build system that shall tell us which units are already loaded (alternatively we can parse the ELF structures of the host binary, but this doesn't sound as a very portable and robust solution). We use `ocamlfind.dynlink` library which enables such cooperation, by storing a list of libraries and packages that were used to build a binary in an internal data structure. We wrote a small ocamlbuild plugin [4] that enables this and the rest is done by ocamlfind (which actually generates a file and links it into the host binary). Everything is under MIT license so feel free to use it at your wish. Besides having the bap prefix those tools are pretty independent and could be generalized with all bapspecificness scrapped away. Best wishes, Ivan Gotovchits [1]: https://github.com/ocaml/ocaml/pull/1063 [2]: https://github.com/BinaryAnalysisPlatform/bap/blob/master/lib/bap_build/bap= _build.ml [3] https://github.com/BinaryAnalysisPlatform/bap/blob/master/lib/bap_plugins/b= ap_plugins.ml [4] https://github.com/BinaryAnalysisPlatform/bap/blob/master/myocamlbuild.ml.i= n#L41-L85 On Thu, Nov 8, 2018 at 5:05 AM Fr=C3=A9d=C3=A9ric Fort wrote: > Hello, > > I have an existing program and would like to allow to extend it's > functionalities with plugins. > If I simplify my code structure it looks as follows: > - a.ml : "main module" of the program > - b.ml : additional definitions used in a.ml > - c.ml : interface for plugins (a collection of function refs) > - d.ml : plugin I would like to load > > Now, d.ml uses values defined in b.ml. Some of them are of type string ref > and it seems that the code of b.ml is reevaluated when I call > Dynlink.loadfile "/path/to/d.cmxs" which resets them to the empty string. > > Is there a way to prevent this from happening ? > Using allow_only and prohibit is not an option, since multiple plugins > would each reevaluate C > and undo each others modifications. > > Yours sincerely, > Fr=C3=A9d=C3=A9ric Fort > > P.S.: Here follows a minimal working example. > I compiled it with > ocamlbuild -use-ocamlfind -lib dynlink a.native > ocamlbuild -use-ocamlfind d.cmxs > > a.ml: > open Format > > let _ =3D > B.str :=3D "abc"; > printf "%s\n" !B.str; > begin > try > Dynlink.loadfile "./_build/d.cmxs" > with Dynlink.Error err -> > failwith (Dynlink.error_message err) end; > printf "%s\n" !B.str; > match !C.f with > | Some(f) -> printf "%s\n" (f 0) > | None -> () > > b.ml: > let str =3D ref "" > > c.ml: > let f : (int -> string) option ref =3D ref None > > d.ml: > let _ =3D > C.f :=3D Some((fun x -> !B.str^(string_of_int x))) > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > https://inbox.ocaml.org/caml-list > Forum: https://discuss.ocaml.org/ > Bug reports: http://caml.inria.fr/bin/caml-bugs --=20 Caml-list mailing list. Subscription management and archives: https://sympa.inria.fr/sympa/arc/caml-list https://inbox.ocaml.org/caml-list Forum: https://discuss.ocaml.org/ Bug reports: http://caml.inria.fr/bin/caml-bugs= --000000000000ec4fcd057a26de58 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Frederic,

You're observing a= glimpse of an undefined behavior that occurs when OCaml runtime reloads a = compilation unit that is already loaded. It is a well-known bug (MPR#4208, = MPR#4229, MPR#4839, MPR#6462, MPR#6957, MPR#6950) which is not yet fixed [1= ]. In a luckier general case, it will lead to a segmentation fault. But som= etimes, it may just flip some bits, turn true into false, and ... happy deb= ugging. The crux of the problem is that the runtime not only reevaluates OC= aml values, but it also resets the roots table and other runtime=C2=A0data = structures which breaks GC invariants and sends it on a rampage on your dat= a.

That is not to say that you can't load code= dynamically in OCaml. You can, and we do this successfully in BAP, which u= ses plugins very extensively. It just means that you can't trust the ru= ntime and hope that it will take care of the correctness and need to ensure= it yourself. Basically, that means that your loader must track which compi= lation units are already loaded, and your plugin must contain meta informat= ion that tells the loader which compilation units it requires and which it = provides. This requires quite a cooperation from all the parts. In BAP we s= olved it in the following case:

1) Developed a `ba= pbuild` tool which is an ocamlbuild=C2=A0enhanced with a plugin [2] that kn= ows how to build `*.plugin` files. A plugin is a zip file underneath the ho= od with a fixed layout=C2=A0(called bundle in our parlance). It contains a = MANIFEST file which includes the list of required libraries and a list of p= rovided units, along with some meta information and, of course, the cmxs=C2= =A0(and cma) for the code itself. Optionally, the bundle may include all th= e dependent libraries (to make the plugin loadable in environments where th= e required libraries are not provided). The `bapbuild`=C2=A0tool will packa= ge all the dependencies by default, and since some libraries in the OPAM un= iverse do not provide `cmxs` at all it will also build cmxs=C2=A0for them a= nd package them into the plugin. Note,=C2=A0

2) De= veloped a `bap_plugins` runtime library [3] which loads plugins, fulfilling= their dependencies and ensuring that no units are loaded twice.=C2=A0

3) The host program (which loads plugins) may (and wil= l) also contain some compilation units in it, as it will be linked from som= e set of compilation units that are either local to the project or came fro= m external libraries. So we need some cooperation from the build system tha= t shall tell us which units are already loaded (alternatively we can parse = the ELF structures of the host binary, but this doesn't sound as a very= portable and robust solution). We use `ocamlfind.dynlink` library which en= ables such cooperation, by storing a list of libraries and packages that we= re used to build a binary in an internal data structure. We wrote a small o= camlbuild plugin [4] that enables this and the rest is done by ocamlfind (w= hich actually generates a file and links it into the host binary).

Everything is under MIT license so feel free to use it at = your wish. Besides having the bap prefix those tools are pretty independent= and could be generalized with all bapspecificness scrapped away.

Best wishes,
Ivan Gotovchits

=

=
On Thu, Nov 8, 2018 a= t 5:05 AM Fr=C3=A9d=C3=A9ric Fort <frederic.fort@univ-lille.fr> wrote:
Hello,

I have an existing program and would like to allow to extend it's funct= ionalities with plugins.
If I simplify my code structure it looks as follows:
=C2=A0- a.ml : "main module" of the program
=C2=A0-
b.ml : additional definitions used in a.ml
=C2=A0- c.ml : interface for plugins (a collection of function refs)
=C2=A0-
d.ml : plugin I would like to load

Now,
d.ml = uses values defined in b.ml. Some of them are of type string ref
and it seems that the code of b.ml is reevaluated when I call
Dynlink.loadfile "/path/to/d.cmxs" which resets them to the empty= string.

Is there a way to prevent this from happening ?
Using allow_only and prohibit is not an option, since multiple plugins woul= d each reevaluate C
and undo each others modifications.

Yours sincerely,
Fr=C3=A9d=C3=A9ric Fort

P.S.: Here follows a minimal working example.
I compiled it with
ocamlbuild -use-ocamlfind -lib dynlink a.native
ocamlbuild -use-ocamlfind d.cmxs

a.ml:
open Format

let _ =3D
=C2=A0 B.str :=3D "abc";
=C2=A0 printf "%s\n" !B.str;
=C2=A0 begin
=C2=A0 =C2=A0 try
=C2=A0 =C2=A0 =C2=A0 Dynlink.loadfile "./_build/d.cmxs"
=C2=A0 =C2=A0 with Dynlink.Error err ->
=C2=A0 =C2=A0 =C2=A0 failwith (Dynlink.error_message err) end;
=C2=A0 printf "%s\n" !B.str;
=C2=A0 match !C.f with
=C2=A0 | Some(f) -> printf "%s\n" (f 0)
=C2=A0 | None -> ()

b.ml:
let str =3D ref ""

c.ml:
let f : (int -> string) option ref =3D ref None

d.ml:
let _ =3D
=C2=A0 C.f :=3D Some((fun x -> !B.str^(string_of_int x)))

--
Caml-list mailing list.=C2=A0 Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list htt= ps://inbox.ocaml.org/caml-list
Forum: https://discuss.ocaml.org/
Bug reports: http://caml.inria.fr/bin/caml-bugs
--000000000000ec4fcd057a26de58--