caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Ivan Gotovchits <ivg@ieee.org>
To: frederic.fort@univ-lille.fr
Cc: caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] Dynlink plugin reevaluates modules of main program
Date: Thu, 8 Nov 2018 08:02:06 -0500	[thread overview]
Message-ID: <CALdWJ+xjbXs5Xt8jEDu73HVnFuxUeGQz8PV8owYJZwAda-ZEfg@mail.gmail.com> (raw)
In-Reply-To: <1238750250.1711857.1541671514743.JavaMail.zimbra@univ-lille.fr>

[-- Attachment #1: Type: text/plain, Size: 5632 bytes --]

Hi Frederic,

You're observing a glimpse of an undefined behavior that occurs when OCaml
runtime reloads a compilation unit that is already loaded. It is a
well-known bug (MPR#4208, MPR#4229, MPR#4839, MPR#6462, MPR#6957, MPR#6950)
which is not yet fixed [1]. In a luckier general case, it will lead to a
segmentation fault. But sometimes, it may just flip some bits, turn true
into false, and ... happy debugging. The crux of the problem is that the
runtime not only reevaluates OCaml values, but it also resets the roots
table and other runtime data structures which breaks GC invariants and
sends it on a rampage on your data.

That is not to say that you can't load code dynamically in OCaml. You can,
and we do this successfully in BAP, which uses plugins very extensively. It
just means that you can't trust the runtime and hope that it will take care
of the correctness and need to ensure it yourself. Basically, that means
that your loader must track which compilation units are already loaded, and
your plugin must contain meta information that tells the loader which
compilation units it requires and which it provides. This requires quite a
cooperation from all the parts. In BAP we solved it in the following case:

1) Developed a `bapbuild` tool which is an ocamlbuild enhanced with a
plugin [2] that knows how to build `*.plugin` files. A plugin is a zip file
underneath the hood with a fixed layout (called bundle in our parlance). It
contains a MANIFEST file which includes the list of required libraries and
a list of provided units, along with some meta information and, of course,
the cmxs (and cma) for the code itself. Optionally, the bundle may include
all the dependent libraries (to make the plugin loadable in environments
where the required libraries are not provided). The `bapbuild` tool will
package all the dependencies by default, and since some libraries in the
OPAM universe do not provide `cmxs` at all it will also build cmxs for them
and package them into the plugin. Note,

2) Developed a `bap_plugins` runtime library [3] which loads plugins,
fulfilling their dependencies and ensuring that no units are loaded twice.

3) The host program (which loads plugins) may (and will) also contain some
compilation units in it, as it will be linked from some set of compilation
units that are either local to the project or came from external libraries.
So we need some cooperation from the build system that shall tell us which
units are already loaded (alternatively we can parse the ELF structures of
the host binary, but this doesn't sound as a very portable and robust
solution). We use `ocamlfind.dynlink` library which enables such
cooperation, by storing a list of libraries and packages that were used to
build a binary in an internal data structure. We wrote a small ocamlbuild
plugin [4] that enables this and the rest is done by ocamlfind (which
actually generates a file and links it into the host binary).

Everything is under MIT license so feel free to use it at your wish.
Besides having the bap prefix those tools are pretty independent and could
be generalized with all bapspecificness scrapped away.

Best wishes,
Ivan Gotovchits


[1]: https://github.com/ocaml/ocaml/pull/1063
[2]:
https://github.com/BinaryAnalysisPlatform/bap/blob/master/lib/bap_build/bap_build.ml
[3]
https://github.com/BinaryAnalysisPlatform/bap/blob/master/lib/bap_plugins/bap_plugins.ml
[4]
https://github.com/BinaryAnalysisPlatform/bap/blob/master/myocamlbuild.ml.in#L41-L85

On Thu, Nov 8, 2018 at 5:05 AM Frédéric Fort <frederic.fort@univ-lille.fr>
wrote:

> Hello,
>
> I have an existing program and would like to allow to extend it's
> functionalities with plugins.
> If I simplify my code structure it looks as follows:
>  - a.ml : "main module" of the program
>  - b.ml : additional definitions used in a.ml
>  - c.ml : interface for plugins (a collection of function refs)
>  - d.ml : plugin I would like to load
>
> Now, d.ml uses values defined in b.ml. Some of them are of type string ref
> and it seems that the code of b.ml is reevaluated when I call
> Dynlink.loadfile "/path/to/d.cmxs" which resets them to the empty string.
>
> Is there a way to prevent this from happening ?
> Using allow_only and prohibit is not an option, since multiple plugins
> would each reevaluate C
> and undo each others modifications.
>
> Yours sincerely,
> Frédéric Fort
>
> P.S.: Here follows a minimal working example.
> I compiled it with
> ocamlbuild -use-ocamlfind -lib dynlink a.native
> ocamlbuild -use-ocamlfind d.cmxs
>
> a.ml:
> open Format
>
> let _ =
>   B.str := "abc";
>   printf "%s\n" !B.str;
>   begin
>     try
>       Dynlink.loadfile "./_build/d.cmxs"
>     with Dynlink.Error err ->
>       failwith (Dynlink.error_message err) end;
>   printf "%s\n" !B.str;
>   match !C.f with
>   | Some(f) -> printf "%s\n" (f 0)
>   | None -> ()
>
> b.ml:
> let str = ref ""
>
> c.ml:
> let f : (int -> string) option ref = ref None
>
> d.ml:
> let _ =
>   C.f := Some((fun x -> !B.str^(string_of_int x)))
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> https://inbox.ocaml.org/caml-list
> Forum: https://discuss.ocaml.org/
> Bug reports: http://caml.inria.fr/bin/caml-bugs

-- 
Caml-list mailing list.  Subscription management and archives:
https://sympa.inria.fr/sympa/arc/caml-list https://inbox.ocaml.org/caml-list
Forum: https://discuss.ocaml.org/
Bug reports: http://caml.inria.fr/bin/caml-bugs

[-- Attachment #2: Type: text/html, Size: 7628 bytes --]

  parent reply	other threads:[~2018-11-08 13:04 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-08 10:05 Frédéric Fort
2018-11-08 10:23 ` Nicolás Ojeda Bär
2018-11-08 10:32 ` Gabriel Scherer
2018-11-08 13:02 ` Ivan Gotovchits [this message]
2018-11-15 16:35   ` Nicolás Ojeda Bär

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALdWJ+xjbXs5Xt8jEDu73HVnFuxUeGQz8PV8owYJZwAda-ZEfg@mail.gmail.com \
    --to=ivg@ieee.org \
    --cc=caml-list@inria.fr \
    --cc=frederic.fort@univ-lille.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).