caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Lauri Alanko <la@iki.fi>
To: caml-list@inria.fr
Subject: [Caml-list] Closure marshalling inconsistency
Date: Sun, 6 Feb 2011 18:36:06 +0200	[thread overview]
Message-ID: <20110206163606.GA1023@melkinpaasi.cs.helsinki.fi> (raw)

Marshalling closures with references to globals is precarious
business.

(** a.ml **)
let r = ref 0
let f () = incr r
let print s = Printf.printf "After %s, !A.r = %d\n" s !r

(** b.ml **)
let f1 = A.f
let f2 () = A.f ()
let f3 () = incr A.r
let pickle f = Marshal.to_string f [Marshal.Closures]
let unpickle s : unit -> unit = Marshal.from_string s 0 
let s1 = pickle f1
let s2 = pickle f2
let s3 = pickle f3
let u1 = unpickle s1
let u2 = unpickle s2
let u3 = unpickle s3
let _ =
  A.print "start";
  u1 ();
  A.print "u1";
  u2 ();
  A.print "u2";
  u3 ();
  A.print "u3"


$ ocamlc -o a a.ml b.ml
$ ./a
After start, !A.r = 0
After u1, !A.r = 0
After u2, !A.r = 1
After u3, !A.r = 2

Why did u1 () not increase A.r? Because A.f is a _closure_ whose
environment consists of the variables of its local module that it
refers to, namely, the ref cell r. When A.f is marshalled, the current
value of r is marshalled as well, and then unmarshalled into a new
closure containing a reference to a new ref cell. When u1 () is
applied, only the newly created ref cell is incremented, not A.r.

On the other hand, the _code_ of f2 contains a reference to the global
variable A.f, and there is no environment to marshall. When we
unmarshall back, we get back a reference to the same code, so calling
u2() just calls A.f() (not a copy of it), which then increments A.r as
normal.

Finally, f3() does the same thing as f2(), except that instead of
calling A.f, it just increments A.r (through a global reference)
directly.

So simple eta expansion and inlining of plain functions can have an
observable effect on sharing in marshalled closures.

What's worse, this only applies to the bytecode compiler. In native
code things are different:

$ ocamlopt -o a.opt a.ml b.ml
$ ./a.opt
After start, !A.r = 0
After u1, !A.r = 1
After u2, !A.r = 2
After u3, !A.r = 3

In native code, the reference to the local variable r in A.f does not
happen through an environment, but is hard-coded into the code. Since
the environment is marshalled by value and code is marshalled by
reference, this again makes an observable difference. I'd consider
this a bug.

So, if you want to marshall a function that references a global, make
sure that the global is in _another_ top-level module than the one
that defines the function.


Lauri

                 reply	other threads:[~2011-02-06 16:36 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110206163606.GA1023@melkinpaasi.cs.helsinki.fi \
    --to=la@iki.fi \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).