caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Marshal, closures, bytecode and native compilers
@ 2007-02-01 18:44 Damien Pous
  2007-02-02  2:15 ` [Caml-list] " Jacques Garrigue
  0 siblings, 1 reply; 2+ messages in thread
From: Damien Pous @ 2007-02-01 18:44 UTC (permalink / raw)
  To: caml-list

Bonjour,

I found some strange difference between the native and bytecode
compilers, when Marshaling functional values:

[damien@mostha]$ cat lift.ml
let r = ref 0
let f =
  fun () -> incr r; print_int !r; print_newline()
let () = match Sys.argv.(1) with
  | "w" ->  Marshal.to_channel stdout f [Marshal.Closures]
  | "r" ->
      let g = (Marshal.from_channel stdin: unit -> unit) in
        g (); f ()
  | _ -> assert false

[damien@mostha]$ ocamlc lift.ml; ( ./a.out w | ./a.out r )
1
1
[damien@mostha]$ ocamlopt lift.ml; ( ./a.out w | ./a.out r )
1
2
[damien@mostha]$ ocamlc -version
3.09.2

In the bytecode version, the reference [r] gets marshaled along with
[f] so that the calls [f()] and [g()] respectively affect the initial
reference of the reader, and the (fresh) marshaled reference.

On the contrary in the native version, it seems that [f] is not
`closed': its code address is directly sent, and the call [g()]
affects the initial reference of the reader.

For my needs, I definitely prefer the second answer (only the address
is sent). However, if I move the declaration of the reference inside
the definition of [f], both compilers agree on the first answer: the
reference is marshaled.

[damien@mostha]$ cat refs.ml
let f =
  let r = ref 0 in
    fun () -> incr r; print_int !r; print_newline()
let () = match Sys.argv.(1) with
  | "w" ->  Marshal.to_channel stdout f [Marshal.Closures]
  | "r" ->
      let g = (Marshal.from_channel stdin: unit -> unit) in
        g (); f ()
  | _ -> assert false

[damien@mostha]$ ocamlc refs.ml; ( ./a.out w | ./a.out r )
1
1
[damien@mostha]$ ocamlopt refs.ml; ( ./a.out w | ./a.out r )
1
1


More than the different behaviour of ocamlc and ocamlopt on "lift.ml",
I am quite surprised that ocamlopt does not give the same results on
"refs.ml" and "lift.ml" : the second is just a `lambda-lifting' of the
first one!


Here come my questions:
 - How to guess how deep a functional value will be marshaled?
 - Is there a way to enforce the second behaviour, where the reference is
   not marshalled (ocamlopt lift.ml)?


Cimer beaucoup,
Damien


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Caml-list] Marshal, closures, bytecode and native compilers
  2007-02-01 18:44 Marshal, closures, bytecode and native compilers Damien Pous
@ 2007-02-02  2:15 ` Jacques Garrigue
  0 siblings, 0 replies; 2+ messages in thread
From: Jacques Garrigue @ 2007-02-02  2:15 UTC (permalink / raw)
  To: Damien.Pous; +Cc: caml-list

From: Damien Pous <Damien.Pous@ens-lyon.fr>

> I found some strange difference between the native and bytecode
> compilers, when Marshaling functional values:
> 
> [damien@mostha]$ cat lift.ml
> let r = ref 0
> let f =
>   fun () -> incr r; print_int !r; print_newline()
> let () = match Sys.argv.(1) with
>   | "w" ->  Marshal.to_channel stdout f [Marshal.Closures]
>   | "r" ->
>       let g = (Marshal.from_channel stdin: unit -> unit) in
>         g (); f ()
>   | _ -> assert false
> 
> [damien@mostha]$ ocamlc lift.ml; ( ./a.out w | ./a.out r )
> 1
> 1
> [damien@mostha]$ ocamlopt lift.ml; ( ./a.out w | ./a.out r )
> 1
> 2
> [damien@mostha]$ ocamlc -version
> 3.09.2
> 
> In the bytecode version, the reference [r] gets marshaled along with
> [f] so that the calls [f()] and [g()] respectively affect the initial
> reference of the reader, and the (fresh) marshaled reference.
> 
> On the contrary in the native version, it seems that [f] is not
> `closed': its code address is directly sent, and the call [g()]
> affects the initial reference of the reader.

Interesting phenomenon. According to the usual definition of closure,
the correct solution is probably the bytecode one. But this definition
seems hardly applicable in practice, since it would also mean bringing
all dependencies with you. This is not the case even with the bytecode
version. For instance if you move "let r = ref 0" to r.ml, and replace
the first line of your program by "open R", you get the same behaviour
as for native code.

So as a first approximation, the real specification is: local
variables are transmitted with the closure, but global ones are not.
The trouble being that the definition of global is different for
bytecode and native code. With bytecode, definitions from the same
module are local, while they are global for native code.

Moreover, I believe that, through optimizations, variables that look
local may turn up to be global.

I'm not sure what would be the right fix.
A more complete specification would be a good idea.
A flag to disable optimizations would be rather costly.

For now, a rule of the thumb would be:

* if you want your variable to be handled as global, even in bytecode,
  either receive it as parameter (after marshalling) or put it in
  another compilation unit.

* if you want your variable to be handle as local, even in native
  code, then define or redefine it locally inside your function.
    let r = ref 0
    let f =
      let r = r in
      fun () -> incr !r; print_int !r; print_newline()
   For the time being this seems to work.

Maybe it is better just to assume that you should not mix closure
marshalling with mutable variables. In either case, the semantics
seems fishy. It seems more reasonable to make such functions receive
their mutable state explicitly, and choose either to send it
(obtaining a "fork" behaviour) or not.

Jacques Garrigue


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2007-02-02  2:15 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-02-01 18:44 Marshal, closures, bytecode and native compilers Damien Pous
2007-02-02  2:15 ` [Caml-list] " Jacques Garrigue

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).