caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* C binding and GC interaction: storing values outside the heap
@ 2010-09-08  8:41 oleg
  0 siblings, 0 replies; 2+ messages in thread
From: oleg @ 2010-09-08  8:41 UTC (permalink / raw)
  To: p.donadeo, caml-list


Paolo Donadeo wrote: 
> The problem is that, for several good reasons, I need a copy, or a
> reference, to the OCaml value representing the lua_State *inside* the
> Lua state (I mean the C data structure).

Data structures with the mixture of OCaml values and non-OCaml
(unboxed) values are needed indeed, it seems. I had to emulate these
data structures since the captured native-code delimited continuation,
a part of the C stack, is exactly such mixed value. Leaking memory
can not be tolerated since continuations could be captured at high
rate (the delimcc distribution has several tests for that).

The solution was a custom GC scanning function. The mixed value is
allocated off the OCaml heap. All allocated values are linked.  The
pointer to the allocated value is wrapped into a OCaml custom block,
with a finalizer, which unlinks the value being finalized. The links
are not NOT heap pointers; they are not known to GC and so do not
prevent finalization. The tricky part is registering a GC scan
call-back. GC invokes the registered call-backs at the end.  The
call-back should go through all linked mixed values, and apply the GC
scanning action (passed as the argument to the callback) to all OCaml
values within the mixed values.

This solution is inspired by the implementation of thread contexts
in the native thread implementation, see the file
	otherlibs/systhreads/posix.c
in the OCaml distribution. The delimcc implementation is in the file
stacks-native.c of the delimcc distribution. 

The solution does not seem to be very efficient; it would be great if
OCaml supported mixed values directly (that is, permitted a custom
block with a custom GC scanning function). When GC scans such a custom
value, it invokes the user-provided function like a GC call-back.
The custom GC callback should know which parts of the mixed value are
OCaml values; it would do the GC action on those values.



^ permalink raw reply	[flat|nested] 2+ messages in thread

* C binding and GC interaction: storing values outside the heap
@ 2010-09-07 20:58 Paolo Donadeo
  0 siblings, 0 replies; 2+ messages in thread
From: Paolo Donadeo @ 2010-09-07 20:58 UTC (permalink / raw)
  To: OCaml mailing list, OCaml-Lua devel ML

[-- Attachment #1: Type: text/plain, Size: 4331 bytes --]

I'm writing a Lua API binding <http://ocaml-lua.forge.ocamlcore.org/> and I
have a problem regarding the interaction with the garbage collector. The
situation is rather canonical: a particular C data type, the Lua
state<http://www.lua.org/manual/5.1/manual.html#lua_state>,
is used as argument in all the C functions of the API. A pointer to a
lua_State is wrapped inside an OCaml custom block, in the very same way
presented in the official documentation (in the ncurses example). Like the
WINDOW* example, the lua_State is allocated via caml_stat_alloc and the
resulting pointer is wrapped in a value obtained by caml_alloc_custom using
a macro:

#define lua_State_val(L) (*((lua_State **) Data_custom_val(L))) /* also
l-value */
... ... ...
lua_State *L = lua_newstate(custom_alloc, NULL); // the actual allocation
made by caml_stat_resize
... ... ...
v_L = caml_alloc_custom(&lua_State_ops, sizeof(lua_State *), 1, 10);
lua_State_val(v_L) = L;
CAMLreturn(v_L);

So far so good.

The problem is that, for several good reasons, I need a copy, or a
reference, to the OCaml value representing the lua_State (v_L in the code
above) *inside* the Lua state (I mean the C data structure). This is
possible because the Lua API provides for a way to bind a user data inside
the state. So I wrote:

typedef struct ocaml_data
{
 value state_value;
 value panic_callback;
} ocaml_data;

CAMLprim
value luaL_newstate__stub (value unit)
{
   CAMLparam1(unit);
   CAMLlocal1(v_L);

   value *default_panic_v = caml_named_value("default_panic");

   /* create a fresh new Lua state */
   lua_State *L = lua_newstate(custom_alloc, NULL);
   lua_atpanic(L, &default_panic);

   /* alloc space for the register entry */
   ocaml_data *data = (ocaml_data*)caml_stat_alloc(sizeof(ocaml_data));
   caml_register_global_root(&(data->panic_callback));
   data->panic_callback = *default_panic_v;

   /* create a new Lua table for binding informations */
   set_ocaml_data(L, data);  // puts "data" inside L

   /* wrap the lua_State* in a custom object */
   v_L = caml_alloc_custom(&lua_State_ops, sizeof(lua_State *), 1, 10);
   lua_State_val(v_L) = L;
   data->state_value = v_L;  // also v_L inside L but BIG PROBLEM HERE!!!

   /* return the lua_State value */
   CAMLreturn(v_L);
}

The problem here is that I'm storing an OCaml value (v_L) inside a malloc-ed
area. Result: segfault.

Is there a safe way to store a reference to a value outside the heap?

As a temporary workaround I removed the "value state_value" field from the
ocaml_data struct, replacing it with a reference counter:

typedef struct ocaml_data
{
   value panic_callback;
   int ref_counter;
} ocaml_data;

and the previous "luaL_newstate__stub" function sets the counter to 1:

... ... ...
   /* alloc space for the register entry */
   ocaml_data *data = (ocaml_data*)caml_stat_alloc(sizeof(ocaml_data));
   caml_register_global_root(&(data->panic_callback));
   data->panic_callback = *default_panic_v;
   data->ref_counter = 1;
... ... ...

In other parts of the code, where I have the original lua_State pointer, but
I need the corresponding OCaml value, and where I previously used the
retrieve it from the lua_State, now I create *another* OCaml value with the
same lua_State, incrementing the reference counter, for example:

static int panic_wrapper(lua_State *L)
{
   CAMLlocal1(v_L);
   ocaml_data *data = get_ocaml_data(L);

   /* wrap the lua_State* in a custom object */
   v_L = caml_alloc_custom(&lua_State_ops, sizeof(lua_State *), 1, 10);
   lua_State_val(v_L) = L;
   data->ref_counter++;

   return Int_val(caml_callback(data->panic_callback, v_L));
}

In the finalization function I free() the C data structures only if
ref_counter reaches 0:

static void finalize_lua_State(value L)
{
   lua_State *state = lua_State_val(L);
   ocaml_data *data = get_ocaml_data(state);

   if (data->ref_counter == 1)
   {
       caml_remove_global_root(&(data->panic_callback));
       caml_stat_free(data);
       lua_close(state);  // this calls free()
   }
   else
   {
       data->ref_counter--;
   }
}

What I don't like here is that several OCaml values, representing the same C
data structure, are simultaneously present in the program, and the reference
counting is not exactly the best way to collect memory garbage.

Any ideas or suggestions?


-- 
*Paolo*

[-- Attachment #2: Type: text/html, Size: 5473 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2010-09-08  8:42 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-08  8:41 C binding and GC interaction: storing values outside the heap oleg
  -- strict thread matches above, loose matches on Subject: below --
2010-09-07 20:58 Paolo Donadeo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).