Keeping local types local?

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* Keeping local types local?
@ 2008-09-16 10:12 oleg
  2008-09-16 16:47 ` David Rajchenbach-Teller
  0 siblings, 1 reply; 6+ messages in thread
From: oleg @ 2008-09-16 10:12 UTC (permalink / raw)
  To: David.Teller, caml-list

David Rajchenbach-Teller wrote:

> One way of doing so would be to use monads but my idea is to use local
> modules and local types and take advantage of the fact that values of
> that type cannot escape the scope of the local module. Unfortunately,
> as it turns out, sometimes, values with a local type can escape their
> scope -- and I'm looking for an idea on how to plug the leak.

I don't think there is much hope. Without monads (or, better,
parameterized monads) and without the effect type system, there is
nothing to prevent the `escape'. The type system of OCaml is
powerless: it can prevent the value 'v' of some type to be used
outside the local module that declares that local type. However,
nothing prevents us from forming a closure
	fun () -> consumer v; ()
and using that closure anywhere we see fit. The closure has the
`effect' of consuming the value, regardless of any bracketing
boundaries. And yet its type is unit->unit. The problem with ML in
general is precisely that the type of the function says absolutely
nothing about the function's effects. It gets worse: there is a
similar way to consume the value 'v' of a `local' type
inappropriately: via an assignment. The consumer code may include the
following:

	let thunk = ref (fun () -> ())

	let process () =
	     !thunk ();
	     thunk := fun () -> ignore (set v);
	     ...

The next invocation of the the process function of the consumer will
execute set v, probably outside of the transaction
boundaries. One should also worry about returning thunks via
exceptions... 

Section 4.3 of
	http://okmij.org/ftp/Computation/resource-aware-prog/region-io.pdf
as well as the accompanying code shows several attempts to break out
of the region boundaries. As far as the Region-IO paper is concerned,
all attempts failed. Because we do use monads, rank-2 polymorphism and
because the type of the monad specifically reflects its effects. For
fancier guarantees, one needs parameterized monads (which aren't
actually monads), see Section 6 of the paper.

Rank-2 polymorphism is available in OCaml too. In fact, MetaOCaml uses
this technique to prevent `escaping' of free variables, to make sure
that only closed code could be run. Mutations and exceptions still
pose the problem however.

Perhaps the best approach in OCaml now is sand-boxing. Although
it is not advertised much, OCaml system is quite `reflective'. One can
trivially adapt the main compiler driver so to compile a source code
file in a `reduced' environment -- the environment that does not have
the ref type, for example. By controlling the available operations and
available types, we can restrict the effects; we should also define an
appropriate variant (Int, Bool, pair, array -- but not a closure) and
insist the return value be of that variant type. The return value
must be serializable, and we should only permit assignments of
serializable values. That may be sufficient for safety.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Keeping local types local?
  2008-09-16 10:12 Keeping local types local? oleg
@ 2008-09-16 16:47 ` David Rajchenbach-Teller
  2008-09-17  8:07   ` oleg
  0 siblings, 1 reply; 6+ messages in thread
From: David Rajchenbach-Teller @ 2008-09-16 16:47 UTC (permalink / raw)
  To: oleg; +Cc: caml-list

Thanks for the answer.

My hope is that, since OCaml does already perform some scope escape
analysis, this analysis could be hijacked into something more
restrictive with the help of some simple rewriting/staged compilation.

Now, I realise that monads (I'm unfamiliar with parametrised monads,
although I have the intuition that they are what I call "indexed
families of monads" in my own work -- I'll be sure to read your paper)
are a solution to the problem, possibly the only solution. My issue with
this is that monads are a quite un-ocamlish solution, which requires
rewriting every single line of code, largely losing the ability to
compose functions and affecting performances too much for my taste, not
to mention the problem of monad transformers.

So I'm looking for another way out. As far as both your examples and my
experiments seem to indicate, the only way of escaping scope is to
return a continuation which calls one of the protected functions and
ignores the result. So there is a relatively simple way of preventing
escapes, which can probably be done syntactically: forbidding
continuations. Now, that's a harsh measure so perhaps we can alleviate
it. Assuming we allow rewriting, we can possibly replace every single
function definition inside the scope with something a tad more drastic
-- I don't know what yet, but there may be a way.

Now, I have the feeling that this may require also modifying the
language, to introduce something which, to my intuition, looks like
implicit parameters. I'm not sure how deep such a modification should
run (perhaps Camlp4 is sufficient) but I have the feeling that such a
modification would also prove quite useful for adding type-classes to
OCaml, so if the depth is limited, it may be a good idea to achieve both
changes with one stroke.

What do you think?

Cheers,
 David

P.S.: I'll be sure to read your paper -- we seem to have many common
preoccupations of late.
P.P.S.: Sandboxing? Interesting idea. I was planning to get a few
students to implement dynamic  evaluation in OCaml but I hadn't thought
of using that for the purpose of sandboxing.

On Tue, 2008-09-16 at 03:12 -0700, oleg@okmij.org wrote:
> David Rajchenbach-Teller wrote:
> 
> > One way of doing so would be to use monads but my idea is to use local
> > modules and local types and take advantage of the fact that values of
> > that type cannot escape the scope of the local module. Unfortunately,
> > as it turns out, sometimes, values with a local type can escape their
> > scope -- and I'm looking for an idea on how to plug the leak.
> 
> I don't think there is much hope. Without monads (or, better,
> parameterized monads) and without the effect type system, there is
> nothing to prevent the `escape'. The type system of OCaml is
> powerless: it can prevent the value 'v' of some type to be used
> outside the local module that declares that local type. However,
> nothing prevents us from forming a closure
> 	fun () -> consumer v; ()
> and using that closure anywhere we see fit. The closure has the
> `effect' of consuming the value, regardless of any bracketing
> boundaries. And yet its type is unit->unit. The problem with ML in
> general is precisely that the type of the function says absolutely
> nothing about the function's effects. It gets worse: there is a
> similar way to consume the value 'v' of a `local' type
> inappropriately: via an assignment. The consumer code may include the
> following:
> 
> 	let thunk = ref (fun () -> ())
> 
> 	let process () =
> 	     !thunk ();
> 	     thunk := fun () -> ignore (set v);
> 	     ...
> 
> The next invocation of the the process function of the consumer will
> execute set v, probably outside of the transaction
> boundaries. One should also worry about returning thunks via
> exceptions... 
> 
> Section 4.3 of
> 	http://okmij.org/ftp/Computation/resource-aware-prog/region-io.pdf
> as well as the accompanying code shows several attempts to break out
> of the region boundaries. As far as the Region-IO paper is concerned,
> all attempts failed. Because we do use monads, rank-2 polymorphism and
> because the type of the monad specifically reflects its effects. For
> fancier guarantees, one needs parameterized monads (which aren't
> actually monads), see Section 6 of the paper.
> 
> Rank-2 polymorphism is available in OCaml too. In fact, MetaOCaml uses
> this technique to prevent `escaping' of free variables, to make sure
> that only closed code could be run. Mutations and exceptions still
> pose the problem however.
> 
> Perhaps the best approach in OCaml now is sand-boxing. Although
> it is not advertised much, OCaml system is quite `reflective'. One can
> trivially adapt the main compiler driver so to compile a source code
> file in a `reduced' environment -- the environment that does not have
> the ref type, for example. By controlling the available operations and
> available types, we can restrict the effects; we should also define an
> appropriate variant (Int, Bool, pair, array -- but not a closure) and
> insist the return value be of that variant type. The return value
> must be serializable, and we should only permit assignments of
> serializable values. That may be sufficient for safety.
> 
> 
-- 
David Teller-Rajchenbach
 Security of Distributed Systems
  http://www.univ-orleans.fr/lifo/Members/David.Teller
 Angry researcher: French Universities need reforms, but the LRU act brings liquidations. 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Keeping local types local?
  2008-09-16 16:47 ` David Rajchenbach-Teller
@ 2008-09-17  8:07   ` oleg
  2008-09-19 13:55     ` [Caml-list] " David Rajchenbach-Teller
  0 siblings, 1 reply; 6+ messages in thread
From: oleg @ 2008-09-17  8:07 UTC (permalink / raw)
  To: David.Teller; +Cc: caml-list

> So I'm looking for another way out. As far as both your examples and my
> experiments seem to indicate, the only way of escaping scope is to
> return a continuation which calls one of the protected functions and
> ignores the result.

I'm afraid this is worse than it seems. Returning any closure (not
necessarily a continuation) can defeat the security. To summarize, the
security of the framework is defeated if

	-- one returns a closure that calls a protected function
	-- one returns an object whose method calls a protected function
	-- one returns a polymorphic record that `abstracts'
	over the abstract type guarding the protected function

	-- one assigns to the mutable variable: a closure, an object,
or a polymorphic record

	-- one throws or returns an exception that contains a closure,
an object, or a polymorphic record

Here is the illustrating code. We start with the baseline
let f0 () =
  let result =
    let module A =
        struct
          type 'a t = Guard of 'a (*Used only to prevent scope escape.*)
                  (** Local primitives, usage guarded by [Guard] *)
          let set v =
            print_endline "set has been called"; Guard ()
          let return x = Guard x
          let result =
	    print_endline "Initialization";
            match
	      (* Good client *)
	      set 1
            with Guard x -> print_endline "Clean-up"; x
        end in A.result
  in result
;;
let test1 = f0 ();;

We see that set has been called after initialization and _before_ the
clean-up. In the code below, we replace the client line (which above
was just `set 1') with something else. The stupid bad client

let f1 () = ...
	return (fun () -> set 1)
	...

is easily caught. The type checker rejects the code with the message
  This `let module' expression has type unit -> unit A.t
  In this type, the locally bound module name A escapes its scope

Alas, a bit more cunning client, as you have observed,

let f2 () = ...
	      return (fun () -> ignore (set 1); ())
	...
let test2 = f2 () ();;

manages to call the setter _after_ the clean-up. But that is not the
only cunning client. Here is another one

let f3 () = ...
	      return (object val mutable guarded = return ()
		            method call_setter = guarded <- set ()
		     end)
	...
let test3 = (f3 ())#call_setter;;

and another one

exception Foo of (unit -> unit);;
let f4 () = ...
	      return (Foo (fun () -> ignore (set 1); ()))
let test4 = try raise (f4 ()) with Foo e -> e ();;

and another one

let cunning_ref = ref (fun () -> ())
let f5 () = ...
	      cunning_ref := (fun () -> ignore (set 1); ()); return ()
	...
let test5 = f5 (); !cunning_ref ();;

To ensure security, one should prohibit returning, assigning or
throwing any values other than the values of simple types: numbers,
strings, pairs, arrays and lists of those. In short, only easily
serializable values may be returned, assigned and thrown.

I fully agree with your assessment of monads. I should remark that
type-based assurances work well for data dependencies, but not so for
control dependencies (that's why we need a so-called type-state).
Monads convert control dependency into data dependency.

You do know of FlowCaml, right? It doesn't seem to be actively
maintained though...

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] Re: Keeping local types local?
  2008-09-17  8:07   ` oleg
@ 2008-09-19 13:55     ` David Rajchenbach-Teller
  0 siblings, 0 replies; 6+ messages in thread
From: David Rajchenbach-Teller @ 2008-09-19 13:55 UTC (permalink / raw)
  To: oleg; +Cc: caml-list


On Wed, 2008-09-17 at 01:07 -0700, oleg@okmij.org wrote: 
> > So I'm looking for another way out. As far as both your examples and my
> > experiments seem to indicate, the only way of escaping scope is to
> > return a continuation which calls one of the protected functions and
> > ignores the result.
> 
> I'm afraid this is worse than it seems. Returning any closure (not
> necessarily a continuation) can defeat the security. To summarize, the
> security of the framework is defeated if

You're right, I actually meant "closure" rather than "continuation". 

> I fully agree with your assessment of monads. I should remark that
> type-based assurances work well for data dependencies, but not so for
> control dependencies (that's why we need a so-called type-state).
> Monads convert control dependency into data dependency.

"Type-state"? I'm not familiar with the term although it sounds exactly
like what I have in mind (and which I hoped to be able to emulate with
OCaml-compatible dynamic scoping, i.e. implicit arguments).

> You do know of FlowCaml, right? It doesn't seem to be actively
> maintained though...
Yes, FlowCaml is my plan C (plan A was monads).
> 
Thanks,
 David
> 
-- 
David Teller-Rajchenbach
 Security of Distributed Systems
  http://www.univ-orleans.fr/lifo/Members/David.Teller
 Angry researcher: French Universities need reforms, but the LRU act
brings liquidations. 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Keeping local types local?
@ 2008-09-18 23:34 oleg
  0 siblings, 0 replies; 6+ messages in thread
From: oleg @ 2008-09-18 23:34 UTC (permalink / raw)
  To: David.Teller; +Cc: caml-list

It seems the following section might be relevant to the present
discussion. It deals with a related problem: a function yield ()
should _not_ be invoked within the dynamic extent of the function
irq_disable. The latter takes a thunk and executes it with disabled
interrupts. This is a so-called yield problem in operating systems:

	Hao Chen and Jonathan S. Shapiro,
	Using Build-Integrated Static Checking to Preserve Correctness 
	Invariants, CCS2004, pp. 288--297
	http://www.eros-os.org/papers/ccs04.pdf

The problem is quite like the one we've discussed, up to the inverse:
you wish to enforce that set() and get() are only invoked within the
dynamic extent of the initialization function. The yield problem is to
prevent yield() from being invoked, within the specified dynamic extent.

The section below was present in an earlier draft of the regions IO
paper written with Chung-chieh Shan. The section discusses two other
OCaml solutions to the problem, which haven't been mentioned yet. The
yield.ml code discussed at the end is available here:

	http://okmij.org/ftp/Computation/resource-aware-prog/yield.ml

I should also point out the file yield.elf in the same directory. It
formally proves the safety of the monadic solution to the problem. The
safety is the corollary to the Progress theorem. The dynamic semantics
does not define the transition for yield() executed with any level of
disabled interrupts. Thus the evaluation of yield() in that context
gets stuck. The formally proven Progress theorem states that the
evaluation of the well-typed term does not get stuck.

Begin excerpt:
\subsection{Types versus effects}

This approach cannot be embedded in direct style in an impure language
such as OCaml to provide the same static safety guarantees. We can
easily define a higher-order function |irq_disable| that takes a thunk
as argument and executes it with interrupts disabled. However, using ML
facilities alone without external tools, we cannot statically prevent
invoking this function as 
\begin{code}
irq_disable (fun () -> yield (); ())
\end{code}
and thus executing |yield| with interrupts disabled.  This is because
the safe expression |()| and the unsafe one |(yield (); ())| have the same
type no matter what type we assign to |yield|. 

We need an effect system, such as \citearound{'s exception checker for
OCaml}\citet{leroy-type-based}. If the desired safety property were to
invoke |yield| only when interrupts are disabled (which is the opposite
of our problem), we could encode the property in terms of exceptions by
pretending that |yield| might raise an exception.
\begin{code}
let yield () = 
  if false then raise YieldExc else do_yield ()
\end{code}
The |irq_disable| function would then catch the exception.  Alas, our
problem is different, and the exception checker is external to OCaml.
An unappealing alternative is to abandon direct style and program
instead in continuation\hyp passing or monadic style, perhaps using some
syntactic sugar analogous to Haskell's |do| notation.  A third approach
is to use MetaOCaml to synthesize the interrupt handling code in direct
style, ensuring the yield property along the way.  Some microkernels
\citep{gabber-pebble} use this idea of run-time code generation, but it
is too static for our needs (see |yield.ml|).

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Keeping local types local?
@ 2008-09-15 12:37 David Rajchenbach-Teller
  0 siblings, 0 replies; 6+ messages in thread
From: David Rajchenbach-Teller @ 2008-09-15 12:37 UTC (permalink / raw)
  To: Caml

        Hi everyone,

 Recently, I've been thinking about a small technique which I would like
to use to design some libraries. Essentially, I'd like to define values
which can't escape a given scope. One way of doing so would be to use
monads but my idea is to use local modules and local types and take
advantage of the fact that values of that type cannot escape the scope
of the local module. Unfortunately, as it turns out, sometimes, values
with a local type can escape their scope -- and I'm looking for an idea
on how to plug the leak.

Let's try and make this clear with an example. In the following code, we
have two operations [set] and [get], which should only be called between
initialization and clean-up. To enforce this guarantee, we use a local
module (added by Camlp4), with a local type ['a t] and we make sure that
our two operations have a return type ['a t]. We also have some client
code (actually written by the user), which may use [get], [set] and
[return] (note: 

let f () =
  let result = 
    let module A =
	struct
	  type 'a t = Guard of 'a (*Used only to prevent scope escape.*)

                  (** Local primitives, usage guarded by [Guard] *)

	  let set v =
	    (*perform some side-effect*)
	    Guard ()

                  let get () =
                    (*perform some side-effect*)
                    Guard 42(*or some other value*)

                  let return x =
                    Guard x

                  (** Infrastructure *)

	  let result =
                    (*initialize some stuff*)
                     match
                      (*start of client code*)
                            (*client code doesn't know about [Guard]*)
                      (*end of client code*)
                     with Guard x -> 
                      (*do some clean-up*)
                      x

	end in A.result
  in result

Now, any use of [set] or [get] will yield a result of type ['a A.t]. By
definition of local modules, this type can't escape the scope of [A],
which means that we can't store references to either value in an outside
reference, we can't put it into a continuation and it can't cross module
boundaries. Which means that we have safely used our resources. Yeah.

Unfortunately that's not always true, as there is a way for the client
code to sneak a continuation which can call [set]:

(*start of client code*)
   return (fun () -> ignore (set 99))
(*end of client code*)

I can then write [f () ()] and avoid the whole resource-protection
infrastructure :( 

Now, as far as I can tell, the only way of leaking unsafe operations is
to return a continuation which perform one of our local operations,
trigger the continuation and somehow ignore the value of the result.

Does anyone have ideas regarding how to prevent this kind of leaks? I'm
willing to resort to Camlp4 and/or advanced type hackery but not to
write my own type system.

Thanks in advance,
 David

P.S.:
 In case you're interested, such a design pattern would permit
definition of fast checked local and polymorphic exceptions without
monads, and might be applicable to some cases of types-and-effects.

-- 
David Teller-Rajchenbach
 Security of Distributed Systems
  http://www.univ-orleans.fr/lifo/Members/David.Teller
 Angry researcher: French Universities need reforms, but the LRU act brings liquidations. 

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2008-09-19 13:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-09-16 10:12 Keeping local types local? oleg
2008-09-16 16:47 ` David Rajchenbach-Teller
2008-09-17  8:07   ` oleg
2008-09-19 13:55     ` [Caml-list] " David Rajchenbach-Teller
  -- strict thread matches above, loose matches on Subject: below --
2008-09-18 23:34 oleg
2008-09-15 12:37 David Rajchenbach-Teller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).