caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* cyclic value construction ("let rec")
@ 2000-03-30 20:12 Benoit Deboursetty
  2000-04-03 12:57 ` Markus Mottl
  0 siblings, 1 reply; 8+ messages in thread
From: Benoit Deboursetty @ 2000-03-30 20:12 UTC (permalink / raw)
  To: caml-list

Following the "let rec" thread, let me sum up a few of the problems which
occur frequently during value creation, as this list showed it.

 * Currently, O'CaML does not allow the creation of arbitrarily big cycles
in a non-mutable value. For instance, you cannot write a program that
computes a cyclic list ln of size n
   l_n = 1 :: 2 :: 3 :: ... :: n :: l_n
using only the built-in list type.

 * This is annoying because O'CaML handles immutable lists and trees so
well sometimes you wish it could do the same with anything else.

 * I have seen three basic classes of solutions when you want to deal
with values possibly showing cycles. Here I will assume that the values
you're working on is an arbitrary graph and its nodes carry data .

    1) index each node of the value in an array and instead of pointing at
a data, point at its index.

type 'a graph = ('a * int list) array

let circle n =
  Array.init n (fun i -> (i, [(i+1) mod n]))

circle : int -> int graph

  Pros: by externally describing links, you are able to "see what you're
doing". In particular, when you traverse the value, you can know in
constant time whether you are on a data node that you have already seen.
(you can mark where you have been in an array with the same indices). You
need this whenever you want to implement something like "map" on the data
carried by nodes of your graph. The majority of graph algorithms also
require this "outsight" of the value.

  Cons: joining two such graphs is complex, thus this construction is only
good for graphs you want to build at once. Arrays are mutable (with all
the problems possibly induced by passing around mutable arguments);
invalid values exist (for instance when they refer out-of-bound indices);
access to neighbours is indirect (you first go to the index in the table).
You don't have access to neighbours in patterns.

    2) use mutable fields, then build the data incrementally.

type 'a node = { data: 'a; mutable next: 'a node list }
let circle n =
  let temp = Array.init n (fun i -> { data = i; next = [] }) in
  for i = 0 to n - 1 do
    temp.(i).next <- [temp.((i+1) mod n)];
  done;
  temp.(0)

circle : int -> int node

  Pros: resembles C and its "do-what-I'm-telling-you" coding style.
Sometimes it's good. Sometimes you don't have the extra indirection caused
by the index. Values are always valid. You have access to neighbours in
patterns.

  Cons: resembles C and its "do-what-I'm-telling-you" coding style.
Program-wide mutability is a design problem when you only need it at value
building time. Also, sometimes, you are obliged to use an option for the
type of your mutable field. In the end, you find yourself with a complex
data structure with 'Some _'s everywhere and not a single 'None'. In those
cases when after build you expect every option of a given field to be a
Some, this represents another extra indirection (*) and the source of
invalid values.

   3) use lazy evaluation for the successors of each node.

type 'a node = { data: 'a ; next: 'a node list Lazy.t }

let circle n =
  let links = Array.make n [] in
  let nodes = Array.init n
    (fun i -> { data = i; next = lazy links.(i) })
  in
  for i = 0 to n - 1 do
    links.(i) <- [nodes.((i+1) mod n)]
  done;
  (* just to show the resulting structure is finite *)
  Array.iter (fun x -> ignore (Lazy.force x.next)) nodes;
  nodes.(0)

  Pros: gives the same power as mutable fields and more, with a less
imperative style. The only way of handling infinite structures. Depending
on the situation, you may have access to neighbours in patterns, but not
surely.

  Cons: correct graph building with lazy evaluation is not easy. Infinite
recursion is a common pitfall; building an infinite value when you wanted
it finite is another one. Also, with current O'CaML version, Lazy.t is not
abstract, and is implemented as a reference, hence it is mutable. Finally,
even if your were so careful as to keep your graph finite, you still have
to use Lazy.force each time you want to access a node's neighbour even
when you could have force'd everything in advance. You have no means of
assessing that a value is finite, you cannot tell whether marshalling
such a value will work.



I must have omitted many other solutions which I don't know. From this
build difficulty come many frustrations like "if only I could freeze all
mutable fields of this value at once" or things like that. I am in no way
criticizing O'CaML for being unable to handle things C, C++ and Java do
with "null pointers"! Just trying to figure out what is at the basis of
all these questions.

I have this little hack that will allow you to freeze mutable fields, but
it relies on a current exposed unsafeness with Marshalling. You can freeze
a value with mutable fields by marshalling it into a string, then
un-marshalling it back into a value of the same type with nonmutable
fields... This makes basically a copy of a value and changes the type on
the way.

It would be interesting to know whether people have other workarounds.
Also, a generic graph module which would implement "map", "iter" etc. in a 
cycle-safe way (like Marshal) would perhaps be interesting in the
'user-contributed stdlib extension'...

What do you think?

Benoît de Boursetty.

(*) I don't know how options are implemented, but at least at
language-level this adds an indirection



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cyclic value construction ("let rec")
  2000-03-30 20:12 cyclic value construction ("let rec") Benoit Deboursetty
@ 2000-04-03 12:57 ` Markus Mottl
  2000-04-06 13:18   ` Pierre Weis
  0 siblings, 1 reply; 8+ messages in thread
From: Markus Mottl @ 2000-04-03 12:57 UTC (permalink / raw)
  To: Benoit Deboursetty; +Cc: OCAML

> It would be interesting to know whether people have other workarounds.
> Also, a generic graph module which would implement "map", "iter" etc. in a 
> cycle-safe way (like Marshal) would perhaps be interesting in the
> 'user-contributed stdlib extension'...

You will probably always have to rely on "unsafe" things for such
purposes.  My quick hack for problems like this looks as follows:

  type 'a node = { data : 'a; next : 'a node }

  let make_circle n : 'a node =
    let module Dummy = struct
      type 'a t = { data : 'a; mutable next : 'a t } end in
    let res = { Dummy.data = n; Dummy.next = Obj.magic 0 } in
    res.Dummy.next <- res;
    Obj.magic res

  let rec iter f node = f node.data; iter f node.next

  let _ = iter (fun x -> print_int x; print_newline ()) (make_circle 42)

Type "'a node" is immutable from the outside. However, exploiting the
undocumented (good reason for this ;-) type casting features of OCaml,
one can cheat and define an identical type within a local module -
which is mutable!

Initialisation, too, relies on "unsafe" features - we simply put a
reference to a dummy value into the slot until it can be filled with the
"right" value. The return value is cast back again to the "global type"
as indicated in the type restriction in the definition of "make_circle"
- this is important, because otherwise you can crash your program (it
would have "any" type).

As far as I know, this should be safe, because the GC doesn't care
about the static types of the values - it only looks at the value tags
at runtime. Of course, the above will only work as long as the memory
layout of the "local" type agrees with the one of the "global type".

Thus, you can confine the "unsafe" part to specific functions (e.g.
"make_circle"). The iteration function "iter" does not change anything
and can therefore work on the "global" type.

Best regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cyclic value construction ("let rec")
  2000-04-03 12:57 ` Markus Mottl
@ 2000-04-06 13:18   ` Pierre Weis
  2000-04-06 13:34     ` Markus Mottl
  0 siblings, 1 reply; 8+ messages in thread
From: Pierre Weis @ 2000-04-06 13:18 UTC (permalink / raw)
  To: Markus Mottl; +Cc: caml-list

> Type "'a node" is immutable from the outside. However, exploiting the
> undocumented (good reason for this ;-) type casting features of OCaml,
> one can cheat and define an identical type within a local module -
> which is mutable!

This certainly suggests to allow the export of an immutable view of a
record type with mutable fields. This way you could do the
initialization in a safe way (no magic) using side effects, and still
export a safe immutable type to the external world.

-- 
Pierre Weis

INRIA, Projet Cristal, http://pauillac.inria.fr/~weis



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cyclic value construction ("let rec")
  2000-04-06 13:18   ` Pierre Weis
@ 2000-04-06 13:34     ` Markus Mottl
  2000-04-06 14:25       ` Xavier Leroy
  0 siblings, 1 reply; 8+ messages in thread
From: Markus Mottl @ 2000-04-06 13:34 UTC (permalink / raw)
  To: Pierre Weis; +Cc: OCAML

> This certainly suggests to allow the export of an immutable view of a
> record type with mutable fields. This way you could do the
> initialization in a safe way (no magic) using side effects, and still
> export a safe immutable type to the external world.

Sounds like a good idea! Using powerful "magic" is probably too dangerous
for "everyday"-use and definitely not in accordance with the "zero defect"
ambitions of the type system...

Although it would sometimes be nice to even hide specific fields of the
record, this would probably not work well together with separate
compilation.  However, the memory layout of the fields does not change by
just omitting the "mutable" declaration, so this should not do any harm.

Best regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cyclic value construction ("let rec")
  2000-04-06 13:34     ` Markus Mottl
@ 2000-04-06 14:25       ` Xavier Leroy
  2000-04-06 15:12         ` Markus Mottl
  2000-04-10  0:03         ` Pierre Weis
  0 siblings, 2 replies; 8+ messages in thread
From: Xavier Leroy @ 2000-04-06 14:25 UTC (permalink / raw)
  To: Markus Mottl, Pierre Weis; +Cc: OCAML

> > [Pierre Weis:]
> > This certainly suggests to allow the export of an immutable view of a
> > record type with mutable fields. This way you could do the
> > initialization in a safe way (no magic) using side effects, and still
> > export a safe immutable type to the external world.
> 
> [Markus Mottl:]
> Sounds like a good idea! Using powerful "magic" is probably too dangerous
> for "everyday"-use and definitely not in accordance with the "zero defect"
> ambitions of the type system...
> Although it would sometimes be nice to even hide specific fields of the
> record, this would probably not work well together with separate
> compilation.  However, the memory layout of the fields does not change by
> just omitting the "mutable" declaration, so this should not do any harm.

Alas, it can do a lot of harm.  For one thing, you could break type
safety this way, just like with polymorphic references:

A.ml:
        type 'a t = { mutable contents: 'a }
        let assign t v = t.contents <- v

A.mli:
        type 'a t = { contents: 'a}
        val assign: 'a t -> 'a -> unit

Client.ml:
        open A
        ...
          let x = { contents = [] } in
          assign x [1];
          x.contents = [true]

When typing Client.ml, since "contents" is assumed immutable, the
definition of x is a syntactic value, hence x receives type
        forall 'a.  'a list t
But of course this typing is invalidated by the call to "assign",
and you end up comparing an int list to a bool list -- a typing violation.

Some compiler optimisations, specific to immutable structures, could
similarly be broken.

So, no, we can't allow exporting a record with different mutability
annotations than in its definition.

- Xavier Leroy



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cyclic value construction ("let rec")
  2000-04-06 14:25       ` Xavier Leroy
@ 2000-04-06 15:12         ` Markus Mottl
  2000-04-10  0:03         ` Pierre Weis
  1 sibling, 0 replies; 8+ messages in thread
From: Markus Mottl @ 2000-04-06 15:12 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: OCAML

> Alas, it can do a lot of harm.  For one thing, you could break type
> safety this way, just like with polymorphic references:

Oh, right... - I could have thought earlier of this eternal problem of
destructive update and polymorphic typing! Once again it seems to me that
there is little place to "wiggle" at the current implementation of the
typing discipline without losing its important properties...

So my "quick hack" version is indeed potentially explosive and will only
work correctly (= type safe) as long as the user does not put polymorphic
values into the datastructure and does not do "evil" things then.

It would be possible to allow this "const cast" with records that have a
non-parameterized type - which would be pretty boring and might possibly
make the language less regular: beginners might be confused why "casting
away" mutability is allowed in some cases but not in others.

Best regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cyclic value construction ("let rec")
  2000-04-06 14:25       ` Xavier Leroy
  2000-04-06 15:12         ` Markus Mottl
@ 2000-04-10  0:03         ` Pierre Weis
  2000-04-10  1:41           ` Markus Mottl
  1 sibling, 1 reply; 8+ messages in thread
From: Pierre Weis @ 2000-04-10  0:03 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: caml-list

> > > This certainly suggests to allow the export of an immutable view of a
> > > record type with mutable fields. This way you could do the

> Alas, it can do a lot of harm.  For one thing, you could break type
> safety this way, just like with polymorphic references:
[...]
> So, no, we can't allow exporting a record with different mutability
> annotations than in its definition.

This is perfectly right, and as always with mutable values, you cannot
implement a feature without adding some safeguard conditions.

That's why I suggested to export an immutable VIEW of a record type
with mutable fields, not simply to export it as an immutable type!

I was thinking of something like:

type 'a loc = {mutable contents : 'a} as {contents : 'a};;

This way the compiler knows everything about the type, the original
and ``view'' definitions, so that nothing can be invalidated. The gain
is that now, direct assignment as

    x.contents <- value

is forbidden, and values of loc cannot be assigned outside their
definition module, if this module carefully avoids to export any
assignment function.

Evidently, values of type ref should be considered mutable by the
compiler, for instance during the constant propagation pass.

This view conception of type exportation could also be generalized to
hide some other information about a type to export: hide some
constructors or labels...
-- 
Pierre Weis

INRIA, Projet Cristal, http://pauillac.inria.fr/~weis



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cyclic value construction ("let rec")
  2000-04-10  0:03         ` Pierre Weis
@ 2000-04-10  1:41           ` Markus Mottl
  0 siblings, 0 replies; 8+ messages in thread
From: Markus Mottl @ 2000-04-10  1:41 UTC (permalink / raw)
  To: Pierre Weis; +Cc: OCAML

> This is perfectly right, and as always with mutable values, you cannot
> implement a feature without adding some safeguard conditions.
[snip] 
> type 'a loc = {mutable contents : 'a} as {contents : 'a};;
[snip]
> This view conception of type exportation could also be generalized to
> hide some other information about a type to export: hide some
> constructors or labels...

Oh, my - if I could already think as fast ahead...

This looks indeed not bad and wouldn't even clutter the language! Programs
with elaborate record fields would surely benefit - no more read
access functions required (saves work both in the implementation and
the interface!)...

I am now waiting for Xavier to use his feared backhand to play back the
ball with another overlooked problem... ;-)

Best regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2000-04-11 17:49 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-03-30 20:12 cyclic value construction ("let rec") Benoit Deboursetty
2000-04-03 12:57 ` Markus Mottl
2000-04-06 13:18   ` Pierre Weis
2000-04-06 13:34     ` Markus Mottl
2000-04-06 14:25       ` Xavier Leroy
2000-04-06 15:12         ` Markus Mottl
2000-04-10  0:03         ` Pierre Weis
2000-04-10  1:41           ` Markus Mottl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).