caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Cyclic data structures: internal representation
@ 2012-11-12  6:44 Jean-Baptiste Jeannin
  2012-11-12  8:55 ` Basile Starynkevitch
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Jean-Baptiste Jeannin @ 2012-11-12  6:44 UTC (permalink / raw)
  To: caml-list

Good morning,

One can create cyclic data structures in OCaml, using the keyword let rec. For example:

  # let rec l0 = 1 :: 2 :: l0;;
  var l0 : int list =
    [1; 2; 1; 2; 1; 2; 1; 2; 1; 2; 1; 2; 1; 2; 1; 2; 1; 2; ...]

I am wondering what the internal representation (in the AST) of such a data structure is. It could be a cyclic data structure on the AST itself, but I doubt it because it would make it difficult to manipulate. Another solution I see is keeping a data structure with free variables inside (or references, for that matter) and binding these free variables in some other place. l0 would then bounded directly to  [1 ; 2 ; l0 ] in the AST. Any idea what the internals of OCaml do?

Related questions:
- is there any easy way to check if a list is cyclic or not? The only way I see is to go down the list, checking at every step if this particular sublist has already been seen. But it's rather heavy.
- the documentation on the = sign (http://caml.inria.fr/pub/docs/manual-ocaml/libref/Pervasives.html) mentions that "Equality between cyclic data structures may not terminate." It seems to terminate if one or the other of the data structures is not cyclic. Does it ever terminate when both data sstructures are cyclic, or does it always loop?
  # let l1 = [1; 2] in l0 = l1;;
  - : bool = false
  # l0 = l0;;
  (* doesn't finish *)

Thank you,
Jean-Baptiste Jeannin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Cyclic data structures: internal representation
  2012-11-12  6:44 [Caml-list] Cyclic data structures: internal representation Jean-Baptiste Jeannin
@ 2012-11-12  8:55 ` Basile Starynkevitch
  2012-11-12  9:43 ` Gabriel Scherer
  2012-11-12 11:24 ` Dmitry Grebeniuk
  2 siblings, 0 replies; 4+ messages in thread
From: Basile Starynkevitch @ 2012-11-12  8:55 UTC (permalink / raw)
  To: Jean-Baptiste Jeannin; +Cc: caml-list

On Mon, Nov 12, 2012 at 01:44:55AM -0500, Jean-Baptiste Jeannin wrote:
> Good morning,
> 
> One can create cyclic data structures in OCaml, using the keyword let rec. For example:
> 
>   # let rec l0 = 1 :: 2 :: l0;;
>   var l0 : int list =
>     [1; 2; 1; 2; 1; 2; 1; 2; 1; 2; 1; 2; 1; 2; 1; 2; 1; 2; ...]
> 
> I am wondering what the internal representation (in the AST) of such a data structure is. 

It is related to the notion of statically constructive expressions, see section 7.3 of the ocaml manual
http://caml.inria.fr/pub/docs/manual-ocaml-4.00/manual021.html

statically constructive expressions (which *exclude* function applications, and  conditionals) can be first constructed (as yet-unfilled values) then filled. In that case, the compiler emit first the code for all the constructions, then the code for all the filling (taking into account the recursively bound variables).

This is not Ocaml specific; any Lisp or Scheme (or related, e.g. http://gcc-melt.org/ for example) 
language also do that to implement let-rec kind of mutually recursive definitions.

A very good explanation about that happens in Christian Queinnec "Lisp In Small Pieces" books. 
If you read french, you could read the latest variant in Franch on 
http://paracamplus.com/?CGIRunMode=book&urn=Cours/LiSP/4 
Principes d'implantation de Scheme et Lisp 
ISBN 	978-2-916466-03-3

Regards
-- 
Basile STARYNKEVITCH         http://starynkevitch.net/Basile/
email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Cyclic data structures: internal representation
  2012-11-12  6:44 [Caml-list] Cyclic data structures: internal representation Jean-Baptiste Jeannin
  2012-11-12  8:55 ` Basile Starynkevitch
@ 2012-11-12  9:43 ` Gabriel Scherer
  2012-11-12 11:24 ` Dmitry Grebeniuk
  2 siblings, 0 replies; 4+ messages in thread
From: Gabriel Scherer @ 2012-11-12  9:43 UTC (permalink / raw)
  To: Jean-Baptiste Jeannin; +Cc: caml-list

> I am wondering what the internal representation (in the AST) of such a data structure is.
> It could be a cyclic data structure on the AST itself, but I doubt it because it would make
> it difficult to manipulate. Another solution I see is keeping a data structure with free variables
> inside (or references, for that matter) and binding these free variables in some other place.
> l0 would then bounded directly to  [1 ; 2 ; l0 ] in the AST. Any idea what the internals of OCaml do?

During compilation, the syntactic AST explicitly represent the
recursion through the "let rec", with something to the effect of
(simplified form) Let(Recursive, "l0", Cons(1, Cons(2, Var "l0")).
At runtime this is represented as any OCaml list, but with a cyclic
reference. You have a memory organization that looks a bit like:

cell1: Block(variant_tag, int 1, addr cell2)
cell2: Block(variant_tag, int 2, addr cell1)

This is implemented by first allocation a dummy node for l0, then
evaluating the "real definition" of l0 (with recursive references
pointing to the dummy node), and backpatching the dummy node with the
result of the evaluation. Again in simplified form, this is translated
to:
  let l0 = dummy
  update l0 (1 :: 2 :: dummy)

 See for yourself:
  $ echo "let rec l0 = 1 :: 2 :: l0" > test.ml
  $ ocamlopt -dcmm -c test.ml
    (data int 1024 global "camlTest" "camlTest": skip 8)
    (function camlTest__entry ()
     (let l0/1008 (extcall "caml_alloc_dummy" 5 addr)
       (extcall "caml_update_dummy" l0/1008 (alloc 2048 3 (alloc 2048
5 l0/1008))
         unit)
       (store "camlTest" l0/1008))
     1a)

This is correct because, while "l0" may appear recursively in its
definition, its evaluation is never "forced": it appears only in
places that return its adress directly without accessing its
(yet-undefined) value.
  let rec l0 = List.length l0 :: l0
would be incorrect and, in the general case of a function application,
"let rec l0 = f l0" is rejected because the compiler cannot in general
inspect "f" to check that it never "forces" its argument.

The full definition of which recursive value definitions are accepted
(as an under-approximation of the set of definitions that are actually valid)
is described by the manual as an "extension of the OCaml language": the
core language supports only recursively-defined functions, which have
a well-defined semantics (you know the identifiers are not forced, as
a function definition of the form (fun x -> ...) performs no direct
evaluation).
  http://caml.inria.fr/pub/docs/manual-ocaml-4.00/manual021.html#toc70


On Mon, Nov 12, 2012 at 7:44 AM, Jean-Baptiste Jeannin
<jeannin@cs.cornell.edu> wrote:
> - is there any easy way to check if a list is cyclic or not? The only way I see is to go down the list, checking at every step if this particular sublist has already been seen. But it's rather heavy.

You have to use some cycle detection algorithm. Some of them are
rather funny (the Tortoise and Hare algorithm) and they are described
on Wikipedia:
  http://en.wikipedia.org/wiki/Cycle_detection

However, I think there is something wrong with your application if you
need such a test. Manipulating such cyclic values is dangerous
(because of the non-termination problems you mention in your question,
that are not limited to equality check: you will not like the result
of List.map (fun n -> n) l0), and I would rather advise you to have
value cyclicity appears explicitly in your program, if it needs to.
That is, define a type of "wrapped lists" that represent cyclic lists
by having just the cyclic part as data, and re-defining your
manipulating functions to do the wrapping/cycling explicitly on this
representation. It's better when you are forced to be aware of where
the loops are.

> - the documentation on the = sign (http://caml.inria.fr/pub/docs/manual-ocaml/libref/Pervasives.html) mentions that "Equality between cyclic data structures may not terminate." It seems to terminate if one or the other of the data structures is not cyclic. Does it ever terminate when both data sstructures are cyclic, or does it always loop?

I don't think that is specified, but I would expect it to always loop
(implementing a performance-hurting cycle detection for the few mad
scientists to play with infinite lists would be unreasonable). If you
have any form of implicit cyclicity in your data, you must avoid the
default comparison and hashing functions. One more reason to handle
cyclicity explicitly.


The Lisp world, where cons cells have long been mutable by default,
has played with the fire of delicately handling implicit cyclicity in
values. Matías Giovannini has some fairly interesting examples of
doing this in a dialect of OCaml extended with unsafe features (namely
the Obj module). I think it's a bad idea to actually use these games
in a real program, but it's still a good read.
  http://alaska-kamtchatka.blogspot.fr/2007/10/in-place-reversal-of-trees.html
  http://alaska-kamtchatka.blogspot.fr/2007/11/unsafe-clasp.html

On Mon, Nov 12, 2012 at 7:44 AM, Jean-Baptiste Jeannin
<jeannin@cs.cornell.edu> wrote:
> Good morning,
>
> One can create cyclic data structures in OCaml, using the keyword let rec. For example:
>
>   # let rec l0 = 1 :: 2 :: l0;;
>   var l0 : int list =
>     [1; 2; 1; 2; 1; 2; 1; 2; 1; 2; 1; 2; 1; 2; 1; 2; 1; 2; ...]
>
> I am wondering what the internal representation (in the AST) of such a data structure is. It could be a cyclic data structure on the AST itself, but I doubt it because it would make it difficult to manipulate. Another solution I see is keeping a data structure with free variables inside (or references, for that matter) and binding these free variables in some other place. l0 would then bounded directly to  [1 ; 2 ; l0 ] in the AST. Any idea what the internals of OCaml do?
>
> Related questions:
> - is there any easy way to check if a list is cyclic or not? The only way I see is to go down the list, checking at every step if this particular sublist has already been seen. But it's rather heavy.
> - the documentation on the = sign (http://caml.inria.fr/pub/docs/manual-ocaml/libref/Pervasives.html) mentions that "Equality between cyclic data structures may not terminate." It seems to terminate if one or the other of the data structures is not cyclic. Does it ever terminate when both data sstructures are cyclic, or does it always loop?
>   # let l1 = [1; 2] in l0 = l1;;
>   - : bool = false
>   # l0 = l0;;
>   (* doesn't finish *)
>
> Thank you,
> Jean-Baptiste Jeannin
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] Cyclic data structures: internal representation
  2012-11-12  6:44 [Caml-list] Cyclic data structures: internal representation Jean-Baptiste Jeannin
  2012-11-12  8:55 ` Basile Starynkevitch
  2012-11-12  9:43 ` Gabriel Scherer
@ 2012-11-12 11:24 ` Dmitry Grebeniuk
  2 siblings, 0 replies; 4+ messages in thread
From: Dmitry Grebeniuk @ 2012-11-12 11:24 UTC (permalink / raw)
  To: Jean-Baptiste Jeannin

Hello.

> - is there any easy way to check if a list is cyclic or not? The only way I see is to go down the list, checking at every step if this particular sublist has already been seen. But it's rather heavy.

> - the documentation on the = sign (http://caml.inria.fr/pub/docs/manual-ocaml/libref/Pervasives.html) mentions that "Equality between cyclic data structures may not terminate." It seems to terminate if one or the other of the data structures is not cyclic. Does it ever terminate when both data sstructures are cyclic, or does it always loop?

  Both these questions are solved with my library ocaml-cyclist:
https://forge.ocamlcore.org/projects/ocaml-cyclist/
  I don't remember exact details, but generally I use
"tortoise and hare" algorithm.

  Also note that lists with a cycle can also contain some prefix
that doesn't appear in the cycle (it happens when list with cycle
is appended to "linear" list).  That's also covered by ocaml-cyclist:

        value length : list 'a -> (int * int);
        (** Returns [(prefix_len, cycle_len)] of the argument.
            (0, 0) for empty list, (n, 0) for linear list,
            (0, n) for circular list, (n, m) for generic-shaped
            cyclic list.  (n, m > 0)
        *)

  As for equality, you can use

        value for_all2 : ?strict:bool ->
          ('a -> 'b -> bool) -> list 'a -> list 'b -> bool;

to write the code like

    let list_eq a b = Cyclist.for_all2 ~strict:true ( = ) a b

which will run correctly.  However, the following lists will be
considered equal: [{1; 2; 3}] and [1; 2; {3; 1; 2; 3; 1; 2}] (curly braces
denote the cycle of list; it's for illustration purposes only).
  Using other library functions you can strenghten your equality
relation.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2012-11-12 11:24 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-12  6:44 [Caml-list] Cyclic data structures: internal representation Jean-Baptiste Jeannin
2012-11-12  8:55 ` Basile Starynkevitch
2012-11-12  9:43 ` Gabriel Scherer
2012-11-12 11:24 ` Dmitry Grebeniuk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).