From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by sympa.inria.fr (Postfix) with ESMTPS id 6B4307EE20 for ; Mon, 12 Nov 2012 10:44:06 +0100 (CET) Received-SPF: None (mail1-smtp-roc.national.inria.fr: no sender authenticity information available from domain of gabriel.scherer@gmail.com) identity=pra; client-ip=209.85.210.182; receiver=mail1-smtp-roc.national.inria.fr; envelope-from="gabriel.scherer@gmail.com"; x-sender="gabriel.scherer@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail1-smtp-roc.national.inria.fr: domain of gabriel.scherer@gmail.com designates 209.85.210.182 as permitted sender) identity=mailfrom; client-ip=209.85.210.182; receiver=mail1-smtp-roc.national.inria.fr; envelope-from="gabriel.scherer@gmail.com"; x-sender="gabriel.scherer@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail1-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-ia0-f182.google.com) identity=helo; client-ip=209.85.210.182; receiver=mail1-smtp-roc.national.inria.fr; envelope-from="gabriel.scherer@gmail.com"; x-sender="postmaster@mail-ia0-f182.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AtsBAIvEoFDRVdK2m2dsb2JhbABEDrEjkh0IIwEBAQEBCAkLCRQngh4BAQQBJxkBGxILAQMBCwYFCw0NDAEBEyIBEQEFAQoSBhMICgmHXAEDCQYLnD6MMoJ4hBgKGScDClmIdQEFDIwJARODHASDFgOVfIEcjUYWKYNUPoFaCRc X-IronPort-AV: E=Sophos;i="4.80,761,1344204000"; d="scan'208";a="181084889" Received: from mail-ia0-f182.google.com ([209.85.210.182]) by mail1-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 12 Nov 2012 10:43:54 +0100 Received: by mail-ia0-f182.google.com with SMTP id k10so7572163iag.27 for ; Mon, 12 Nov 2012 01:43:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=rC5NrD5uVChEbDG6J1/JllM6p/ybx0kikRdTJI40RjY=; b=GWzo7sNt3vOIpq5vERiRUFNYNhCXlKBIvrT8/fKFSIpMGuQonMTeyEm5cPtfwa4ZSO z36b70TC5hymncQFTT9NjG5916Z3of1Lrw3aOS+g/NQ4NcrWxgK3d/tvgwEgqTtYEIdF K1hAtP1atf2I1z3wkeX2DcM0Zb/MpUPCEE/+16y1cpj3flsVnQLVskFCXfsJtu2r5/Qd guPeRXDwq7qXReLN+KoJ9bIM6YLmPRFmx4y9aWC9kHP73tek/L8kCrVgwsfpnkvipHGv wUCpAIy3qAOmLtNYrTEgvOGtCcgV+5YeV6Ejf8cfmjpX4gOaDKHyswJ+K/CNNZtrURVV q8bw== Received: by 10.43.7.132 with SMTP id oo4mr18037392icb.6.1352713433287; Mon, 12 Nov 2012 01:43:53 -0800 (PST) MIME-Version: 1.0 Received: by 10.50.184.199 with HTTP; Mon, 12 Nov 2012 01:43:13 -0800 (PST) In-Reply-To: References: From: Gabriel Scherer Date: Mon, 12 Nov 2012 10:43:13 +0100 Message-ID: To: Jean-Baptiste Jeannin Cc: "caml-list@inria.fr" Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Caml-list] Cyclic data structures: internal representation > I am wondering what the internal representation (in the AST) of such a da= ta structure is. > It could be a cyclic data structure on the AST itself, but I doubt it bec= ause it would make > it difficult to manipulate. Another solution I see is keeping a data stru= cture with free variables > inside (or references, for that matter) and binding these free variables = in some other place. > l0 would then bounded directly to [1 ; 2 ; l0 ] in the AST. Any idea wha= t the internals of OCaml do? During compilation, the syntactic AST explicitly represent the recursion through the "let rec", with something to the effect of (simplified form) Let(Recursive, "l0", Cons(1, Cons(2, Var "l0")). At runtime this is represented as any OCaml list, but with a cyclic reference. You have a memory organization that looks a bit like: cell1: Block(variant_tag, int 1, addr cell2) cell2: Block(variant_tag, int 2, addr cell1) This is implemented by first allocation a dummy node for l0, then evaluating the "real definition" of l0 (with recursive references pointing to the dummy node), and backpatching the dummy node with the result of the evaluation. Again in simplified form, this is translated to: let l0 =3D dummy update l0 (1 :: 2 :: dummy) See for yourself: $ echo "let rec l0 =3D 1 :: 2 :: l0" > test.ml $ ocamlopt -dcmm -c test.ml (data int 1024 global "camlTest" "camlTest": skip 8) (function camlTest__entry () (let l0/1008 (extcall "caml_alloc_dummy" 5 addr) (extcall "caml_update_dummy" l0/1008 (alloc 2048 3 (alloc 2048 5 l0/1008)) unit) (store "camlTest" l0/1008)) 1a) This is correct because, while "l0" may appear recursively in its definition, its evaluation is never "forced": it appears only in places that return its adress directly without accessing its (yet-undefined) value. let rec l0 =3D List.length l0 :: l0 would be incorrect and, in the general case of a function application, "let rec l0 =3D f l0" is rejected because the compiler cannot in general inspect "f" to check that it never "forces" its argument. The full definition of which recursive value definitions are accepted (as an under-approximation of the set of definitions that are actually vali= d) is described by the manual as an "extension of the OCaml language": the core language supports only recursively-defined functions, which have a well-defined semantics (you know the identifiers are not forced, as a function definition of the form (fun x -> ...) performs no direct evaluation). http://caml.inria.fr/pub/docs/manual-ocaml-4.00/manual021.html#toc70 On Mon, Nov 12, 2012 at 7:44 AM, Jean-Baptiste Jeannin wrote: > - is there any easy way to check if a list is cyclic or not? The only way= I see is to go down the list, checking at every step if this particular su= blist has already been seen. But it's rather heavy. You have to use some cycle detection algorithm. Some of them are rather funny (the Tortoise and Hare algorithm) and they are described on Wikipedia: http://en.wikipedia.org/wiki/Cycle_detection However, I think there is something wrong with your application if you need such a test. Manipulating such cyclic values is dangerous (because of the non-termination problems you mention in your question, that are not limited to equality check: you will not like the result of List.map (fun n -> n) l0), and I would rather advise you to have value cyclicity appears explicitly in your program, if it needs to. That is, define a type of "wrapped lists" that represent cyclic lists by having just the cyclic part as data, and re-defining your manipulating functions to do the wrapping/cycling explicitly on this representation. It's better when you are forced to be aware of where the loops are. > - the documentation on the =3D sign (http://caml.inria.fr/pub/docs/manual= -ocaml/libref/Pervasives.html) mentions that "Equality between cyclic data = structures may not terminate." It seems to terminate if one or the other of= the data structures is not cyclic. Does it ever terminate when both data s= structures are cyclic, or does it always loop? I don't think that is specified, but I would expect it to always loop (implementing a performance-hurting cycle detection for the few mad scientists to play with infinite lists would be unreasonable). If you have any form of implicit cyclicity in your data, you must avoid the default comparison and hashing functions. One more reason to handle cyclicity explicitly. The Lisp world, where cons cells have long been mutable by default, has played with the fire of delicately handling implicit cyclicity in values. Mat=EDas Giovannini has some fairly interesting examples of doing this in a dialect of OCaml extended with unsafe features (namely the Obj module). I think it's a bad idea to actually use these games in a real program, but it's still a good read. http://alaska-kamtchatka.blogspot.fr/2007/10/in-place-reversal-of-trees.h= tml http://alaska-kamtchatka.blogspot.fr/2007/11/unsafe-clasp.html On Mon, Nov 12, 2012 at 7:44 AM, Jean-Baptiste Jeannin wrote: > Good morning, > > One can create cyclic data structures in OCaml, using the keyword let rec= . For example: > > # let rec l0 =3D 1 :: 2 :: l0;; > var l0 : int list =3D > [1; 2; 1; 2; 1; 2; 1; 2; 1; 2; 1; 2; 1; 2; 1; 2; 1; 2; ...] > > I am wondering what the internal representation (in the AST) of such a da= ta structure is. It could be a cyclic data structure on the AST itself, but= I doubt it because it would make it difficult to manipulate. Another solut= ion I see is keeping a data structure with free variables inside (or refere= nces, for that matter) and binding these free variables in some other place= . l0 would then bounded directly to [1 ; 2 ; l0 ] in the AST. Any idea wha= t the internals of OCaml do? > > Related questions: > - is there any easy way to check if a list is cyclic or not? The only way= I see is to go down the list, checking at every step if this particular su= blist has already been seen. But it's rather heavy. > - the documentation on the =3D sign (http://caml.inria.fr/pub/docs/manual= -ocaml/libref/Pervasives.html) mentions that "Equality between cyclic data = structures may not terminate." It seems to terminate if one or the other of= the data structures is not cyclic. Does it ever terminate when both data s= structures are cyclic, or does it always loop? > # let l1 =3D [1; 2] in l0 =3D l1;; > - : bool =3D false > # l0 =3D l0;; > (* doesn't finish *) > > Thank you, > Jean-Baptiste Jeannin > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs