From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id NAA02991 for caml-red; Mon, 12 Feb 2001 13:29:49 +0100 (MET) Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id KAA09760 for ; Sun, 11 Feb 2001 10:15:08 +0100 (MET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.11.1/8.10.0) with ESMTP id f1B9F1n21112; Sun, 11 Feb 2001 10:15:02 +0100 (MET) Received: (from xleroy@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id KAA26605; Sun, 11 Feb 2001 10:15:01 +0100 (MET) Date: Sun, 11 Feb 2001 10:15:01 +0100 From: Xavier Leroy To: Mattias Waldau Cc: caml-list@inria.fr Subject: Re: Print arbitrary value Message-ID: <20010211101501.B129@pauillac.inria.fr> References: <20010126163821.A4855@quincy.inria.fr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0i In-Reply-To: ; from mattias.waldau@tacton.se on Wed, Jan 31, 2001 at 01:00:37PM +0100 Sender: weis@pauillac.inria.fr > Is there a simple way to print how a complex structure? > Essentially I would like to use the toplevel print functionality within my > program for debugging purposes. The short answer is: no. Here is a more detailed answer (this should be addressed in the FAQ at some point). - It is easy to write a generic "print" function that simply decodes the internal representation of its argument, using the same "tag" information that the GC or the structural equality function (=) use. The problem is that the internal representations contain much less information than needed for printing. For instance, integer, characters, booleans, and constant constructors are all represented internally as integers; thus, the values 0, '\000', false and [] would all be printed as "0". Similarly, the names of sum constructors and record labels are lost: constructors become integer tags and labels are encoded positionally. So, the list [1;2] would be printed something like "C0(1, C0(2, 0))". Totally useless. - For better printing, it is necessary to have a run-time representation of the static ML type of the value to print, in addition to the value itself. The ML type acts as a "decoder" for the low-level representation of the value, allowing precise printing. This is how the toplevel proceeds to print results of evaluations. This works fine for the toplevel because it has easy access to all the required typing information -- the toplevel integrates the type-checker and compiler. But normal programs do not: at run-time, they do not have access to their static types and type environments. This is just compile-time information that is not transmitted to run-time. Several schemes have been designed to allow some amount of run-time type information to be available to running programs, such as "dynamics" and the Weis-Furuse "generics". The toplevel print facility is hard to fit within these schemes, because it does not suffice to have "some amount of RTTI": the full ML types plus the full typing environment (with all type definitions, module signatures and whatnot) are required, and this raises rather serious efficiency issues. So, it is unlikely that the exact equivalent of the toplevel generic printer will ever be available at run-time. Currently, one has to write printing functions by hand for debugging purposes. The ocamlc/ocamlopt compilers, for instance, contain about 10 pretty-printers for the various intermediate representations. Writing pretty-printers is a lot of boring work, but for complex enough data structures, the output of the toplevel generic printer is too verbose anyway (all the constructor names, record labels, parentheses and braces clutter the output), and a hand-written printer is useful to print the data in a more compact, problem-specific notation. - Xavier Leroy