From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id A0EC0BB84 for ; Mon, 17 Apr 2006 04:09:36 +0200 (CEST) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id k3H29Z7Q013320 for ; Mon, 17 Apr 2006 04:09:36 +0200 Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id EAA21805 for ; Mon, 17 Apr 2006 04:09:35 +0200 (MET DST) Received: from selenite.metnet.navy.mil (selenite.metnet.navy.mil [192.16.167.32]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id k3H29W9b013313 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Mon, 17 Apr 2006 04:09:34 +0200 Received: (qmail 3857 invoked by uid 107); 17 Apr 2006 02:09:29 -0000 Received: from vwmetnet.metnet.navy.mil (192.16.167.21) by selenite.metnet.navy.mil with SMTP; 17 Apr 2006 02:09:29 -0000 Received: from Adric.metnet.navy.mil ([152.80.48.103]) by vwmetnet.metnet.navy.mil (NAVGW 2.5.2.9) with SMTP id M2006041702181027312 for ; Mon, 17 Apr 2006 02:18:10 GMT Received: by Adric.metnet.navy.mil (Postfix, from userid 760) id BB844AC2A; Sun, 16 Apr 2006 19:09:10 -0700 (PDT) To: caml-list@inria.fr Subject: ANN: Generic print function Reply-To: oleg@pobox.com Errors-To: oleg@okmij.org From: oleg@pobox.com Message-Id: <20060417020910.BB844AC2A@Adric.metnet.navy.mil> Date: Sun, 16 Apr 2006 19:09:10 -0700 (PDT) X-Miltered: at concorde with ID 4442F8DF.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at concorde with ID 4442F8DC.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; oleg:01 metaocaml:01 metaocaml:01 ocaml:01 typechecker:01 specifier:01 printf:01 variants:01 arrays:01 ocaml:01 debugging:01 val:01 printf:01 bool:01 oleg:01 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=NO_REAL_NAME autolearn=disabled version=3.0.3 The facility that prints results and types of expressions evaluated at the top-level is now available anywhere in the program -- in bytecode- or natively compiled programs. Generic printing is a (perhaps unintentional) `side-effect' of MetaOCaml -- of the fact that a code value is not merely AST; the code value also captures the type and the type environment of variables and other values. Generic printing is a library that works with the unmodified MetaOCaml (which is _fully_ compatible with the regular OCaml). MOTIVATION First of all, there is an, arguably small, matter of print_int, print_char, etc. Most of the time the typechecker knows exactly what is the type of the data to print. Why should we spell it out in the suffix of 'print' (or as a format specifier in Printf). This small annoyance gets bigger if we deal with a complex data structure, such as a list of records whose elements are variants and arrays. There is no built-in print_xxx function for it: we _have to_ write our own. What is annoying is that OCaml knows darn well how to print the structure, if the structure is the result of an expression evaluated at top level. Alas, such printing is _not_ available in standalone programs, or if we want to use the print function somewhere in the code, where the structure is produced as an intermediate result. Such a printing is a useful debugging aid. We may also want to use the printing facility to write out the data structure, in a human-readable form, into various files. The top-level output is quite pretty and is useful beyond the top level. INTERFACE and SIMPLE EXAMPLE The core function is val fprint : Format.formatter -> ('a,'b) code -> string which takes a code value of any type, and pretty-prints it on the given formatter. The printed result is exactly the same as that by the top-level value printing. The function [fprint] returns the representation of the expression's type, as a string. The latter is arguably a frill, but it was easy to do, so just as well. For example, let pr_type et = Format.printf "\n%s@." et let () = let x = Some ([|(10,true);(11,false)|]) in pr_type (print ..) prints the following two lines: Some [|(10, true); (11, false)|] (int * bool) array option The first line is the value, and the latter (printed by pr_type) is the type. There was no need to define any custom printer for the value or its components. A more involved examples is 65 lines down in this message. AVAILABILITY http://pobox.com/~oleg/ftp/ML/gprint/ The included Makefile builds bytecode and native gprint libraries, and runs the validation test vgprint.ml -- at the top-level (no need to compile any library), in a byte-code executable, and a native code executable. The implementation depends on the unmodified MetaOCaml. To compile the library, MetaOCaml distribution is required. The implementation is surprisingly simple and can be easily integrated with MetaOCaml. http://www.metaocaml.org/ DESIGN MetaOCaml lets us manipulate pieces of code as values. Whereas 1 is an int value, .<1>. is a code value, of the type ('a,int) code. MetaOCaml can print those code values: # let x = 1 in Trx.npc ..;; .<1>. # let x = 'a' in Trx.npc ..;; .<'a'>. so far, so good. However, # let x = "a" in Trx.npc ..;; .<(* cross-stage persistent value (as id: x) *)>. And here we hit the snag. If we use a different printing function, # let x = "a" in Trx.printcode ..;; expression ([0,0+-1]..[0,0+-1]) ghost Pexp_cspval (as id: "x") we see that the code value is internally an AST, Parsetree.expression. We also see that aside from a few simple cases, MetaOCaml does not inline the values from the captured variables; rather, MetaOCaml incorporates references to such variables (so-called, cross-stage persistent (CSP) variables). We need the second observation: the code value is intended to be evaluated (i.e., `run'). The compilation of a code value generally requires its type. For example, to compile [match x with Foo -> ...] we need to know the type of [x]. In particular, we need to know if [Foo] is the only variant. If so, the above match is exhaustive and we do not need to compile the default case: [_ -> raise Match_error]. Therefore, when MetaOCaml captures the reference to a CSP variable, it has to, in general, capture the type as well. And it does, in a special AST node, which contains the corresponding Typedtree.expression. The latter includes the type and the type environment with declarations, etc. These are exactly the data that top-level's generic print function needs. The common, and correct, reply to the frequently asked question as to why OCaml does not have generic print is follows: printing a value requires the knowledge of its type. Indeed, a machine integer '1' may represent, inter alia, both an integer 0 and a boolean 'false'. The type information is not preserved in the compiled code. Fortunately, MetaOCaml's code values do preserve the necessary type information. COMPLEX EXAMPLE Let us first define the following complex data type: module C = struct type 'a color = Blue | Green | Rgb of 'a end;; type 'a image = {title : string; pixels : 'a C.color array};; type big = int image list;; let v = [ {title = "im1"; pixels = [| C.Blue; C.Rgb 10 |]}; {title = "im2"; pixels = [| C.Green |]}; ] The following expression let () = pr_type (print ..) prints exactly what we expect. Before continuing the example, we should note a drawback of the current lack of integration of the generic print facility with MetaOCaml. When doing [print ..] where x is of a variant type and its current value is a constant constructor (e.g., None), we see the output '(* cross-stage persistent value (as id: x) *)'. This is a drawback of some optimizations in MetaOCaml, and will be fixed if this code is integrated into MetaOCaml. Fortunately, there is an easy workaround: replace [print ..] with [print (let z = [x] in ..)]. We now continue the example: open C let some_processing ims = let brighten px = let new_px = match px with Blue -> Green | Green -> Rgb 10 | Rgb x -> Rgb (x+1) in let () = Format.printf "@.pixel: %a -> %a @." (fun ppf v -> ignore (fprint ppf v)) (let x = [px] in ..) (fun ppf v -> ignore (fprint ppf v)) (let x = [new_px] in ..) in new_px in let process im = let () = Format.printf "Processing: " in let _ = print .. in {im with pixels = Array.map brighten im.pixels} in let res = List.map process ims in let _ = print .. in Format.printf "@." let () = some_processing v;; The list of images, an image itself, and a single pixel were all printed generically. We did not have to define any custom printers. Here's the output: Processing: {title = "im1"; pixels = [|Blue; Rgb 10|]} pixel: [Blue] -> [Green] pixel: [Rgb 10] -> [Rgb 11] Processing: {title = "im2"; pixels = [|Green|]} pixel: [Green] -> [Rgb 10] [{title = "im1"; pixels = [|Green; Rgb 11|]}; {title = "im2"; pixels = [|Rgb 10|]}] POLYMORPHISM vs. GENERIC The function print is generic but not polymorphic. For example, if we define let pr x = print ..; x and invoke the function as "pr [10]", we see the printed output "". The function 'pr' has the type 'a->'a -- that is, it promises to take the value of any type, regardless of its structure. The function does not even need to know what is the exact type of its argument, because it is irrelevant. Informally, an OCaml function of the type ['a-> ...] corresponds to the Haskell function [a -> ...]. OTH, an OCaml function of the type [('a,'b) code -> ...] corresponds to Haskell's [Typeable b => b -> ...]. The latter enables generic programming.