From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 6C88F7EEAF for ; Thu, 31 Jan 2013 08:49:06 +0100 (CET) Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of oleg@okmij.org) identity=pra; client-ip=66.39.3.115; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="oleg@okmij.org"; x-sender="oleg@okmij.org"; x-conformance=sidf_compatible Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of oleg@okmij.org designates 66.39.3.115 as permitted sender) identity=mailfrom; client-ip=66.39.3.115; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="oleg@okmij.org"; x-sender="oleg@okmij.org"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@www1.g3.pair.com) identity=helo; client-ip=66.39.3.115; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="oleg@okmij.org"; x-sender="postmaster@www1.g3.pair.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: At8eAOcgClFCJwNzdGdsb2JhbABFiyShTwGSQw4BDAcECQcWJ4JMTVSBHId3DKA6oSmNIoNqA5YMAYEciiCIBg X-IPAS-Result: At8eAOcgClFCJwNzdGdsb2JhbABFiyShTwGSQw4BDAcECQcWJ4JMTVSBHId3DKA6oSmNIoNqA5YMAYEciiCIBg X-IronPort-AV: E=Sophos;i="4.84,574,1355094000"; d="scan'208";a="761034" Received: from www1.g3.pair.com ([66.39.3.115]) by mail2-smtp-roc.national.inria.fr with SMTP; 31 Jan 2013 08:49:05 +0100 Received: (qmail 16893 invoked by uid 9370); 31 Jan 2013 07:49:03 -0000 Date: 31 Jan 2013 07:49:03 -0000 Message-ID: <20130131074903.16892.qmail@www1.g3.pair.com> From: oleg@okmij.org To: caml-list@inria.fr X-Validation-by: oleg@okmij.org Subject: [Caml-list] ANN: Brand-new BER MetaOCaml for OCaml 4.00.1 BER MetaOCaml N100 is now available. It is a strict superset of OCaml 4.00.1, extending it with staging annotations to construct and run typed code values. BER MetaOCaml has been completely re-implemented and thus caught up with OCaml. For those who don't know what staging or MetaOCaml is, a short introduction follows the news summary. BER MetaOCaml is in the spirit of the original MetaOCaml by Taha and Calcagno, but has been completely re-written using different algorithms and techniques. BER MetaOCaml has been re-structured to minimize the number of the changes to the OCaml type-checker and to separate the `kernel' (type-checking and constructing code values) from the `user-level'. Various ways of running the code -- compiling to byte-code or machine instructions and executing them, or translating code values to C or LLVM, or printing them -- can be done in `user-level' libraries, without any need to hack into (Meta)OCaml. (Printing and byte-code execution are included in BER N100.) This release of BER MetaOCaml is meant to incite future research into type-safe meta-programming. To the user, the two major differences of BER N100 from the old MetaOCaml are: -- constructor restriction: all data constructors and record labels used within brackets must come from the types that are declared in separately compiled modules. -- scope extrusion check: attempting to build code values with unbound or mistakenly bound variables (which is possible with mutation or other effects) is caught early, raising an exception with good diagnostics. Both are explained after the short introduction to staging. Smaller visible differences are better printing of cross-stage persistent values and the full support for labeled arguments. A long-standing problem with records with polymorphic fields has fixed itself. The BER N100 code is now extensively commented, and has a regression test suite. BER N100 is much less invasive into OCaml: compare the size of the patch to the main OCaml type-checker typing/typecore.ml, which contains the bulk of the changes for type checking the staging constructs. In the previous version BER N004, the patch had 564 lines of additions, deletions and context; now, only 328 lines. (The core MetaOCaml kernel is trx.ml, with 1800 lines.) BER MetaOCaml N100 is available: -- as a set of patches to the OCaml 4.00.1 distribution. http://okmij.org/ftp/ML/ber-metaocaml-100.tar.gz See the INSTALL document in that archive. You need the source distribution of OCaml 4.00.1, see the following URL for details. http://ocaml.org/install.html -- as a GIT bundle containing the changes relative to OCaml 4.00.1 http://okmij.org/ftp/ML/metaocaml.bundle First, you have to obtain the base git clone https://github.com/ocaml/ocaml.git -b 4.00 ometa4 then switch to tag 4.00.1 and then apply the bundle. Jacques Carette has extensively re-written the printing of code values, and is currently maintaining this part. I'm grateful to him for encouragement and discussions. Introduction to staging and MetaOCaml The standard example of meta-programming -- the running example of A.P.Ershov's 1977 paper that begat meta-programming -- is the power function, computing x^n. In OCaml: let square x = x * x let rec power n x = if n = 0 then 1 else if n mod 2 = 0 then square (power (n/2) x) else x * (power (n-1) x) (* val power : int -> int -> int = *) Suppose our program has to compute x^7 many times. We may define let power7 x = power 7 x In MetaOCaml, we may also specialize the power function to a particular value n, obtaining the code which will later receive x and compute x^n. We re-write power n x annotating expressions as computed `now' (when n becomes known) and `later' (when x is given). let rec spower n x = if n = 0 then .<1>. else if n mod 2 = 0 then .. else .<.~x * .~(spower (n-1) x)>. (* val spower : int -> ('cl, int) code -> ('cl, int) code = *) The two annotations, or staging constructs, are brackets .< e >. and escape .~e . Brackets .< e >. `quasi-quote' the expression e, annotating it as computed later. Escape .~e, which must be used within brackets, tells that e is computed now but produces the result for later. That result, the code, is spliced-in the containing bracket. The inferred type of spower is different. The result is no longer an int, but ('cl, int) code -- the code of expressions that compute an int. The first type argument to 'code', often named 'cl, is a so-called environment classifier and can be skipped on first reading. The type of spower spells out which argument is received now, and which later. To specialize spower to 7, we define let spower7_code = . .~(spower 7 ..)>.;; (* val spower7_code : ('cl, int -> int) code = .< fun x_1 -> (x_1 * (((* cross-stage persistent value (id: square) *)) (x_1 * (((* cross-stage persistent value (id: square) *)) (x_1 * 1)))))>. *) and obtain the code of a function that will receive x and return x^7. Code, even of functions, can be printed, which is what MetaOCaml toplevel did. The print-out contains so-called cross-stage persistent values, or CSP, which `quote' present-stage values such as square to be used later. One may think of CSP as references to `external libraries' -- only in our case the program acts as a library for the code it generates. If we want to use thus specialized x^7 now, in our code, we should compile spower7_code and link it back to our program. This is called `running the code' let spower7 = .! spower7_code (* val spower7 : int -> int = *) The specialized spower7 has the same type as the partially applied power7 above. They behave identically. However, power7 x will do recursion on n, checking n's parity, etc. In contrast, specialized spower7 has no recursion (as can be seen from spower7_code). All operations on n have been done when the spower7_code was computed, producing the straight-lined code spower7 that operates only on x. MetaOCaml supports arbitrary number of later stages, letting us write not only code generators but also generators of code generators, etc. Data constructor restriction BER MetaOCaml N100 imposes the restriction that all data constructors and record labels used within brackets must come from the types that are declared in separately compiled modules. For example, the following all work: ..;; ..;; ..;; .<{Complex.re = 1.0; im = 2.0}>.;; let open Complex in .<{re = 1.0; im = 2.0}>.;; ..;; because data types bool, option, list, Complex are either Pervasive or defined in the (separately compiled) standard library. However, the following are not allowed and flagged as compile-time error: type foo = Bar;; .. module Foo = struct exception E end;; .. The type declaration foo or the module declaration Foo must be moved into a separate file. The corresponding .cmi file must also be available at run-time: either placed into the same directory as the executable, or somewhere within the OCaml library search path. Scope extrusion test Although MetaOCaml permits manipulation and splicing of open code, its type system statically ensures that only closed code can be printed or run: We can't run the code we haven't finished constructing. For example, . .~(let y = .! .. in ..)>.;; gives a type error since .. is obviously open code. This static guarantee holds only for pure code. Effects such as storing code values in mutable cells void the guarantee. Here is the example using the _old_ MetaOCaml from 2006 (version 3.09.1 alpha 030). (The problem can be illustrated simpler, but the following example is more realistic and devious.) let c = let r = ref .z>. in let f = . .~(r := . x>.; .<0>.)>. in . .~f (.~(!r) 1)>. ;; (* val c : ('a, '_b -> int) code = . ((fun x_2 -> 0) ((fun y_3 -> x_2) 1))>. *) One must look hard to see that x_2 is actually unbound in the resulting code. The problem is revealed when we attempt to run that code: .! c;; (* Characters 77-78: Unbound value x_2 Exception: Trx.TypeCheckingError. *) Since we get the error anyway (without much diagnostics though), one may discount the problem. Alas, sometimes scope extrusion results in no error -- just in wrong results. let c1 = let r = ref .z>. in let _ = . .~(r := . x>.; .<0>.)>. in !r;; (* val c1 : ('a, '_b -> '_b) code = . x_2>. *) we then use c1 to construct the code c2: let c2 = . fun x -> .~c1>.;; (* val c2 : ('a, 'b -> 'c -> '_d -> '_d) code = . fun x_2 -> fun y_3 -> x_2>. *) which contains no unbound variables and can be run without problems. (.! c2) 1 2 3;; (* - : int = 2 *) It is most likely that the user did not intend for 'fun x ->' in c2 to bind x in c1. This is the blatant violation of lexical scope. And yet we get no error or other overt indication that something went wrong. BER MetaOCaml N100 has none of this. Although the type system still permits code values with escaped variables, attempting to use such code in any way -- splice, print or run -- immediately raises an exception. For example, entering the expression c in the top-level BER MetaOCaml N100 gives Exception: Failure Scope extrusion at Characters 89-99: let f = . .~(r := . x>.; .<0>.)>. in ^^^^^^^^^^ for the identifier x_7 bound at Characters 74-75: let f = . .~(r := . x>.; .<0>.)>. in ^ The exception message tells which variable got away, where it was bound and where it eloped. The file NOTES.txt in the BER MetaOCaml distribution describes the features of BER MetaOCaml in more detail and outlines directions for further development. Hopefully the release of BER MetaOCaml N100 would stimulate using and researching typed meta-programming.