From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <garrigue@math.nagoya-u.ac.jp>
X-Original-To: caml-list@sympa.inria.fr
Delivered-To: caml-list@sympa.inria.fr
Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83])
	by sympa.inria.fr (Postfix) with ESMTPS id 03DDD7EE51
	for <caml-list@sympa.inria.fr>; Fri,  3 May 2013 03:28:29 +0200 (CEST)
Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender
  authenticity information available from domain of
  garrigue@math.nagoya-u.ac.jp) identity=pra;
  client-ip=133.6.130.5;
  receiver=mail2-smtp-roc.national.inria.fr;
  envelope-from="garrigue@math.nagoya-u.ac.jp";
  x-sender="garrigue@math.nagoya-u.ac.jp";
  x-conformance=sidf_compatible
Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender
  authenticity information available from domain of
  garrigue@math.nagoya-u.ac.jp) identity=mailfrom;
  client-ip=133.6.130.5;
  receiver=mail2-smtp-roc.national.inria.fr;
  envelope-from="garrigue@math.nagoya-u.ac.jp";
  x-sender="garrigue@math.nagoya-u.ac.jp";
  x-conformance=sidf_compatible
Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender
  authenticity information available from domain of
  postmaster@mailhost.math.nagoya-u.ac.jp) identity=helo;
  client-ip=133.6.130.5;
  receiver=mail2-smtp-roc.national.inria.fr;
  envelope-from="garrigue@math.nagoya-u.ac.jp";
  x-sender="postmaster@mailhost.math.nagoya-u.ac.jp";
  x-conformance=sidf_compatible
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AoYEAAMSg1GFBoIFdGdsb2JhbABSwmaBFg4BDBUIPIIfAQEEASccATUCAwsLNBIhNhkbh18DCQWvJoRFAoUbDUuHNQeMT4EUCxNyMweCcmGJGopFgWuBZIwRiD2BWgIe
X-IPAS-Result: AoYEAAMSg1GFBoIFdGdsb2JhbABSwmaBFg4BDBUIPIIfAQEEASccATUCAwsLNBIhNhkbh18DCQWvJoRFAoUbDUuHNQeMT4EUCxNyMweCcmGJGopFgWuBZIwRiD2BWgIe
X-IronPort-AV: E=Sophos;i="4.87,600,1363129200"; 
   d="scan'208";a="15868880"
Received: from rabbit.math.nagoya-u.ac.jp (HELO mailhost.math.nagoya-u.ac.jp) ([133.6.130.5])
  by mail2-smtp-roc.national.inria.fr with ESMTP; 03 May 2013 03:28:26 +0200
Received: from mailhost.math.nagoya-u.ac.jp (localhost [127.0.0.1])
	by mailhost.math.nagoya-u.ac.jp (Postfix) with ESMTP id E63EC6387;
	Fri,  3 May 2013 10:28:22 +0900 (JST)
Received: from mailhost.math.nagoya-u.ac.jp (localhost [127.0.0.1])
	by mailhost.math.nagoya-u.ac.jp (Postfix) with ESMTP id 6F9F7395D;
	Fri,  3 May 2013 10:28:22 +0900 (JST)
DomainKey-Signature: h=Received:Subject:Mime-Version:Content-Type:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=; c=nofws; d=math.nagoya-u.ac.jp; q=; s=alpha
Received: from tanpopo.garrigue.jp (58x158x128x157.ap58.ftth.ucom.ne.jp [58.158.128.157])
	by mailhost.math.nagoya-u.ac.jp (Postfix) with ESMTPSA id 2430E2517;
	Fri,  3 May 2013 10:28:22 +0900 (JST)
Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\))
Content-Type: text/plain; charset=iso-8859-1
From: Jacques Garrigue <garrigue@math.nagoya-u.ac.jp>
In-Reply-To: <CADKNfh+c85f1R3BFwabbznEBCxbVnamQzdunkawm8B_UnhndWQ@mail.gmail.com>
Date: Fri, 3 May 2013 10:28:30 +0900
Cc: caml-list@inria.fr
Content-Transfer-Encoding: quoted-printable
Message-Id: <7EC5410C-F995-4353-8DC1-BA7A3EABC101@math.nagoya-u.ac.jp>
References: <CADKNfh+c85f1R3BFwabbznEBCxbVnamQzdunkawm8B_UnhndWQ@mail.gmail.com>
To: yminsky@gmail.com
X-Mailer: Apple Mail (2.1503)
Subject: Re: [Caml-list] -principal

On 2013/05/03, at 6:16, Yaron Minsky <yminsky@gmail.com> wrote:

> Can anyone describe the pluses and minuses of turning on -principal?
> Is it considered to be good practice for codebases that are compatible
> with it?  Can it change behavior of currently working code?

Turning on -principal allows detecting "risky" uses of type information to
help type inference.
By risky I mean that success or failure of type inference may depend
on the order in which subexpressions are typed.
This kind of type information is only used by a limited number of features:
* polymorphic methods
* using different order of labeled arguments in a function
* some discarding of optional arguments
* choice of the source type in a coercion (i.e. automatic upgrading of
   (expr :> t2) into (expr : t1 :> t2) when the type of expr is closed).
  This is particularly useful for private type abbreviations.
* GADTs
* disambiguation of record field and constructor names (new in trunk)

The technical definition is based on polymorphism, but the intuition
is that type information follows the (functional) flow of data, even going
through polymorphic functions (being conservative when merging flows).
So a simple way to understand is: supposing that type annotation physically
adds type information to the expression they decorate, will all values
reaching the use point be decorated?
It also flows backward (the expected type is propagated), but does not
go through functions in that case.

# type t =3D < id: 'a. 'a -> 'a >;;
type t =3D < id : 'a. 'a -> 'a >
# let f (x : t) =3D x, x#id;;     (* safe code: the type of x is known at a=
ll its use points *)
val f : t -> t * ('a -> 'a) =3D <fun>
# let f x =3D (x : t), x#id;;     (* unsafe code: the type is only known be=
cause typing goes from left to right *)
Warning 18: this use of a polymorphic method is not principal.
val f : t -> t * ('a -> 'a) =3D <fun>
# let f x =3D x#id, (x : t);;    (* just exchanging the members of the pair=
 causes a failure *)
Error: This expression has type < id : 'a; .. >
       but an expression was expected of type t
       The universal variable 'a0 would escape its scope

In an ideal world, I would suggest systematically using -principal, as it m=
akes
the notion of "known type" well-defined.
However, there are drawbacks to -principal
* type inference is slower (more copying occurs)
* cmi's become bigger
The first point is mainly a problem if you use objects.
The second one if you have already problems with cmi size (again, mostly ob=
jects)

As a result the suggested approach is to only compile with -principal once =
in a while.
This is reasonable because if compiling with -principal works, in theory it=
 is guaranteed
that the program will be accepted without -principal too. Of course the sem=
antics do
not change.

However there is a difficulty: since the generated cmi's may differ, in gen=
eral one
has to maintain completely separate builds.
If you have a complete set of mli's, then a solution is to always compile m=
li's with
(or without) -principal. (Compiling the mli without principal may in theory=
 cause
propagation to fail in some cases, but you are on the safe side, and I have
never observed it)

So yes, it is good practice. But I'm afraid few invested the effort in doin=
g it.
Looking at the features you use in Core, this may be a good idea to try.
(My understanding is that Lexifi tried, but they had problems because they
use big object types).
By the way, the ocaml compiler itself doesn't use any of the above features,
so it is not directly concerned.

Jacques Garrigue