From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <garrigue@math.nagoya-u.ac.jp>
X-Original-To: caml-list@sympa.inria.fr
Delivered-To: caml-list@sympa.inria.fr
Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104])
	by sympa.inria.fr (Postfix) with ESMTPS id A0CDA7EE51
	for <caml-list@sympa.inria.fr>; Sun, 28 Apr 2013 02:02:49 +0200 (CEST)
Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender
  authenticity information available from domain of
  garrigue@math.nagoya-u.ac.jp) identity=pra;
  client-ip=133.6.130.5;
  receiver=mail3-smtp-sop.national.inria.fr;
  envelope-from="garrigue@math.nagoya-u.ac.jp";
  x-sender="garrigue@math.nagoya-u.ac.jp";
  x-conformance=sidf_compatible
Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender
  authenticity information available from domain of
  garrigue@math.nagoya-u.ac.jp) identity=mailfrom;
  client-ip=133.6.130.5;
  receiver=mail3-smtp-sop.national.inria.fr;
  envelope-from="garrigue@math.nagoya-u.ac.jp";
  x-sender="garrigue@math.nagoya-u.ac.jp";
  x-conformance=sidf_compatible
Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender
  authenticity information available from domain of
  postmaster@mailhost.math.nagoya-u.ac.jp) identity=helo;
  client-ip=133.6.130.5;
  receiver=mail3-smtp-sop.national.inria.fr;
  envelope-from="garrigue@math.nagoya-u.ac.jp";
  x-sender="postmaster@mailhost.math.nagoya-u.ac.jp";
  x-conformance=sidf_compatible
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AgcCABpmfFGFBoIFnGdsb2JhbAA5GoM9v10OAQEBAQEIFAk8gk0TCQE3TW8uh3oNL6t+hEQChR6HageNYoQtYYkTjg6BJpMkgVs
X-IPAS-Result: AgcCABpmfFGFBoIFnGdsb2JhbAA5GoM9v10OAQEBAQEIFAk8gk0TCQE3TW8uh3oNL6t+hEQChR6HageNYoQtYYkTjg6BJpMkgVs
X-IronPort-AV: E=Sophos;i="4.87,565,1363129200"; 
   d="scan'208";a="12460607"
Received: from rabbit.math.nagoya-u.ac.jp (HELO mailhost.math.nagoya-u.ac.jp) ([133.6.130.5])
  by mail3-smtp-sop.national.inria.fr with ESMTP; 28 Apr 2013 02:02:47 +0200
Received: from mailhost.math.nagoya-u.ac.jp (localhost [127.0.0.1])
	by mailhost.math.nagoya-u.ac.jp (Postfix) with ESMTP id DE4F36395
	for <caml-list@inria.fr>; Sun, 28 Apr 2013 09:02:43 +0900 (JST)
Received: from mailhost.math.nagoya-u.ac.jp (localhost [127.0.0.1])
	by mailhost.math.nagoya-u.ac.jp (Postfix) with ESMTP id A3FAF3A0E
	for <caml-list@inria.fr>; Sun, 28 Apr 2013 09:02:43 +0900 (JST)
DomainKey-Signature: h=Received:From:Content-Type:Content-Transfer-Encoding:Subject:Message-Id:Date:To:Mime-Version:X-Mailer; b=; c=nofws; d=math.nagoya-u.ac.jp; q=; s=alpha
Received: from tet.garrigue.jp (58x158x128x157.ap58.ftth.ucom.ne.jp [58.158.128.157])
	by mailhost.math.nagoya-u.ac.jp (Postfix) with ESMTPSA id 6C45C39A3
	for <caml-list@inria.fr>; Sun, 28 Apr 2013 09:02:43 +0900 (JST)
From: Jacques Garrigue <garrigue@math.nagoya-u.ac.jp>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Message-Id: <00C57DF0-C6F0-4EDE-8607-2155F3A17146@math.nagoya-u.ac.jp>
Date: Sun, 28 Apr 2013 09:02:56 +0900
To: OCaML List Mailing <caml-list@inria.fr>
Mime-Version: 1.0 (Mac OS X Mail 6.3 \(1503\))
X-Mailer: Apple Mail (2.1503)
Subject: [Caml-list] Request for feedback: A problem with injectivity and
 GADTs

Some of you may already be aware of PR#5985
http://caml.inria.fr/mantis/view.php?id=3D5985

To explain very shortly what is happening, a bug was found in the ocaml
compiler, both 4.00 and the development version, such that one that
convert a value from any type to any other type, bypassing all checks.
This is what we call a soundness bug, a hole in the type system.

Such problems happen at times, and we are usually very fast to fix them.
However this time the problem is bit more subtle, because fixing it
straightforwardly may make impossible to write a whole category of
GADTs. Hence this request for feedback.

Practical description of the problem:

It is quite usual to write type witnesses using a GADT:

type _ ty =3D
  | Int : int ty
  | Pair : 'a ty * 'b ty -> ('a * 'b) ty
  | Hashtbl : 'a ty * 'b ty -> ('a, 'b) Hashtbl.t ty

This looks fine, but now suppose that we write the following code:

module F (X : sig type 'a t end) =3D struct
   type _ ty =3D
    | Int : int ty
    | Pair : 'a ty * 'b ty -> ('a * 'b) ty
    | T : 'a ty -> 'a X.t ty
end

The idea is to create a type witness with the
type constructor X.t instead of Hashtbl.t.
Now I can use it this way:

module U =3D struct type 'a t =3D unit end
module M =3D F(U)
let w =3D M.T M.Int
(* val w : int U.t M.ty *)
let w' =3D match w with M.T x -> x | _ -> assert false
(* val w' : '_a M.ty =3D M.Int *)

What this means is that we can now give an arbitrary type
to a witness for int. You can easily use it to build a function
from int to any  type:

let f : type a. a M.ty -> int -> a =3D
  fun w x -> match w with M.Int -> x | _ -> assert false;;
let g : int -> float =3D f w';;

If we look at the source of the problem, it lies in the fact U.t is not inj=
ective.
I.e. when we define a GADT, we assume that all the type variables in
the return type can be determined from the type parameter.
That is, from (int U.t M.ty) we assume that the type of the argument of T i=
s (int M.ty).
However, since U.t is not injective, int U.t is actually equal to any type =
'a U.t,
and we can convert.
Note that, for known types, injectivity was already checked.
I.e., the following definition is not accepted:

type _ ty =3D
  | Int : int ty
  | Pair : 'a ty * 'b ty -> ('a * 'b) ty
  | T : 'a ty -> 'a U.t ty

However, abstract types were assumed to be injective.
Here we see that you cannot take this for granted.

The problem is of course not limited to type witnesses, and not even GADTs:
types with constrained parameters (introduced at the very beginning of ocam=
l),
can also trigger it.
And you don't need a functor to trigger it: once you know that two types ar=
e equal
in some scope, there are many ways to leak that information.

=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D
Here comes the request for feedback:

The fix is simple enough: we should track injectivity, and assume that abst=
ract
types are not injective by default.
However, this also means that the first type I defined above (using Hashtbl=
.t)
will be refused.

How many of you are using GADTs in this way, and how dependent are you on
abstract types ?


If we need to be able to define GADTs relying on the injectivity of some ab=
stract
types, a straightforward solution is to add injectivity annotations on type=
 parameters,
the same way it was done for variance:

  type #'a t

would mean that t is injective.
This is implemented in branches/non-vanishing, in the subversion repository.
	svn checkout http://caml.inria.fr/svn/branches/non-vanishing
Note that in that branch Hashtbl.t in the standard library is marked as inj=
ective.
This introduces some new syntax, and libraries have to be modified to benef=
it,
so the question is do we really need it?


Jacques Garrigue

P.S.
If you are curious about other difficulties of GADTs, the is also branches/=
abstract-new
which introduces a notion of new types, either concrete or abstract, which =
are guaranteed
to be distinct from each other, and any other concrete type. This is useful=
 when
checking the exhaustiveness of pattern-matching, as we can then discard imp=
ossible
branches. That branch is completely experimental.