From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gabriel.scherer@gmail.com>
X-Original-To: caml-list@sympa.inria.fr
Delivered-To: caml-list@sympa.inria.fr
Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104])
	by sympa.inria.fr (Postfix) with ESMTPS id 1ABEE7EE51
	for <caml-list@sympa.inria.fr>; Mon, 29 Apr 2013 10:00:41 +0200 (CEST)
Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender
  authenticity information available from domain of
  gabriel.scherer@gmail.com) identity=pra;
  client-ip=209.85.214.52;
  receiver=mail3-smtp-sop.national.inria.fr;
  envelope-from="gabriel.scherer@gmail.com";
  x-sender="gabriel.scherer@gmail.com";
  x-conformance=sidf_compatible
Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of
  gabriel.scherer@gmail.com designates 209.85.214.52 as
  permitted sender) identity=mailfrom; client-ip=209.85.214.52;
  receiver=mail3-smtp-sop.national.inria.fr;
  envelope-from="gabriel.scherer@gmail.com";
  x-sender="gabriel.scherer@gmail.com";
  x-conformance=sidf_compatible; x-record-type="v=spf1"
Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender
  authenticity information available from domain of
  postmaster@mail-bk0-f52.google.com) identity=helo;
  client-ip=209.85.214.52;
  receiver=mail3-smtp-sop.national.inria.fr;
  envelope-from="gabriel.scherer@gmail.com";
  x-sender="postmaster@mail-bk0-f52.google.com";
  x-conformance=sidf_compatible
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AvUDAJknflHRVdY0k2dsb2JhbAA4GoM9rC6SEQQEAXYIFg4BAQEBBwsLCRQEJIIfAQEEAScZARsSCwEDAQsGBQsNDSEhAQERAQUBChIGExIJh2gBAwkGDC+fbowxgn2DZgoZJwMKWYdeAQUMjDiBHoEFMweDTwOVOoFkgSaKZYM6FimEOTqBLg
X-IPAS-Result: AvUDAJknflHRVdY0k2dsb2JhbAA4GoM9rC6SEQQEAXYIFg4BAQEBBwsLCRQEJIIfAQEEAScZARsSCwEDAQsGBQsNDSEhAQERAQUBChIGExIJh2gBAwkGDC+fbowxgn2DZgoZJwMKWYdeAQUMjDiBHoEFMweDTwOVOoFkgSaKZYM6FimEOTqBLg
X-IronPort-AV: E=Sophos;i="4.87,572,1363129200"; 
   d="scan'208";a="12551311"
Received: from mail-bk0-f52.google.com ([209.85.214.52])
  by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 29 Apr 2013 10:00:40 +0200
Received: by mail-bk0-f52.google.com with SMTP id je9so1613128bkc.39
        for <caml-list@inria.fr>; Mon, 29 Apr 2013 01:00:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20120113;
        h=x-received:mime-version:in-reply-to:references:from:date:message-id
         :subject:to:cc:content-type:content-transfer-encoding;
        bh=bVR7zdl6rfIzJGlvA4euX0Z+noJepfim8gIgJ/xIClY=;
        b=sgjGeLz7PGIds/zj3r6OsLJtunWPp+WfpBSi5Ai7vOdUwTNx9j+gK2p1HHRh2ASkF4
         D0RGTk2HFRt/ttgZWKGDurnOJuWVEV0NiS0g5o3pft8w/eE5uuryClfSVyQHps9kiakL
         2nXpIU17bdBRu2nTDlX+oUxama0qJ2n9PydG9Ou2MiRRNx3Obnu0uWpL972RHq4e8Qb6
         E6HgLCfgxGXH7f5oOTgsvWOJ4Yg2UNxeBvOi6bmOV2xqS+k9BxbE+4h5RAKnRLzhRscE
         xGrZr+RwYSk3yfUIfmWbCxmZuZYeaCr6oB3N2B8/yuUzyfj5VOIjNtGpjlrYp5jiTMJQ
         vEog==
X-Received: by 10.204.74.138 with SMTP id u10mr1384405bkj.113.1367222440046;
 Mon, 29 Apr 2013 01:00:40 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.205.83.144 with HTTP; Mon, 29 Apr 2013 00:59:59 -0700 (PDT)
In-Reply-To: <CAK0y-36uPAAXGanrhZKQOQAPsELjTFxM1Av820da-mU4kqTVfg@mail.gmail.com>
References: <00C57DF0-C6F0-4EDE-8607-2155F3A17146@math.nagoya-u.ac.jp> <CAK0y-36uPAAXGanrhZKQOQAPsELjTFxM1Av820da-mU4kqTVfg@mail.gmail.com>
From: Gabriel Scherer <gabriel.scherer@gmail.com>
Date: Mon, 29 Apr 2013 09:59:59 +0200
Message-ID: <CAPFanBFrqwPmBRHjuagYdgbowMiGyxwh5dDe4-2yRk3AJJGgug@mail.gmail.com>
To: Jacques Le Normand <rathereasy@gmail.com>
Cc: Jacques Garrigue <garrigue@math.nagoya-u.ac.jp>, OCaML List Mailing <caml-list@inria.fr>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Subject: Re: [Caml-list] Request for feedback: A problem with injectivity
 and GADTs

On Mon, Apr 29, 2013 at 7:17 AM, Jacques Le Normand
<rathereasy@gmail.com> wrote:
> Fully injective types are the norm. It would have been nice to have a "non
> injectivity" or "phantom type" annotation. Since phantom types are seldom
> used, beginners won't get confused. It might even help beginners understa=
nd
> the concept of a phantom type.

In practice most definitions using what we call a "phantom type" are
in fact injective (because sum and record types are injective in all
parameters if your notion of type equality is unification), so it
would be more dangerous than helpful to link the two concepts.

You are right however to remark that in a sense the "right default" is
to be injective rather than be potentially non-injective. If we were
starting from void, it would make sense to pick (type 'a t) as meaning
that 'a is injective, and a different syntax for maybe-non-injective=B9
types (this was suggested by other people in the discussion). However,
this would break backward compatibility as code that is working now,
providing a non-injective type to a signature (type 'a t), would then
fail to type-check.

There are two different design principles in tension here:
- the least path of resistance: the shortest syntax should have the
most common and useful meaning, and it is suboptimal to add a
complication that would apply in almost all cases
- pay (only) for what you use: if you don't use GADTs, you most
probably don't care about injectivity, so it would be wrong to break
existing code to facilitate a feature that authors of said code do not
use

I tend to think that having an explicit syntax for
potentially-non-injective types would be the best choice starting from
scratch (users will be surprised if their non-injective type fail to
satisfy a signature, but the error message can easily guide them
towards the right, local fix). But breaking compatibility is generally
not considered an option when evolving the OCaml language.

=B9: there is a difference between "I guarantee you that this type is
not injective in this parameter" and "You cannot assume that this type
is injective in this parameter", and the latter is "more abstract"
than the former; it is the better notion for an abstract type about
which we know nothing at all (so that any type declaration may be an
instance of it). The former, stronger notion seems less useful, and so
far nobody has suggested to allow to express it.


> On Sat, Apr 27, 2013 at 5:02 PM, Jacques Garrigue
> <garrigue@math.nagoya-u.ac.jp> wrote:
>>
>> Some of you may already be aware of PR#5985
>> http://caml.inria.fr/mantis/view.php?id=3D5985
>>
>> To explain very shortly what is happening, a bug was found in the ocaml
>> compiler, both 4.00 and the development version, such that one that
>> convert a value from any type to any other type, bypassing all checks.
>> This is what we call a soundness bug, a hole in the type system.
>>
>> Such problems happen at times, and we are usually very fast to fix them.
>> However this time the problem is bit more subtle, because fixing it
>> straightforwardly may make impossible to write a whole category of
>> GADTs. Hence this request for feedback.
>>
>> Practical description of the problem:
>>
>> It is quite usual to write type witnesses using a GADT:
>>
>> type _ ty =3D
>>   | Int : int ty
>>   | Pair : 'a ty * 'b ty -> ('a * 'b) ty
>>   | Hashtbl : 'a ty * 'b ty -> ('a, 'b) Hashtbl.t ty
>>
>> This looks fine, but now suppose that we write the following code:
>>
>> module F (X : sig type 'a t end) =3D struct
>>    type _ ty =3D
>>     | Int : int ty
>>     | Pair : 'a ty * 'b ty -> ('a * 'b) ty
>>     | T : 'a ty -> 'a X.t ty
>> end
>>
>> The idea is to create a type witness with the
>> type constructor X.t instead of Hashtbl.t.
>> Now I can use it this way:
>>
>> module U =3D struct type 'a t =3D unit end
>> module M =3D F(U)
>> let w =3D M.T M.Int
>> (* val w : int U.t M.ty *)
>> let w' =3D match w with M.T x -> x | _ -> assert false
>> (* val w' : '_a M.ty =3D M.Int *)
>>
>> What this means is that we can now give an arbitrary type
>> to a witness for int. You can easily use it to build a function
>> from int to any  type:
>>
>> let f : type a. a M.ty -> int -> a =3D
>>   fun w x -> match w with M.Int -> x | _ -> assert false;;
>> let g : int -> float =3D f w';;
>>
>> If we look at the source of the problem, it lies in the fact U.t is not
>> injective.
>> I.e. when we define a GADT, we assume that all the type variables in
>> the return type can be determined from the type parameter.
>> That is, from (int U.t M.ty) we assume that the type of the argument of T
>> is (int M.ty).
>> However, since U.t is not injective, int U.t is actually equal to any ty=
pe
>> 'a U.t,
>> and we can convert.
>> Note that, for known types, injectivity was already checked.
>> I.e., the following definition is not accepted:
>>
>> type _ ty =3D
>>   | Int : int ty
>>   | Pair : 'a ty * 'b ty -> ('a * 'b) ty
>>   | T : 'a ty -> 'a U.t ty
>>
>> However, abstract types were assumed to be injective.
>> Here we see that you cannot take this for granted.
>>
>> The problem is of course not limited to type witnesses, and not even
>> GADTs:
>> types with constrained parameters (introduced at the very beginning of
>> ocaml),
>> can also trigger it.
>> And you don't need a functor to trigger it: once you know that two types
>> are equal
>> in some scope, there are many ways to leak that information.
>>
>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D
>> Here comes the request for feedback:
>>
>> The fix is simple enough: we should track injectivity, and assume that
>> abstract
>> types are not injective by default.
>> However, this also means that the first type I defined above (using
>> Hashtbl.t)
>> will be refused.
>>
>> How many of you are using GADTs in this way, and how dependent are you on
>> abstract types ?
>>
>>
>> If we need to be able to define GADTs relying on the injectivity of some
>> abstract
>> types, a straightforward solution is to add injectivity annotations on
>> type parameters,
>> the same way it was done for variance:
>>
>>   type #'a t
>>
>> would mean that t is injective.
>> This is implemented in branches/non-vanishing, in the subversion
>> repository.
>>         svn checkout http://caml.inria.fr/svn/branches/non-vanishing
>> Note that in that branch Hashtbl.t in the standard library is marked as
>> injective.
>> This introduces some new syntax, and libraries have to be modified to
>> benefit,
>> so the question is do we really need it?
>>
>>
>> Jacques Garrigue
>>
>> P.S.
>> If you are curious about other difficulties of GADTs, the is also
>> branches/abstract-new
>> which introduces a notion of new types, either concrete or abstract, whi=
ch
>> are guaranteed
>> to be distinct from each other, and any other concrete type. This is
>> useful when
>> checking the exhaustiveness of pattern-matching, as we can then discard
>> impossible
>> branches. That branch is completely experimental.
>>
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa.inria.fr/sympa/arc/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
>