From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <yallop@gmail.com>
X-Original-To: caml-list@sympa.inria.fr
Delivered-To: caml-list@sympa.inria.fr
Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83])
	by sympa.inria.fr (Postfix) with ESMTPS id C5A5E7EF10
	for <caml-list@sympa.inria.fr>; Tue, 24 Feb 2015 20:52:35 +0100 (CET)
Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender
  authenticity information available from domain of
  yallop@gmail.com) identity=pra; client-ip=209.85.216.48;
  receiver=mail2-smtp-roc.national.inria.fr;
  envelope-from="yallop@gmail.com";
  x-sender="yallop@gmail.com"; x-conformance=sidf_compatible
Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of
  yallop@gmail.com designates 209.85.216.48 as permitted
  sender) identity=mailfrom; client-ip=209.85.216.48;
  receiver=mail2-smtp-roc.national.inria.fr;
  envelope-from="yallop@gmail.com";
  x-sender="yallop@gmail.com"; x-conformance=sidf_compatible;
  x-record-type="v=spf1"
Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender
  authenticity information available from domain of
  postmaster@mail-qa0-f48.google.com) identity=helo;
  client-ip=209.85.216.48;
  receiver=mail2-smtp-roc.national.inria.fr;
  envelope-from="yallop@gmail.com";
  x-sender="postmaster@mail-qa0-f48.google.com";
  x-conformance=sidf_compatible
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0CrAQDE1exUmzDYVdFRCoNYWgSDBMAOAoV4AoEnB0MBAQEBAQEQAQEBAQEGCwsJFC6EDwEBAQMBAQIPEQQZARsWBwEDAQsGBQsDCgICJgICIgERAQUBHAYTCBIIh3gBAwkIr3g+MYsugWuCd44VChknDVSEWwEBAQEGAQEBARgBBQ6BE4lyhBJcB4JogUMBBJM2ghaDT4EbgxWLSYF0EiOBDAmEED4xgQIkgR0BAQE
X-IPAS-Result: A0CrAQDE1exUmzDYVdFRCoNYWgSDBMAOAoV4AoEnB0MBAQEBAQEQAQEBAQEGCwsJFC6EDwEBAQMBAQIPEQQZARsWBwEDAQsGBQsDCgICJgICIgERAQUBHAYTCBIIh3gBAwkIr3g+MYsugWuCd44VChknDVSEWwEBAQEGAQEBARgBBQ6BE4lyhBJcB4JogUMBBJM2ghaDT4EbgxWLSYF0EiOBDAmEED4xgQIkgR0BAQE
X-IronPort-AV: E=Sophos;i="5.09,640,1418079600"; 
   d="scan'208";a="123232567"
Received: from mail-qa0-f48.google.com ([209.85.216.48])
  by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 24 Feb 2015 20:52:35 +0100
Received: by mail-qa0-f48.google.com with SMTP id dc16so28735383qab.7
        for <caml-list@inria.fr>; Tue, 24 Feb 2015 11:52:34 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20120113;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :cc:content-type;
        bh=g79lQCkfjRGSDLWT0zo6lhFWzZrglh7yeQqU+2vAffY=;
        b=MFwajEReQOIsxzbdAygfZ5QyFAAUEd2EHaQGcQL0SpRCoCyoZ6dyYQW4FDv93yBqKi
         LnOmrZJpiguD3B7UbwJDRAz64qgp1+OedGNatC2MVpJs4kzih7Hb/ZPX/hnddKf4YgKF
         nSKVCWJO94Ly58ctg0h0GODLmroAfna77ThCW1aWvdPMq2VvmfADaP3t3SgsNuKo99ta
         6vDajVpugHENbNe+1V7vrtyEjoNEwqoUsFoEMSooZ73C6lFVgSczef03lfKf2oHbjskR
         I18bwGEtvaFiCdawcuJ9rgq5VVIPtPsr79rCcQ0nVr0NqFBDFbAwLu4QheuNbpRdOGh3
         GkOg==
MIME-Version: 1.0
X-Received: by 10.140.149.4 with SMTP id 4mr13201122qhv.38.1424807553849; Tue,
 24 Feb 2015 11:52:33 -0800 (PST)
Received: by 10.229.160.11 with HTTP; Tue, 24 Feb 2015 11:52:33 -0800 (PST)
In-Reply-To: <E51C5B015DBD1348A1D85763337FB6D9E994424B@Remus.metastack.local>
References: <000401d05056$ed605f70$c8211e50$@metastack.com>
	<87wq372noc.fsf@study.localdomain>
	<E51C5B015DBD1348A1D85763337FB6D9E994424B@Remus.metastack.local>
Date: Tue, 24 Feb 2015 19:52:33 +0000
Message-ID: <CAAxsn=GsxouTxpsptw_Kk930721iZuhFAY2BObYmjV0Rbk5wkw@mail.gmail.com>
From: Jeremy Yallop <yallop@gmail.com>
To: David Allsopp <dra-news@metastack.com>
Cc: Leo White <lpw25@cam.ac.uk>, OCaml List <caml-list@inria.fr>
Content-Type: text/plain; charset=UTF-8
Subject: Re: [Caml-list] Match error with abstract types in modules

On 24 February 2015 at 18:26, David Allsopp <dra-news@metastack.com> wrote:
> Leo White wrote:
>> > Please could someone remind me what it is about types in modules which
>> > means that
>> >
>> >   module Foo = struct
>> >     type 'a foo
>> >   end
>> >
>> >   type _ t = A : int Foo.foo t
>> >            | B : float Foo.foo t
>> >
>> >   let foo A = ()
>> >
>> > is non-exhaustive pattern matching, but:
>> >
>> >   type 'a bar
>> >
>> >   type _ u = C : int bar u
>> >            | D : float bar u
>> >
>> >   let bar C = ()
>> >
>> > is exhaustive? Or is it actually a bug (given that foo B is a type
>> error)?
>> >
>>
>> If Foo is replaced with:
>>
>>     module Foo : sig
>>       type 'a foo
>>     end = struct
>>       type 'a foo = unit
>>     end
>>
>> so that `int foo` is equal to `float foo`, then your match isn't really
>> exhaustive.
>
> How so?

In order to understand what's going on here it helps to consider two
relations between types: compatibility and equality.

Equality's simple enough: two types are equal if they are the same
(after resolving all alises).  For example,

   float and float are equal

   int and float are not equal

   string -> bool and string -> bool are equal

   int t and int t are equal for any unary type constructor t

   int list and int option are not equal

   float t and int s might be equal if t and s are aliases.  For
example if we have type 'a t = int and type 'a s = 'a then float t and
int s are both equal to int.

Compatibility's a tiny bit more complicated.  Two types are compatible
if it's *possible* to make them equal by some instantiation of
abstract types.  For example:

   equal types are always compatible

   int list and int option are not compatible

   float t and int s are compatible if t and s are abstract types,
since there is a way to define the abtract types that makes float t
and int s equal.

GADTs are mostly about equality.  The fundamental role of GADTs is to
transport type equalities from one part of a program to another.
However, checking compatibility of GADT indexes is sometimes useful as
well.  In particular, checking the compatibility of indexes during
pattern matches allows the compiler to more precisely identify
redundant cases or inexhaustive matches.

We might imagine three possible strategies that the compiler could
take in checking a pattern match involving GADTs for exhaustiveness
and redundancy.  Suppose we have a function that matches over GADTs:

  let f : type a. (a, int) s -> unit = function ...

Then

  (1) The compiler could simply ignore type information and assume
that all cases could potentially be matched.  This was the approach
used in GHC until recently.

  (2) The compiler could check whether there is some instantiation of
locally abstract types that makes the declared type of the case
variable equal to the type of the pattern for each case.  For example,
if s is defined as follows:

      type (_, _) s =
         X : (float, int) s
       | Y : ('b, 'b) s
       | Z : (bool, float) s

   then there are choices for the variable "a" that make (a, int)
equal to the types of X and Y, but no way to choose a type for "a"
that makes (a, int) equal to the type of Z.

      Unfortunately an approach based on equality breaks down when
abstract types are involved.  (I'll show why below.)

   (3) The compiler could take the same approach as in (2), but check
for a way to make types compatible rather than equal.  This is the
approach taken in OCaml.

Let's look at your example.  You have an abstract type Foo.foo and the
following type definition:

    type _ t = A : int Foo.foo t
             | B : float Foo.foo t

Now, suppose you have a function declared like this:

    let f : int t -> unit = function

Then with approach (1) both A and B would be allowed as patterns, and
leaving them off would lead to a warning about exhaustiveness.  With
approach (2) neither A nor B would be allowed, since int t is not
equal to either int Foo.foo t or float Foo.foo t.  With approach (3)
both A and B are allowed as patterns, since int t is compatible with
both int Foo.foo t and float Foo.foo t.  Allowing both A and B as
patterns is the correct option here, since it's possible to call f
with either A or B if foo is instantiated appropriately:

   module F(Foo: sig type _ foo end) =
   struct
       type _ t = A : int Foo.foo t
                    | B : float Foo.foo t
         let f : int t -> unit = function
           A -> ()
         | B -> ()
  end

  include F(struct type _ foo = int end)

  let () = begin f A; f B end

> But you get no warning about an unusable match case with:
>
>    let foo = function A -> () | B -> ()

The reason is that the types of A and B are compatible with each other.

>> Personally, I dislike this behaviour and would prefer both cases to give
>> an error.
>
> What's to dislike? It's rather useful - the constructors of a GADT are partionable by type.

It is indeed useful, and a nice bonus is that it leads to more
efficient code.  However, one (reasonable) point of view is that there
shouldn't be a difference in behaviour between defining something
directly and having a temporary abstraction barrier in between the
definition and the use.  I tend to agree, except that I think the
desirable fix here is not to stop using the compatibility check to
improve pattern matching compilation but to add to the language a way
to express that int Foo.foo and float Foo.foo are incompatible (i.e.
that Foo.foo is injective).

More generally, I think OCaml's whole approach to exhaustiveness
checking for pattern matching with GADTs is an interesting story, and
I'd like to see it written up somewhere (assuming it isn't already).
The Haskell folk are writing a paper about new approach to
exhaustiveness checking for GADT patterns, but the issues that arise
when you have modular abstraction, as in OCaml, are quite different.

Jeremy.