From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id C2EFA7EF10 for ; Tue, 24 Feb 2015 21:04:01 +0100 (CET) Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of bmillwood@janestreet.com) identity=pra; client-ip=38.105.200.112; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="bmillwood@janestreet.com"; x-sender="bmillwood@janestreet.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of bmillwood@janestreet.com designates 38.105.200.112 as permitted sender) identity=mailfrom; client-ip=38.105.200.112; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="bmillwood@janestreet.com"; x-sender="bmillwood@janestreet.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mxout1.mail.janestreet.com) identity=helo; client-ip=38.105.200.112; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="bmillwood@janestreet.com"; x-sender="postmaster@mxout1.mail.janestreet.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: A0C+AQCz2OxUnHDIaSZRCoNYWgSDBLA4jXKBZAEJhXACgScHQwEBAQEBARABAQEBAQYWCUKEDwEBAQMBAQIPEQQZAQEsBAcBBAsJAgsNDR0CAiEBEgEFAQoSBhMSCAiHeQMJCAMKnxeQTz4xij5whGIBBY5dAwqFLwEBAQEBBQEBAQEBAQEBAQEBEQYKiwmCRIFOWAQHgmiBQ4RkCokAhVKCFoIJgUaBGzmCYIkAgkmBdBIjgRWEEG+BAiSBHQEBAQ X-IPAS-Result: A0C+AQCz2OxUnHDIaSZRCoNYWgSDBLA4jXKBZAEJhXACgScHQwEBAQEBARABAQEBAQYWCUKEDwEBAQMBAQIPEQQZAQEsBAcBBAsJAgsNDR0CAiEBEgEFAQoSBhMSCAiHeQMJCAMKnxeQTz4xij5whGIBBY5dAwqFLwEBAQEBBQEBAQEBAQEBAQEBEQYKiwmCRIFOWAQHgmiBQ4RkCokAhVKCFoIJgUaBGzmCYIkAgkmBdBIjgRWEEG+BAiSBHQEBAQ X-IronPort-AV: E=Sophos;i="5.09,640,1418079600"; d="scan'208";a="101196516" Received: from mxout1.mail.janestreet.com ([38.105.200.112]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 24 Feb 2015 21:04:00 +0100 Received: from tot-qpr-mailcore1.delacy.com ([172.27.56.68] helo=tot-qpr-mailcore1) by mxout1.mail.janestreet.com with smtp (Exim 4.82) (envelope-from ) id 1YQLi2-0002cI-Rq for caml-list@inria.fr; Tue, 24 Feb 2015 15:03:58 -0500 X-JS-Flow: external Received: by tot-qpr-mailcore1 with JS-mailcore (0.1) (envelope-from ) id BU7Nku-AAACfJ-ai; 2015-02-24 15:03:58.849826-05:00 Received: from mail-qa0-f51.google.com ([209.85.216.51]) by mxgoog1.mail.janestreet.com with esmtps (UNKNOWN:AES128-GCM-SHA256:128) (Exim 4.72) (envelope-from ) id 1YQLi2-0004b7-ML for caml-list@inria.fr; Tue, 24 Feb 2015 15:03:58 -0500 Received: by mail-qa0-f51.google.com with SMTP id i13so28814310qae.10 for ; Tue, 24 Feb 2015 12:03:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=janestreet.com; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=BTLHeA5e0TURmyLccCEc8MDt0p+ZixoJDj2i9nxmKf8=; b=jkD9nFwl6zjeUhBhXzO9nfSIaDQ2jagmJg2fAVy6njh/35S3hgSi5ewFoCYixZKtia p8d5e89KeuteITLl2E+yaP0Si+jTL0jBJrhixnhlqSOBstbdPGcLzZJFkIk+qDxznS1h zSNlgPsNAVPFM+WPFBIXFeWhlyqvjD2bjGqzc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=BTLHeA5e0TURmyLccCEc8MDt0p+ZixoJDj2i9nxmKf8=; b=BYbEyOU9qatugdZYExkq5+wLvjnNTdHVaAN+pkelaZoVLe7D/alonZMy3EzzPmuIog 7QLkNkjJbAR3QFW5hZrkXvIAoNIM9NpWPMRqrHUBLHdnQbX7LO7ARjgMUTDvMk2bOrVg 8hq1kHa6y84wvOKmHh/mHaOQ/OQ6ZwJR9zg0fru7ugCGVT/AY+x1BYi2VDuu3I0OzL0+ QhbJgDup8nEXbDHUz9SSxbyGE5LEys2Hq6xUaRxdlUhSYf3lJHRqRLtVJENdv4LyFS06 zeYzIgB2pPjnNTyEyuDfKVMJjbjy6J84D58X+0A10cu6za6zy8ZjUBJ7N6c3MD50J2iN MPYg== X-Gm-Message-State: ALoCoQkc4E54JOIEfSYf19rlbBCfRswMhEL8ifoYIL2rDS1CNKpsaPKu8i0P89SeIdtK5by/qfkJi85LlrvlfzqtO3AYm7SQumUS8zH1/shRfAKx0nJ+21fggwsVN4sPyIoH6AaewaPd X-Received: by 10.140.41.169 with SMTP id z38mr37902142qgz.56.1424808238493; Tue, 24 Feb 2015 12:03:58 -0800 (PST) MIME-Version: 1.0 X-Received: by 10.140.41.169 with SMTP id z38mr37902126qgz.56.1424808238396; Tue, 24 Feb 2015 12:03:58 -0800 (PST) Received: by 10.96.81.100 with HTTP; Tue, 24 Feb 2015 12:03:58 -0800 (PST) In-Reply-To: References: <000401d05056$ed605f70$c8211e50$@metastack.com> <87wq372noc.fsf@study.localdomain> Date: Tue, 24 Feb 2015 20:03:58 +0000 Message-ID: From:Ben Millwood To:Jeremy Yallop Cc:David Allsopp , Leo White , OCaml List Content-Type: multipart/alternative; boundary=001a11c11dbaec504b050fdb0415 X-JS-Processed-by: mailcore Subject: Re: [Caml-list] Match error with abstract types in modules --001a11c11dbaec504b050fdb0415 Content-Type: text/plain; charset=UTF-8 I think the issue would be substantially mitigated were it not for the simple fact that [type 'a t] in a signature means "abstract type", whereas [type 'a t] in a structure means "empty type". The fact that there is no syntactic distinction, and indeed *no way to make one* caused me a great deal of confusion some time ago when I had a problem similar to David's. If I had my way, [type 'a t] would mean "abstract type", and would therefore be disallowed in structures, and there'd be a separate syntax for empty types (although I admit not having any great ideas for what it should be). Then if there were two empty types in a signature, they'd be incompatible (and injective), just as two types with exposed constructors are incompatible. On 24 February 2015 at 19:52, Jeremy Yallop wrote: > On 24 February 2015 at 18:26, David Allsopp > wrote: > > Leo White wrote: > >> > Please could someone remind me what it is about types in modules which > >> > means that > >> > > >> > module Foo = struct > >> > type 'a foo > >> > end > >> > > >> > type _ t = A : int Foo.foo t > >> > | B : float Foo.foo t > >> > > >> > let foo A = () > >> > > >> > is non-exhaustive pattern matching, but: > >> > > >> > type 'a bar > >> > > >> > type _ u = C : int bar u > >> > | D : float bar u > >> > > >> > let bar C = () > >> > > >> > is exhaustive? Or is it actually a bug (given that foo B is a type > >> error)? > >> > > >> > >> If Foo is replaced with: > >> > >> module Foo : sig > >> type 'a foo > >> end = struct > >> type 'a foo = unit > >> end > >> > >> so that `int foo` is equal to `float foo`, then your match isn't really > >> exhaustive. > > > > How so? > > In order to understand what's going on here it helps to consider two > relations between types: compatibility and equality. > > Equality's simple enough: two types are equal if they are the same > (after resolving all alises). For example, > > float and float are equal > > int and float are not equal > > string -> bool and string -> bool are equal > > int t and int t are equal for any unary type constructor t > > int list and int option are not equal > > float t and int s might be equal if t and s are aliases. For > example if we have type 'a t = int and type 'a s = 'a then float t and > int s are both equal to int. > > Compatibility's a tiny bit more complicated. Two types are compatible > if it's *possible* to make them equal by some instantiation of > abstract types. For example: > > equal types are always compatible > > int list and int option are not compatible > > float t and int s are compatible if t and s are abstract types, > since there is a way to define the abtract types that makes float t > and int s equal. > > GADTs are mostly about equality. The fundamental role of GADTs is to > transport type equalities from one part of a program to another. > However, checking compatibility of GADT indexes is sometimes useful as > well. In particular, checking the compatibility of indexes during > pattern matches allows the compiler to more precisely identify > redundant cases or inexhaustive matches. > > We might imagine three possible strategies that the compiler could > take in checking a pattern match involving GADTs for exhaustiveness > and redundancy. Suppose we have a function that matches over GADTs: > > let f : type a. (a, int) s -> unit = function ... > > Then > > (1) The compiler could simply ignore type information and assume > that all cases could potentially be matched. This was the approach > used in GHC until recently. > > (2) The compiler could check whether there is some instantiation of > locally abstract types that makes the declared type of the case > variable equal to the type of the pattern for each case. For example, > if s is defined as follows: > > type (_, _) s = > X : (float, int) s > | Y : ('b, 'b) s > | Z : (bool, float) s > > then there are choices for the variable "a" that make (a, int) > equal to the types of X and Y, but no way to choose a type for "a" > that makes (a, int) equal to the type of Z. > > Unfortunately an approach based on equality breaks down when > abstract types are involved. (I'll show why below.) > > (3) The compiler could take the same approach as in (2), but check > for a way to make types compatible rather than equal. This is the > approach taken in OCaml. > > Let's look at your example. You have an abstract type Foo.foo and the > following type definition: > > type _ t = A : int Foo.foo t > | B : float Foo.foo t > > Now, suppose you have a function declared like this: > > let f : int t -> unit = function > > Then with approach (1) both A and B would be allowed as patterns, and > leaving them off would lead to a warning about exhaustiveness. With > approach (2) neither A nor B would be allowed, since int t is not > equal to either int Foo.foo t or float Foo.foo t. With approach (3) > both A and B are allowed as patterns, since int t is compatible with > both int Foo.foo t and float Foo.foo t. Allowing both A and B as > patterns is the correct option here, since it's possible to call f > with either A or B if foo is instantiated appropriately: > > module F(Foo: sig type _ foo end) = > struct > type _ t = A : int Foo.foo t > | B : float Foo.foo t > let f : int t -> unit = function > A -> () > | B -> () > end > > include F(struct type _ foo = int end) > > let () = begin f A; f B end > > > But you get no warning about an unusable match case with: > > > > let foo = function A -> () | B -> () > > The reason is that the types of A and B are compatible with each other. > > >> Personally, I dislike this behaviour and would prefer both cases to give > >> an error. > > > > What's to dislike? It's rather useful - the constructors of a GADT are > partionable by type. > > It is indeed useful, and a nice bonus is that it leads to more > efficient code. However, one (reasonable) point of view is that there > shouldn't be a difference in behaviour between defining something > directly and having a temporary abstraction barrier in between the > definition and the use. I tend to agree, except that I think the > desirable fix here is not to stop using the compatibility check to > improve pattern matching compilation but to add to the language a way > to express that int Foo.foo and float Foo.foo are incompatible (i.e. > that Foo.foo is injective). > > More generally, I think OCaml's whole approach to exhaustiveness > checking for pattern matching with GADTs is an interesting story, and > I'd like to see it written up somewhere (assuming it isn't already). > The Haskell folk are writing a paper about new approach to > exhaustiveness checking for GADT patterns, but the issues that arise > when you have modular abstraction, as in OCaml, are quite different. > > Jeremy. > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > --001a11c11dbaec504b050fdb0415 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
I think the issue would be substantially mitigated were it= not for the simple fact that [type 'a t] in a signature means "ab= stract type", whereas [type 'a t] in a structure means "empty= type". The fact that there is no syntactic distinction, and indeed *n= o way to make one* caused me a great deal of confusion some time ago when I= had a problem similar to David's.

If I had my way, = [type 'a t] would mean "abstract type", and would therefore b= e disallowed in structures, and there'd be a separate syntax for empty = types (although I admit not having any great ideas for what it should be). = Then if there were two empty types in a signature, they'd be incompatib= le (and injective), just as two types with exposed constructors are incompa= tible.

On 24 February 2015 at 19:52, Jeremy Yallop <yallop@gmail.com> wrote:
On 24 February 2015 at 18:26, David Allsopp <dra-news@metastack.com> wrote:
> Leo White wrote:
>> > Please could someone remind me what it is about types in modu= les which
>> > means that
>> >
>> >=C2=A0 =C2=A0module Foo =3D struct
>> >=C2=A0 =C2=A0 =C2=A0type 'a foo
>> >=C2=A0 =C2=A0end
>> >
>> >=C2=A0 =C2=A0type _ t =3D A : int Foo.foo t
>> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | B : float Foo.foo = t
>> >
>> >=C2=A0 =C2=A0let foo A =3D ()
>> >
>> > is non-exhaustive pattern matching, but:
>> >
>> >=C2=A0 =C2=A0type 'a bar
>> >
>> >=C2=A0 =C2=A0type _ u =3D C : int bar u
>> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | D : float bar u
>> >
>> >=C2=A0 =C2=A0let bar C =3D ()
>> >
>> > is exhaustive? Or is it actually a bug (given that foo B is a= type
>> error)?
>> >
>>
>> If Foo is replaced with:
>>
>>=C2=A0 =C2=A0 =C2=A0module Foo : sig
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0type 'a foo
>>=C2=A0 =C2=A0 =C2=A0end =3D struct
>>=C2=A0 =C2=A0 =C2=A0 =C2=A0type 'a foo =3D unit
>>=C2=A0 =C2=A0 =C2=A0end
>>
>> so that `int foo` is equal to `float foo`, then your match isn'= ;t really
>> exhaustive.
>
> How so?

In order to understand what's going on here it helps to con= sider two
relations between types: compatibility and equality.

Equality's simple enough: two types are equal if they are the same
(after resolving all alises).=C2=A0 For example,

=C2=A0 =C2=A0float and float are equal

=C2=A0 =C2=A0int and float are not equal

=C2=A0 =C2=A0string -> bool and string -> bool are equal

=C2=A0 =C2=A0int t and int t are equal for any unary type constructor t

=C2=A0 =C2=A0int list and int option are not equal

=C2=A0 =C2=A0float t and int s might be equal if t and s are aliases.=C2=A0= For
example if we have type 'a t =3D int and type 'a s =3D 'a then = float t and
int s are both equal to int.

Compatibility's a tiny bit more complicated.=C2=A0 Two types are compat= ible
if it's *possible* to make them equal by some instantiation of
abstract types.=C2=A0 For example:

=C2=A0 =C2=A0equal types are always compatible

=C2=A0 =C2=A0int list and int option are not compatible

=C2=A0 =C2=A0float t and int s are compatible if t and s are abstract types= ,
since there is a way to define the abtract types that makes float t
and int s equal.

GADTs are mostly about equality.=C2=A0 The fundamental role of GADTs is to<= br> transport type equalities from one part of a program to another.
However, checking compatibility of GADT indexes is sometimes useful as
well.=C2=A0 In particular, checking the compatibility of indexes during
pattern matches allows the compiler to more precisely identify
redundant cases or inexhaustive matches.

We might imagine three possible strategies that the compiler could
take in checking a pattern match involving GADTs for exhaustiveness
and redundancy.=C2=A0 Suppose we have a function that matches over GADTs:
=C2=A0 let f : type a. (a, int) s -> unit =3D function ...

Then

=C2=A0 (1) The compiler could simply ignore type information and assume
that all cases could potentially be matched.=C2=A0 This was the approach
used in GHC until recently.

=C2=A0 (2) The compiler could check whether there is some instantiation of<= br> locally abstract types that makes the declared type of the case
variable equal to the type of the pattern for each case.=C2=A0 For example,=
if s is defined as follows:

=C2=A0 =C2=A0 =C2=A0 type (_, _) s =3D
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0X : (float, int) s
=C2=A0 =C2=A0 =C2=A0 =C2=A0| Y : ('b, 'b) s
=C2=A0 =C2=A0 =C2=A0 =C2=A0| Z : (bool, float) s

=C2=A0 =C2=A0then there are choices for the variable "a" that mak= e (a, int)
equal to the types of X and Y, but no way to choose a type for "a"= ;
that makes (a, int) equal to the type of Z.

=C2=A0 =C2=A0 =C2=A0 Unfortunately an approach based on equality breaks dow= n when
abstract types are involved.=C2=A0 (I'll show why below.)

=C2=A0 =C2=A0(3) The compiler could take the same approach as in (2), but c= heck
for a way to make types compatible rather than equal.=C2=A0 This is the
approach taken in OCaml.

Let's look at your example.=C2=A0 You have an abstract type Foo.foo and= the
following type definition:

=C2=A0 =C2=A0 type _ t =3D A : int Foo.foo t
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| B : float Foo.foo t

Now, suppose you have a function declared like this:

=C2=A0 =C2=A0 let f : int t -> unit =3D function

Then with approach (1) both A and B would be allowed as patterns, and
leaving them off would lead to a warning about exhaustiveness.=C2=A0 With approach (2) neither A nor B would be allowed, since int t is not
equal to either int Foo.foo t or float Foo.foo t.=C2=A0 With approach (3) both A and B are allowed as patterns, since int t is compatible with
both int Foo.foo t and float Foo.foo t.=C2=A0 Allowing both A and B as
patterns is the correct option here, since it's possible to call f
with either A or B if foo is instantiated appropriately:

=C2=A0 =C2=A0module F(Foo: sig type _ foo end) =3D
=C2=A0 =C2=A0struct
=C2=A0 =C2=A0 =C2=A0 =C2=A0type _ t =3D A : int Foo.foo t<= br> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | B := float Foo.foo t
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0let f : int t -> unit =3D funct= ion
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0A -> ()
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| B -> ()
=C2=A0 end

=C2=A0 include F(struct type _ foo =3D int end)

=C2=A0 let () =3D begin f A; f B end

> But you get no warning about an unusable match case with:
>
>=C2=A0 =C2=A0 let foo =3D function A -> () | B -> ()

The reason is that the types of A and B are compatible with each oth= er.

>> Personally, I dislike this behaviour and would prefer both cases t= o give
>> an error.
>
> What's to dislike? It's rather useful - the constructors of a = GADT are partionable by type.

It is indeed useful, and a nice bonus is that it leads to more
efficient code.=C2=A0 However, one (reasonable) point of view is that there=
shouldn't be a difference in behaviour between defining something
directly and having a temporary abstraction barrier in between the
definition and the use.=C2=A0 I tend to agree, except that I think the
desirable fix here is not to stop using the compatibility check to
improve pattern matching compilation but to add to the language a way
to express that int Foo.foo and float Foo.foo are incompatible (i.e.
that Foo.foo is injective).

More generally, I think OCaml's whole approach to exhaustiveness
checking for pattern matching with GADTs is an interesting story, and
I'd like to see it written up somewhere (assuming it isn't already)= .
The Haskell folk are writing a paper about new approach to
exhaustiveness checking for GADT patterns, but the issues that arise
when you have modular abstraction, as in OCaml, are quite different.

Jeremy.

--
Caml-list mailing list.=C2=A0 Subscription management and archives:
ht= tps://sympa.inria.fr/sympa/arc/caml-list
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs

--001a11c11dbaec504b050fdb0415--