Re: [Caml-list] Match error with abstract types in modules

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

From: Ben Millwood <bmillwood@janestreet.com>
To: Jeremy Yallop <yallop@gmail.com>
Cc: David Allsopp <dra-news@metastack.com>,
	Leo White <lpw25@cam.ac.uk>, OCaml List <caml-list@inria.fr>
Subject: Re: [Caml-list] Match error with abstract types in modules
Date: Tue, 24 Feb 2015 20:03:58 +0000	[thread overview]
Message-ID: <CA+MHO50VZUavWk2A-shNqUEV8uEw-dtFuFP=uGZ+_MMRTgFjDg@mail.gmail.com> (raw)
In-Reply-To: <CAAxsn=GsxouTxpsptw_Kk930721iZuhFAY2BObYmjV0Rbk5wkw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 7278 bytes --]

I think the issue would be substantially mitigated were it not for the
simple fact that [type 'a t] in a signature means "abstract type", whereas
[type 'a t] in a structure means "empty type". The fact that there is no
syntactic distinction, and indeed *no way to make one* caused me a great
deal of confusion some time ago when I had a problem similar to David's.

If I had my way, [type 'a t] would mean "abstract type", and would
therefore be disallowed in structures, and there'd be a separate syntax for
empty types (although I admit not having any great ideas for what it should
be). Then if there were two empty types in a signature, they'd be
incompatible (and injective), just as two types with exposed constructors
are incompatible.

On 24 February 2015 at 19:52, Jeremy Yallop <yallop@gmail.com> wrote:

> On 24 February 2015 at 18:26, David Allsopp <dra-news@metastack.com>
> wrote:
> > Leo White wrote:
> >> > Please could someone remind me what it is about types in modules which
> >> > means that
> >> >
> >> >   module Foo = struct
> >> >     type 'a foo
> >> >   end
> >> >
> >> >   type _ t = A : int Foo.foo t
> >> >            | B : float Foo.foo t
> >> >
> >> >   let foo A = ()
> >> >
> >> > is non-exhaustive pattern matching, but:
> >> >
> >> >   type 'a bar
> >> >
> >> >   type _ u = C : int bar u
> >> >            | D : float bar u
> >> >
> >> >   let bar C = ()
> >> >
> >> > is exhaustive? Or is it actually a bug (given that foo B is a type
> >> error)?
> >> >
> >>
> >> If Foo is replaced with:
> >>
> >>     module Foo : sig
> >>       type 'a foo
> >>     end = struct
> >>       type 'a foo = unit
> >>     end
> >>
> >> so that `int foo` is equal to `float foo`, then your match isn't really
> >> exhaustive.
> >
> > How so?
>
> In order to understand what's going on here it helps to consider two
> relations between types: compatibility and equality.
>
> Equality's simple enough: two types are equal if they are the same
> (after resolving all alises).  For example,
>
>    float and float are equal
>
>    int and float are not equal
>
>    string -> bool and string -> bool are equal
>
>    int t and int t are equal for any unary type constructor t
>
>    int list and int option are not equal
>
>    float t and int s might be equal if t and s are aliases.  For
> example if we have type 'a t = int and type 'a s = 'a then float t and
> int s are both equal to int.
>
> Compatibility's a tiny bit more complicated.  Two types are compatible
> if it's *possible* to make them equal by some instantiation of
> abstract types.  For example:
>
>    equal types are always compatible
>
>    int list and int option are not compatible
>
>    float t and int s are compatible if t and s are abstract types,
> since there is a way to define the abtract types that makes float t
> and int s equal.
>
> GADTs are mostly about equality.  The fundamental role of GADTs is to
> transport type equalities from one part of a program to another.
> However, checking compatibility of GADT indexes is sometimes useful as
> well.  In particular, checking the compatibility of indexes during
> pattern matches allows the compiler to more precisely identify
> redundant cases or inexhaustive matches.
>
> We might imagine three possible strategies that the compiler could
> take in checking a pattern match involving GADTs for exhaustiveness
> and redundancy.  Suppose we have a function that matches over GADTs:
>
>   let f : type a. (a, int) s -> unit = function ...
>
> Then
>
>   (1) The compiler could simply ignore type information and assume
> that all cases could potentially be matched.  This was the approach
> used in GHC until recently.
>
>   (2) The compiler could check whether there is some instantiation of
> locally abstract types that makes the declared type of the case
> variable equal to the type of the pattern for each case.  For example,
> if s is defined as follows:
>
>       type (_, _) s =
>          X : (float, int) s
>        | Y : ('b, 'b) s
>        | Z : (bool, float) s
>
>    then there are choices for the variable "a" that make (a, int)
> equal to the types of X and Y, but no way to choose a type for "a"
> that makes (a, int) equal to the type of Z.
>
>       Unfortunately an approach based on equality breaks down when
> abstract types are involved.  (I'll show why below.)
>
>    (3) The compiler could take the same approach as in (2), but check
> for a way to make types compatible rather than equal.  This is the
> approach taken in OCaml.
>
> Let's look at your example.  You have an abstract type Foo.foo and the
> following type definition:
>
>     type _ t = A : int Foo.foo t
>              | B : float Foo.foo t
>
> Now, suppose you have a function declared like this:
>
>     let f : int t -> unit = function
>
> Then with approach (1) both A and B would be allowed as patterns, and
> leaving them off would lead to a warning about exhaustiveness.  With
> approach (2) neither A nor B would be allowed, since int t is not
> equal to either int Foo.foo t or float Foo.foo t.  With approach (3)
> both A and B are allowed as patterns, since int t is compatible with
> both int Foo.foo t and float Foo.foo t.  Allowing both A and B as
> patterns is the correct option here, since it's possible to call f
> with either A or B if foo is instantiated appropriately:
>
>    module F(Foo: sig type _ foo end) =
>    struct
>        type _ t = A : int Foo.foo t
>                     | B : float Foo.foo t
>          let f : int t -> unit = function
>            A -> ()
>          | B -> ()
>   end
>
>   include F(struct type _ foo = int end)
>
>   let () = begin f A; f B end
>
> > But you get no warning about an unusable match case with:
> >
> >    let foo = function A -> () | B -> ()
>
> The reason is that the types of A and B are compatible with each other.
>
> >> Personally, I dislike this behaviour and would prefer both cases to give
> >> an error.
> >
> > What's to dislike? It's rather useful - the constructors of a GADT are
> partionable by type.
>
> It is indeed useful, and a nice bonus is that it leads to more
> efficient code.  However, one (reasonable) point of view is that there
> shouldn't be a difference in behaviour between defining something
> directly and having a temporary abstraction barrier in between the
> definition and the use.  I tend to agree, except that I think the
> desirable fix here is not to stop using the compatibility check to
> improve pattern matching compilation but to add to the language a way
> to express that int Foo.foo and float Foo.foo are incompatible (i.e.
> that Foo.foo is injective).
>
> More generally, I think OCaml's whole approach to exhaustiveness
> checking for pattern matching with GADTs is an interesting story, and
> I'd like to see it written up somewhere (assuming it isn't already).
> The Haskell folk are writing a paper about new approach to
> exhaustiveness checking for GADT patterns, but the issues that arise
> when you have modular abstraction, as in OCaml, are quite different.
>
> Jeremy.
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

[-- Attachment #2: Type: text/html, Size: 9358 bytes --]

next prev parent reply	other threads:[~2015-02-24 20:04 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-24 17:25 David Allsopp
2015-02-24 18:02 ` Leo White
2015-02-24 18:26   ` David Allsopp
2015-02-24 19:11     ` Leo White
2015-02-25  8:41       ` David Allsopp
2015-02-24 19:52     ` Jeremy Yallop
2015-02-24 20:03       ` Ben Millwood [this message]
2015-02-24 21:11         ` Leo White
2015-02-25  8:41           ` David Allsopp
2015-02-25  9:48             ` Gabriel Scherer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+MHO50VZUavWk2A-shNqUEV8uEw-dtFuFP=uGZ+_MMRTgFjDg@mail.gmail.com' \
    --to=bmillwood@janestreet.com \
    --cc=caml-list@inria.fr \
    --cc=dra-news@metastack.com \
    --cc=lpw25@cam.ac.uk \
    --cc=yallop@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).