On Mon, Oct 25, 2010 at 6:44 PM, bluestorm <bluestorm.dylc@gmail.com> wrote:
It's very interesting.

First, I'm curious of the "historical" aspects of this work : where does it come from ? Apparently there is work from you and Jacques Garrigue, but it's hard to tell. Is it new, or a long-running experiment ?


The history: the algorithm was developed, in part, for my PhD research. I've been working on it with Jacques Garrigue for the last two months.
 
In your "intuition" section (btw. there is a typo here, it should be (type s) (x : s t)), you seem to present GADT as directly related to the new (type s) construct. It's a bit surprising because it's difficult to know the difference between this and classic type variables. I suppose it is because only (type s) guarantee that the variable remains polymorphic, and you use that to ensure that branch-local unifications don't "escape" to the outer level ? Could you be a bit more explicit on this ?

 
I don't know what you mean by "remains polymorphic". However, (type s) and polymorphism are quite distinct concepts. Consider the following examples:

# let rec f (type s) (x : s) : s = f x;;
Error: This expression has type s but an expression was expected of type s
       The type constructor s would escape its scope

# fun (type s) ( f : s -> s) ( x : s) -> f x;;
- : ('_a -> '_a) -> '_a -> '_a = <fun>


The reason I chose to use newtypes, ie (type s), is that I needed a type variable which did not change (I believe the Haskell people call it rigid), so I decided to use type constructors. Another option, previously implemented, was to use polymorphic variables, ie:

let rec foo : 'a. 'a t -> t = 
    function 
        | IntLit x -> x


However, this has several disadvantages, the biggest of which is that  the variable 'a cannot be referenced inside the expression since its scope is the type in which it was introduced. 


 
It's also a bit difficult to know what's the big deal about "exhaustiveness checks". As I understand it, you remark that with GADTs some case cannot happen due to typing reasons, but the exhaustive check doesn't know about it. Could you be a bit more explicit about what the exhaustiveness checker does :
- is it exactly the same behavior as before, ignoring GADT specificities ? (ie. you haven't changed anything)
- if not, what have you changed and how can we try to predict its reaction to a given code ?
- what can we do when it doesn't detect an impossible case ? I suppose we can't a pattern clause for it, as the type checker would reject it.


This problem is not new in O'Caml. For example:

# type t = { x : 'a . 'a list } ;;
type t = { x : 'a. 'a list; }
# let _ = function { x = [] } -> 5;;
Warning 8: this pattern-matching is not exhaustive.
Here is an example of a value that is not matched:
{x=_::_}

however, try creating a value of type ('a. 'a list) satisfying the pattern _ :: _

What I've done is written a second pass to the exhaustiveness checker. The first pass does the same thing as before, but ignores GADTs completely. The second pass exhaustively checks every possible generalized constructor combination. 

For example, in the code

type 'a t = Foo : int t | Bar : bool t | Baz : float t

let f : type s. s t * s t * s t -> s = 
    function
         Foo, Foo, Foo
      | ....

My code will check all 9 possible patterns and will output any which were missed. The pattern returned by my algorithm is a valid pattern.
 
I'm not sure I understand the example of the "Variance" section.
Why is the cast in that direction ? It seems to me that even if we could force t to be covariant, this cast (to a less general type) shouldn't work :

  # type +'a t = T of 'a;;
  # let a = T (object method a = 1 end);;
  # (a :> < m : int; n : bool > t);;
  Error: Type < a : int > t is not a subtype of < m : int; n : bool > t 


I apologize, that should be: 

type -'a t = C : < m : int > -> < m : int > t

or, as a constraint:

type -'a t = C of 'a constraint 'a = < m : int > 
 
Again, you "Objects and variants" and "Propagation" subsections are a bit vague. Could you give example of code exhibiting possible problems ?


Propagation:

Currently, this code doesn't compile:

    let rec baz : type s . s t -> s = 
      fun (type z) ->
function
   IntLit x -> x : s
 | BoolLit y -> y : s

so you need to add the annotation:

    let rec baz : type s . s t -> s = 
      fun (type z) ->
((function
   IntLit x -> x 
 | BoolLit y -> y) : s t -> s)

objects (and polymorphic variants):

the following will not compile:

    let rec eval : type s . s t -> s = 
      function
| IntLit x -> ignore (object method m : s = failwith "foo" end : < m : int; ..>) ; x

polymorphic variants in patterns:

the following will not compile:

    let rec eval : type s . [`A] * s t -> unit = 
      function
| `A, IntLit _ -> ()
| `A, BoolLit _ -> ()
 
Finally, a few syntax trolls, in decreasing order of constructivity :

- is it a good idea to reproduce the "implicit quantification" of ordinary types ? It seems it could be particularly dangerous here.
  for example, changing
    type _ t = Id : 'a -> 'a t
  to 
    type 'a t = Id : 'a -> 'a t | Foo of 'a
  introduce, if I'm not mistaken, a semantic-changing variable captures.
  (I thought other dark corners of the type declarations already had this behavior, but right now I can't remember which ones)

type 'a t = Id : 'a -> 'a t | Foo of 'a

is the same as

type 'b t = Id : 'a -> 'a t | Foo of 'b

In other words, the type variables in generalized constructor definitions are distinct from the type parameters.
 

- if I understand it correctly, (type a . a t -> a) is yet another syntax for type quantification. Why ? I thought (type a) was used to force generalization, but ('a . ...)-style annotation already force polymorphism (or don't they ?). Is it a semantic difference with ('a . 'a t -> 'a), other than its interaction with gadts ? Can we use (type a . a t -> a) in all places where we used ('a . 'a t -> 'a) before ?

(type s) does not force generalization (see above); this is why this new syntax is needed. You can use (type a . a t -> a) anywhere you used ('a. 'a t -> 'a) could before, assuming that you don't have any types a that you don't want hidden. This syntax extension is purely syntactic sugar.
 

- is there a rationale for choosing Coq-style variant syntax instead of just adding a blurb to the existing syntax, such as
    | IntLit of int : int t
  or
    | IntList of int return int t
  ?


The only rationale is that I want to make it clear that the type variables found inside generalized constructor definitions are distinct from the type parameters. In your second example, return is not a keyword in O'Caml. I could very well have chosen your first example. If there is a consensus on some alternate syntax, I have no qualms about changing it.
 
Thank you for the feedback. I will add some of these things to my webpage.

Sincerely,

Jacques Le Normand


Thanks. 

On Mon, Oct 25, 2010 at 10:39 AM, Jacques Le Normand <rathereasy@gmail.com> wrote:
Dear Caml list,

I am pleased to announce an experimental branch of the O'Caml compiler: O'Caml extended with Generalized Algebraic Datatypes. You can find more information on this webpage:



And you can grab the latest release here:

svn checkout https://yquem.inria.fr/caml/svn/ocaml/branches/gadts

Any feedback would be very much appreciated.

Sincerely,

Jacques Le Normand


_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs