From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id 0A9EABBAF for ; Tue, 26 Oct 2010 07:30:15 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AuQBAHIExkzRVaE0kGdsb2JhbACBRZFrhhsBhzZLCBUBAQEBCQkMBxEDH6NyiViCF4ZFLohXAQEDBYVDBIFagniFe4Vy X-IronPort-AV: E=Sophos;i="4.58,239,1286143200"; d="scan'208";a="63922397" Received: from mail-fx0-f52.google.com ([209.85.161.52]) by mail3-smtp-sop.national.inria.fr with ESMTP; 26 Oct 2010 07:30:14 +0200 Received: by fxm12 with SMTP id 12so3018657fxm.39 for ; Mon, 25 Oct 2010 22:30:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=bsSyIRpg61xAuBqlWFOVpc1PUFXGmjLiEosbRrErVbA=; b=ZIA4e27x7yikvsAJh4ZTJ/EIAqcYRZeq/VYlm25/nMoBMEGuktL6cV9gzrTbqVRzU9 qUIeBkmCM+73JhxmtnWNoNmY0C4N7AXM5s7oL0SNpXzMK8nStjd3TTvevByPT8IcO+5H oJZSAglPbCnX0H1RdDkPPfcyCji+MlJbkQHww= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=J5t3DWFatiA0DJxGAkndkz7/DPkfIA7aXQe3YzZM50PAT2Jj2Tiy14ruGQOauBTFAX utzRR/rwggmeePgTD7Clfl+g+khxBCyllbDCNBLXhogT/rcgocYIoXej/2l3rDdK3zAj YwKUFyTNSGijo4ETbZqcAslw0aTRewuBKT7Po= MIME-Version: 1.0 Received: by 10.239.189.2 with SMTP id r2mr1902420hbh.50.1288071013584; Mon, 25 Oct 2010 22:30:13 -0700 (PDT) Received: by 10.239.185.79 with HTTP; Mon, 25 Oct 2010 22:30:13 -0700 (PDT) In-Reply-To: References: Date: Tue, 26 Oct 2010 14:30:13 +0900 Message-ID: Subject: Re: [Caml-list] Generalized Algebraic Datatypes From: Jacques Le Normand To: bluestorm Cc: caml-list caml-list Content-Type: multipart/alternative; boundary=001485f6db5035d17104937e678c X-Spam: no; 0.00; datatypes:01 unifications:01 polymorphism:01 haskell:01 constructors:01 foo:01 checker:01 checker:01 o'caml:01 foo:01 bool:01 baz:01 bool:01 subtype:01 -'a:01 --001485f6db5035d17104937e678c Content-Type: text/plain; charset=ISO-8859-1 On Mon, Oct 25, 2010 at 6:44 PM, bluestorm wrote: > It's very interesting. > > First, I'm curious of the "historical" aspects of this work : where does it > come from ? Apparently there is work from you and Jacques Garrigue, but it's > hard to tell. Is it new, or a long-running experiment ? > > The history: the algorithm was developed, in part, for my PhD research. I've been working on it with Jacques Garrigue for the last two months. > In your "intuition" section (btw. there is a typo here, it should be (type > s) (x : s t)), you seem to present GADT as directly related to the new (type > s) construct. It's a bit surprising because it's difficult to know the > difference between this and classic type variables. I suppose it is because > only (type s) guarantee that the variable remains polymorphic, and you use > that to ensure that branch-local unifications don't "escape" to the outer > level ? Could you be a bit more explicit on this ? > > I don't know what you mean by "remains polymorphic". However, (type s) and polymorphism are quite distinct concepts. Consider the following examples: # let rec f (type s) (x : s) : s = f x;; Error: This expression has type s but an expression was expected of type s The type constructor s would escape its scope # fun (type s) ( f : s -> s) ( x : s) -> f x;; - : ('_a -> '_a) -> '_a -> '_a = The reason I chose to use newtypes, ie (type s), is that I needed a type variable which did not change (I believe the Haskell people call it rigid), so I decided to use type constructors. Another option, previously implemented, was to use polymorphic variables, ie: let rec foo : 'a. 'a t -> t = function | IntLit x -> x However, this has several disadvantages, the biggest of which is that the variable 'a cannot be referenced inside the expression since its scope is the type in which it was introduced. > It's also a bit difficult to know what's the big deal about "exhaustiveness > checks". As I understand it, you remark that with GADTs some case cannot > happen due to typing reasons, but the exhaustive check doesn't know about > it. Could you be a bit more explicit about what the exhaustiveness checker > does : > - is it exactly the same behavior as before, ignoring GADT specificities ? > (ie. you haven't changed anything) > - if not, what have you changed and how can we try to predict its reaction > to a given code ? > - what can we do when it doesn't detect an impossible case ? I suppose we > can't a pattern clause for it, as the type checker would reject it. > > This problem is not new in O'Caml. For example: # type t = { x : 'a . 'a list } ;; type t = { x : 'a. 'a list; } # let _ = function { x = [] } -> 5;; Warning 8: this pattern-matching is not exhaustive. Here is an example of a value that is not matched: {x=_::_} however, try creating a value of type ('a. 'a list) satisfying the pattern _ :: _ What I've done is written a second pass to the exhaustiveness checker. The first pass does the same thing as before, but ignores GADTs completely. The second pass exhaustively checks every possible generalized constructor combination. For example, in the code type 'a t = Foo : int t | Bar : bool t | Baz : float t let f : type s. s t * s t * s t -> s = function Foo, Foo, Foo | .... My code will check all 9 possible patterns and will output any which were missed. The pattern returned by my algorithm is a valid pattern. > I'm not sure I understand the example of the "Variance" section. > Why is the cast in that direction ? It seems to me that even if we could > force t to be covariant, this cast (to a less general type) shouldn't work : > > # type +'a t = T of 'a;; > # let a = T (object method a = 1 end);; > # (a :> < m : int; n : bool > t);; > Error: Type < a : int > t is not a subtype of < m : int; n : bool > t > > I apologize, that should be: type -'a t = C : < m : int > -> < m : int > t or, as a constraint: type -'a t = C of 'a constraint 'a = < m : int > > Again, you "Objects and variants" and "Propagation" subsections are a bit > vague. Could you give example of code exhibiting possible problems ? > > Propagation: Currently, this code doesn't compile: let rec baz : type s . s t -> s = fun (type z) -> function IntLit x -> x : s | BoolLit y -> y : s so you need to add the annotation: let rec baz : type s . s t -> s = fun (type z) -> ((function IntLit x -> x | BoolLit y -> y) : s t -> s) objects (and polymorphic variants): the following will not compile: let rec eval : type s . s t -> s = function | IntLit x -> ignore (object method m : s = failwith "foo" end : < m : int; ..>) ; x polymorphic variants in patterns: the following will not compile: let rec eval : type s . [`A] * s t -> unit = function | `A, IntLit _ -> () | `A, BoolLit _ -> () > Finally, a few syntax trolls, in decreasing order of constructivity : > > - is it a good idea to reproduce the "implicit quantification" of ordinary > types ? It seems it could be particularly dangerous here. > for example, changing > type _ t = Id : 'a -> 'a t > to > type 'a t = Id : 'a -> 'a t | Foo of 'a > introduce, if I'm not mistaken, a semantic-changing variable captures. > (I thought other dark corners of the type declarations already had this > behavior, but right now I can't remember which ones) > type 'a t = Id : 'a -> 'a t | Foo of 'a is the same as type 'b t = Id : 'a -> 'a t | Foo of 'b In other words, the type variables in generalized constructor definitions are distinct from the type parameters. > > - if I understand it correctly, (type a . a t -> a) is yet another syntax > for type quantification. Why ? I thought (type a) was used to force > generalization, but ('a . ...)-style annotation already force polymorphism > (or don't they ?). Is it a semantic difference with ('a . 'a t -> 'a), other > than its interaction with gadts ? Can we use (type a . a t -> a) in all > places where we used ('a . 'a t -> 'a) before ? > (type s) does not force generalization (see above); this is why this new syntax is needed. You can use (type a . a t -> a) anywhere you used ('a. 'a t -> 'a) could before, assuming that you don't have any types a that you don't want hidden. This syntax extension is purely syntactic sugar. > > - is there a rationale for choosing Coq-style variant syntax instead of > just adding a blurb to the existing syntax, such as > | IntLit of int : int t > or > | IntList of int return int t > ? > > The only rationale is that I want to make it clear that the type variables found inside generalized constructor definitions are distinct from the type parameters. In your second example, return is not a keyword in O'Caml. I could very well have chosen your first example. If there is a consensus on some alternate syntax, I have no qualms about changing it. Thank you for the feedback. I will add some of these things to my webpage. Sincerely, Jacques Le Normand > Thanks. > > On Mon, Oct 25, 2010 at 10:39 AM, Jacques Le Normand > wrote: > >> Dear Caml list, >> >> I am pleased to announce an experimental branch of the O'Caml compiler: >> O'Caml extended with Generalized Algebraic Datatypes. You can find more >> information on this webpage: >> >> https://sites.google.com/site/ocamlgadt/ >> >> >> And you can grab the latest release here: >> >> svn checkout https://yquem.inria.fr/caml/svn/ocaml/branches/gadts >> >> >> >> >> Any feedback would be very much appreciated. >> >> Sincerely, >> >> Jacques Le Normand >> >> >> _______________________________________________ >> Caml-list mailing list. Subscription management: >> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list >> Archives: http://caml.inria.fr >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >> Bug reports: http://caml.inria.fr/bin/caml-bugs >> >> --001485f6db5035d17104937e678c Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

On Mon, Oct 25, 2010 at 6:44 PM, bluesto= rm <bluest= orm.dylc@gmail.com> wrote:
It's very interesting.

First, I'm curi= ous of the "historical" aspects of this work : where does it come= from ? Apparently there is work from you and Jacques Garrigue, but it'= s hard to tell. Is it new, or a long-running experiment ?


The history: t= he algorithm was developed, in part, for my PhD research. I've been wor= king on it with Jacques Garrigue for the last two months.
=A0
In your &quo= t;intuition" section (btw. there is a typo here, it should be (type s)= (x : s t)), you seem to present GADT as directly related to the new (type = s) construct. It's a bit surprising because it's difficult to know = the difference between this and classic type variables. I suppose it is bec= ause only (type s) guarantee that the variable remains polymorphic, and you= use that to ensure that branch-local unifications don't "escape&q= uot; to the outer level ? Could you be a bit more explicit on this ?

=A0
I don'= t know what you mean by "remains polymorphic". However, (type s) = and polymorphism are quite distinct concepts. Consider the following exampl= es:

# let rec f (type s) (x : s) : s =3D f x;;
Error: This expression has type s but an expression was expected of type= s
=A0=A0 =A0 =A0 The type constructor s would escape its scope

# fun (type s) ( f : s -> s) ( x : s) -> f x= ;;
- : ('_a -> '_a) -> '_a -> '_a =3D &l= t;fun>


The reason I chose = to use newtypes, ie (type s), is that I needed a type variable which did no= t change (I believe the Haskell people call it rigid), so I decided to use = type constructors. Another option, previously implemented, was to use polym= orphic variables, ie:

let rec foo : 'a. 'a t -> t =3D=A0
=A0=A0 =A0function=A0
=A0=A0 =A0 =A0 =A0| IntLit x -> x
=


However, this has several disadvantages,= the biggest of which is that=A0=A0the variable 'a cannot be referenced= inside the expression since its scope is the type in which it was introduc= ed.=A0


=A0
It's also a bit difficult to know what&= #39;s the big deal about "exhaustiveness checks". As I understand= it, you remark that with GADTs some case cannot happen due to typing reaso= ns, but the exhaustive check doesn't know about it. Could you be a bit = more explicit about what the exhaustiveness checker does :
- is it exactly the same behavior as before, ignoring GADT specificiti= es ? (ie. you haven't changed anything)
- if not, what have y= ou changed and how can we try to predict its reaction to a given code ?
- what can we do when it doesn't detect an impossible case ? I sup= pose we can't a pattern clause for it, as the type checker would reject= it.


This problem is not new in O'Caml. For example:

# type t =3D { x : 'a . 'a list } ;;
type t = =3D { x : 'a. 'a list; }
# let _ =3D function { x =3D [] = } -> 5;;
Warning 8: this pattern-matching is not exhaustive.
Here is = an example of a value that is not matched:
{x=3D_::_}
=

however, try creating a value of type ('a. 'a l= ist) satisfying the pattern _ :: _

What I've done is written a second pass to the exha= ustiveness checker. The first pass does the same thing as before, but ignor= es GADTs completely. The second pass exhaustively checks every possible gen= eralized constructor combination.=A0

For example, in the code

type = 'a t =3D Foo : int t | Bar : bool t | Baz : float t

let f : type s. s t * s t * s t -> s =3D=A0
=A0=A0 =A0fu= nction
=A0=A0 =A0 =A0 =A0 Foo, Foo, Foo
=A0=A0 =A0 =A0| ....
<= div>
My code will check all 9 possible patterns and will outp= ut any which were missed.=A0The pattern returned by my algorithm is a valid= pattern.
=A0
I&#= 39;m not sure I understand the example of the "Variance" section.=
Why is the cast in that direction ? It seems to me that even if we cou= ld force t to be covariant, this cast (to a less general type) shouldn'= t work :

=A0=A0# type +'a t =3D T of 'a;;<= /div>
=A0=A0# let a =3D T (object method a =3D 1 end);;
=A0= =A0# (a :> < m : int; n : bool > t);;
=A0=A0Error: Type = < a : int > t is not a subtype of < m : int; n : bool > t=A0


I apologize, that = should be:=A0

type -'a t =3D C : < m : int > -> < m : int &g= t; t

or, as a constraint:

type -&= #39;a t =3D C of 'a constraint 'a =3D < m : int >=A0
=A0
Again, you "Objects and variants" and "Propagatio= n" subsections are a bit vague. Could you give example of code exhibit= ing possible problems ?


Propagation:

Currently, this c= ode doesn't compile:

=A0=A0 =A0let rec ba= z : type s . s t -> s =3D=A0
=A0=A0 =A0 =A0fun (type z) -><= /div>
= function
=A0 = =A0IntLit x -> x : s
=A0| BoolLit y -> y : s

so you need to add the annotation:

=A0=A0 =A0let rec baz : type s . s t -> s =3D= =A0
=A0=A0 =A0 =A0fun (type z) ->
((function
=A0 =A0IntLit x = -> x=A0
=A0|= BoolLit y -> y) : s t -> s)

objects (= and polymorphic variants):

the following will not = compile:

=A0=A0 =A0let rec eval : type s . s t -> s =3D= =A0
=A0=A0 =A0 =A0function
| IntLit x -> ignore (object method= m : s =3D failwith "foo" end : < m : int; ..>) ; x

polymorphic variants in patterns:

<= /div>
the following will not compile:

=A0= =A0 =A0let rec eval : type s . [`A] * s t -> unit =3D=A0
=A0= =A0 =A0 =A0function
| `A,= IntLit _ -> ()
| `A, BoolLit _ -> ()
=A0
Finally, a few syntax trolls, in decreasing = order of constructivity :

- is it a good idea to reproduce the "implicit qua= ntification" of ordinary types ? It seems it could be particularly dan= gerous here.
=A0=A0for example, changing
=A0=A0 =A0type= _ t =3D Id : 'a -> 'a t
=A0=A0to=A0
=A0=A0 =A0type 'a t =3D Id : 'a -> &#= 39;a t | Foo of 'a
=A0=A0introduce, if I'm not mistaken, = a semantic-changing variable captures.
=A0=A0(I thought other dar= k corners of the type declarations already had this behavior, but right now= I can't remember which ones)

= type 'a t =3D Id : 'a -> 'a t | Foo of 'a

=
is the same as

type 'b t =3D Id : &= #39;a -> 'a t | Foo of 'b

In other words, the type variables in generalized const= ructor definitions are distinct from the type parameters.
=A0

- if I understand it correctly, (type a . a t -> a) = is yet another syntax for type quantification. Why ? I thought (type a) was= used to force generalization, but ('a . ...)-style annotation already = force polymorphism (or don't they ?). Is it a semantic difference with = ('a . 'a t -> 'a), other than its interaction with gadts ? C= an we use (type a . a t -> a) in all places where we used ('a . '= ;a t -> 'a) before ?

(type s) does not force = generalization (see above); this is why this new syntax is needed. You can = use (type a . a t -> a) anywhere you used ('a. 'a t -> 'a= ) could before, assuming that you don't have any types a that you don&#= 39;t want hidden. This syntax extension is purely syntactic sugar.
=A0

- is there a rationale for choosing Coq-style variant s= yntax instead of just adding a blurb to the existing syntax, such as
<= div>=A0=A0 =A0| IntLit of int : int t
=A0=A0or
=A0=A0 = =A0| IntList of int return int t
=A0=A0?


The only rationale is that I want to make it clear that the type v= ariables found inside generalized constructor definitions are distinct from= the type parameters. In your second example, return is not a keyword in O&= #39;Caml. I could very well have chosen your first example. If there is a c= onsensus on some alternate syntax, I have no qualms about changing it.
=A0
Thank you for the feedback. I will add some of these thi= ngs to my webpage.

Sincerely,

=
Jacques Le Normand


Thanks.=A0
<= /div>

<= /div>
On Mon, Oct 25, 2010 at 10:39 AM, Jacques L= e Normand=A0<rathereasy@gmail.com>=A0wrote:
Dear Caml list,

I am pleased to announce an experimental= branch of the O'Caml compiler: O'Caml extended with Generalized Al= gebraic Datatypes. You can find more information on this webpage:


=

And you can grab the latest release here:

svn checkout https:=
//yquem.inria.fr/caml/svn/ocaml/branches/gadts

<= span style=3D"font-family:Times;font-size:medium">
Any feedback would be very much appreciated.

Since= rely,

Jacques Le Normand


______________________________________= _________
Caml-list mailing list. Subscription management:
http://yque= m.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives:=A0http://caml.inria.fr
Beginner's list:=A0http://groups.yahoo.com/group/ocaml_beginnersBug reports:=A0http://caml.inria.fr/bin/caml-bugs


--001485f6db5035d17104937e678c--