There is a direct relation between how type evolve during GADT typing and how queries evolve during logic/relational programming. It can help to think of a GADT declaration as the set of rules of a Prolog program. (I you have never looked at Prolog, here is an occasion to learn something new). That does *not* mean that GADTs let you do prolog-style programming in types (the compiler will not automatically search for a proof, the user has to write it down explicitly as a GADT term, so this work is not done implicitly, as with type-class search for example), but it does give a way to think about GADT declarations. Your declaration (notice that I renamed the Z variable, for a good reason that will appear clear below): type (_, _) t = | Z : ('f, 'f) t | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t becomes t F F. t F R :- t F (A -> R), (unit -> A). In any case, your declaration has the following effect: - by the Z rule, all types of the form ('a1 -> 'a2 -> ... -> 'an -> 'r, 'a -> 'a2 -> ... -> 'an -> 'r) t are inhabited - by the S rule, you can "hide" an element on the right (and you have to provide a thunk for it) , moving from ('a1 -> ... -> 'an -> r, 'ak -> 'a{k+1} -> .. -> 'an -> r) t to ('a1 -> ... -> 'an -> r, 'a{k+1} -> .. -> 'an -> r) t If you do this repeatedly you end up on what you want: ('a1 -> .. -> 'an -> r, r) t. The way you write about it, it sounds like you had the following *different* definition in mind type (_, _) t = | Z : ('r, 'r) t | S : ('f, 'r) t * (unit -> 'a) -> ('a -> 'f, 'r) t let test = S (S (Z, (fun () -> 3l)), (fun () -> "foo")) corresponding to the following Prolog program that I do find easier to think about: t R R. t (A -> F) R :- t F R, (unit -> A). It turns out that with this other definition, the first implementation that you had written actually type-check fines. (You could try to write conversion functions between those two definitions to convince yourself that they are morally equivalent, but this is actually a difficult exercise.) On Fri, Sep 20, 2019 at 10:11 AM Malcolm Matalka wrote: > > Gabriel Scherer writes: > > > Note: the reason why you get _weak type variables (which are not > > polymorphic, just not-inferred-yet) is that the type-checker cannot > detect > > that (z // (fun () -> ...)) is a value: it would have to unfold the call > to > > (//) for this, which it doesn't do in general, and here certainly could > not > > do given that its definition is hidden behind an abstraction boundary. > > I would recommend experimenting *without* the module interface and the > > auxiliary function, with the constructors directly, you would get > slightly > > better types. > > > > # S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));; > > - : (int32 -> string -> 'a, 'a) t = S (S (Z, ), ) > > > > Historically we have used module interfaces to implement "phantom types" > -- > > because the type information there is only present in the interface, not > in > > the definition. With GADTs, the type constraints are built into the > > definition itself, which is precisely what makes them more powerful; no > > need for an abstract interface on top. > > > > The first part of your question is about understanding how the > > type-inference work (how variables are manipulated by the type-checker > and > > then "propagated back up"). This sounds like a difficult way to > understand > > GADTs: you want to learn the typing rules *and* the type-inference > > algorithm at once. But only the first part is necessary to use the > feature > > productively (with a few tips on when to use annotations, which are easy > to > > explain and in fact explained in the manual: > > http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html ). So instead > of > > asking: "how did the compiler get this type?", I would ask: "why is this > > the right type"? I think you could convince yourself that (1) it is a > > correct type and (2) any other valid type would be a specialization of > this > > type, there is no simpler solution. > > > > The second part: you wrote: > > > > let rec bind : type f r. f -> (f, r) t -> r = fun f -> function > > | Z -> > > f > > | S (t, v) -> > > bind (f (v ())) t > > > > Let's focus on the second clause: > > > > | S (t, v) -> > > bind (f (v ())) t > > > > we know that (f : f) holds, and that the pattern-matching is on a value > of > > type (f, r) t, and we must return r. > > When pattern-matching on S (t, v), we learn extra type information from > the > > typing rule of S: > > > > | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t > > > > if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit -> > a)) > > for *some* unknown/existential type a. So within the branch we have > > > > bind : type f r. f -> (f, r) t -> r > > f : (f, a -> r) t > > v : unit -> a > > expected return type: r > > > > f does *not* have a function type here, so your idea of applying (f (v > ())) > > cannot work (v does have a function type, so (v ()) is valid). > > > > The only thing you can do on f is call (bind) recursively (with what > > arguments?), and then you will get an (a -> r) as result. > > > > Do you see how to write a correct program from there? > > Ah-ha! I was close but my thinking was at the "wrong level" (I'm not > sure how to put it). The working implementation of bind is: > > let rec bind : type f r. f -> (f, r) t -> r = fun f -> function > | Z -> > f > | S (t, v) -> > let f' = bind f t in > f' (v ()) > > And now I see why you wanted me to look at it not through the module > interface. Looking at how the recursion would work, of course the > most-nested S will have the function corresponding to the first > parameter of the function. I was thinking that I would consume the > parameters on the way down, but it's actually on the way up. > > I am still working out in my head the answer to "why is this the right > type", I think it has to slosh around in there a bit before it truly > makes sense to me. > > Thank you! > > > > > > > > > On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka > wrote: > > > >> I have been using GADTs to make type-safe versions of APIs that are kind > >> of like printf. I've been using this pattern by rote and now I'm > >> getting to trying to understand how it works. > >> > >> I have the following code: > >> > >> module F : sig > >> type ('f, 'r) t > >> > >> val z : ('r, 'r) t > >> val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t > >> end = struct > >> type ('f, 'r) t = > >> | Z : ('r, 'r) t > >> | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t > >> > >> let z = Z > >> let (//) t f = S (t, f) > >> end > >> > >> And the following usage: > >> > >> utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));; > >> - : (int32 -> string -> '_weak1, '_weak1) F.t = > >> > >> I understand how 'r is '_weak1 (I think), but I'm still trying to figure > >> out how 'f gets its type by applying (//). I think I have an > >> explanation below, and I'm hoping someone can say if it's correct and/or > >> simplify it. > >> > >> Explanation: > >> > >> The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t). > >> The constructor S says that the (_, _) t that it takes has some type 'f, > >> but that the second type variable must be of the form 'a -> 'r, sort of > >> deconstructing pattern match on the type (if that makes sense). And if > >> that (_, _) t is a Z, where we have stated the two type variables have > >> the same value, that means 'f must also be ('a -> 'r). So for the > >> first application of (//) it makes sense. If we apply (//) again, the > >> first parameter has the type (int32 -> '_weak1, '_weak1) t, but that > >> must actually mean '_weak1 in this case is string -> '_weak, and then > >> this is where things become jumbled in my head. If the passed in (_, _) > >> t must be (string -> '_weak), and the inner type actually must be (int32 > >> -> '_weak), then, the inner inner type must be ('_weak), the type of 'r > >> at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the > >> same type for Z, 'f must also be (string -> int32 -> '_weak), and that > >> knowledge propagates back up? But that doesn't feel quite right to me, > >> either. > >> > >> With the help of reading other code, I've figured out how to make a > >> function that uses a type like this that works like kprintf, however > >> where I'm going in this case is that I want to take a function that > >> matches 'f and apply it to the result of my functions. > >> > >> Something like: > >> > >> let rec bind : type f r. f -> (f, r) t -> r = fun f -> function > >> | Z -> > >> f > >> | S (t, v) -> > >> bind (f (v ())) t > >> > >> However, this does not compile given: > >> > >> Error: This expression has type f > >> This is not a function; it cannot be applied. > >> > >> My understanding was if I have Z, and an input that has the type f, then > >> that is just the return type since at Z, f and r are the same type. And > >> than at S, I can peel away the type of f by applying the function f and > >> then recursively calling. But my understanding is missing something. > >> > >> Does this make sense? > >> > >> What am I not understanding? > >> > >> Thank you, > >> /Malcolm > >> > >