> (You could try to write conversion functions between those two definitions
> to convince yourself that they are morally equivalent,
> but this is actually a difficult exercise.)

type (_, _) t1 =
| Z : ('f, 'f) t1
| S : (('f, 'a -> 'r) t1 * (unit -> 'a)) -> ('f, 'r) t1

type (_, _) t2 =
| Z : ('r, 'r) t2
| S : ('f, 'r) t2 * (unit -> 'a) -> ('a -> 'f, 'r) t2

t2 is t1 "upside down" (and conversely). To go from t1 to t2 is thus a "reverse" operation.
Just like for lists, writing "reverse" directly is difficult, but one can use an auxiliary function

let rec rev_append
  : type a b c. (b, c) t2 -> (a, b) t1 -> (a, c) t2


On Fri, Sep 20, 2019 at 10:35 AM Gabriel Scherer <gabriel.scherer@gmail.com> wrote:
There is a direct relation between how type evolve during GADT typing and how queries evolve during logic/relational programming. It can help to think of a GADT declaration as the set of rules of a Prolog program. (I you have never looked at Prolog, here is an occasion to learn something new). That does *not* mean that GADTs let you do prolog-style programming in types (the compiler will not automatically search for a proof, the user has to write it down explicitly as a GADT term, so this work is not done implicitly, as with type-class search for example), but it does give a way to think about GADT declarations.

Your declaration (notice that I renamed the Z variable, for a good reason that will appear clear below):

  type (_, _) t =
    | Z : ('f, 'f) t
    | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t

becomes

  t F F.
  t F R :- t F (A -> R), (unit -> A).

In any case, your declaration has the following effect:
- by the Z rule, all types of the form
  ('a1 -> 'a2 -> ... -> 'an -> 'r, 'a -> 'a2 -> ... -> 'an -> 'r) t
  are inhabited
- by the S rule, you can "hide" an element on the right
  (and you have to provide a thunk for it)
, moving from
  ('a1 -> ... -> 'an -> r, 'ak -> 'a{k+1} -> .. -> 'an -> r) t
  to
  ('a1 -> ... -> 'an -> r, 'a{k+1} -> .. -> 'an -> r) t
 
If you do this repeatedly you end up on what you want: ('a1 -> .. -> 'an -> r, r) t.

The way you write about it, it sounds like you had the following *different* definition in mind

type (_, _) t =
| Z : ('r, 'r) t
| S : ('f, 'r) t * (unit -> 'a) -> ('a -> 'f, 'r) t

let test = S (S (Z, (fun () -> 3l)), (fun () -> "foo"))

corresponding to the following Prolog program that I do find easier to think about:

  t R R.
  t (A -> F) R :- t F R, (unit -> A).

It turns out that with this other definition, the first implementation that you had written actually type-check fines.

(You could try to write conversion functions between those two definitions to convince yourself that they are morally equivalent, but this is actually a difficult exercise.)

On Fri, Sep 20, 2019 at 10:11 AM Malcolm Matalka <mmatalka@gmail.com> wrote:

Gabriel Scherer <gabriel.scherer@gmail.com> writes:

> Note: the reason why you get _weak type variables (which are not
> polymorphic, just not-inferred-yet)  is that the type-checker cannot detect
> that (z // (fun () -> ...)) is a value: it would have to unfold the call to
> (//) for this, which it doesn't do in general, and here certainly could not
> do given that its definition is hidden behind an abstraction boundary.
> I would recommend experimenting *without* the module interface and the
> auxiliary function, with the constructors directly, you would get slightly
> better types.
>
> # S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));;
> - : (int32 -> string -> 'a, 'a) t = S (S (Z, <fun>), <fun>)
>
> Historically we have used module interfaces to implement "phantom types" --
> because the type information there is only present in the interface, not in
> the definition. With GADTs, the type constraints are built into the
> definition itself, which is precisely what makes them more powerful; no
> need for an abstract interface on top.
>
> The first part of your question is about understanding how the
> type-inference work (how variables are manipulated by the type-checker and
> then "propagated back up"). This sounds like a difficult way to understand
> GADTs: you want to learn the typing rules *and* the type-inference
> algorithm at once. But only the first part is necessary to use the feature
> productively (with a few tips on when to use annotations, which are easy to
> explain and in fact explained in the manual:
> http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html ). So instead of
> asking: "how did the compiler get this type?", I would ask: "why is this
> the right type"? I think you could convince yourself that (1) it is a
> correct type and (2) any other valid type would be a specialization of this
> type, there is no simpler solution.
>
> The second part: you wrote:
>
>   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
>     | Z ->
>       f
>     | S (t, v) ->
>       bind (f (v ())) t
>
> Let's focus on the second clause:
>
>     | S (t, v) ->
>       bind (f (v ())) t
>
> we know that (f : f) holds, and that the pattern-matching is on a value of
> type (f, r) t, and we must return r.
> When pattern-matching on S (t, v), we learn extra type information from the
> typing rule of S:
>
>     | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
>
> if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit -> a))
> for *some* unknown/existential type a. So within the branch we have
>
>   bind : type f r. f -> (f, r) t -> r
>   f : (f, a -> r) t
>   v : unit -> a
>   expected return type: r
>
> f does *not* have a function type here, so your idea of applying (f (v ()))
> cannot work (v does have a function type, so (v ()) is valid).
>
> The only thing you can do on f is call (bind) recursively (with what
> arguments?), and then you will get an (a -> r) as result.
>
> Do you see how to write a correct program from there?

Ah-ha!  I was close but my thinking was at the "wrong level" (I'm not
sure how to put it).  The working implementation of bind is:

  let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
    | Z ->
      f
    | S (t, v) ->
      let f' = bind f t in
      f' (v ())

And now I see why you wanted me to look at it not through the module
interface.  Looking at how the recursion would work, of course the
most-nested S will have the function corresponding to the first
parameter of the function.  I was thinking that I would consume the
parameters on the way down, but it's actually on the way up.

I am still working out in my head the answer to "why is this the right
type", I think it has to slosh around in there a bit before it truly
makes sense to me.

Thank you!

>
>
>
> On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka <mmatalka@gmail.com> wrote:
>
>> I have been using GADTs to make type-safe versions of APIs that are kind
>> of like printf.  I've been using this pattern by rote and now I'm
>> getting to trying to understand how it works.
>>
>> I have the following code:
>>
>> module F : sig
>>   type ('f, 'r) t
>>
>>   val z : ('r, 'r) t
>>   val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t
>> end = struct
>>   type ('f, 'r) t =
>>     | Z : ('r, 'r) t
>>     | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
>>
>>   let z = Z
>>   let (//) t f = S (t, f)
>> end
>>
>> And the following usage:
>>
>> utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));;
>> - : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr>
>>
>> I understand how 'r is '_weak1 (I think), but I'm still trying to figure
>> out how 'f gets its type by applying (//).  I think I have an
>> explanation below, and I'm hoping someone can say if it's correct and/or
>> simplify it.
>>
>> Explanation:
>>
>> The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t).
>> The constructor S says that the (_, _) t that it takes has some type 'f,
>> but that the second type variable must be of the form 'a -> 'r, sort of
>> deconstructing pattern match on the type (if that makes sense).  And if
>> that (_, _) t is a Z, where we have stated the two type variables have
>> the same value, that means 'f must also be ('a -> 'r). So for the
>> first application of (//) it makes sense.  If we apply (//) again, the
>> first parameter has the type (int32 -> '_weak1, '_weak1) t, but that
>> must actually mean '_weak1 in this case is string -> '_weak, and then
>> this is where things become jumbled in my head.  If the passed in (_, _)
>> t must be (string -> '_weak), and the inner type actually must be (int32
>> -> '_weak), then, the inner inner type must be ('_weak), the type of 'r
>> at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the
>> same type for Z, 'f must also be (string -> int32 -> '_weak), and that
>> knowledge propagates back up?  But that doesn't feel quite right to me,
>> either.
>>
>> With the help of reading other code, I've figured out how to make a
>> function that uses a type like this that works like kprintf, however
>> where I'm going in this case is that I want to take a function that
>> matches 'f and apply it to the result of my functions.
>>
>> Something like:
>>
>>   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
>>     | Z ->
>>       f
>>     | S (t, v) ->
>>       bind (f (v ())) t
>>
>> However, this does not compile given:
>>
>> Error: This expression has type f
>>        This is not a function; it cannot be applied.
>>
>> My understanding was if I have Z, and an input that has the type f, then
>> that is just the return type since at Z, f and r are the same type.  And
>> than at S, I can peel away the type of f by applying the function f and
>> then recursively calling.  But my understanding is missing something.
>>
>> Does this make sense?
>>
>> What am I not understanding?
>>
>> Thank you,
>> /Malcolm
>>