*[Caml-list] How is this type inferred (GADTs)@ 2019-09-20 3:42 Malcolm Matalka2019-09-20 6:35 ` Gabriel Scherer 2019-09-20 14:25 ` Oleg 0 siblings, 2 replies; 6+ messages in thread From: Malcolm Matalka @ 2019-09-20 3:42 UTC (permalink / raw) To: caml-list I have been using GADTs to make type-safe versions of APIs that are kind of like printf. I've been using this pattern by rote and now I'm getting to trying to understand how it works. I have the following code: module F : sig type ('f, 'r) t val z : ('r, 'r) t val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t end = struct type ('f, 'r) t = | Z : ('r, 'r) t | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t let z = Z let (//) t f = S (t, f) end And the following usage: utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));; - : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr> I understand how 'r is '_weak1 (I think), but I'm still trying to figure out how 'f gets its type by applying (//). I think I have an explanation below, and I'm hoping someone can say if it's correct and/or simplify it. Explanation: The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t). The constructor S says that the (_, _) t that it takes has some type 'f, but that the second type variable must be of the form 'a -> 'r, sort of deconstructing pattern match on the type (if that makes sense). And if that (_, _) t is a Z, where we have stated the two type variables have the same value, that means 'f must also be ('a -> 'r). So for the first application of (//) it makes sense. If we apply (//) again, the first parameter has the type (int32 -> '_weak1, '_weak1) t, but that must actually mean '_weak1 in this case is string -> '_weak, and then this is where things become jumbled in my head. If the passed in (_, _) t must be (string -> '_weak), and the inner type actually must be (int32 -> '_weak), then, the inner inner type must be ('_weak), the type of 'r at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the same type for Z, 'f must also be (string -> int32 -> '_weak), and that knowledge propagates back up? But that doesn't feel quite right to me, either. With the help of reading other code, I've figured out how to make a function that uses a type like this that works like kprintf, however where I'm going in this case is that I want to take a function that matches 'f and apply it to the result of my functions. Something like: let rec bind : type f r. f -> (f, r) t -> r = fun f -> function | Z -> f | S (t, v) -> bind (f (v ())) t However, this does not compile given: Error: This expression has type f This is not a function; it cannot be applied. My understanding was if I have Z, and an input that has the type f, then that is just the return type since at Z, f and r are the same type. And than at S, I can peel away the type of f by applying the function f and then recursively calling. But my understanding is missing something. Does this make sense? What am I not understanding? Thank you, /Malcolm ^ permalink raw reply [flat|nested] 6+ messages in thread

*Re: [Caml-list] How is this type inferred (GADTs)2019-09-20 3:42 [Caml-list] How is this type inferred (GADTs) Malcolm Matalka@ 2019-09-20 6:35 ` Gabriel Scherer2019-09-20 8:11 ` Malcolm Matalka 2019-09-20 14:25 ` Oleg 1 sibling, 1 reply; 6+ messages in thread From: Gabriel Scherer @ 2019-09-20 6:35 UTC (permalink / raw) To: Malcolm Matalka;+Cc:caml-list [-- Attachment #1: Type: text/plain, Size: 5898 bytes --] Note: the reason why you get _weak type variables (which are not polymorphic, just not-inferred-yet) is that the type-checker cannot detect that (z // (fun () -> ...)) is a value: it would have to unfold the call to (//) for this, which it doesn't do in general, and here certainly could not do given that its definition is hidden behind an abstraction boundary. I would recommend experimenting *without* the module interface and the auxiliary function, with the constructors directly, you would get slightly better types. # S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));; - : (int32 -> string -> 'a, 'a) t = S (S (Z, <fun>), <fun>) Historically we have used module interfaces to implement "phantom types" -- because the type information there is only present in the interface, not in the definition. With GADTs, the type constraints are built into the definition itself, which is precisely what makes them more powerful; no need for an abstract interface on top. The first part of your question is about understanding how the type-inference work (how variables are manipulated by the type-checker and then "propagated back up"). This sounds like a difficult way to understand GADTs: you want to learn the typing rules *and* the type-inference algorithm at once. But only the first part is necessary to use the feature productively (with a few tips on when to use annotations, which are easy to explain and in fact explained in the manual: http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html ). So instead of asking: "how did the compiler get this type?", I would ask: "why is this the right type"? I think you could convince yourself that (1) it is a correct type and (2) any other valid type would be a specialization of this type, there is no simpler solution. The second part: you wrote: let rec bind : type f r. f -> (f, r) t -> r = fun f -> function | Z -> f | S (t, v) -> bind (f (v ())) t Let's focus on the second clause: | S (t, v) -> bind (f (v ())) t we know that (f : f) holds, and that the pattern-matching is on a value of type (f, r) t, and we must return r. When pattern-matching on S (t, v), we learn extra type information from the typing rule of S: | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit -> a)) for *some* unknown/existential type a. So within the branch we have bind : type f r. f -> (f, r) t -> r f : (f, a -> r) t v : unit -> a expected return type: r f does *not* have a function type here, so your idea of applying (f (v ())) cannot work (v does have a function type, so (v ()) is valid). The only thing you can do on f is call (bind) recursively (with what arguments?), and then you will get an (a -> r) as result. Do you see how to write a correct program from there? On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka <mmatalka@gmail.com> wrote: > I have been using GADTs to make type-safe versions of APIs that are kind > of like printf. I've been using this pattern by rote and now I'm > getting to trying to understand how it works. > > I have the following code: > > module F : sig > type ('f, 'r) t > > val z : ('r, 'r) t > val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t > end = struct > type ('f, 'r) t = > | Z : ('r, 'r) t > | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t > > let z = Z > let (//) t f = S (t, f) > end > > And the following usage: > > utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));; > - : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr> > > I understand how 'r is '_weak1 (I think), but I'm still trying to figure > out how 'f gets its type by applying (//). I think I have an > explanation below, and I'm hoping someone can say if it's correct and/or > simplify it. > > Explanation: > > The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t). > The constructor S says that the (_, _) t that it takes has some type 'f, > but that the second type variable must be of the form 'a -> 'r, sort of > deconstructing pattern match on the type (if that makes sense). And if > that (_, _) t is a Z, where we have stated the two type variables have > the same value, that means 'f must also be ('a -> 'r). So for the > first application of (//) it makes sense. If we apply (//) again, the > first parameter has the type (int32 -> '_weak1, '_weak1) t, but that > must actually mean '_weak1 in this case is string -> '_weak, and then > this is where things become jumbled in my head. If the passed in (_, _) > t must be (string -> '_weak), and the inner type actually must be (int32 > -> '_weak), then, the inner inner type must be ('_weak), the type of 'r > at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the > same type for Z, 'f must also be (string -> int32 -> '_weak), and that > knowledge propagates back up? But that doesn't feel quite right to me, > either. > > With the help of reading other code, I've figured out how to make a > function that uses a type like this that works like kprintf, however > where I'm going in this case is that I want to take a function that > matches 'f and apply it to the result of my functions. > > Something like: > > let rec bind : type f r. f -> (f, r) t -> r = fun f -> function > | Z -> > f > | S (t, v) -> > bind (f (v ())) t > > However, this does not compile given: > > Error: This expression has type f > This is not a function; it cannot be applied. > > My understanding was if I have Z, and an input that has the type f, then > that is just the return type since at Z, f and r are the same type. And > than at S, I can peel away the type of f by applying the function f and > then recursively calling. But my understanding is missing something. > > Does this make sense? > > What am I not understanding? > > Thank you, > /Malcolm > [-- Attachment #2: Type: text/html, Size: 7529 bytes --] <div dir="ltr"><div>Note: the reason why you get _weak type variables (which are not polymorphic, just not-inferred-yet) is that the type-checker cannot detect that (z // (fun () -> ...)) is a value: it would have to unfold the call to (//) for this, which it doesn't do in general, and here certainly could not do given that its definition is hidden behind an abstraction boundary.</div><div>I would recommend experimenting *without* the module interface and the auxiliary function, with the constructors directly, you would get slightly better types.</div><div><br></div><div># S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));;<br>- : (int32 -> string -> 'a, 'a) t = S (S (Z, <fun>), <fun>)</div><div><br></div><div>Historically we have used module interfaces to implement "phantom types" -- because the type information there is only present in the interface, not in the definition. With GADTs, the type constraints are built into the definition itself, which is precisely what makes them more powerful; no need for an abstract interface on top.</div><div><br></div><div>The first part of your question is about understanding how the type-inference work (how variables are manipulated by the type-checker and then "propagated back up"). This sounds like a difficult way to understand GADTs: you want to learn the typing rules *and* the type-inference algorithm at once. But only the first part is necessary to use the feature productively (with a few tips on when to use annotations, which are easy to explain and in fact explained in the manual: <a href="http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html">http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html</a> ). So instead of asking: "how did the compiler get this type?", I would ask: "why is this the right type"? I think you could convince yourself that (1) it is a correct type and (2) any other valid type would be a specialization of this type, there is no simpler solution.</div><div><br></div><div>The second part: you wrote:<br></div><div> <br> let rec bind : type f r. f -> (f, r) t -> r = fun f -> function<br> | Z -><br> f<br> | S (t, v) -><br> bind (f (v ())) t</div><div><br></div><div>Let's focus on the second clause:<br></div><div><br></div><div> | S (t, v) -><br></div><div><div> bind (f (v ())) t</div></div><div><br></div><div>we know that (f : f) holds, and that the pattern-matching is on a value of type (f, r) t, and we must return r.</div><div>When pattern-matching on S (t, v), we learn extra type information from the typing rule of S:</div><div><br> | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t</div><div><br></div><div>if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit -> a)) for *some* unknown/existential type a. So within the branch we have</div><div><br></div><div> bind : type f r. f -> (f, r) t -> r </div><div> f : (f, a -> r) t<br></div><div> v : unit -> a</div><div> expected return type: r</div><div><br></div><div>f does *not* have a function type here, so your idea of applying (f (v ())) cannot work (v does have a function type, so (v ()) is valid).</div><div><br></div><div>The only thing you can do on f is call (bind) recursively (with what arguments?), and then you will get an (a -> r) as result.</div><div><br></div><div>Do you see how to write a correct program from there?<br></div><div><div><br></div><div><br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka <<a href="mailto:mmatalka@gmail.com">mmatalka@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">I have been using GADTs to make type-safe versions of APIs that are kind<br> of like printf. I've been using this pattern by rote and now I'm<br> getting to trying to understand how it works.<br> <br> I have the following code:<br> <br> module F : sig<br> type ('f, 'r) t<br> <br> val z : ('r, 'r) t<br> val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t<br> end = struct<br> type ('f, 'r) t =<br> | Z : ('r, 'r) t<br> | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t<br> <br> let z = Z<br> let (//) t f = S (t, f)<br> end<br> <br> And the following usage:<br> <br> utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));;<br> - : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr><br> <br> I understand how 'r is '_weak1 (I think), but I'm still trying to figure<br> out how 'f gets its type by applying (//). I think I have an<br> explanation below, and I'm hoping someone can say if it's correct and/or<br> simplify it.<br> <br> Explanation:<br> <br> The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t).<br> The constructor S says that the (_, _) t that it takes has some type 'f,<br> but that the second type variable must be of the form 'a -> 'r, sort of<br> deconstructing pattern match on the type (if that makes sense). And if<br> that (_, _) t is a Z, where we have stated the two type variables have<br> the same value, that means 'f must also be ('a -> 'r). So for the<br> first application of (//) it makes sense. If we apply (//) again, the<br> first parameter has the type (int32 -> '_weak1, '_weak1) t, but that<br> must actually mean '_weak1 in this case is string -> '_weak, and then<br> this is where things become jumbled in my head. If the passed in (_, _)<br> t must be (string -> '_weak), and the inner type actually must be (int32<br> -> '_weak), then, the inner inner type must be ('_weak), the type of 'r<br> at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the<br> same type for Z, 'f must also be (string -> int32 -> '_weak), and that<br> knowledge propagates back up? But that doesn't feel quite right to me,<br> either.<br> <br> With the help of reading other code, I've figured out how to make a<br> function that uses a type like this that works like kprintf, however<br> where I'm going in this case is that I want to take a function that<br> matches 'f and apply it to the result of my functions.<br> <br> Something like:<br> <br> let rec bind : type f r. f -> (f, r) t -> r = fun f -> function<br> | Z -><br> f<br> | S (t, v) -><br> bind (f (v ())) t<br> <br> However, this does not compile given:<br> <br> Error: This expression has type f<br> This is not a function; it cannot be applied.<br> <br> My understanding was if I have Z, and an input that has the type f, then<br> that is just the return type since at Z, f and r are the same type. And<br> than at S, I can peel away the type of f by applying the function f and<br> then recursively calling. But my understanding is missing something.<br> <br> Does this make sense?<br> <br> What am I not understanding?<br> <br> Thank you,<br> /Malcolm<br> </blockquote></div> ^ permalink raw reply [flat|nested] 6+ messages in thread

*Re: [Caml-list] How is this type inferred (GADTs)2019-09-20 6:35 ` Gabriel Scherer@ 2019-09-20 8:11 ` Malcolm Matalka2019-09-20 8:35 ` Gabriel Scherer 0 siblings, 1 reply; 6+ messages in thread From: Malcolm Matalka @ 2019-09-20 8:11 UTC (permalink / raw) To: Gabriel Scherer;+Cc:caml-list Gabriel Scherer <gabriel.scherer@gmail.com> writes: > Note: the reason why you get _weak type variables (which are not > polymorphic, just not-inferred-yet) is that the type-checker cannot detect > that (z // (fun () -> ...)) is a value: it would have to unfold the call to > (//) for this, which it doesn't do in general, and here certainly could not > do given that its definition is hidden behind an abstraction boundary. > I would recommend experimenting *without* the module interface and the > auxiliary function, with the constructors directly, you would get slightly > better types. > > # S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));; > - : (int32 -> string -> 'a, 'a) t = S (S (Z, <fun>), <fun>) > > Historically we have used module interfaces to implement "phantom types" -- > because the type information there is only present in the interface, not in > the definition. With GADTs, the type constraints are built into the > definition itself, which is precisely what makes them more powerful; no > need for an abstract interface on top. > > The first part of your question is about understanding how the > type-inference work (how variables are manipulated by the type-checker and > then "propagated back up"). This sounds like a difficult way to understand > GADTs: you want to learn the typing rules *and* the type-inference > algorithm at once. But only the first part is necessary to use the feature > productively (with a few tips on when to use annotations, which are easy to > explain and in fact explained in the manual: > http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html ). So instead of > asking: "how did the compiler get this type?", I would ask: "why is this > the right type"? I think you could convince yourself that (1) it is a > correct type and (2) any other valid type would be a specialization of this > type, there is no simpler solution. > > The second part: you wrote: > > let rec bind : type f r. f -> (f, r) t -> r = fun f -> function > | Z -> > f > | S (t, v) -> > bind (f (v ())) t > > Let's focus on the second clause: > > | S (t, v) -> > bind (f (v ())) t > > we know that (f : f) holds, and that the pattern-matching is on a value of > type (f, r) t, and we must return r. > When pattern-matching on S (t, v), we learn extra type information from the > typing rule of S: > > | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t > > if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit -> a)) > for *some* unknown/existential type a. So within the branch we have > > bind : type f r. f -> (f, r) t -> r > f : (f, a -> r) t > v : unit -> a > expected return type: r > > f does *not* have a function type here, so your idea of applying (f (v ())) > cannot work (v does have a function type, so (v ()) is valid). > > The only thing you can do on f is call (bind) recursively (with what > arguments?), and then you will get an (a -> r) as result. > > Do you see how to write a correct program from there? Ah-ha! I was close but my thinking was at the "wrong level" (I'm not sure how to put it). The working implementation of bind is: let rec bind : type f r. f -> (f, r) t -> r = fun f -> function | Z -> f | S (t, v) -> let f' = bind f t in f' (v ()) And now I see why you wanted me to look at it not through the module interface. Looking at how the recursion would work, of course the most-nested S will have the function corresponding to the first parameter of the function. I was thinking that I would consume the parameters on the way down, but it's actually on the way up. I am still working out in my head the answer to "why is this the right type", I think it has to slosh around in there a bit before it truly makes sense to me. Thank you! > > > > On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka <mmatalka@gmail.com> wrote: > >> I have been using GADTs to make type-safe versions of APIs that are kind >> of like printf. I've been using this pattern by rote and now I'm >> getting to trying to understand how it works. >> >> I have the following code: >> >> module F : sig >> type ('f, 'r) t >> >> val z : ('r, 'r) t >> val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t >> end = struct >> type ('f, 'r) t = >> | Z : ('r, 'r) t >> | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t >> >> let z = Z >> let (//) t f = S (t, f) >> end >> >> And the following usage: >> >> utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));; >> - : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr> >> >> I understand how 'r is '_weak1 (I think), but I'm still trying to figure >> out how 'f gets its type by applying (//). I think I have an >> explanation below, and I'm hoping someone can say if it's correct and/or >> simplify it. >> >> Explanation: >> >> The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t). >> The constructor S says that the (_, _) t that it takes has some type 'f, >> but that the second type variable must be of the form 'a -> 'r, sort of >> deconstructing pattern match on the type (if that makes sense). And if >> that (_, _) t is a Z, where we have stated the two type variables have >> the same value, that means 'f must also be ('a -> 'r). So for the >> first application of (//) it makes sense. If we apply (//) again, the >> first parameter has the type (int32 -> '_weak1, '_weak1) t, but that >> must actually mean '_weak1 in this case is string -> '_weak, and then >> this is where things become jumbled in my head. If the passed in (_, _) >> t must be (string -> '_weak), and the inner type actually must be (int32 >> -> '_weak), then, the inner inner type must be ('_weak), the type of 'r >> at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the >> same type for Z, 'f must also be (string -> int32 -> '_weak), and that >> knowledge propagates back up? But that doesn't feel quite right to me, >> either. >> >> With the help of reading other code, I've figured out how to make a >> function that uses a type like this that works like kprintf, however >> where I'm going in this case is that I want to take a function that >> matches 'f and apply it to the result of my functions. >> >> Something like: >> >> let rec bind : type f r. f -> (f, r) t -> r = fun f -> function >> | Z -> >> f >> | S (t, v) -> >> bind (f (v ())) t >> >> However, this does not compile given: >> >> Error: This expression has type f >> This is not a function; it cannot be applied. >> >> My understanding was if I have Z, and an input that has the type f, then >> that is just the return type since at Z, f and r are the same type. And >> than at S, I can peel away the type of f by applying the function f and >> then recursively calling. But my understanding is missing something. >> >> Does this make sense? >> >> What am I not understanding? >> >> Thank you, >> /Malcolm >> ^ permalink raw reply [flat|nested] 6+ messages in thread

*Re: [Caml-list] How is this type inferred (GADTs)2019-09-20 8:11 ` Malcolm Matalka@ 2019-09-20 8:35 ` Gabriel Scherer2019-09-20 8:56 ` Gabriel Scherer 0 siblings, 1 reply; 6+ messages in thread From: Gabriel Scherer @ 2019-09-20 8:35 UTC (permalink / raw) To: Malcolm Matalka;+Cc:caml-list [-- Attachment #1: Type: text/plain, Size: 9450 bytes --] There is a direct relation between how type evolve during GADT typing and how queries evolve during logic/relational programming. It can help to think of a GADT declaration as the set of rules of a Prolog program. (I you have never looked at Prolog, here is an occasion to learn something new). That does *not* mean that GADTs let you do prolog-style programming in types (the compiler will not automatically search for a proof, the user has to write it down explicitly as a GADT term, so this work is not done implicitly, as with type-class search for example), but it does give a way to think about GADT declarations. Your declaration (notice that I renamed the Z variable, for a good reason that will appear clear below): type (_, _) t = | Z : ('f, 'f) t | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t becomes t F F. t F R :- t F (A -> R), (unit -> A). In any case, your declaration has the following effect: - by the Z rule, all types of the form ('a1 -> 'a2 -> ... -> 'an -> 'r, 'a -> 'a2 -> ... -> 'an -> 'r) t are inhabited - by the S rule, you can "hide" an element on the right (and you have to provide a thunk for it) , moving from ('a1 -> ... -> 'an -> r, 'ak -> 'a{k+1} -> .. -> 'an -> r) t to ('a1 -> ... -> 'an -> r, 'a{k+1} -> .. -> 'an -> r) t If you do this repeatedly you end up on what you want: ('a1 -> .. -> 'an -> r, r) t. The way you write about it, it sounds like you had the following *different* definition in mind type (_, _) t = | Z : ('r, 'r) t | S : ('f, 'r) t * (unit -> 'a) -> ('a -> 'f, 'r) t let test = S (S (Z, (fun () -> 3l)), (fun () -> "foo")) corresponding to the following Prolog program that I do find easier to think about: t R R. t (A -> F) R :- t F R, (unit -> A). It turns out that with this other definition, the first implementation that you had written actually type-check fines. (You could try to write conversion functions between those two definitions to convince yourself that they are morally equivalent, but this is actually a difficult exercise.) On Fri, Sep 20, 2019 at 10:11 AM Malcolm Matalka <mmatalka@gmail.com> wrote: > > Gabriel Scherer <gabriel.scherer@gmail.com> writes: > > > Note: the reason why you get _weak type variables (which are not > > polymorphic, just not-inferred-yet) is that the type-checker cannot > detect > > that (z // (fun () -> ...)) is a value: it would have to unfold the call > to > > (//) for this, which it doesn't do in general, and here certainly could > not > > do given that its definition is hidden behind an abstraction boundary. > > I would recommend experimenting *without* the module interface and the > > auxiliary function, with the constructors directly, you would get > slightly > > better types. > > > > # S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));; > > - : (int32 -> string -> 'a, 'a) t = S (S (Z, <fun>), <fun>) > > > > Historically we have used module interfaces to implement "phantom types" > -- > > because the type information there is only present in the interface, not > in > > the definition. With GADTs, the type constraints are built into the > > definition itself, which is precisely what makes them more powerful; no > > need for an abstract interface on top. > > > > The first part of your question is about understanding how the > > type-inference work (how variables are manipulated by the type-checker > and > > then "propagated back up"). This sounds like a difficult way to > understand > > GADTs: you want to learn the typing rules *and* the type-inference > > algorithm at once. But only the first part is necessary to use the > feature > > productively (with a few tips on when to use annotations, which are easy > to > > explain and in fact explained in the manual: > > http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html ). So instead > of > > asking: "how did the compiler get this type?", I would ask: "why is this > > the right type"? I think you could convince yourself that (1) it is a > > correct type and (2) any other valid type would be a specialization of > this > > type, there is no simpler solution. > > > > The second part: you wrote: > > > > let rec bind : type f r. f -> (f, r) t -> r = fun f -> function > > | Z -> > > f > > | S (t, v) -> > > bind (f (v ())) t > > > > Let's focus on the second clause: > > > > | S (t, v) -> > > bind (f (v ())) t > > > > we know that (f : f) holds, and that the pattern-matching is on a value > of > > type (f, r) t, and we must return r. > > When pattern-matching on S (t, v), we learn extra type information from > the > > typing rule of S: > > > > | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t > > > > if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit -> > a)) > > for *some* unknown/existential type a. So within the branch we have > > > > bind : type f r. f -> (f, r) t -> r > > f : (f, a -> r) t > > v : unit -> a > > expected return type: r > > > > f does *not* have a function type here, so your idea of applying (f (v > ())) > > cannot work (v does have a function type, so (v ()) is valid). > > > > The only thing you can do on f is call (bind) recursively (with what > > arguments?), and then you will get an (a -> r) as result. > > > > Do you see how to write a correct program from there? > > Ah-ha! I was close but my thinking was at the "wrong level" (I'm not > sure how to put it). The working implementation of bind is: > > let rec bind : type f r. f -> (f, r) t -> r = fun f -> function > | Z -> > f > | S (t, v) -> > let f' = bind f t in > f' (v ()) > > And now I see why you wanted me to look at it not through the module > interface. Looking at how the recursion would work, of course the > most-nested S will have the function corresponding to the first > parameter of the function. I was thinking that I would consume the > parameters on the way down, but it's actually on the way up. > > I am still working out in my head the answer to "why is this the right > type", I think it has to slosh around in there a bit before it truly > makes sense to me. > > Thank you! > > > > > > > > > On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka <mmatalka@gmail.com> > wrote: > > > >> I have been using GADTs to make type-safe versions of APIs that are kind > >> of like printf. I've been using this pattern by rote and now I'm > >> getting to trying to understand how it works. > >> > >> I have the following code: > >> > >> module F : sig > >> type ('f, 'r) t > >> > >> val z : ('r, 'r) t > >> val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t > >> end = struct > >> type ('f, 'r) t = > >> | Z : ('r, 'r) t > >> | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t > >> > >> let z = Z > >> let (//) t f = S (t, f) > >> end > >> > >> And the following usage: > >> > >> utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));; > >> - : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr> > >> > >> I understand how 'r is '_weak1 (I think), but I'm still trying to figure > >> out how 'f gets its type by applying (//). I think I have an > >> explanation below, and I'm hoping someone can say if it's correct and/or > >> simplify it. > >> > >> Explanation: > >> > >> The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t). > >> The constructor S says that the (_, _) t that it takes has some type 'f, > >> but that the second type variable must be of the form 'a -> 'r, sort of > >> deconstructing pattern match on the type (if that makes sense). And if > >> that (_, _) t is a Z, where we have stated the two type variables have > >> the same value, that means 'f must also be ('a -> 'r). So for the > >> first application of (//) it makes sense. If we apply (//) again, the > >> first parameter has the type (int32 -> '_weak1, '_weak1) t, but that > >> must actually mean '_weak1 in this case is string -> '_weak, and then > >> this is where things become jumbled in my head. If the passed in (_, _) > >> t must be (string -> '_weak), and the inner type actually must be (int32 > >> -> '_weak), then, the inner inner type must be ('_weak), the type of 'r > >> at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the > >> same type for Z, 'f must also be (string -> int32 -> '_weak), and that > >> knowledge propagates back up? But that doesn't feel quite right to me, > >> either. > >> > >> With the help of reading other code, I've figured out how to make a > >> function that uses a type like this that works like kprintf, however > >> where I'm going in this case is that I want to take a function that > >> matches 'f and apply it to the result of my functions. > >> > >> Something like: > >> > >> let rec bind : type f r. f -> (f, r) t -> r = fun f -> function > >> | Z -> > >> f > >> | S (t, v) -> > >> bind (f (v ())) t > >> > >> However, this does not compile given: > >> > >> Error: This expression has type f > >> This is not a function; it cannot be applied. > >> > >> My understanding was if I have Z, and an input that has the type f, then > >> that is just the return type since at Z, f and r are the same type. And > >> than at S, I can peel away the type of f by applying the function f and > >> then recursively calling. But my understanding is missing something. > >> > >> Does this make sense? > >> > >> What am I not understanding? > >> > >> Thank you, > >> /Malcolm > >> > > [-- Attachment #2: Type: text/html, Size: 12567 bytes --] <div dir="ltr"><div>There is a direct relation between how type evolve during GADT typing and how queries evolve during logic/relational programming. It can help to think of a GADT declaration as the set of rules of a Prolog program. (I you have never looked at Prolog, here is an occasion to learn something new). That does *not* mean that GADTs let you do prolog-style programming in types (the compiler will not automatically search for a proof, the user has to write it down explicitly as a GADT term, so this work is not done implicitly, as with type-class search for example), but it does give a way to think about GADT declarations.</div><div><br></div><div>Your declaration (notice that I renamed the Z variable, for a good reason that will appear clear below):<br></div><div><br></div><div> type (_, _) t =<br> | Z : ('f, 'f) t<br> | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t</div><div><br></div><div>becomes<br></div><div><br></div><div> t F F.<br></div><div> t F R :- t F (A -> R), (unit -> A).</div><div><br></div><div>In any case, your declaration has the following effect:</div><div>- by the Z rule, all types of the form</div><div> ('a1 -> 'a2 -> ... -> 'an -> 'r, 'a -> 'a2 -> ... -> 'an -> 'r) t <br></div><div> are inhabited<br></div><div>- by the S rule, you can "hide" an element on the right</div><div> (and you have to provide a thunk for it)<br></div><div>, moving from</div><div> ('a1 -> ... -> 'an -> r, 'ak -> 'a{k+1} -> .. -> 'an -> r) t<br></div><div> to</div><div> ('a1 -> ... -> 'an -> r, 'a{k+1} -> .. -> 'an -> r) t<br><div> <br></div><div>If you do this repeatedly you end up on what you want: ('a1 -> .. -> 'an -> r, r) t.</div><div><br></div><div>The way you write about it, it sounds like you had the following *different* definition in mind</div><div><br></div><div>type (_, _) t =<br>| Z : ('r, 'r) t<br>| S : ('f, 'r) t * (unit -> 'a) -> ('a -> 'f, 'r) t<br><br>let test = S (S (Z, (fun () -> 3l)), (fun () -> "foo"))</div><div><br></div><div>corresponding to the following Prolog program that I do find easier to think about:<br></div><div><br></div><div><div> t R R.<br></div><div> t (A -> F) R :- t F R, (unit -> A).</div><div><br></div></div><div>It turns out that with this other definition, the first implementation that you had written actually type-check fines.</div><div><br></div><div>(You could try to write conversion functions between those two definitions to convince yourself that they are morally equivalent, but this is actually a difficult exercise.)<br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Sep 20, 2019 at 10:11 AM Malcolm Matalka <<a href="mailto:mmatalka@gmail.com">mmatalka@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br> Gabriel Scherer <<a href="mailto:gabriel.scherer@gmail.com" target="_blank">gabriel.scherer@gmail.com</a>> writes:<br> <br> > Note: the reason why you get _weak type variables (which are not<br> > polymorphic, just not-inferred-yet) is that the type-checker cannot detect<br> > that (z // (fun () -> ...)) is a value: it would have to unfold the call to<br> > (//) for this, which it doesn't do in general, and here certainly could not<br> > do given that its definition is hidden behind an abstraction boundary.<br> > I would recommend experimenting *without* the module interface and the<br> > auxiliary function, with the constructors directly, you would get slightly<br> > better types.<br> ><br> > # S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));;<br> > - : (int32 -> string -> 'a, 'a) t = S (S (Z, <fun>), <fun>)<br> ><br> > Historically we have used module interfaces to implement "phantom types" --<br> > because the type information there is only present in the interface, not in<br> > the definition. With GADTs, the type constraints are built into the<br> > definition itself, which is precisely what makes them more powerful; no<br> > need for an abstract interface on top.<br> ><br> > The first part of your question is about understanding how the<br> > type-inference work (how variables are manipulated by the type-checker and<br> > then "propagated back up"). This sounds like a difficult way to understand<br> > GADTs: you want to learn the typing rules *and* the type-inference<br> > algorithm at once. But only the first part is necessary to use the feature<br> > productively (with a few tips on when to use annotations, which are easy to<br> > explain and in fact explained in the manual:<br> > <a href="http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html" rel="noreferrer" target="_blank">http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html</a> ). So instead of<br> > asking: "how did the compiler get this type?", I would ask: "why is this<br> > the right type"? I think you could convince yourself that (1) it is a<br> > correct type and (2) any other valid type would be a specialization of this<br> > type, there is no simpler solution.<br> ><br> > The second part: you wrote:<br> ><br> > let rec bind : type f r. f -> (f, r) t -> r = fun f -> function<br> > | Z -><br> > f<br> > | S (t, v) -><br> > bind (f (v ())) t<br> ><br> > Let's focus on the second clause:<br> ><br> > | S (t, v) -><br> > bind (f (v ())) t<br> ><br> > we know that (f : f) holds, and that the pattern-matching is on a value of<br> > type (f, r) t, and we must return r.<br> > When pattern-matching on S (t, v), we learn extra type information from the<br> > typing rule of S:<br> ><br> > | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t<br> ><br> > if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit -> a))<br> > for *some* unknown/existential type a. So within the branch we have<br> ><br> > bind : type f r. f -> (f, r) t -> r<br> > f : (f, a -> r) t<br> > v : unit -> a<br> > expected return type: r<br> ><br> > f does *not* have a function type here, so your idea of applying (f (v ()))<br> > cannot work (v does have a function type, so (v ()) is valid).<br> ><br> > The only thing you can do on f is call (bind) recursively (with what<br> > arguments?), and then you will get an (a -> r) as result.<br> ><br> > Do you see how to write a correct program from there?<br> <br> Ah-ha! I was close but my thinking was at the "wrong level" (I'm not<br> sure how to put it). The working implementation of bind is:<br> <br> let rec bind : type f r. f -> (f, r) t -> r = fun f -> function<br> | Z -><br> f<br> | S (t, v) -><br> let f' = bind f t in<br> f' (v ())<br> <br> And now I see why you wanted me to look at it not through the module<br> interface. Looking at how the recursion would work, of course the<br> most-nested S will have the function corresponding to the first<br> parameter of the function. I was thinking that I would consume the<br> parameters on the way down, but it's actually on the way up.<br> <br> I am still working out in my head the answer to "why is this the right<br> type", I think it has to slosh around in there a bit before it truly<br> makes sense to me.<br> <br> Thank you!<br> <br> ><br> ><br> ><br> > On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka <<a href="mailto:mmatalka@gmail.com" target="_blank">mmatalka@gmail.com</a>> wrote:<br> ><br> >> I have been using GADTs to make type-safe versions of APIs that are kind<br> >> of like printf. I've been using this pattern by rote and now I'm<br> >> getting to trying to understand how it works.<br> >><br> >> I have the following code:<br> >><br> >> module F : sig<br> >> type ('f, 'r) t<br> >><br> >> val z : ('r, 'r) t<br> >> val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t<br> >> end = struct<br> >> type ('f, 'r) t =<br> >> | Z : ('r, 'r) t<br> >> | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t<br> >><br> >> let z = Z<br> >> let (//) t f = S (t, f)<br> >> end<br> >><br> >> And the following usage:<br> >><br> >> utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));;<br> >> - : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr><br> >><br> >> I understand how 'r is '_weak1 (I think), but I'm still trying to figure<br> >> out how 'f gets its type by applying (//). I think I have an<br> >> explanation below, and I'm hoping someone can say if it's correct and/or<br> >> simplify it.<br> >><br> >> Explanation:<br> >><br> >> The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t).<br> >> The constructor S says that the (_, _) t that it takes has some type 'f,<br> >> but that the second type variable must be of the form 'a -> 'r, sort of<br> >> deconstructing pattern match on the type (if that makes sense). And if<br> >> that (_, _) t is a Z, where we have stated the two type variables have<br> >> the same value, that means 'f must also be ('a -> 'r). So for the<br> >> first application of (//) it makes sense. If we apply (//) again, the<br> >> first parameter has the type (int32 -> '_weak1, '_weak1) t, but that<br> >> must actually mean '_weak1 in this case is string -> '_weak, and then<br> >> this is where things become jumbled in my head. If the passed in (_, _)<br> >> t must be (string -> '_weak), and the inner type actually must be (int32<br> >> -> '_weak), then, the inner inner type must be ('_weak), the type of 'r<br> >> at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the<br> >> same type for Z, 'f must also be (string -> int32 -> '_weak), and that<br> >> knowledge propagates back up? But that doesn't feel quite right to me,<br> >> either.<br> >><br> >> With the help of reading other code, I've figured out how to make a<br> >> function that uses a type like this that works like kprintf, however<br> >> where I'm going in this case is that I want to take a function that<br> >> matches 'f and apply it to the result of my functions.<br> >><br> >> Something like:<br> >><br> >> let rec bind : type f r. f -> (f, r) t -> r = fun f -> function<br> >> | Z -><br> >> f<br> >> | S (t, v) -><br> >> bind (f (v ())) t<br> >><br> >> However, this does not compile given:<br> >><br> >> Error: This expression has type f<br> >> This is not a function; it cannot be applied.<br> >><br> >> My understanding was if I have Z, and an input that has the type f, then<br> >> that is just the return type since at Z, f and r are the same type. And<br> >> than at S, I can peel away the type of f by applying the function f and<br> >> then recursively calling. But my understanding is missing something.<br> >><br> >> Does this make sense?<br> >><br> >> What am I not understanding?<br> >><br> >> Thank you,<br> >> /Malcolm<br> >><br> <br> </blockquote></div> ^ permalink raw reply [flat|nested] 6+ messages in thread

*Re: [Caml-list] How is this type inferred (GADTs)2019-09-20 8:35 ` Gabriel Scherer@ 2019-09-20 8:56 ` Gabriel Scherer0 siblings, 0 replies; 6+ messages in thread From: Gabriel Scherer @ 2019-09-20 8:56 UTC (permalink / raw) To: Malcolm Matalka;+Cc:caml-list [-- Attachment #1: Type: text/plain, Size: 10489 bytes --] > (You could try to write conversion functions between those two definitions > to convince yourself that they are morally equivalent, > but this is actually a difficult exercise.) type (_, _) t1 = | Z : ('f, 'f) t1 | S : (('f, 'a -> 'r) t1 * (unit -> 'a)) -> ('f, 'r) t1 type (_, _) t2 = | Z : ('r, 'r) t2 | S : ('f, 'r) t2 * (unit -> 'a) -> ('a -> 'f, 'r) t2 t2 is t1 "upside down" (and conversely). To go from t1 to t2 is thus a "reverse" operation. Just like for lists, writing "reverse" directly is difficult, but one can use an auxiliary function let rec rev_append : type a b c. (b, c) t2 -> (a, b) t1 -> (a, c) t2 On Fri, Sep 20, 2019 at 10:35 AM Gabriel Scherer <gabriel.scherer@gmail.com> wrote: > There is a direct relation between how type evolve during GADT typing and > how queries evolve during logic/relational programming. It can help to > think of a GADT declaration as the set of rules of a Prolog program. (I you > have never looked at Prolog, here is an occasion to learn something new). > That does *not* mean that GADTs let you do prolog-style programming in > types (the compiler will not automatically search for a proof, the user has > to write it down explicitly as a GADT term, so this work is not done > implicitly, as with type-class search for example), but it does give a way > to think about GADT declarations. > > Your declaration (notice that I renamed the Z variable, for a good reason > that will appear clear below): > > type (_, _) t = > | Z : ('f, 'f) t > | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t > > becomes > > t F F. > t F R :- t F (A -> R), (unit -> A). > > In any case, your declaration has the following effect: > - by the Z rule, all types of the form > ('a1 -> 'a2 -> ... -> 'an -> 'r, 'a -> 'a2 -> ... -> 'an -> 'r) t > are inhabited > - by the S rule, you can "hide" an element on the right > (and you have to provide a thunk for it) > , moving from > ('a1 -> ... -> 'an -> r, 'ak -> 'a{k+1} -> .. -> 'an -> r) t > to > ('a1 -> ... -> 'an -> r, 'a{k+1} -> .. -> 'an -> r) t > > If you do this repeatedly you end up on what you want: ('a1 -> .. -> 'an > -> r, r) t. > > The way you write about it, it sounds like you had the following > *different* definition in mind > > type (_, _) t = > | Z : ('r, 'r) t > | S : ('f, 'r) t * (unit -> 'a) -> ('a -> 'f, 'r) t > > let test = S (S (Z, (fun () -> 3l)), (fun () -> "foo")) > > corresponding to the following Prolog program that I do find easier to > think about: > > t R R. > t (A -> F) R :- t F R, (unit -> A). > > It turns out that with this other definition, the first implementation > that you had written actually type-check fines. > > (You could try to write conversion functions between those two definitions > to convince yourself that they are morally equivalent, but this is actually > a difficult exercise.) > > On Fri, Sep 20, 2019 at 10:11 AM Malcolm Matalka <mmatalka@gmail.com> > wrote: > >> >> Gabriel Scherer <gabriel.scherer@gmail.com> writes: >> >> > Note: the reason why you get _weak type variables (which are not >> > polymorphic, just not-inferred-yet) is that the type-checker cannot >> detect >> > that (z // (fun () -> ...)) is a value: it would have to unfold the >> call to >> > (//) for this, which it doesn't do in general, and here certainly could >> not >> > do given that its definition is hidden behind an abstraction boundary. >> > I would recommend experimenting *without* the module interface and the >> > auxiliary function, with the constructors directly, you would get >> slightly >> > better types. >> > >> > # S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));; >> > - : (int32 -> string -> 'a, 'a) t = S (S (Z, <fun>), <fun>) >> > >> > Historically we have used module interfaces to implement "phantom >> types" -- >> > because the type information there is only present in the interface, >> not in >> > the definition. With GADTs, the type constraints are built into the >> > definition itself, which is precisely what makes them more powerful; no >> > need for an abstract interface on top. >> > >> > The first part of your question is about understanding how the >> > type-inference work (how variables are manipulated by the type-checker >> and >> > then "propagated back up"). This sounds like a difficult way to >> understand >> > GADTs: you want to learn the typing rules *and* the type-inference >> > algorithm at once. But only the first part is necessary to use the >> feature >> > productively (with a few tips on when to use annotations, which are >> easy to >> > explain and in fact explained in the manual: >> > http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html ). So >> instead of >> > asking: "how did the compiler get this type?", I would ask: "why is this >> > the right type"? I think you could convince yourself that (1) it is a >> > correct type and (2) any other valid type would be a specialization of >> this >> > type, there is no simpler solution. >> > >> > The second part: you wrote: >> > >> > let rec bind : type f r. f -> (f, r) t -> r = fun f -> function >> > | Z -> >> > f >> > | S (t, v) -> >> > bind (f (v ())) t >> > >> > Let's focus on the second clause: >> > >> > | S (t, v) -> >> > bind (f (v ())) t >> > >> > we know that (f : f) holds, and that the pattern-matching is on a value >> of >> > type (f, r) t, and we must return r. >> > When pattern-matching on S (t, v), we learn extra type information from >> the >> > typing rule of S: >> > >> > | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t >> > >> > if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit -> >> a)) >> > for *some* unknown/existential type a. So within the branch we have >> > >> > bind : type f r. f -> (f, r) t -> r >> > f : (f, a -> r) t >> > v : unit -> a >> > expected return type: r >> > >> > f does *not* have a function type here, so your idea of applying (f (v >> ())) >> > cannot work (v does have a function type, so (v ()) is valid). >> > >> > The only thing you can do on f is call (bind) recursively (with what >> > arguments?), and then you will get an (a -> r) as result. >> > >> > Do you see how to write a correct program from there? >> >> Ah-ha! I was close but my thinking was at the "wrong level" (I'm not >> sure how to put it). The working implementation of bind is: >> >> let rec bind : type f r. f -> (f, r) t -> r = fun f -> function >> | Z -> >> f >> | S (t, v) -> >> let f' = bind f t in >> f' (v ()) >> >> And now I see why you wanted me to look at it not through the module >> interface. Looking at how the recursion would work, of course the >> most-nested S will have the function corresponding to the first >> parameter of the function. I was thinking that I would consume the >> parameters on the way down, but it's actually on the way up. >> >> I am still working out in my head the answer to "why is this the right >> type", I think it has to slosh around in there a bit before it truly >> makes sense to me. >> >> Thank you! >> >> > >> > >> > >> > On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka <mmatalka@gmail.com> >> wrote: >> > >> >> I have been using GADTs to make type-safe versions of APIs that are >> kind >> >> of like printf. I've been using this pattern by rote and now I'm >> >> getting to trying to understand how it works. >> >> >> >> I have the following code: >> >> >> >> module F : sig >> >> type ('f, 'r) t >> >> >> >> val z : ('r, 'r) t >> >> val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t >> >> end = struct >> >> type ('f, 'r) t = >> >> | Z : ('r, 'r) t >> >> | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t >> >> >> >> let z = Z >> >> let (//) t f = S (t, f) >> >> end >> >> >> >> And the following usage: >> >> >> >> utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));; >> >> - : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr> >> >> >> >> I understand how 'r is '_weak1 (I think), but I'm still trying to >> figure >> >> out how 'f gets its type by applying (//). I think I have an >> >> explanation below, and I'm hoping someone can say if it's correct >> and/or >> >> simplify it. >> >> >> >> Explanation: >> >> >> >> The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t). >> >> The constructor S says that the (_, _) t that it takes has some type >> 'f, >> >> but that the second type variable must be of the form 'a -> 'r, sort of >> >> deconstructing pattern match on the type (if that makes sense). And if >> >> that (_, _) t is a Z, where we have stated the two type variables have >> >> the same value, that means 'f must also be ('a -> 'r). So for the >> >> first application of (//) it makes sense. If we apply (//) again, the >> >> first parameter has the type (int32 -> '_weak1, '_weak1) t, but that >> >> must actually mean '_weak1 in this case is string -> '_weak, and then >> >> this is where things become jumbled in my head. If the passed in (_, >> _) >> >> t must be (string -> '_weak), and the inner type actually must be >> (int32 >> >> -> '_weak), then, the inner inner type must be ('_weak), the type of 'r >> >> at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the >> >> same type for Z, 'f must also be (string -> int32 -> '_weak), and that >> >> knowledge propagates back up? But that doesn't feel quite right to me, >> >> either. >> >> >> >> With the help of reading other code, I've figured out how to make a >> >> function that uses a type like this that works like kprintf, however >> >> where I'm going in this case is that I want to take a function that >> >> matches 'f and apply it to the result of my functions. >> >> >> >> Something like: >> >> >> >> let rec bind : type f r. f -> (f, r) t -> r = fun f -> function >> >> | Z -> >> >> f >> >> | S (t, v) -> >> >> bind (f (v ())) t >> >> >> >> However, this does not compile given: >> >> >> >> Error: This expression has type f >> >> This is not a function; it cannot be applied. >> >> >> >> My understanding was if I have Z, and an input that has the type f, >> then >> >> that is just the return type since at Z, f and r are the same type. >> And >> >> than at S, I can peel away the type of f by applying the function f and >> >> then recursively calling. But my understanding is missing something. >> >> >> >> Does this make sense? >> >> >> >> What am I not understanding? >> >> >> >> Thank you, >> >> /Malcolm >> >> >> >> [-- Attachment #2: Type: text/html, Size: 13875 bytes --] <div dir="ltr"><div>> (You could try to write conversion functions between those two definitions</div><div>> to convince yourself that they are morally equivalent,</div><div>> but this is actually a difficult exercise.)</div><div><br></div><div>type (_, _) t1 =<br>| Z : ('f, 'f) t1<br>| S : (('f, 'a -> 'r) t1 * (unit -> 'a)) -> ('f, 'r) t1<br><br>type (_, _) t2 =<br>| Z : ('r, 'r) t2<br>| S : ('f, 'r) t2 * (unit -> 'a) -> ('a -> 'f, 'r) t2<br></div><div><br></div><div>t2 is t1 "upside down" (and conversely). To go from t1 to t2 is thus a "reverse" operation.</div><div>Just like for lists, writing "reverse" directly is difficult, but one can use an auxiliary function<br></div><div><br></div><div>let rec rev_append<br> : type a b c. (b, c) t2 -> (a, b) t1 -> (a, c) t2 <br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Sep 20, 2019 at 10:35 AM Gabriel Scherer <<a href="mailto:gabriel.scherer@gmail.com">gabriel.scherer@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>There is a direct relation between how type evolve during GADT typing and how queries evolve during logic/relational programming. It can help to think of a GADT declaration as the set of rules of a Prolog program. (I you have never looked at Prolog, here is an occasion to learn something new). That does *not* mean that GADTs let you do prolog-style programming in types (the compiler will not automatically search for a proof, the user has to write it down explicitly as a GADT term, so this work is not done implicitly, as with type-class search for example), but it does give a way to think about GADT declarations.</div><div><br></div><div>Your declaration (notice that I renamed the Z variable, for a good reason that will appear clear below):<br></div><div><br></div><div> type (_, _) t =<br> | Z : ('f, 'f) t<br> | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t</div><div><br></div><div>becomes<br></div><div><br></div><div> t F F.<br></div><div> t F R :- t F (A -> R), (unit -> A).</div><div><br></div><div>In any case, your declaration has the following effect:</div><div>- by the Z rule, all types of the form</div><div> ('a1 -> 'a2 -> ... -> 'an -> 'r, 'a -> 'a2 -> ... -> 'an -> 'r) t <br></div><div> are inhabited<br></div><div>- by the S rule, you can "hide" an element on the right</div><div> (and you have to provide a thunk for it)<br></div><div>, moving from</div><div> ('a1 -> ... -> 'an -> r, 'ak -> 'a{k+1} -> .. -> 'an -> r) t<br></div><div> to</div><div> ('a1 -> ... -> 'an -> r, 'a{k+1} -> .. -> 'an -> r) t<br><div> <br></div><div>If you do this repeatedly you end up on what you want: ('a1 -> .. -> 'an -> r, r) t.</div><div><br></div><div>The way you write about it, it sounds like you had the following *different* definition in mind</div><div><br></div><div>type (_, _) t =<br>| Z : ('r, 'r) t<br>| S : ('f, 'r) t * (unit -> 'a) -> ('a -> 'f, 'r) t<br><br>let test = S (S (Z, (fun () -> 3l)), (fun () -> "foo"))</div><div><br></div><div>corresponding to the following Prolog program that I do find easier to think about:<br></div><div><br></div><div><div> t R R.<br></div><div> t (A -> F) R :- t F R, (unit -> A).</div><div><br></div></div><div>It turns out that with this other definition, the first implementation that you had written actually type-check fines.</div><div><br></div><div>(You could try to write conversion functions between those two definitions to convince yourself that they are morally equivalent, but this is actually a difficult exercise.)<br></div></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Sep 20, 2019 at 10:11 AM Malcolm Matalka <<a href="mailto:mmatalka@gmail.com" target="_blank">mmatalka@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br> Gabriel Scherer <<a href="mailto:gabriel.scherer@gmail.com" target="_blank">gabriel.scherer@gmail.com</a>> writes:<br> <br> > Note: the reason why you get _weak type variables (which are not<br> > polymorphic, just not-inferred-yet) is that the type-checker cannot detect<br> > that (z // (fun () -> ...)) is a value: it would have to unfold the call to<br> > (//) for this, which it doesn't do in general, and here certainly could not<br> > do given that its definition is hidden behind an abstraction boundary.<br> > I would recommend experimenting *without* the module interface and the<br> > auxiliary function, with the constructors directly, you would get slightly<br> > better types.<br> ><br> > # S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));;<br> > - : (int32 -> string -> 'a, 'a) t = S (S (Z, <fun>), <fun>)<br> ><br> > Historically we have used module interfaces to implement "phantom types" --<br> > because the type information there is only present in the interface, not in<br> > the definition. With GADTs, the type constraints are built into the<br> > definition itself, which is precisely what makes them more powerful; no<br> > need for an abstract interface on top.<br> ><br> > The first part of your question is about understanding how the<br> > type-inference work (how variables are manipulated by the type-checker and<br> > then "propagated back up"). This sounds like a difficult way to understand<br> > GADTs: you want to learn the typing rules *and* the type-inference<br> > algorithm at once. But only the first part is necessary to use the feature<br> > productively (with a few tips on when to use annotations, which are easy to<br> > explain and in fact explained in the manual:<br> > <a href="http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html" rel="noreferrer" target="_blank">http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html</a> ). So instead of<br> > asking: "how did the compiler get this type?", I would ask: "why is this<br> > the right type"? I think you could convince yourself that (1) it is a<br> > correct type and (2) any other valid type would be a specialization of this<br> > type, there is no simpler solution.<br> ><br> > The second part: you wrote:<br> ><br> > let rec bind : type f r. f -> (f, r) t -> r = fun f -> function<br> > | Z -><br> > f<br> > | S (t, v) -><br> > bind (f (v ())) t<br> ><br> > Let's focus on the second clause:<br> ><br> > | S (t, v) -><br> > bind (f (v ())) t<br> ><br> > we know that (f : f) holds, and that the pattern-matching is on a value of<br> > type (f, r) t, and we must return r.<br> > When pattern-matching on S (t, v), we learn extra type information from the<br> > typing rule of S:<br> ><br> > | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t<br> ><br> > if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit -> a))<br> > for *some* unknown/existential type a. So within the branch we have<br> ><br> > bind : type f r. f -> (f, r) t -> r<br> > f : (f, a -> r) t<br> > v : unit -> a<br> > expected return type: r<br> ><br> > f does *not* have a function type here, so your idea of applying (f (v ()))<br> > cannot work (v does have a function type, so (v ()) is valid).<br> ><br> > The only thing you can do on f is call (bind) recursively (with what<br> > arguments?), and then you will get an (a -> r) as result.<br> ><br> > Do you see how to write a correct program from there?<br> <br> Ah-ha! I was close but my thinking was at the "wrong level" (I'm not<br> sure how to put it). The working implementation of bind is:<br> <br> let rec bind : type f r. f -> (f, r) t -> r = fun f -> function<br> | Z -><br> f<br> | S (t, v) -><br> let f' = bind f t in<br> f' (v ())<br> <br> And now I see why you wanted me to look at it not through the module<br> interface. Looking at how the recursion would work, of course the<br> most-nested S will have the function corresponding to the first<br> parameter of the function. I was thinking that I would consume the<br> parameters on the way down, but it's actually on the way up.<br> <br> I am still working out in my head the answer to "why is this the right<br> type", I think it has to slosh around in there a bit before it truly<br> makes sense to me.<br> <br> Thank you!<br> <br> ><br> ><br> ><br> > On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka <<a href="mailto:mmatalka@gmail.com" target="_blank">mmatalka@gmail.com</a>> wrote:<br> ><br> >> I have been using GADTs to make type-safe versions of APIs that are kind<br> >> of like printf. I've been using this pattern by rote and now I'm<br> >> getting to trying to understand how it works.<br> >><br> >> I have the following code:<br> >><br> >> module F : sig<br> >> type ('f, 'r) t<br> >><br> >> val z : ('r, 'r) t<br> >> val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t<br> >> end = struct<br> >> type ('f, 'r) t =<br> >> | Z : ('r, 'r) t<br> >> | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t<br> >><br> >> let z = Z<br> >> let (//) t f = S (t, f)<br> >> end<br> >><br> >> And the following usage:<br> >><br> >> utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));;<br> >> - : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr><br> >><br> >> I understand how 'r is '_weak1 (I think), but I'm still trying to figure<br> >> out how 'f gets its type by applying (//). I think I have an<br> >> explanation below, and I'm hoping someone can say if it's correct and/or<br> >> simplify it.<br> >><br> >> Explanation:<br> >><br> >> The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t).<br> >> The constructor S says that the (_, _) t that it takes has some type 'f,<br> >> but that the second type variable must be of the form 'a -> 'r, sort of<br> >> deconstructing pattern match on the type (if that makes sense). And if<br> >> that (_, _) t is a Z, where we have stated the two type variables have<br> >> the same value, that means 'f must also be ('a -> 'r). So for the<br> >> first application of (//) it makes sense. If we apply (//) again, the<br> >> first parameter has the type (int32 -> '_weak1, '_weak1) t, but that<br> >> must actually mean '_weak1 in this case is string -> '_weak, and then<br> >> this is where things become jumbled in my head. If the passed in (_, _)<br> >> t must be (string -> '_weak), and the inner type actually must be (int32<br> >> -> '_weak), then, the inner inner type must be ('_weak), the type of 'r<br> >> at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the<br> >> same type for Z, 'f must also be (string -> int32 -> '_weak), and that<br> >> knowledge propagates back up? But that doesn't feel quite right to me,<br> >> either.<br> >><br> >> With the help of reading other code, I've figured out how to make a<br> >> function that uses a type like this that works like kprintf, however<br> >> where I'm going in this case is that I want to take a function that<br> >> matches 'f and apply it to the result of my functions.<br> >><br> >> Something like:<br> >><br> >> let rec bind : type f r. f -> (f, r) t -> r = fun f -> function<br> >> | Z -><br> >> f<br> >> | S (t, v) -><br> >> bind (f (v ())) t<br> >><br> >> However, this does not compile given:<br> >><br> >> Error: This expression has type f<br> >> This is not a function; it cannot be applied.<br> >><br> >> My understanding was if I have Z, and an input that has the type f, then<br> >> that is just the return type since at Z, f and r are the same type. And<br> >> than at S, I can peel away the type of f by applying the function f and<br> >> then recursively calling. But my understanding is missing something.<br> >><br> >> Does this make sense?<br> >><br> >> What am I not understanding?<br> >><br> >> Thank you,<br> >> /Malcolm<br> >><br> <br> </blockquote></div> </blockquote></div> ^ permalink raw reply [flat|nested] 6+ messages in thread

*2019-09-20 3:42 [Caml-list] How is this type inferred (GADTs) Malcolm Matalka 2019-09-20 6:35 ` Gabriel SchererRe: [Caml-list] How is this type inferred (GADTs)@ 2019-09-20 14:25 ` Oleg1 sibling, 0 replies; 6+ messages in thread From: Oleg @ 2019-09-20 14:25 UTC (permalink / raw) To: mmatalka;+Cc:caml-list Gabriel has given an excellent explanation, it is hard to add anything. Yet I will still try. > module F : sig > type ('f, 'r) t > val z : ('r, 'r) t > val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t > end = struct > type ('f, 'r) t = > | Z : ('r, 'r) t > | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t > let z = Z > let (//) t f = S (t, f) > end Perhaps it will be easier to work out the types if we don't use GADTs. They are not necessary in your example (and not needed for the typed printf either). Here is the adjusted code module F : sig type ('f, 'r) t val z : ('r, 'r) t val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t val bind : 'f -> ('f, 'r) t -> 'r end = struct type ('f, 'r) t = 'f -> 'r let z = fun x -> x (* the type annotations are all optional. They are given to show which subexpressions have which types, and let the compiler confirm that. Alas, attaching type annotations requires too many parentheses *) let (//) : type a r f. (f, a -> r) t -> (unit -> a) -> (f, r) t = fun t th -> fun (f:f) -> ((((t f):(a->r)) (th () : a)) : r) let bind f t = t f end let exp1 = F.(z // (fun () -> Int32.zero) // (fun () -> ""));; (* val exp1 : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr> *) Just for reference, here is the implementation for the opposite order, suggested by Gabriel. module G : sig type ('f, 'r) t val z : ('r, 'r) t val (//) : ('f, 'r) t -> (unit -> 'a) -> ('a -> 'f, 'r) t val bind : 'f -> ('f, 'r) t -> 'r end = struct type ('f, 'r) t = {e: 'w. ('r -> 'w) -> ('f -> 'w)} let z = {e = fun x -> x} (* the type annotations are all optional. They are given to show which subexpressions have which types, and let the compiler confirm that Alas, attaching type annotations requires too many parentheses *) let (//) : type a r f. (f, r) t -> (unit -> a) -> (a->f, r) t = fun t th -> {e = fun (type w) (k:(r->w)) -> fun (af:a->f) -> ((((t.e k):(f -> w)) ((af (th ())):f)):w)} let bind f t = t.e (fun x -> x) f end let test = G.(z // (fun () -> 3l) // (fun () -> "foo"));; (* val test : (string -> int32 -> '_weak2, '_weak2) G.t = <abstr> *) ;; ^ permalink raw reply [flat|nested] 6+ messages in thread

end of thread, back to indexThread overview:6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-09-20 3:42 [Caml-list] How is this type inferred (GADTs) Malcolm Matalka 2019-09-20 6:35 ` Gabriel Scherer 2019-09-20 8:11 ` Malcolm Matalka 2019-09-20 8:35 ` Gabriel Scherer 2019-09-20 8:56 ` Gabriel Scherer 2019-09-20 14:25 ` Oleg

caml-list - the Caml user's mailing list Archives are clonable: git clone --mirror http://inbox.vuxu.org/caml-list git clone --mirror https://inbox.ocaml.org/caml-list Example config snippet for mirrors Newsgroup available over NNTP: nntp://inbox.vuxu.org/vuxu.archive.caml-list AGPL code for this site: git clone https://public-inbox.org/public-inbox.git