caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Gabriel Scherer <gabriel.scherer@gmail.com>
To: Malcolm Matalka <mmatalka@gmail.com>
Cc: caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] How is this type inferred (GADTs)
Date: Fri, 20 Sep 2019 10:56:27 +0200	[thread overview]
Message-ID: <CAPFanBEXXy6A-L13_-awFn+trFfDTkHKQ8PgE2_MZHeY1M3Tdw@mail.gmail.com> (raw)
In-Reply-To: <CAPFanBFAr5LOEXTdRJtrX6FUQ2_LVjbVgZqiZHLMyE6KkUPL8A@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 10489 bytes --]

> (You could try to write conversion functions between those two definitions
> to convince yourself that they are morally equivalent,
> but this is actually a difficult exercise.)

type (_, _) t1 =
| Z : ('f, 'f) t1
| S : (('f, 'a -> 'r) t1 * (unit -> 'a)) -> ('f, 'r) t1

type (_, _) t2 =
| Z : ('r, 'r) t2
| S : ('f, 'r) t2 * (unit -> 'a) -> ('a -> 'f, 'r) t2

t2 is t1 "upside down" (and conversely). To go from t1 to t2 is thus a
"reverse" operation.
Just like for lists, writing "reverse" directly is difficult, but one can
use an auxiliary function

let rec rev_append
  : type a b c. (b, c) t2 -> (a, b) t1 -> (a, c) t2


On Fri, Sep 20, 2019 at 10:35 AM Gabriel Scherer <gabriel.scherer@gmail.com>
wrote:

> There is a direct relation between how type evolve during GADT typing and
> how queries evolve during logic/relational programming. It can help to
> think of a GADT declaration as the set of rules of a Prolog program. (I you
> have never looked at Prolog, here is an occasion to learn something new).
> That does *not* mean that GADTs let you do prolog-style programming in
> types (the compiler will not automatically search for a proof, the user has
> to write it down explicitly as a GADT term, so this work is not done
> implicitly, as with type-class search for example), but it does give a way
> to think about GADT declarations.
>
> Your declaration (notice that I renamed the Z variable, for a good reason
> that will appear clear below):
>
>   type (_, _) t =
>     | Z : ('f, 'f) t
>     | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
>
> becomes
>
>   t F F.
>   t F R :- t F (A -> R), (unit -> A).
>
> In any case, your declaration has the following effect:
> - by the Z rule, all types of the form
>   ('a1 -> 'a2 -> ... -> 'an -> 'r, 'a -> 'a2 -> ... -> 'an -> 'r) t
>   are inhabited
> - by the S rule, you can "hide" an element on the right
>   (and you have to provide a thunk for it)
> , moving from
>   ('a1 -> ... -> 'an -> r, 'ak -> 'a{k+1} -> .. -> 'an -> r) t
>   to
>   ('a1 -> ... -> 'an -> r, 'a{k+1} -> .. -> 'an -> r) t
>
> If you do this repeatedly you end up on what you want: ('a1 -> .. -> 'an
> -> r, r) t.
>
> The way you write about it, it sounds like you had the following
> *different* definition in mind
>
> type (_, _) t =
> | Z : ('r, 'r) t
> | S : ('f, 'r) t * (unit -> 'a) -> ('a -> 'f, 'r) t
>
> let test = S (S (Z, (fun () -> 3l)), (fun () -> "foo"))
>
> corresponding to the following Prolog program that I do find easier to
> think about:
>
>   t R R.
>   t (A -> F) R :- t F R, (unit -> A).
>
> It turns out that with this other definition, the first implementation
> that you had written actually type-check fines.
>
> (You could try to write conversion functions between those two definitions
> to convince yourself that they are morally equivalent, but this is actually
> a difficult exercise.)
>
> On Fri, Sep 20, 2019 at 10:11 AM Malcolm Matalka <mmatalka@gmail.com>
> wrote:
>
>>
>> Gabriel Scherer <gabriel.scherer@gmail.com> writes:
>>
>> > Note: the reason why you get _weak type variables (which are not
>> > polymorphic, just not-inferred-yet)  is that the type-checker cannot
>> detect
>> > that (z // (fun () -> ...)) is a value: it would have to unfold the
>> call to
>> > (//) for this, which it doesn't do in general, and here certainly could
>> not
>> > do given that its definition is hidden behind an abstraction boundary.
>> > I would recommend experimenting *without* the module interface and the
>> > auxiliary function, with the constructors directly, you would get
>> slightly
>> > better types.
>> >
>> > # S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));;
>> > - : (int32 -> string -> 'a, 'a) t = S (S (Z, <fun>), <fun>)
>> >
>> > Historically we have used module interfaces to implement "phantom
>> types" --
>> > because the type information there is only present in the interface,
>> not in
>> > the definition. With GADTs, the type constraints are built into the
>> > definition itself, which is precisely what makes them more powerful; no
>> > need for an abstract interface on top.
>> >
>> > The first part of your question is about understanding how the
>> > type-inference work (how variables are manipulated by the type-checker
>> and
>> > then "propagated back up"). This sounds like a difficult way to
>> understand
>> > GADTs: you want to learn the typing rules *and* the type-inference
>> > algorithm at once. But only the first part is necessary to use the
>> feature
>> > productively (with a few tips on when to use annotations, which are
>> easy to
>> > explain and in fact explained in the manual:
>> > http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html ). So
>> instead of
>> > asking: "how did the compiler get this type?", I would ask: "why is this
>> > the right type"? I think you could convince yourself that (1) it is a
>> > correct type and (2) any other valid type would be a specialization of
>> this
>> > type, there is no simpler solution.
>> >
>> > The second part: you wrote:
>> >
>> >   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
>> >     | Z ->
>> >       f
>> >     | S (t, v) ->
>> >       bind (f (v ())) t
>> >
>> > Let's focus on the second clause:
>> >
>> >     | S (t, v) ->
>> >       bind (f (v ())) t
>> >
>> > we know that (f : f) holds, and that the pattern-matching is on a value
>> of
>> > type (f, r) t, and we must return r.
>> > When pattern-matching on S (t, v), we learn extra type information from
>> the
>> > typing rule of S:
>> >
>> >     | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
>> >
>> > if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit ->
>> a))
>> > for *some* unknown/existential type a. So within the branch we have
>> >
>> >   bind : type f r. f -> (f, r) t -> r
>> >   f : (f, a -> r) t
>> >   v : unit -> a
>> >   expected return type: r
>> >
>> > f does *not* have a function type here, so your idea of applying (f (v
>> ()))
>> > cannot work (v does have a function type, so (v ()) is valid).
>> >
>> > The only thing you can do on f is call (bind) recursively (with what
>> > arguments?), and then you will get an (a -> r) as result.
>> >
>> > Do you see how to write a correct program from there?
>>
>> Ah-ha!  I was close but my thinking was at the "wrong level" (I'm not
>> sure how to put it).  The working implementation of bind is:
>>
>>   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
>>     | Z ->
>>       f
>>     | S (t, v) ->
>>       let f' = bind f t in
>>       f' (v ())
>>
>> And now I see why you wanted me to look at it not through the module
>> interface.  Looking at how the recursion would work, of course the
>> most-nested S will have the function corresponding to the first
>> parameter of the function.  I was thinking that I would consume the
>> parameters on the way down, but it's actually on the way up.
>>
>> I am still working out in my head the answer to "why is this the right
>> type", I think it has to slosh around in there a bit before it truly
>> makes sense to me.
>>
>> Thank you!
>>
>> >
>> >
>> >
>> > On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka <mmatalka@gmail.com>
>> wrote:
>> >
>> >> I have been using GADTs to make type-safe versions of APIs that are
>> kind
>> >> of like printf.  I've been using this pattern by rote and now I'm
>> >> getting to trying to understand how it works.
>> >>
>> >> I have the following code:
>> >>
>> >> module F : sig
>> >>   type ('f, 'r) t
>> >>
>> >>   val z : ('r, 'r) t
>> >>   val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t
>> >> end = struct
>> >>   type ('f, 'r) t =
>> >>     | Z : ('r, 'r) t
>> >>     | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
>> >>
>> >>   let z = Z
>> >>   let (//) t f = S (t, f)
>> >> end
>> >>
>> >> And the following usage:
>> >>
>> >> utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));;
>> >> - : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr>
>> >>
>> >> I understand how 'r is '_weak1 (I think), but I'm still trying to
>> figure
>> >> out how 'f gets its type by applying (//).  I think I have an
>> >> explanation below, and I'm hoping someone can say if it's correct
>> and/or
>> >> simplify it.
>> >>
>> >> Explanation:
>> >>
>> >> The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t).
>> >> The constructor S says that the (_, _) t that it takes has some type
>> 'f,
>> >> but that the second type variable must be of the form 'a -> 'r, sort of
>> >> deconstructing pattern match on the type (if that makes sense).  And if
>> >> that (_, _) t is a Z, where we have stated the two type variables have
>> >> the same value, that means 'f must also be ('a -> 'r). So for the
>> >> first application of (//) it makes sense.  If we apply (//) again, the
>> >> first parameter has the type (int32 -> '_weak1, '_weak1) t, but that
>> >> must actually mean '_weak1 in this case is string -> '_weak, and then
>> >> this is where things become jumbled in my head.  If the passed in (_,
>> _)
>> >> t must be (string -> '_weak), and the inner type actually must be
>> (int32
>> >> -> '_weak), then, the inner inner type must be ('_weak), the type of 'r
>> >> at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the
>> >> same type for Z, 'f must also be (string -> int32 -> '_weak), and that
>> >> knowledge propagates back up?  But that doesn't feel quite right to me,
>> >> either.
>> >>
>> >> With the help of reading other code, I've figured out how to make a
>> >> function that uses a type like this that works like kprintf, however
>> >> where I'm going in this case is that I want to take a function that
>> >> matches 'f and apply it to the result of my functions.
>> >>
>> >> Something like:
>> >>
>> >>   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
>> >>     | Z ->
>> >>       f
>> >>     | S (t, v) ->
>> >>       bind (f (v ())) t
>> >>
>> >> However, this does not compile given:
>> >>
>> >> Error: This expression has type f
>> >>        This is not a function; it cannot be applied.
>> >>
>> >> My understanding was if I have Z, and an input that has the type f,
>> then
>> >> that is just the return type since at Z, f and r are the same type.
>> And
>> >> than at S, I can peel away the type of f by applying the function f and
>> >> then recursively calling.  But my understanding is missing something.
>> >>
>> >> Does this make sense?
>> >>
>> >> What am I not understanding?
>> >>
>> >> Thank you,
>> >> /Malcolm
>> >>
>>
>>

[-- Attachment #2: Type: text/html, Size: 13875 bytes --]

  reply	other threads:[~2019-09-20  8:54 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-20  3:42 Malcolm Matalka
2019-09-20  6:35 ` Gabriel Scherer
2019-09-20  8:11   ` Malcolm Matalka
2019-09-20  8:35     ` Gabriel Scherer
2019-09-20  8:56       ` Gabriel Scherer [this message]
2019-09-20 14:25 ` Oleg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPFanBEXXy6A-L13_-awFn+trFfDTkHKQ8PgE2_MZHeY1M3Tdw@mail.gmail.com \
    --to=gabriel.scherer@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=mmatalka@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).