Re: [Caml-list] How is this type inferred (GADTs)

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

From: Gabriel Scherer <gabriel.scherer@gmail.com>
To: Malcolm Matalka <mmatalka@gmail.com>
Cc: caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] How is this type inferred (GADTs)
Date: Fri, 20 Sep 2019 10:35:06 +0200	[thread overview]
Message-ID: <CAPFanBFAr5LOEXTdRJtrX6FUQ2_LVjbVgZqiZHLMyE6KkUPL8A@mail.gmail.com> (raw)
In-Reply-To: <86v9tnfkum.fsf@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 9450 bytes --]

There is a direct relation between how type evolve during GADT typing and
how queries evolve during logic/relational programming. It can help to
think of a GADT declaration as the set of rules of a Prolog program. (I you
have never looked at Prolog, here is an occasion to learn something new).
That does *not* mean that GADTs let you do prolog-style programming in
types (the compiler will not automatically search for a proof, the user has
to write it down explicitly as a GADT term, so this work is not done
implicitly, as with type-class search for example), but it does give a way
to think about GADT declarations.

Your declaration (notice that I renamed the Z variable, for a good reason
that will appear clear below):

  type (_, _) t =
    | Z : ('f, 'f) t
    | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t

becomes

  t F F.
  t F R :- t F (A -> R), (unit -> A).

In any case, your declaration has the following effect:
- by the Z rule, all types of the form
  ('a1 -> 'a2 -> ... -> 'an -> 'r, 'a -> 'a2 -> ... -> 'an -> 'r) t
  are inhabited
- by the S rule, you can "hide" an element on the right
  (and you have to provide a thunk for it)
, moving from
  ('a1 -> ... -> 'an -> r, 'ak -> 'a{k+1} -> .. -> 'an -> r) t
  to
  ('a1 -> ... -> 'an -> r, 'a{k+1} -> .. -> 'an -> r) t

If you do this repeatedly you end up on what you want: ('a1 -> .. -> 'an ->
r, r) t.

The way you write about it, it sounds like you had the following
*different* definition in mind

type (_, _) t =
| Z : ('r, 'r) t
| S : ('f, 'r) t * (unit -> 'a) -> ('a -> 'f, 'r) t

let test = S (S (Z, (fun () -> 3l)), (fun () -> "foo"))

corresponding to the following Prolog program that I do find easier to
think about:

  t R R.
  t (A -> F) R :- t F R, (unit -> A).

It turns out that with this other definition, the first implementation that
you had written actually type-check fines.

(You could try to write conversion functions between those two definitions
to convince yourself that they are morally equivalent, but this is actually
a difficult exercise.)

On Fri, Sep 20, 2019 at 10:11 AM Malcolm Matalka <mmatalka@gmail.com> wrote:

>
> Gabriel Scherer <gabriel.scherer@gmail.com> writes:
>
> > Note: the reason why you get _weak type variables (which are not
> > polymorphic, just not-inferred-yet)  is that the type-checker cannot
> detect
> > that (z // (fun () -> ...)) is a value: it would have to unfold the call
> to
> > (//) for this, which it doesn't do in general, and here certainly could
> not
> > do given that its definition is hidden behind an abstraction boundary.
> > I would recommend experimenting *without* the module interface and the
> > auxiliary function, with the constructors directly, you would get
> slightly
> > better types.
> >
> > # S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));;
> > - : (int32 -> string -> 'a, 'a) t = S (S (Z, <fun>), <fun>)
> >
> > Historically we have used module interfaces to implement "phantom types"
> --
> > because the type information there is only present in the interface, not
> in
> > the definition. With GADTs, the type constraints are built into the
> > definition itself, which is precisely what makes them more powerful; no
> > need for an abstract interface on top.
> >
> > The first part of your question is about understanding how the
> > type-inference work (how variables are manipulated by the type-checker
> and
> > then "propagated back up"). This sounds like a difficult way to
> understand
> > GADTs: you want to learn the typing rules *and* the type-inference
> > algorithm at once. But only the first part is necessary to use the
> feature
> > productively (with a few tips on when to use annotations, which are easy
> to
> > explain and in fact explained in the manual:
> > http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html ). So instead
> of
> > asking: "how did the compiler get this type?", I would ask: "why is this
> > the right type"? I think you could convince yourself that (1) it is a
> > correct type and (2) any other valid type would be a specialization of
> this
> > type, there is no simpler solution.
> >
> > The second part: you wrote:
> >
> >   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
> >     | Z ->
> >       f
> >     | S (t, v) ->
> >       bind (f (v ())) t
> >
> > Let's focus on the second clause:
> >
> >     | S (t, v) ->
> >       bind (f (v ())) t
> >
> > we know that (f : f) holds, and that the pattern-matching is on a value
> of
> > type (f, r) t, and we must return r.
> > When pattern-matching on S (t, v), we learn extra type information from
> the
> > typing rule of S:
> >
> >     | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
> >
> > if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit ->
> a))
> > for *some* unknown/existential type a. So within the branch we have
> >
> >   bind : type f r. f -> (f, r) t -> r
> >   f : (f, a -> r) t
> >   v : unit -> a
> >   expected return type: r
> >
> > f does *not* have a function type here, so your idea of applying (f (v
> ()))
> > cannot work (v does have a function type, so (v ()) is valid).
> >
> > The only thing you can do on f is call (bind) recursively (with what
> > arguments?), and then you will get an (a -> r) as result.
> >
> > Do you see how to write a correct program from there?
>
> Ah-ha!  I was close but my thinking was at the "wrong level" (I'm not
> sure how to put it).  The working implementation of bind is:
>
>   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
>     | Z ->
>       f
>     | S (t, v) ->
>       let f' = bind f t in
>       f' (v ())
>
> And now I see why you wanted me to look at it not through the module
> interface.  Looking at how the recursion would work, of course the
> most-nested S will have the function corresponding to the first
> parameter of the function.  I was thinking that I would consume the
> parameters on the way down, but it's actually on the way up.
>
> I am still working out in my head the answer to "why is this the right
> type", I think it has to slosh around in there a bit before it truly
> makes sense to me.
>
> Thank you!
>
> >
> >
> >
> > On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka <mmatalka@gmail.com>
> wrote:
> >
> >> I have been using GADTs to make type-safe versions of APIs that are kind
> >> of like printf.  I've been using this pattern by rote and now I'm
> >> getting to trying to understand how it works.
> >>
> >> I have the following code:
> >>
> >> module F : sig
> >>   type ('f, 'r) t
> >>
> >>   val z : ('r, 'r) t
> >>   val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t
> >> end = struct
> >>   type ('f, 'r) t =
> >>     | Z : ('r, 'r) t
> >>     | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t
> >>
> >>   let z = Z
> >>   let (//) t f = S (t, f)
> >> end
> >>
> >> And the following usage:
> >>
> >> utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));;
> >> - : (int32 -> string -> '_weak1, '_weak1) F.t = <abstr>
> >>
> >> I understand how 'r is '_weak1 (I think), but I'm still trying to figure
> >> out how 'f gets its type by applying (//).  I think I have an
> >> explanation below, and I'm hoping someone can say if it's correct and/or
> >> simplify it.
> >>
> >> Explanation:
> >>
> >> The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t).
> >> The constructor S says that the (_, _) t that it takes has some type 'f,
> >> but that the second type variable must be of the form 'a -> 'r, sort of
> >> deconstructing pattern match on the type (if that makes sense).  And if
> >> that (_, _) t is a Z, where we have stated the two type variables have
> >> the same value, that means 'f must also be ('a -> 'r). So for the
> >> first application of (//) it makes sense.  If we apply (//) again, the
> >> first parameter has the type (int32 -> '_weak1, '_weak1) t, but that
> >> must actually mean '_weak1 in this case is string -> '_weak, and then
> >> this is where things become jumbled in my head.  If the passed in (_, _)
> >> t must be (string -> '_weak), and the inner type actually must be (int32
> >> -> '_weak), then, the inner inner type must be ('_weak), the type of 'r
> >> at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the
> >> same type for Z, 'f must also be (string -> int32 -> '_weak), and that
> >> knowledge propagates back up?  But that doesn't feel quite right to me,
> >> either.
> >>
> >> With the help of reading other code, I've figured out how to make a
> >> function that uses a type like this that works like kprintf, however
> >> where I'm going in this case is that I want to take a function that
> >> matches 'f and apply it to the result of my functions.
> >>
> >> Something like:
> >>
> >>   let rec bind : type f r. f -> (f, r) t -> r = fun f -> function
> >>     | Z ->
> >>       f
> >>     | S (t, v) ->
> >>       bind (f (v ())) t
> >>
> >> However, this does not compile given:
> >>
> >> Error: This expression has type f
> >>        This is not a function; it cannot be applied.
> >>
> >> My understanding was if I have Z, and an input that has the type f, then
> >> that is just the return type since at Z, f and r are the same type.  And
> >> than at S, I can peel away the type of f by applying the function f and
> >> then recursively calling.  But my understanding is missing something.
> >>
> >> Does this make sense?
> >>
> >> What am I not understanding?
> >>
> >> Thank you,
> >> /Malcolm
> >>
>
>

[-- Attachment #2: Type: text/html, Size: 12567 bytes --]

next prev parent reply	other threads:[~2019-09-20  8:33 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-20  3:42 Malcolm Matalka
2019-09-20  6:35 ` Gabriel Scherer
2019-09-20  8:11   ` Malcolm Matalka
2019-09-20  8:35     ` Gabriel Scherer [this message]
2019-09-20  8:56       ` Gabriel Scherer
2019-09-20 14:25 ` Oleg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPFanBFAr5LOEXTdRJtrX6FUQ2_LVjbVgZqiZHLMyE6KkUPL8A@mail.gmail.com \
    --to=gabriel.scherer@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=mmatalka@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).