From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by c5ff346549e7 (Postfix) with ESMTPS id BB5FE5D3 for ; Fri, 20 Sep 2019 06:33:27 +0000 (UTC) X-IronPort-AV: E=Sophos;i="5.64,527,1559512800"; d="scan'208,217";a="402610177" Received: from sympa.inria.fr ([193.51.193.213]) by mail2-relais-roc.national.inria.fr with ESMTP; 20 Sep 2019 08:33:26 +0200 Received: by sympa.inria.fr (Postfix, from userid 20132) id BA30F7EF9D; Fri, 20 Sep 2019 08:33:25 +0200 (CEST) Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id E31107EE31 for ; Fri, 20 Sep 2019 08:33:20 +0200 (CEST) Authentication-Results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=gabriel.scherer@gmail.com; spf=Pass smtp.mailfrom=gabriel.scherer@gmail.com; spf=None smtp.helo=postmaster@mail-qt1-f179.google.com IronPort-PHdr: =?us-ascii?q?9a23=3A+8C4ExCgsUVZenOOm0hnUyQJP3N1i/DPJgcQr6Af?= =?us-ascii?q?oPdwSPTyrsbcNUDSrc9gkEXOFd2Cra4d0KyM6uu5ATNIoc7Y9ixbKtoUD15NoP?= =?us-ascii?q?5VtjRoONSCB0z/IayiRA0BN+MGamVY+WqmO1NeAsf0ag6aiHSz6TkPBke3blIt?= =?us-ascii?q?daz6FYHIksu4yf259YHNbAVUnjq9Zq55IAmroQnLucQanIVvJrwvxhfUrXdEZv?= =?us-ascii?q?hayGF1Ll6Xgxrw+9288ZF+/yhOof4t69JMXaDndKkkULJUCygrPXoo78PxrxnD?= =?us-ascii?q?SgWP5noYUmoIlxdDHhbI4hLnUJrvqyX2ruVy1jWUMs3wVrA0RC+t77x3Rx/yiS?= =?us-ascii?q?cILCA2/WfKgcFtlq1boRahpxtiw47IZYyeKfRzcr/Bcd4cWGFMRdhaWTBfDYyg?= =?us-ascii?q?bosPF+sBMvher4nhvFsFsB+yCRCxCO/z1jNEg3n71rA43es8CwHLxAMvH9wMv3?= =?us-ascii?q?rUotv7N7ocX/6pw6TT1zrPc+lb1C3h5ITUcB0sp+yHU7JqccrWzEkiDwLLgU+L?= =?us-ascii?q?poz/PjOayOANuHWG4eV8VeKglXQnpB9rojOywcoshZPGiZkPylDF6yp5xJw5Jc?= =?us-ascii?q?akR057f9GkCoVftzuBOot5R8MtWWBouCIgxrIavp67eTEHxZI6zBDRbPyHdpKH?= =?us-ascii?q?4hPlVOuJIDd4gmhleLOliBqo/0ig0PXwVsiy0FZWoCdJjMPAt3ET2BzJ7ciKTO?= =?us-ascii?q?Z28ES52TuXyQzf9uVJLVo3mKfbMZIt3789moYJvUjeHCL7nEP7h7KMeEo+4Oin?= =?us-ascii?q?8eHnb63mppCCM490jRnzMqE0lcy+BeQ0KxEOX3SG9eil2r3v4E/0TbFQgv05la?= =?us-ascii?q?nZt5/aJcAFqaKjHwBV1YMj5w6+DzegztsYgWEKIExZdB+DlYTkOFHDLOrlAfq+?= =?us-ascii?q?n1igiipnyvLCM7H5B5XCNHnDkLPvfbZn7E5czRI+zd9F6J1PELEBIez8WkvruN?= =?us-ascii?q?zDEhA5MxK7w/z5B9VnzY4eVmePDbWYMKPWq1OH+uUvI+yUaI8PpDn9M+Ql5+Lp?= =?us-ascii?q?jXIhhVASZ6yp3Z8OZHC8H/RmOFmZbGH3gtYBFGcKphAxQPbriF2ESz5TZmy9U7?= =?us-ascii?q?gy5jEhW8qaCtLmT4Smh7iElAKyApRSZWkOXlWJGGvhc4aNc/gJYSOWZMRml2pX?= =?us-ascii?q?e6KmTtoO3Bu0tQL+g4FsLufO9zdQ4Z3q3sJ06umVjhoy+CZ5FeyS1miMSyd/mW?= =?us-ascii?q?ZeFGx+57x2vUEokgTL6qN/mfENUIULv6oYADd/DobVyqlBM/63XwvAetmTT1P/?= =?us-ascii?q?G4epBDgwSpQ6xNpcOh8hSeXntQjK2m+RO5FQj6aCXcVm/afV3ny3LMF4mS6fif?= =?us-ascii?q?sRymI+S84KDlWIw65y8w+JWdzMmkSd0r+2LOEShXGcsmiEymWKsQdTVwsiCag?= =?us-ascii?q?=3D?= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0AhDACFcYRdf7OgVdFlHAEBAQQBAQcEA?= =?us-ascii?q?QGBZ4JARYEFFhSEIoEdgl6LGIIPmx0JAQMBDCMMAQGEPwKDBBsHAQQ0EwIMAQE?= =?us-ascii?q?EAQEBAgECAwQBEwEBCQsLCCeFNgyCOikBgmcBAQECARIRBBkBGx0BAwELBgMCC?= =?us-ascii?q?wM0AgIhAQERAQUBHAYTGweDAYFpAQMODw+PJo8MgQM8iyR/FgUBF4J+BYFIQYI?= =?us-ascii?q?5ChknDWIDgT0CBxKBIowJgVg/g3UuPoIaRwOEa4JYBKxKQQeCJWUEhhyKBIQBG?= =?us-ascii?q?4MolXyWLYIIjxMPI4EcKoF5MxojUDGCOwlHEBSBTgwXg0+BPokXQTABjysBAQ?= X-IPAS-Result: =?us-ascii?q?A0AhDACFcYRdf7OgVdFlHAEBAQQBAQcEAQGBZ4JARYEFFhS?= =?us-ascii?q?EIoEdgl6LGIIPmx0JAQMBDCMMAQGEPwKDBBsHAQQ0EwIMAQEEAQEBAgECAwQBE?= =?us-ascii?q?wEBCQsLCCeFNgyCOikBgmcBAQECARIRBBkBGx0BAwELBgMCCwM0AgIhAQERAQU?= =?us-ascii?q?BHAYTGweDAYFpAQMODw+PJo8MgQM8iyR/FgUBF4J+BYFIQYI5ChknDWIDgT0CB?= =?us-ascii?q?xKBIowJgVg/g3UuPoIaRwOEa4JYBKxKQQeCJWUEhhyKBIQBG4MolXyWLYIIjxM?= =?us-ascii?q?PI4EcKoF5MxojUDGCOwlHEBSBTgwXg0+BPokXQTABjysBAQ?= X-IronPort-AV: E=Sophos;i="5.64,527,1559512800"; d="scan'208,217";a="320048722" X-MGA-submission: =?us-ascii?q?MDF4r6QmTe8QrcQFmYlk8frzZxdv6D/t4P/a2w?= =?us-ascii?q?QkYP45BCJGbkWfuvDKLupZWFmUxXHdjFsvZ7+Hb3+rWKHBTzIXNWdJik?= =?us-ascii?q?K6Lq4Pn6cmmV4dB/pTqHRDm77609f7rD+PvUmuGFksTtiMN4IjhM8vXz?= =?us-ascii?q?L98qBMBFsKwj2ORatBhA6UiA=3D=3D?= Received: from mail-qt1-f179.google.com ([209.85.160.179]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/AES128-GCM-SHA256; 20 Sep 2019 08:33:18 +0200 Received: by mail-qt1-f179.google.com with SMTP id 3so5008427qta.1 for ; Thu, 19 Sep 2019 23:33:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=1yIeG/IeYuHL31qRwovVIoRqMe9DMqftYcBYW0VquRM=; b=YYPmyEx4qlTKSowsWwlr8979RIbuz7OgozTLq/QEsa/FIgBPqiismsFjJRB1MN8XTZ PmUM/pvSvPyFzdIyUYk+n0ULMPUxA0+PnDkqaIzPzuA+lsOW90lh5Xfbq7CvPoDG0ZE4 0BtNtbexL+0zFVPZ06euRWiw1Z4uGvhigmioCFnpnfzM/oDDwdSh3oCzDfjcsJc90fnQ yrtnlQmDEFZD0CmZwk3NyOeSarwJ2zdSXq31INCqX+wgoaYjKaDBtxowMcsDQXRtwzTA XnQaRkQBTtLfn7YDSdUf4Num1TDJw9btOjc8+V3qKcvgIqUn4IkMmSDLXlNx4/H6rTSq vurg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=1yIeG/IeYuHL31qRwovVIoRqMe9DMqftYcBYW0VquRM=; b=ULsCEB7f1lK6ku+NVwHalu0KQ1XfV+KtZ2GJWdFC2iNXQVGw6JUczFzFsW8f+DBs3X EoOuMblN7jhCLw2awpB8uZOY4aF2+fyJ3G46QwkhsqLav+4Y0B7j8HwlUbpwyT3GNjGh QrrOH3t8X4/waiHmkZKbA1qPDij1B24mh66K6/QyddaZWyxRzBOgWGmaLhPgoKJ5ULA/ E1BcvjxvDnLJ5kdzY1FOQD9AVxz3x6wZnucz7fZp6DZNteFbNhwNAOLJCPl1dX83KIwO e5idP4ZIvOwNOq3LYapnDP4nNjSaZJVYspB1BTn2MPiMpg4ttqpo7A23EU+SP3+1ph+M iUpw== X-Gm-Message-State: APjAAAWWXbLnrS7h+7HgNm7CEo6ARGxeSQamfgeQkgkBzDeKCYuuK/z4 hLLkSB6btqgGLtjtaPCINr5NaT2wvM6wOhpDy6M= X-Google-Smtp-Source: APXvYqw3+UY6OTbOD0kVk0y8M0Tql7c2549snA2tuppQVR8t6TOUlsP09kan/vxFvZENrcW2iPYI90gHgDyjYGKH2EI= X-Received: by 2002:ac8:6c44:: with SMTP id z4mr1618329qtu.229.1568961197659; Thu, 19 Sep 2019 23:33:17 -0700 (PDT) MIME-Version: 1.0 References: <86o8zfpra2.fsf@gmail.com> In-Reply-To: <86o8zfpra2.fsf@gmail.com> From: Gabriel Scherer Date: Fri, 20 Sep 2019 08:35:05 +0200 Message-ID: To: Malcolm Matalka Cc: caml-list Content-Type: multipart/alternative; boundary="000000000000da59640592f63ec5" Subject: Re: [Caml-list] How is this type inferred (GADTs) Reply-To: Gabriel Scherer X-Loop: caml-list@inria.fr X-Sequence: 17813 Errors-to: caml-list-owner@inria.fr Precedence: list Precedence: bulk Sender: caml-list-request@inria.fr X-no-archive: yes List-Id: List-Archive: List-Help: List-Owner: List-Post: List-Subscribe: List-Unsubscribe: --000000000000da59640592f63ec5 Content-Type: text/plain; charset="UTF-8" Note: the reason why you get _weak type variables (which are not polymorphic, just not-inferred-yet) is that the type-checker cannot detect that (z // (fun () -> ...)) is a value: it would have to unfold the call to (//) for this, which it doesn't do in general, and here certainly could not do given that its definition is hidden behind an abstraction boundary. I would recommend experimenting *without* the module interface and the auxiliary function, with the constructors directly, you would get slightly better types. # S(S(Z, (fun () -> Int32.zero)), (fun () -> ""));; - : (int32 -> string -> 'a, 'a) t = S (S (Z, ), ) Historically we have used module interfaces to implement "phantom types" -- because the type information there is only present in the interface, not in the definition. With GADTs, the type constraints are built into the definition itself, which is precisely what makes them more powerful; no need for an abstract interface on top. The first part of your question is about understanding how the type-inference work (how variables are manipulated by the type-checker and then "propagated back up"). This sounds like a difficult way to understand GADTs: you want to learn the typing rules *and* the type-inference algorithm at once. But only the first part is necessary to use the feature productively (with a few tips on when to use annotations, which are easy to explain and in fact explained in the manual: http://caml.inria.fr/pub/docs/manual-ocaml/manual033.html ). So instead of asking: "how did the compiler get this type?", I would ask: "why is this the right type"? I think you could convince yourself that (1) it is a correct type and (2) any other valid type would be a specialization of this type, there is no simpler solution. The second part: you wrote: let rec bind : type f r. f -> (f, r) t -> r = fun f -> function | Z -> f | S (t, v) -> bind (f (v ())) t Let's focus on the second clause: | S (t, v) -> bind (f (v ())) t we know that (f : f) holds, and that the pattern-matching is on a value of type (f, r) t, and we must return r. When pattern-matching on S (t, v), we learn extra type information from the typing rule of S: | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t if r has type (f, r) t, then (t, v) has type ((f, a -> r) t * (unit -> a)) for *some* unknown/existential type a. So within the branch we have bind : type f r. f -> (f, r) t -> r f : (f, a -> r) t v : unit -> a expected return type: r f does *not* have a function type here, so your idea of applying (f (v ())) cannot work (v does have a function type, so (v ()) is valid). The only thing you can do on f is call (bind) recursively (with what arguments?), and then you will get an (a -> r) as result. Do you see how to write a correct program from there? On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka wrote: > I have been using GADTs to make type-safe versions of APIs that are kind > of like printf. I've been using this pattern by rote and now I'm > getting to trying to understand how it works. > > I have the following code: > > module F : sig > type ('f, 'r) t > > val z : ('r, 'r) t > val (//) : ('f, 'a -> 'r) t -> (unit -> 'a) -> ('f, 'r) t > end = struct > type ('f, 'r) t = > | Z : ('r, 'r) t > | S : (('f, 'a -> 'r) t * (unit -> 'a)) -> ('f, 'r) t > > let z = Z > let (//) t f = S (t, f) > end > > And the following usage: > > utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));; > - : (int32 -> string -> '_weak1, '_weak1) F.t = > > I understand how 'r is '_weak1 (I think), but I'm still trying to figure > out how 'f gets its type by applying (//). I think I have an > explanation below, and I'm hoping someone can say if it's correct and/or > simplify it. > > Explanation: > > The constructor Z says that 'f and 'r, are the same (Z : ('r, 'r) t). > The constructor S says that the (_, _) t that it takes has some type 'f, > but that the second type variable must be of the form 'a -> 'r, sort of > deconstructing pattern match on the type (if that makes sense). And if > that (_, _) t is a Z, where we have stated the two type variables have > the same value, that means 'f must also be ('a -> 'r). So for the > first application of (//) it makes sense. If we apply (//) again, the > first parameter has the type (int32 -> '_weak1, '_weak1) t, but that > must actually mean '_weak1 in this case is string -> '_weak, and then > this is where things become jumbled in my head. If the passed in (_, _) > t must be (string -> '_weak), and the inner type actually must be (int32 > -> '_weak), then, the inner inner type must be ('_weak), the type of 'r > at Z must be (string -> int32 -> '_weak), and since 'r, and 'f are the > same type for Z, 'f must also be (string -> int32 -> '_weak), and that > knowledge propagates back up? But that doesn't feel quite right to me, > either. > > With the help of reading other code, I've figured out how to make a > function that uses a type like this that works like kprintf, however > where I'm going in this case is that I want to take a function that > matches 'f and apply it to the result of my functions. > > Something like: > > let rec bind : type f r. f -> (f, r) t -> r = fun f -> function > | Z -> > f > | S (t, v) -> > bind (f (v ())) t > > However, this does not compile given: > > Error: This expression has type f > This is not a function; it cannot be applied. > > My understanding was if I have Z, and an input that has the type f, then > that is just the return type since at Z, f and r are the same type. And > than at S, I can peel away the type of f by applying the function f and > then recursively calling. But my understanding is missing something. > > Does this make sense? > > What am I not understanding? > > Thank you, > /Malcolm > --000000000000da59640592f63ec5 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Note: the reason why you get _weak type variables (wh= ich are not polymorphic, just not-inferred-yet)=C2=A0 is that the type-chec= ker cannot detect that (z // (fun () -> ...)) is a value: it would have = to unfold the call to (//) for this, which it doesn't do in general, an= d here certainly could not do given that its definition is hidden behind an= abstraction boundary.
I would recommend experimenting *without* = the module interface and the auxiliary function, with the constructors dire= ctly, you would get slightly better types.

# S(S(Z= , (fun () -> Int32.zero)), (fun () -> ""));;
- : (int32 = -> string -> 'a, 'a) t =3D S (S (Z, <fun>), <fun>= )

Historically we have used module interfaces to i= mplement "phantom types" -- because the type information there is= only present in the interface, not in the definition. With GADTs, the type= constraints are built into the definition itself, which is precisely what = makes them more powerful; no need for an abstract interface on top.

The first part of your question is about understanding ho= w the type-inference work (how variables are manipulated by the type-checke= r and then "propagated back up"). This sounds like a difficult wa= y to understand GADTs: you want to learn the typing rules *and* the type-in= ference algorithm at once. But only the first part is necessary to use the = feature productively (with a few tips on when to use annotations, which are= easy to explain and in fact explained in the manual: http://caml.inria.fr/pub/d= ocs/manual-ocaml/manual033.html ). So instead of asking: "how did = the compiler get this type?", I would ask: "why is this the right= type"? I think you could convince yourself that (1) it is a correct t= ype and (2) any other valid type would be a specialization of this type, th= ere is no simpler solution.

The second part: you w= rote:

=C2=A0 let rec bind : type f r. f -> (f, r) t -> r =3D fun f -> fu= nction
=C2=A0 =C2=A0 | Z ->
=C2=A0 =C2=A0 =C2=A0 f
=C2=A0 =C2=A0 | S (t, v) ->
=C2=A0 =C2=A0 =C2=A0 bind (f (v ())) t

Let's f= ocus on the second clause:

=C2=A0 =C2=A0 | S (= t, v) ->
=C2=A0 =C2=A0 =C2=A0 bind (f (v ())) t

we kn= ow that (f : f) holds, and that the pattern-matching is on a value of type = (f, r) t, and we must return r.
When pattern-matching on S (t, v)= , we learn extra type information from the typing rule of S:

=C2=A0 =C2=A0 | S : (('f, 'a -> 'r) t * (unit -> 'a))= -> ('f, 'r) t

if r has type (f, r) t, = then (t, v) has type ((f, a -> r) t * (unit -> a)) for *some* unknown= /existential type a. So within the branch we have

= =C2=A0 bind : type f r. f -> (f, r) t -> r
=C2=A0 f : (f,= a -> r) t
=C2=A0 v : unit -> a
=C2=A0 expect= ed return type: r

f does *not* have a function typ= e here, so your idea of applying (f (v ())) cannot work (v does have a func= tion type, so (v ()) is valid).

The only thing you= can do on f is call (bind) recursively (with what arguments?), and then yo= u will get an (a -> r) as result.

Do you see ho= w to write a correct program from there?

=

On Fri, Sep 20, 2019 at 5:42 AM Malcolm Matalka <mmatalka@gmail.com> wrote:
<= blockquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-l= eft:1px solid rgb(204,204,204);padding-left:1ex">I have been using GADTs to= make type-safe versions of APIs that are kind
of like printf.=C2=A0 I've been using this pattern by rote and now I= 9;m
getting to trying to understand how it works.

I have the following code:

module F : sig
=C2=A0 type ('f, 'r) t

=C2=A0 val z : ('r, 'r) t
=C2=A0 val (//) : ('f, 'a -> 'r) t -> (unit -> 'a)= -> ('f, 'r) t
end =3D struct
=C2=A0 type ('f, 'r) t =3D
=C2=A0 =C2=A0 | Z : ('r, 'r) t
=C2=A0 =C2=A0 | S : (('f, 'a -> 'r) t * (unit -> 'a))= -> ('f, 'r) t

=C2=A0 let z =3D Z
=C2=A0 let (//) t f =3D S (t, f)
end

And the following usage:

utop # F.(z // (fun () -> Int32.zero) // (fun () -> ""));;<= br> - : (int32 -> string -> '_weak1, '_weak1) F.t =3D <abstr&g= t;

I understand how 'r is '_weak1 (I think), but I'm still trying = to figure
out how 'f gets its type by applying (//).=C2=A0 I think I have an
explanation below, and I'm hoping someone can say if it's correct a= nd/or
simplify it.

Explanation:

The constructor Z says that 'f and 'r, are the same (Z : ('r, &= #39;r) t).
The constructor S says that the (_, _) t that it takes has some type 'f= ,
but that the second type variable must be of the form 'a -> 'r, = sort of
deconstructing pattern match on the type (if that makes sense).=C2=A0 And i= f
that (_, _) t is a Z, where we have stated the two type variables have
the same value, that means 'f must also be ('a -> 'r). So fo= r the
first application of (//) it makes sense.=C2=A0 If we apply (//) again, the=
first parameter has the type (int32 -> '_weak1, '_weak1) t, but = that
must actually mean '_weak1 in this case is string -> '_weak, and= then
this is where things become jumbled in my head.=C2=A0 If the passed in (_, = _)
t must be (string -> '_weak), and the inner type actually must be (i= nt32
-> '_weak), then, the inner inner type must be ('_weak), the typ= e of 'r
at Z must be (string -> int32 -> '_weak), and since 'r, and &= #39;f are the
same type for Z, 'f must also be (string -> int32 -> '_weak),= and that
knowledge propagates back up?=C2=A0 But that doesn't feel quite right t= o me,
either.

With the help of reading other code, I've figured out how to make a
function that uses a type like this that works like kprintf, however
where I'm going in this case is that I want to take a function that
matches 'f and apply it to the result of my functions.

Something like:

=C2=A0 let rec bind : type f r. f -> (f, r) t -> r =3D fun f -> fu= nction
=C2=A0 =C2=A0 | Z ->
=C2=A0 =C2=A0 =C2=A0 f
=C2=A0 =C2=A0 | S (t, v) ->
=C2=A0 =C2=A0 =C2=A0 bind (f (v ())) t

However, this does not compile given:

Error: This expression has type f
=C2=A0 =C2=A0 =C2=A0 =C2=A0This is not a function; it cannot be applied.
My understanding was if I have Z, and an input that has the type f, then that is just the return type since at Z, f and r are the same type.=C2=A0 A= nd
than at S, I can peel away the type of f by applying the function f and
then recursively calling.=C2=A0 But my understanding is missing something.<= br>
Does this make sense?

What am I not understanding?

Thank you,
/Malcolm
--000000000000da59640592f63ec5--