Re: [Caml-list] Question on the covariance of a GADT (polymorphic

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

From: Josh Berdine <josh@berdine.net>
To: Oleg <oleg@okmij.org>
Cc: francois.pottier@inria.fr, caml-list@inria.fr
Subject: Re: [Caml-list] Question on the covariance of a GADT (polymorphic
Date: Tue, 6 Oct 2020 20:04:56 +0100	[thread overview]
Message-ID: <FEA35C98-1E7A-402E-9B5D-9585371E4009@berdine.net> (raw)
In-Reply-To: <20201006160519.GA2217@Melchior.localnet>

> On Oct 6, 2020, at 5:05 PM, Oleg <oleg@okmij.org> wrote:
> 
>> type 'a t =
>>   | Wine: [> `Wine ] t
>>   | Food: [> `Food ] t
>>   | Box : 'a t * 'a t -> 'a t
>>   | Fridge: [< `Wine | `Food ] t -> [> `Fridge ] t
>> 
>> In this definition, the type parameter 'a is a phantom parameter. It is not
>> used to describe the type of a value; it is used only to restrict the set of
>> values of type "t" that can be constructed.
>> 
>> The goal here is to allow storing wine and food (and boxes containing
>> boxes containing wine and food) in a fridge, but forbid storing a
>> fridge in a fridge (or a fridge in a box in a fridge, etc.).
> 
> This is indeed a very `popular' problem. On June 21 this year Josh
> Berdine posed almost the same question (without wine and food, alas).
> On 28 June 2020 I sent the reply with three solutions. Perhaps it
> would be easier to point to the code, which also has many comments and
> explanations:
> 
>        http://okmij.org/ftp/tagless-final/subtyping.ml
> 
> The gist of the first two solutions is to exploit the fact that
> tagless-final is sort of isomorphic to GADTs. That is, lots of things
> that can be done with GADTs can be done without (or with GADTs hidden
> away). That hiding has benefit of no longer bringing the problems with
> variance. No GADTs (at least, not in the part facing the user), no
> variance problems, at least, not for the end user. The library author
> may use regular ADT, which are also non-problematic. A regular ADT
> doesn't have the same typing guarantees -- but the typing is enforced
> by the signature (at the module level), so there is no loss.
> 
> I was going to write an article for my web site explaining this and a
> few earlier answers to the similar problems. Maybe I will eventually
> get to it.

Francois, your examples is way better than mine! :-)

Apologies for dropping the ball before, I got busy and there was some chaos going on.

But Oleg, thank you very much for digging into this so thoroughly and making such concrete suggestions!

There are some points I'm unsure of. My understanding of `Reducer` is that it is a way of implementing a particular (reduction) function on expressions from this little language, where rather than the "initial" approach of defining an explicit datatype for the expressions and then reducing them, you use constructor functions that eagerly perform the reduction, and extract the result with `observe`.

What I'm unsure of is how this generalizes to additional functions on expressions. For instance, suppose I also wanted to define a function which computed the set of int literals that appear in the (unreduced) expression? One possibility would seem to be to revise your second solution (say) so that `'a Reducer.t` is `'a * int list` instead of `'a`, and revise the constructor functions to compute this list of ints eagerly.

My working assumption is that which operation on the expression will be performed isn't known at the time when they are constructed. Otherwise a variant of the whole `Reducer` module could be defined for each operation, each with its own `'a t` type defined as appropriate.

More pressingly, I think, is what operations such as `map_ints : (int -> int) -> 'a exp -> 'a exp` would look like. With an initial style representation, this could be defined by traversing the abstract syntax and rebuilding it with the updated `int`s. One approach would be to change `Reducer.t` to a sort of abstract syntax, but this seems to me to lead straight back to the original `'a exp` type. This is, as far as I understand, coming up against the usual situation where the "initial" representation is syntactic in nature and the "final" representation is semantic. I don't know how much is known about handling such cases in the final style though.

Does it sound like I simply have not understood your solution? :-)

Cheers, Josh


> On Jun 27, 2020, at 4:36 PM, Oleg <oleg@okmij.org> wrote:
> 
> Josh Berdine has posed a problem: 
>> As a concrete example, consider something like:
>> ```
>> type _ exp =
>>  | Integer : int -> [> `Integer] exp
>>  | Float : float -> [> `Float] exp
>>  | Add : ([< `Integer | `Float] as 'n) exp * 'n exp -> 'n exp
>>  | String : string -> [> `String] exp
>> ```
>> The intent here is to use polymorphic variants to represent a small
>> (upper semi-) lattice, where basically each point corresponds to a
>> subset of the constructors. The type definition above is admitted, but
>> while the index types allow subtyping ... 
>> this does not extend to the base type:
>> ```
>> # let widen_exp x = (x : [`Integer] exp :> [> `Integer | `Float] exp);;
>> Line 1, characters 18-67:
>> 1 | let widen_exp x = (x : [`Integer] exp :> [> `Integer | `Float] exp);;
>>                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> Error: Type [ `Integer ] exp is not a subtype of
>>         ([> `Float | `Integer ] as 'a) exp 
>>       The first variant type does not allow tag(s) `Float
>> ```
>> This makes sense since `type _ exp` is not covariant. But trying to
>> declare it to be covariant doesn't fly, yielding:
>> ```
>> Error: In this GADT definition, the variance of some parameter
>>       cannot be checked
>> ```
> 
> We demonstrate three solutions to this problem.  The first has an
> imperfection, but is easier to explain. The second removes the
> imperfection. The third one is verbose, but requires the least of
> language features and can be implemented in SML or perhaps even
> Java. All three solutions give the end user the same static
> guarantees, statically preventing building of senseless
> expressions. All three rely on the tagless-final style, in effect
> viewing the problem as a language design problem -- designing a DSL
> with subtyping. It has been my experience that the tagless-final,
> algebraic point of view typically requires less fancy
> typing. Incidentally, the third solution, of explicit coercions, can
> be combined with GADTs and is hence the general solution to the posed
> problem, showing how to do phantom index subtyping with GADTs.
> 
> Presenting all three solutions is too long for a message, so one
> should look at the complete code with many comments:
> 
>        http://okmij.org/ftp/tagless-final/subtyping.ml
> 
> Below is the outline of the first solution, the easiest to
> explain. (The others just add slight variations; the main
> example looks almost the same in all three).
> 
> We will think of the problem as a DSL with subtyping.
> Let's define its syntax and the type system, as an OCaml signature:
> 
> module type exp = sig
>  type +'a t
>  val int    : int   -> [> `int] t
>  val float  : float -> [> `float] t
>  val add    : ([< `int | `float] as 'n) t -> 'n t -> [> `int | `float] t
>  val string : string -> [> `string] t
> end
> 
> The language expressions are indexed with a subset of tags `int,
> `float, `string. The language has the obvious subtyping: [> `int]
> can be weakened to [> `int | `float]. This weakening, and general typechecking
> of our DSL is performed by OCaml's type checker for us. Let's see an
> example
> 
> module Ex1(I:exp) = struct
>  open I
> 
>  (* We can put expressions of different sorts into the same list *)
>  let l1 = [int 1; float 2.0; add (add (int 3) (float 1.0)) (int 4)]
>  (* val l1 : [> `float | `int ] I.t list *)
> 
>  let l2 = string "str" :: l1
>  (* val l2 : [> `float | `int | `string ] I.t list *)
> 
>  (* An attempt to build an invalid expression fails, with a good error message:
>  let x = add (int 1) (string "s")
>                      ^^^^^^^^^^^^
>  Error: This expression has type [> `string ] t
>       but an expression was expected of type [< `float | `int > `int ] t
>       The second variant type does not allow tag(s) `string
>  *)
> 
>  (* We can define a function to sum up elements of a list of expressions,
>     returning a single expression with addition
>  *)
>  let rec sum l = List.fold_left add (int 0) l
>  (* val sum : [ `float | `int ] I.t list -> [ `float | `int ] I.t *)
> 
>  (* We can sum up l1 *)
>  let l1v = sum l1
>  (* val l1v : [ `float | `int ] I.t *)
> 
>  (* but, predictably, not l2
>  let l2v = sum l2
>  *)
> end
> 
> Now, what we can do with the expressions? We can print/show them
> 
> module Show : (exp with type 'a t = string) = struct
>   type 'a t = string
>   ... elided ...
> end
> 
> let _ = let module M = Ex1(Show) in M.l1
> (* - : string list = ["1"; "2."; "Add(Add(3,1.),4)"] *)
> 
> We can also reduce them. The following is the reducer -- although
> in reality it does the normalization-by-evaluation (NBE)
> 
> A subset of expressions: values
> 
> module type vals = sig
>  type +'a t
>  val int    : int    -> [> `int] t
>  val float  : float  -> [> `float] t
>  val string : string -> [> `string] t
> end
> 
> module Reducer(I:vals) : 
>   (sig include exp val observe : ([> `float | `int ] as 'a) t -> 'a I.t end) 
>  = struct
>   type 'a t = Unk of 'a I.t | Int of int | Float of float
>   let observe = function
>   | Unk x -> x
>   | Int x -> I.int x
>   | Float x -> I.float x
> 
>   let int x    = Int x
>   let float x  = Float x
>   let string x = Unk (I.string x)
> 
>   let add x y = match (x,y) with
>   | (Int x, Int y) -> Int (x+y)
>   | (Float x, Float y) -> Float (x+.y)
>   | (Int x, Float y) | (Float y, Int x) -> Float (float_of_int x +. y)
>   (* This is the imperfection. We know that the case cannot happen,
>      but this is not clear from the type system.
>      We should stress that this imperfection does not compromise the guarantees
>      offered to the end user, who never sees the internals of Reducer,
>      and can't even access them.
>    *)
>   | _ -> assert false
> end
> 
> (* We can now evaluate exp to values. We need a way to show them though.
>   Luckily, exp is a superset of values, and we already wrote the show
>   interpreter for exp
> *)
> let _ = let module I = Reducer(Show) in
>        let module M = Ex1(I) in List.map I.observe M.l1
> (* - : string list = ["1"; "2."; "8."] *)
> 
> The second solutions removes the imperfection in Reduce, making
> Reduce.add total, by making the types more phantom. The third solution
> does the obvious thing: implementing subtyping using explicit
> coercions. In the given example, it doesn't add too much annoyance to
> the user. It does however remove the reliance on variance and so
> is applicable to GADTs in general. One can imagine more solutions
> along the shown lines.
>

next prev parent reply	other threads:[~2020-10-06 19:05 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-06 11:57 [Caml-list] Question on the covariance of a GADT (polymorphic variants involved) François Pottier
2020-10-06 16:05 ` [Caml-list] Question on the covariance of a GADT (polymorphic Oleg
2020-10-06 16:45   ` François Pottier
2020-10-06 19:04   ` Josh Berdine [this message]
2020-10-06 19:18     ` [Caml-list] Question on the covariance of a GADT (polymorphic variants involved) François Pottier
2020-10-06 20:10       ` Josh Berdine
2020-10-07 13:17       ` [Caml-list] Question on the covariance of a GADT (polymorphic Oleg
2020-10-07 13:17     ` Oleg
2020-10-07  8:10   ` David Allsopp

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=FEA35C98-1E7A-402E-9B5D-9585371E4009@berdine.net \
    --to=josh@berdine.net \
    --cc=caml-list@inria.fr \
    --cc=francois.pottier@inria.fr \
    --cc=oleg@okmij.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).