caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] Calling a single function on every member of a GADT?
@ 2020-01-07 19:24 rixed
  2020-01-07 20:21 ` Ivan Gotovchits
  0 siblings, 1 reply; 7+ messages in thread
From: rixed @ 2020-01-07 19:24 UTC (permalink / raw)
  To: caml-list

I'm basically trying to do the equivalent of this simple `fold` function:

---
module Simple =
struct
  type term =
     | Int of int
     | Add
     | App of term * term

  let rec fold i f = function
    | Int _ as t -> f i t
    | Add -> f i Add
    | App (x, y) as t -> f (fold (fold i f x) f y) t
end
---

... but using a GADT:

---
module Gadt =
struct
  type _ term =
     | Int : int -> int term
     | Add : (int -> int -> int) term
     | App : ('b -> 'a) term * 'b term -> 'a term

  let rec fold : type a. 'r -> ('r -> _ term -> 'r) -> 'r = fun i f -> function
    | Int _ as t -> f i t
    | Add -> f i Add
(*
     ^ Error: This pattern matches values of type (int -> int -> int) term
        but a pattern was expected which matches values of type int term
        Type int -> int -> int is not compatible with type int
*)
    | App (x, y) as t -> f (fold (fold i f x) f y) t
end
---

I've tried other variants of the syntax and got many encouragements but no green flag from the type-checker.
Why is the compiler expecting an int term in there? I though the whole point of the `type a. ...` syntax was to allow the matched type to vary from one pattern to the next?
Is there a way to do this?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Calling a single function on every member of a GADT?
  2020-01-07 19:24 [Caml-list] Calling a single function on every member of a GADT? rixed
@ 2020-01-07 20:21 ` Ivan Gotovchits
  2020-01-08  6:54   ` rixed
  0 siblings, 1 reply; 7+ messages in thread
From: Ivan Gotovchits @ 2020-01-07 20:21 UTC (permalink / raw)
  To: rixed; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 2097 bytes --]

On Tue, Jan 7, 2020 at 2:25 PM <rixed@happyleptic.org> wrote:

> ... but using a GADT:
>
> ---
> module Gadt =
> struct
>   type _ term =
>      | Int : int -> int term
>      | Add : (int -> int -> int) term
>      | App : ('b -> 'a) term * 'b term -> 'a term
>
>   let rec fold : type a. 'r -> ('r -> _ term -> 'r) -> 'r = fun i f ->
> function
>     | Int _ as t -> f i t
>     | Add -> f i Add
> (*
>      ^ Error: This pattern matches values of type (int -> int -> int) term
>         but a pattern was expected which matches values of type int term
>         Type int -> int -> int is not compatible with type int
> *)
>     | App (x, y) as t -> f (fold (fold i f x) f y) t
> end
> ---
>
> I've tried other variants of the syntax and got many encouragements but no
> green flag from the type-checker.
> Why is the compiler expecting an int term in there? I though the whole
> point of the `type a. ...` syntax was to allow the matched type to vary
> from one pattern to the next?
> Is there a way to do this?
>

It is the limitation of the let-bound polymorphism. A parameter of a
function is monomorphic in its body. The classical example doesn't even
reference any GADT,

    let example f  = f "hello", f 42

It won't compile even though we can provide a polymorphic function that can
applied both to integers and to strings, e.g., `exampe (fun x -> x)` should
be possible, but not, because of the let-bounded polymorphism. There are a
few solutions available in OCaml, the simplest is to use records, e.g.,

    type app = {apply : 'a. 'a -> 'a}

    let example {apply} = apply "hello", apply 42;;

    val example : app -> string * int = <fun>

Now we have `app` that is polymorphic.
In your case, I would define a visitor type, e.g.,

  type 'r visitor = {visit : 'a. 'a term -> 'r -> 'r}

  let rec fold : type a. 'r -> 'r visitor -> a term -> 'r =
    fun i f t -> match t with
      | Int _ as t -> f.visit i t
      | Add as t -> f.visit i t
      | App (x,y) as t ->
          let i = fold i f x in
          let i = fold i f y in
          f.visit i t


Cheers,
Ivan Gotovchits

[-- Attachment #2: Type: text/html, Size: 3077 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Calling a single function on every member of a GADT?
  2020-01-07 20:21 ` Ivan Gotovchits
@ 2020-01-08  6:54   ` rixed
  2020-01-08  9:43     ` Jacques Garrigue
  2020-01-08 20:32     ` Ivan Gotovchits
  0 siblings, 2 replies; 7+ messages in thread
From: rixed @ 2020-01-08  6:54 UTC (permalink / raw)
  To: Ivan Gotovchits; +Cc: caml-list

Hello and thank you for the answer.

On Tue, Jan 7, 2020, at 21:21, Ivan Gotovchits wrote:
> It is the limitation of the let-bound polymorphism. (...)
> In your case, I would define a visitor type, e.g.,
>  type 'r visitor = {visit : 'a. 'a term -> 'r -> 'r}

Oh I see. I've used this trick to force a function to be polymorphic, but I failed to see that this was the problem because to me `f` is not any more polymorphic when the `term` is a GADT than when it's not.

So there is no lighter syntax to specify that `f` should accept any member of a GADT than the syntax to specify that `f` should accept any type at all?

I wonder, is this just a limitation of the OCaml parser or is there some deep reason for these work-around (like is the case, from my understanding, for the value restriction)?

Sorry for asking such simple questions but I've build a mental model of how OCaml type checking works mostly from trials and errors, which used to work great until some versions ago :) 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Calling a single function on every member of a GADT?
  2020-01-08  6:54   ` rixed
@ 2020-01-08  9:43     ` Jacques Garrigue
  2020-01-08 20:32     ` Ivan Gotovchits
  1 sibling, 0 replies; 7+ messages in thread
From: Jacques Garrigue @ 2020-01-08  9:43 UTC (permalink / raw)
  To: caml-list

Actually, this is a rare case where using a polymorphic method may be 
handy too:

let rec fold : type a r. r -> <v : 'b. r -> 'b term -> r> -> a term -> r =
     fun i f -> function
     | Int _ as t -> f#v i t
     | Add -> f#v i Add
     | App (x, y) as t -> f#v (fold (fold i f x) f y) t

let v =
   object method v : type a. _ -> a Gadt.term -> _ =
     fun x -> function
       | Int n -> x+n
       | Add -> x+1
       | App _ -> x+2
   end

let r = Gadt.fold 0 v (App (App (Add, Int 3), Int 5))

The point being that to match on a Gadt you will anyway need to use the 
(type a) construct to allow refinement.

Jacques

On 08/01/2020 07:54, rixed@happyleptic.org wrote:
> Hello and thank you for the answer.
>
> On Tue, Jan 7, 2020, at 21:21, Ivan Gotovchits wrote:
>> It is the limitation of the let-bound polymorphism. (...)
>> In your case, I would define a visitor type, e.g.,
>>   type 'r visitor = {visit : 'a. 'a term -> 'r -> 'r}
> Oh I see. I've used this trick to force a function to be polymorphic, but I failed to see that this was the problem because to me `f` is not any more polymorphic when the `term` is a GADT than when it's not.
>
> So there is no lighter syntax to specify that `f` should accept any member of a GADT than the syntax to specify that `f` should accept any type at all?
>
> I wonder, is this just a limitation of the OCaml parser or is there some deep reason for these work-around (like is the case, from my understanding, for the value restriction)?
>
> Sorry for asking such simple questions but I've build a mental model of how OCaml type checking works mostly from trials and errors, which used to work great until some versions ago :)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Calling a single function on every member of a GADT?
  2020-01-08  6:54   ` rixed
  2020-01-08  9:43     ` Jacques Garrigue
@ 2020-01-08 20:32     ` Ivan Gotovchits
  2020-01-10  9:49       ` Malcolm Matalka
  1 sibling, 1 reply; 7+ messages in thread
From: Ivan Gotovchits @ 2020-01-08 20:32 UTC (permalink / raw)
  To: rixed; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 6661 bytes --]

On Wed, Jan 8, 2020 at 1:54 AM <rixed@happyleptic.org> wrote:

> Hello and thank you for the answer.
>
> On Tue, Jan 7, 2020, at 21:21, Ivan Gotovchits wrote:
> > It is the limitation of the let-bound polymorphism. (...)
> > In your case, I would define a visitor type, e.g.,
> >  type 'r visitor = {visit : 'a. 'a term -> 'r -> 'r}
>
> Oh I see. I've used this trick to force a function to be polymorphic, but
> I failed to see that this was the problem because to me `f` is not any more
> polymorphic when the `term` is a GADT than when it's not.
>
> So there is no lighter syntax to specify that `f` should accept any member
> of a GADT than the syntax to specify that `f` should accept any type at all?
>

Only three methods of introducing rank-2 polymorphism are known to me:
1. records
2. objects
3. first-class modules

Jacques has demonstrated the solution with objects, which might be a little
bit more lightweight, at least as you don't need to define a new data type
beforehand. But the invocation is more verbose and requires an annotation
from the caller side, which could be confusing. The third solution relies
on first-class modules and is even more verbose, at least on the definition
side. Just for the sake of completeness,

  module type Visitor = sig
    type t
    val term : t -> 'a term -> t
  end

  let rec fold : type a r. r -> (module Visitor with type t = r) -> a term
-> r =
    fun i ((module Visit) as f) t -> match t with
      | Int _ as t -> Visit.term i t
      | Add as t -> Visit.term i t
      | App (x,y) as t ->
          let i = fold i f x in
          let i = fold i f y in
          Visit.term i t

  let s = fold 0 (module struct
      type t = int
      let term x _ = x + 1
    end)

And again, it is not about GADT. GADT act as a red herring here. As I've
demonstrated earlier, using a simple pair will suffice to display the
limitation of the prenex polymorphism. Even no ADT is required, just apply
one term to another two and you will get them unified, e.g.,

    let f g x y : unit = g x; g y

will have type

   val f : ('a -> unit) -> 'a -> 'a -> unit

because 'a is quantified on the scope of `f` not `g`, in other words, it
has type (not an OCaml syntax)

   val f : forall 'a. ('a -> unit) -> 'a -> 'a -> unit

while we would like to have a type

   val f : forall 'b, 'c. (forall 'a. 'a -> unit) -> 'b -> 'c -> unit

OCaml doesn't allow us to define types like `('a. 'a -> 'a)` and the reason
is not that it is hard to extend the parser it is...

I wonder, is this just a limitation of the OCaml parser or is there some
> deep reason for these work-around (like is the case, from my understanding,
> for the value restriction)?


Yep, good catch! It is because of the impurity. Indeed, Haskell has the
Rank2Types extension that lets us write types like `(forall a. a -> ()) ->
b -> c -> ()`, with no extra syntactic burden (modulo having to provide the
type annotation). But functions in Haskell are pure, therefore it is
possible. To make the story short and obvious, let me do a simple
demonstration of how things can go wrong in a language with side-effects.
Let's go back to the simple example of pairs and the identity function.
Consider the following nasty identity function,

  let bad_id () =
    let cache = ref None in
    fun x -> match cache.contents with
      | None -> cache := Some x; x
      | Some cache -> cache

It has type `unit -> 'a -> 'a` therefore, if we would have the rank-1
polymorphism enabled for functions, we could apply it to the function

     let map2 : fun ('a. 'a -> 'a) -> 'b -> 'c -> 'b * 'c = fun f (x,y) ->
f x, f y

as

   let x,y : string * int = map2 (bad_id ()) "hello", 42

and will get a segmentation fault, as `y` will now have type int but hold a
string.

And here comes the syntax as a savior as it lets us specify functions that
are guaranteed to be syntactic values. Indeed, all three solutions
syntactically guarantee that the provided argument is a function, not a
closure. Indeed, let's introduce the universal identity via a record,

   type id = { f : 'a. 'a -> 'a}

and we can see that our `bad_id` is not accepted due to the value
restriction, while good_id, defined as,

   let good_id x = x

is perfectly fine, e.g.,

  let id1 = {f = good_id} (*accepted *)
  let id2 = {f = bad_id}   (* rejected *)

moreover, even a fine, but not syntactic, identity is also rejected

  let fine_id () x = x
  let id3 = {f = fine_id ()} (* rejected *)

with the message

  This field value has type 'b -> 'b which is less general than 'a. 'a -> 'a

The same is true with modules,

  module type Id = sig
    val f : 'a -> 'a
  end
  module Id1 : Id = struct let f = good_id end   (* accepted *)
  module Id2 : Id = struct let f = bad_id () end (* rejected *)
  module Id3 : Id = struct let f = fine_id () end (* rejected *)

and with objects (left as an exercise).

To summarize, in order to enable rank2 polymorphism we need a special kind
of values to bear universal functions, as we can't rely on ordinary
functions, which could be constructed using partial application. OCaml
already had objects and records, which serve as a fine media for
universally quantified functions. Later first class modules were
introduced, which could also be used for the same purpose. Probably, one
could devise a special syntax (or rely on the new attributes and extensions
syntax, e.g., `map2 [%rank2 : fun x -> x] ("hello",42)` but probably this
will lead to an unnecessary bloating of the language and the
implementation, especially since we already have three solutions with a
more or less tolerable syntax (and are in the base language, not an
extension).  Besides, if we will use the `[@@unboxed]` annotation, or
visitor will have the same representation as a function, e.g.,

    type 'r visitor = {visit : 'a. 'r -> 'a term -> 'r} [@@unboxed]
    let count x _ = x + 1
    let counter = {visit=count}

and

  # Core_kernel.phys_same count counter;;
  - : bool = true

Concerning rank-n polymorphism, in OCaml is is achieved using functors.
Yes, they are a little bit syntactically heavy and force us to write
signatures, but this is necessary anyway as rank-n is undecidable
(non-inferrable). Finally, as a real-world example [1] of rank-2
polymorphism consider the universal WAVL tree that is a binary tree with
each element having a different type (aka heterogeneous map). We use it in
BAP as a backing store. You might find a few tricks there, especially using
continuation-passing in the recursive cases.

Cheers,
Ivan


[1]:
https://github.com/BinaryAnalysisPlatform/bap/blob/b40689e636607b977758af048b79d65684ce48c3/lib/knowledge/bap_knowledge.ml#L847-L1693

[-- Attachment #2: Type: text/html, Size: 9009 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Calling a single function on every member of a GADT?
  2020-01-08 20:32     ` Ivan Gotovchits
@ 2020-01-10  9:49       ` Malcolm Matalka
  2020-01-10 19:52         ` Ivan Gotovchits
  0 siblings, 1 reply; 7+ messages in thread
From: Malcolm Matalka @ 2020-01-10  9:49 UTC (permalink / raw)
  To: Ivan Gotovchits; +Cc: rixed, caml-list

Thank you for the explanation Ivan.  I have two questions inline.

Ivan Gotovchits <ivg@ieee.org> writes:

> It has type `unit -> 'a -> 'a` therefore, if we would have the rank-1
> polymorphism enabled for functions, we could apply it to the function
>
>      let map2 : fun ('a. 'a -> 'a) -> 'b -> 'c -> 'b * 'c = fun f (x,y) ->
> f x, f y

Small thing, but wouldn't the faux type be the following, based on your
usage (making sure I'm following):

fun ('a. 'a -> 'a) -> ('b * 'c) -> 'b * 'c

> as
>
>    let x,y : string * int = map2 (bad_id ()) "hello", 42
>
> and will get a segmentation fault, as `y` will now have type int but hold a
> string.
>
> And here comes the syntax as a savior as it lets us specify functions that
> are guaranteed to be syntactic values. Indeed, all three solutions
> syntactically guarantee that the provided argument is a function, not a
> closure. Indeed, let's introduce the universal identity via a record,
>
>    type id = { f : 'a. 'a -> 'a}
>
> and we can see that our `bad_id` is not accepted due to the value
> restriction, while good_id, defined as,
>
>    let good_id x = x
>
> is perfectly fine, e.g.,
>
>   let id1 = {f = good_id} (*accepted *)
>   let id2 = {f = bad_id}   (* rejected *)
>
> moreover, even a fine, but not syntactic, identity is also rejected
>
>   let fine_id () x = x
>   let id3 = {f = fine_id ()} (* rejected *)
>
> with the message
>
>   This field value has type 'b -> 'b which is less general than 'a. 'a -> 'a
>

Why is type checking creating a record different than type checking a
function argument?

If we had the syntax (or something like it):

let map2 : ('a. 'a -> 'a) -> ('b * 'c) -> ('b * 'c)

Why would the type checker not be able to see that

map2 good_id ("hi", 42)

is valid but

map2 (fine_id ()) ("hi", 32)

is not, using the same logic that is verifying creating the "id" record
is not valid?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] Calling a single function on every member of a GADT?
  2020-01-10  9:49       ` Malcolm Matalka
@ 2020-01-10 19:52         ` Ivan Gotovchits
  0 siblings, 0 replies; 7+ messages in thread
From: Ivan Gotovchits @ 2020-01-10 19:52 UTC (permalink / raw)
  To: Malcolm Matalka; +Cc: rixed, caml-list

[-- Attachment #1: Type: text/plain, Size: 3519 bytes --]

On Fri, Jan 10, 2020 at 4:50 AM Malcolm Matalka <mmatalka@gmail.com> wrote:

> Thank you for the explanation Ivan.  I have two questions inline.
>
> Ivan Gotovchits <ivg@ieee.org> writes:
>
> > It has type `unit -> 'a -> 'a` therefore, if we would have the rank-1
> > polymorphism enabled for functions, we could apply it to the function
> >
> >      let map2 : fun ('a. 'a -> 'a) -> 'b -> 'c -> 'b * 'c = fun f (x,y)
> ->
> > f x, f y
>
> Small thing, but wouldn't the faux type be the following, based on your
> usage (making sure I'm following):
>
> fun ('a. 'a -> 'a) -> ('b * 'c) -> 'b * 'c
>

Yep, sure :) That's the problem with faux-typing)


> Why is type checking creating a record different than type checking a
> function argument?
>
> If we had the syntax (or something like it):
>
> let map2 : ('a. 'a -> 'a) -> ('b * 'c) -> ('b * 'c)
>
> Why would the type checker not be able to see that
>
> map2 good_id ("hi", 42)
>
> is valid but
>
> map2 (fine_id ()) ("hi", 32)
>
> is not, using the same logic that is verifying creating the "id" record
> is not valid?
>

I believe it is possible, as it is possible in Haskell (with RankNTypes and
ScopedTypeVariables). The main (theoretical) difference is that in OCaml we
need to check whether an expression is expansive and use a specialized
generalization in case if it is (for the relaxed value restriction). It
will, however, complicate the type inference engine a lot, but most
importantly, changing the typing rule of functions will have a tremendous
impact on the language. So this would be a very impractical solution.
Especially, since we don't have the mechanism of language extensions,
enabling RankNTypes will make a lot of programs untypeable, as they will
now require type annotations (recall that RankN is undecidable in general).
It could probably be implemented as a compiler command line parameter, like
`-rectypes` but this will be still quite impractical since more often code
like `fun f -> f 1, f true` is a programmer error, rather than a true
request for universal polymorphism (the same as with rectypes, recursive
types a more often an error rather than a deliberate attempt). Therefore,
enabling RankN(^1) polymorphism will type too many programs (not that it is
unsound, just many programs won't have sense) at the cost of even more
obscure type errors. On the other hand, we have three syntactic constructs
that let us express non-prenex polymorphism of the necessary rank(^2)
without breaking anything else. So it looks like a good deal - we can have
rankN polymorphism and decidable type checker at the same time. Just think
of polymorphic records/methods as an embedded DSL for rankN polymorphism.

============
Footnotes:

1) An important point, that I forgot to notice, is that enabling scoped
type variables, will inevitably enable rankN polymorphism, e.g., since now
any type could be a polytype, then suppose we have type `'a. ('b.'b -> 'a)
-> 'a` could be instantiated to 'a = 'd. ('c. ->  'd) -> 'd, so that our
type is now `'d. ('b. 'b -> ('c. 'c -> 'd) -> 'd) -> ('c. 'c -> 'd) -> 'd`
which is now rank3. Therefore, enabling arbitrary quantification in the
arrow type will lead to rankN and immediately make undecidable most of the
type checker.

2) We can craft arbitrary rank using records with universally quantified
type variables, e.g., here is an example of rank3 polymoprhism:

  type 'a rank1 = {f1 : 's. 's -> 'a}
  type 'a rank2 = {f2 : 'r. 'r -> 'a rank1}

Indeed, `f2` has type `'a.('r. 'r -> ('s. 's -> 'a)`

[-- Attachment #2: Type: text/html, Size: 4704 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-01-10 19:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-07 19:24 [Caml-list] Calling a single function on every member of a GADT? rixed
2020-01-07 20:21 ` Ivan Gotovchits
2020-01-08  6:54   ` rixed
2020-01-08  9:43     ` Jacques Garrigue
2020-01-08 20:32     ` Ivan Gotovchits
2020-01-10  9:49       ` Malcolm Matalka
2020-01-10 19:52         ` Ivan Gotovchits

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).