caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Ivan Gotovchits <ivg@ieee.org>
To: rixed@happyleptic.org
Cc: caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] Calling a single function on every member of a GADT?
Date: Wed, 8 Jan 2020 15:32:58 -0500	[thread overview]
Message-ID: <CALdWJ+wh0zWbv2ejzeFFPahbY0DKqMW8VJBYkeNSZzuAtSw3-Q@mail.gmail.com> (raw)
In-Reply-To: <41592816-f82e-43a4-b67f-02e69623fe23@www.fastmail.com>

[-- Attachment #1: Type: text/plain, Size: 6661 bytes --]

On Wed, Jan 8, 2020 at 1:54 AM <rixed@happyleptic.org> wrote:

> Hello and thank you for the answer.
>
> On Tue, Jan 7, 2020, at 21:21, Ivan Gotovchits wrote:
> > It is the limitation of the let-bound polymorphism. (...)
> > In your case, I would define a visitor type, e.g.,
> >  type 'r visitor = {visit : 'a. 'a term -> 'r -> 'r}
>
> Oh I see. I've used this trick to force a function to be polymorphic, but
> I failed to see that this was the problem because to me `f` is not any more
> polymorphic when the `term` is a GADT than when it's not.
>
> So there is no lighter syntax to specify that `f` should accept any member
> of a GADT than the syntax to specify that `f` should accept any type at all?
>

Only three methods of introducing rank-2 polymorphism are known to me:
1. records
2. objects
3. first-class modules

Jacques has demonstrated the solution with objects, which might be a little
bit more lightweight, at least as you don't need to define a new data type
beforehand. But the invocation is more verbose and requires an annotation
from the caller side, which could be confusing. The third solution relies
on first-class modules and is even more verbose, at least on the definition
side. Just for the sake of completeness,

  module type Visitor = sig
    type t
    val term : t -> 'a term -> t
  end

  let rec fold : type a r. r -> (module Visitor with type t = r) -> a term
-> r =
    fun i ((module Visit) as f) t -> match t with
      | Int _ as t -> Visit.term i t
      | Add as t -> Visit.term i t
      | App (x,y) as t ->
          let i = fold i f x in
          let i = fold i f y in
          Visit.term i t

  let s = fold 0 (module struct
      type t = int
      let term x _ = x + 1
    end)

And again, it is not about GADT. GADT act as a red herring here. As I've
demonstrated earlier, using a simple pair will suffice to display the
limitation of the prenex polymorphism. Even no ADT is required, just apply
one term to another two and you will get them unified, e.g.,

    let f g x y : unit = g x; g y

will have type

   val f : ('a -> unit) -> 'a -> 'a -> unit

because 'a is quantified on the scope of `f` not `g`, in other words, it
has type (not an OCaml syntax)

   val f : forall 'a. ('a -> unit) -> 'a -> 'a -> unit

while we would like to have a type

   val f : forall 'b, 'c. (forall 'a. 'a -> unit) -> 'b -> 'c -> unit

OCaml doesn't allow us to define types like `('a. 'a -> 'a)` and the reason
is not that it is hard to extend the parser it is...

I wonder, is this just a limitation of the OCaml parser or is there some
> deep reason for these work-around (like is the case, from my understanding,
> for the value restriction)?


Yep, good catch! It is because of the impurity. Indeed, Haskell has the
Rank2Types extension that lets us write types like `(forall a. a -> ()) ->
b -> c -> ()`, with no extra syntactic burden (modulo having to provide the
type annotation). But functions in Haskell are pure, therefore it is
possible. To make the story short and obvious, let me do a simple
demonstration of how things can go wrong in a language with side-effects.
Let's go back to the simple example of pairs and the identity function.
Consider the following nasty identity function,

  let bad_id () =
    let cache = ref None in
    fun x -> match cache.contents with
      | None -> cache := Some x; x
      | Some cache -> cache

It has type `unit -> 'a -> 'a` therefore, if we would have the rank-1
polymorphism enabled for functions, we could apply it to the function

     let map2 : fun ('a. 'a -> 'a) -> 'b -> 'c -> 'b * 'c = fun f (x,y) ->
f x, f y

as

   let x,y : string * int = map2 (bad_id ()) "hello", 42

and will get a segmentation fault, as `y` will now have type int but hold a
string.

And here comes the syntax as a savior as it lets us specify functions that
are guaranteed to be syntactic values. Indeed, all three solutions
syntactically guarantee that the provided argument is a function, not a
closure. Indeed, let's introduce the universal identity via a record,

   type id = { f : 'a. 'a -> 'a}

and we can see that our `bad_id` is not accepted due to the value
restriction, while good_id, defined as,

   let good_id x = x

is perfectly fine, e.g.,

  let id1 = {f = good_id} (*accepted *)
  let id2 = {f = bad_id}   (* rejected *)

moreover, even a fine, but not syntactic, identity is also rejected

  let fine_id () x = x
  let id3 = {f = fine_id ()} (* rejected *)

with the message

  This field value has type 'b -> 'b which is less general than 'a. 'a -> 'a

The same is true with modules,

  module type Id = sig
    val f : 'a -> 'a
  end
  module Id1 : Id = struct let f = good_id end   (* accepted *)
  module Id2 : Id = struct let f = bad_id () end (* rejected *)
  module Id3 : Id = struct let f = fine_id () end (* rejected *)

and with objects (left as an exercise).

To summarize, in order to enable rank2 polymorphism we need a special kind
of values to bear universal functions, as we can't rely on ordinary
functions, which could be constructed using partial application. OCaml
already had objects and records, which serve as a fine media for
universally quantified functions. Later first class modules were
introduced, which could also be used for the same purpose. Probably, one
could devise a special syntax (or rely on the new attributes and extensions
syntax, e.g., `map2 [%rank2 : fun x -> x] ("hello",42)` but probably this
will lead to an unnecessary bloating of the language and the
implementation, especially since we already have three solutions with a
more or less tolerable syntax (and are in the base language, not an
extension).  Besides, if we will use the `[@@unboxed]` annotation, or
visitor will have the same representation as a function, e.g.,

    type 'r visitor = {visit : 'a. 'r -> 'a term -> 'r} [@@unboxed]
    let count x _ = x + 1
    let counter = {visit=count}

and

  # Core_kernel.phys_same count counter;;
  - : bool = true

Concerning rank-n polymorphism, in OCaml is is achieved using functors.
Yes, they are a little bit syntactically heavy and force us to write
signatures, but this is necessary anyway as rank-n is undecidable
(non-inferrable). Finally, as a real-world example [1] of rank-2
polymorphism consider the universal WAVL tree that is a binary tree with
each element having a different type (aka heterogeneous map). We use it in
BAP as a backing store. You might find a few tricks there, especially using
continuation-passing in the recursive cases.

Cheers,
Ivan


[1]:
https://github.com/BinaryAnalysisPlatform/bap/blob/b40689e636607b977758af048b79d65684ce48c3/lib/knowledge/bap_knowledge.ml#L847-L1693

[-- Attachment #2: Type: text/html, Size: 9009 bytes --]

  parent reply	other threads:[~2020-01-08 20:32 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-07 19:24 rixed
2020-01-07 20:21 ` Ivan Gotovchits
2020-01-08  6:54   ` rixed
2020-01-08  9:43     ` Jacques Garrigue
2020-01-08 20:32     ` Ivan Gotovchits [this message]
2020-01-10  9:49       ` Malcolm Matalka
2020-01-10 19:52         ` Ivan Gotovchits

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALdWJ+wh0zWbv2ejzeFFPahbY0DKqMW8VJBYkeNSZzuAtSw3-Q@mail.gmail.com \
    --to=ivg@ieee.org \
    --cc=caml-list@inria.fr \
    --cc=rixed@happyleptic.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).