Very interesting and thought-provoking writeup, thank you!
Incidentally, we're investigating the same venues, in our CMU BAP project, as we found out that we need the extensibility in the style of type classes/canonical structures to decouple complex dependencies which arise in the program analysis domain.
In fact, we build our new BAP 2.0 framework largely on your [tagless-final][1] style which, let's admit it, works much better with type classes. Therefore we ended up implementing extensible type representations along with registries for our type classes. Unfortunately, the idea of storing rules in the registry didn't visit us, but we're now thinking about how to incorporate it (the classes that we have are very nontrivial, usually having hundreds of methods, so we're currently using functors to manually derive on class from another, and registering the resulting structures - but using your approach we can register functors as well and automate the derivation). We also didn't generalize the type class instantiation, so our solutions do have some boilerplate (but I have to admit, that the total number of type classes that we need is not very big, so it really never bothered us). What could be surprising is that the universe of types actually grew quite large, that large that the linear search in the registry is not an option for us anymore. In fact, we have so many comparisons between treps, that instead of extracting the extension constructor number from an extensible variant we had to rely on our own freshly generated identifier. But I'm running in front of myself, an important lesson that we have learned is that treps should not only be equality comparable but also ordered (and even hashable) so that we can implement our registries as hash tables. It is also better to keep them abstract so that we can later extend them without breaking user code (to implement introspection as well as different resolution schemes). This is basically an elaboration of your approach (which is also could be commonly found in Janestreet's Core (Type_equal.Uid.t) and other implementations of existentials). In our case, we ended up with the following implementation
```
type 'a witness = ..
module type Witness = sig
type t
type _ witness += Id : t witness
end
type 'a typeid = (module Witness with type t = 'a)
type 'a key = {
ord : int;
key : 'a typeid;
name : string; (* for introspection *)
show : 'a -> Sexp.t; (* also for introspection *)
}
```
Now, we can use the `ord` field to order types, compare them, store in maps, hash tables, and even arrays. E.g., this is how our `teq` function looks like,
```
let same (type a b) x y : (a,b) Type_equal.t =
if
x.id =
y.id then
let module X = (val x.key : Witness with type t = a) in
let module Y = (val y.key : Witness with type t = b) in
match X.Id with
| Y.Id -> Type_equal.T
| _ -> failwith "broken type equality"
else failwith "types are not equal"
```
It is often used in the context where we already know that `
x.id =
y.id`, e.g., when we already found an entry, so we just need to obtain the equality witness (we use Janestreet's Type_equal.T, which is the same as yours eq type).
Concerning the representation of the registry, we also experimented with different approaches (since we have a few ways to make a type existential in OCaml), and found out the following to be the most efficient and easy to work with,
```
type ordered = {
order : 'a. 'a key -> 'a -> 'a -> int;
} [@@unboxed]
```
Notice, that thanks to `[@@unboxed]` we got a free unpacked existential. We will next store `ordered` in our registry, which is a hash table,
```
let ordered : ordered Hashtbl.M(Int).t = Hashtbl.create (module Int)
```
and register it as simple as,
```
let register: type p. p Key.t -> (p -> p -> int) -> unit = fun key order ->
Hashtbl.add_exn vtables ~key:(uid key) ~data:{
order = fun (type a) (k : a key) (x : a) (y : a) ->
let T = same k key in (* obtain the witness that we found the right structure *)
order x y
}
```
Instead of a hashtable, it is also possible to use `ordered array ref` (since our `ord` is just an integer which we increment every time a new class is declared). This will give us even faster lookup.
I hope that this was interesting. And if yes, I'm ready to elaborate more on our design decision or to hear suggestions and critics. Here are a few links:
Cheers,
Ivan Gotovchits
Research Scientist, CMU Cylab