caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] GADTs and associative container
@ 2013-07-09 20:43 Goswin von Brederlow
  2013-07-09 20:52 ` Lukasz Stafiniak
  2013-07-10  2:22 ` Jeremy Yallop
  0 siblings, 2 replies; 7+ messages in thread
From: Goswin von Brederlow @ 2013-07-09 20:43 UTC (permalink / raw)
  To: caml-list

Hi,

I'm wondering if one can have an ascociative container, like a Hashtbl.t
with dependent types (GADTs as the key, value depending on the key).
Something like this:

module H = struct
  type ('a, 'b) t = ('a, 'b) Hashtbl.t
  let create : type a b . int -> (a b, a) t =
		 fun x -> Hashtbl.create x
  let add : type a b . (a b, a) t -> a b -> a -> unit =
	      fun h k v -> Hashtbl.add h k v
  let find : type a b . (a b, a) t -> a b -> a =
	      fun h k -> Hashtbl.find h k
end
    
type one = ONE
type two = TWO

type _ g =
| ONEt : one g
| TWOt : two g

let () =
  let h = H.create 0
  in
  H.add h ONEt ONE;
  H.add h TWOt TWO;
  assert (H.find h ONEt = ONE);
  assert (H.find h TWOt = TWO);
  ()


BUT:

    let create : type a b . int -> (a b, a) t =
                                    ^^^
Error: Unbound type constructor b


Is there some special syntax I'm missing or is it simply impossible to
declare such a container in the abstract?

MfG
	Goswin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] GADTs and associative container
  2013-07-09 20:43 [Caml-list] GADTs and associative container Goswin von Brederlow
@ 2013-07-09 20:52 ` Lukasz Stafiniak
  2013-07-10  9:52   ` Leo White
  2013-07-10 10:16   ` Alain Frisch
  2013-07-10  2:22 ` Jeremy Yallop
  1 sibling, 2 replies; 7+ messages in thread
From: Lukasz Stafiniak @ 2013-07-09 20:52 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: Caml

[-- Attachment #1: Type: text/plain, Size: 1115 bytes --]

On Tue, Jul 9, 2013 at 10:43 PM, Goswin von Brederlow <goswin-v-b@web.de>wrote:

> Hi,
>
> I'm wondering if one can have an ascociative container, like a Hashtbl.t
> with dependent types (GADTs as the key, value depending on the key).
> Something like this:
>
> module H = struct
>   type ('a, 'b) t = ('a, 'b) Hashtbl.t
>   let create : type a b . int -> (a b, a) t =
>                  fun x -> Hashtbl.create x
>   let add : type a b . (a b, a) t -> a b -> a -> unit =
>               fun h k v -> Hashtbl.add h k v
>   let find : type a b . (a b, a) t -> a b -> a =
>               fun h k -> Hashtbl.find h k
> end
>


> BUT:
>
>     let create : type a b . int -> (a b, a) t =
>                                     ^^^
> Error: Unbound type constructor b
>
>
> Is there some special syntax I'm missing or is it simply impossible to
> declare such a container in the abstract?
>
> I think you need higher kinded types, not GADTs. Haskell has them, for
example you can write code that only depends on the type class of "b"
(which is parameterized by "a"), and "b" has signature "* -> *" or
something like that.

[-- Attachment #2: Type: text/html, Size: 1713 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] GADTs and associative container
  2013-07-09 20:43 [Caml-list] GADTs and associative container Goswin von Brederlow
  2013-07-09 20:52 ` Lukasz Stafiniak
@ 2013-07-10  2:22 ` Jeremy Yallop
  2013-07-11 14:17   ` Goswin von Brederlow
  1 sibling, 1 reply; 7+ messages in thread
From: Jeremy Yallop @ 2013-07-10  2:22 UTC (permalink / raw)
  To: Caml List; +Cc: Goswin von Brederlow

It is indeed possible to create associative containers where the types
of the values depend on the types of the keys.  Let's see what can to
be done to turn the standard hash table into such a container, using
GADTs for keys.

We'll start with the interface.  The standard hash table interface
(Hashtbl.S) looks like this, in part:

    module type S =
    sig
      type key
      type 'a t
      val create : int -> 'a t
      val add : 'a t -> key -> 'a -> unit
      val remove : 'a t -> key -> unit
      val find : 'a t -> key -> 'a
      val iter : (key -> 'a -> unit) -> 'a t -> unit
    end

The type of tables ('a t) is parameterized by the type of values,
since each table holds a single type of value.  We're aiming instead
to have value types depend on key types, so we'll move the type
parameter into the key type.  Making this change mechanically
throughout the interface gives us the following:

    module type GS =
    sig
      type 'a key
      type t
      val create : int -> t
      val add : t -> 'a key -> 'a -> unit
      val remove : t -> 'a key -> unit
      val find : t -> 'a key -> 'a
      val iter : < f: 'a. 'a key -> 'a -> unit > -> t -> unit
    end

Actually, I've made one additional change, in the type of iter.  In
the regular Hashtbl iter function we can get by with ML-style
polymorphism, where all the type variables are implicitly quantified
at the outermost point.  This constrains the function passed to iter
to be monomorphic, which is fine, since regular Hashtbls only support
a single value type.  In our revised interface, however, the function
argument must be polymorphic, since it needs to handle *any* suitable
pairing of keys and values.  The object type allows quantifier
nesting, giving us the polymorphism we need.

The type of iter is a hint of things to come: putting things *into* a
polymorphic hash table is a doddle, but there's a bit of a knack to
getting them *out* again intact, as we'll see further down.

Next up is the definition of keys.  The standard Hashtbl.Make functor
uses a definition of keys that bundles the key type together with
equality and hashing operations, like this:

    module type HashedType =
    sig
       type t
       val equal : t -> t -> bool
       val hash : t -> int
    end

Of course, it's no good having just any old definitions of equal and
hash.  It's essential that equal l r implies hash l = hash r, for
example, and there are additional fairly obvious constraints on equal.

Our analogue to HashedType, GHashedType, comes with some additional
operations (and so places additional demands on the creator of hash
tables).  The first part of the signature looks essentially the same
as HashedType: we've added a parameter to the key type, but it's not
used as yet, so we can replace it with the don't-care underscore.
(Note that this means that our equality is heterogeneous, happy to
accept keys of disparate types.)  The remainder of the signature deals
with packing up key-value pairs into existential boxes, and attempting
to get them out again; this will allow us to store different types of
key in a single table in our implementation.

    module type GHashedType =
    sig
      type _ key
      val equal : _ key -> _ key -> bool
      val hash : _ key -> int

      type pair = Pair : 'a key * 'a -> pair
      val unpack : 'a key -> pair -> 'a
    end

As with HashedType there are requirements not captured in the types.
In particular, we'd like unpack k (Pair (k', v)) = v whenever equal k
k' is true.

Time for the implementation.  This is mostly straightforward: after
some preliminaries (mostly about hiding the type parameter in keys by
boxing them up appropriately), there are two functions (add and
remove) that put keys and values in boxes and store them in a
monomorphic table, and two functions (find and iter) that unbox keys
and values to recover the parameterization.

    module GHashtbl (G : GHashedType) :
      GS with type 'a key = 'a G.key =
    struct
      include G
      type k = Key : 'a key -> k
      module H = Hashtbl.Make(struct
        type t = k
        let hash (Key k) = hash k
        let equal (Key l) (Key r) = equal l r
      end)

      type t = pair H.t

      let create n = H.create n

      let add tbl k v = H.add tbl (Key k) (Pair (k, v))

      let remove tbl k = H.remove tbl (Key k)

      let find tbl key = unpack key (H.find tbl (Key key))

      let iter (f : <f: 'a. 'a key -> 'a -> unit>) tbl =
        H.iter (fun _ (Pair (k, v)) -> f#f k v) tbl
    end

As is often the case, the unboxing is the interesting part.  The find
function reveals why we introduced the unpack operation for keys (and
hence the pair type, which could otherwise have been hidden away in
the body of GHashtbl), and shows the secondary purpose of keys as
unboxers of values.  The iter function makes use of the polymorphism
that we introduced in its signature earlier; when we unbox a pair we
have no idea how to instantiate the type variable in the contents, so
it's just as well we have a function to hand (f#f) that's polymorphic
enough to handle any possible instantiation.

Time to try it out.  Here's a sample implementation of GHashedType
that associates ints with int lists (which we'll use to store prime
factors) and strings with bools (which we'll use to indicate
capitalization).

    module KeyType (* : GHashedType *) =
    struct
      type _ key =
        Int : int -> int list key
      | String : string -> bool key

      let equal : type a b. a key -> b key -> bool =
        fun l r -> match l, r with
        | Int x, Int y -> x = y
        | String x, String y -> x = y
        | _ -> false

      let hash = Hashtbl.hash

      type pair = Pair : 'a key * 'a -> pair

      let rec unpack : type a. a key -> pair -> a =
        fun k p -> match k, p with
        | Int _, Pair (Int _, v) -> v
        | String _, Pair (String _, v) -> v
        | _ -> raise Not_found
    end

Using KeyType we can instantiate the functor and set about creating
heterogeneous hash tables:

    # module HT = GHashtbl(KeyType)
    ...
    # let ht = HT.create 10;;
    val ht : HT.t = <abstr>
    # begin HT.add ht (Int 10) [2; 5];
            HT.add ht (Int 12) [2; 2; 3];
            HT.add ht (Int 2) [2]; end;;
    - : unit = ()
    # begin HT.add ht (String "foo") false;
            HT.add ht (String "Foo") true;
            HT.add ht (String "bar") false;
            HT.add ht (String "Bar") true; end;;
          - : unit = ()
    # HT.find ht (Int 10), HT.find ht (String "Foo"), HT.find ht (Int 12);;
    - : int list * bool * int list = ([2; 5], true, [2; 2; 3])

    # let o = object
        method f : type a. a key -> a -> unit =
          fun k v -> match k with
          | Int i -> let s = String.concat "*" (List.map string_of_int v) in
                     Printf.printf "%d = %s\n" i s
          | String s -> Printf.printf "%s is%s capitalized\n"
                          s (if v then "" else " not")
      end;;
    val o : < f : 'a. 'a KeyType.key -> 'a -> unit > = <obj>
    # HT.iter o ht;;
    2 = 2
    12 = 2*2*3
    10 = 2*5
    foo is not capitalized
    bar is not capitalized
    Bar is capitalized
    Foo is capitalized
    - : unit = ()

Jeremy.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] GADTs and associative container
  2013-07-09 20:52 ` Lukasz Stafiniak
@ 2013-07-10  9:52   ` Leo White
  2013-07-10 10:16   ` Alain Frisch
  1 sibling, 0 replies; 7+ messages in thread
From: Leo White @ 2013-07-10  9:52 UTC (permalink / raw)
  To: Lukasz Stafiniak; +Cc: Goswin von Brederlow, Caml

Lukasz Stafiniak <lukstafi@gmail.com> writes:

> On Tue, Jul 9, 2013 at 10:43 PM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>
>     Hi,
>    
>     I'm wondering if one can have an ascociative container, like a Hashtbl.t
>     with dependent types (GADTs as the key, value depending on the key).
>     Something like this:
>    
>     module H = struct
>       type ('a, 'b) t = ('a, 'b) Hashtbl.t
>       let create : type a b . int -> (a b, a) t =
>                      fun x -> Hashtbl.create x
>       let add : type a b . (a b, a) t -> a b -> a -> unit =
>                   fun h k v -> Hashtbl.add h k v
>       let find : type a b . (a b, a) t -> a b -> a =
>                   fun h k -> Hashtbl.find h k
>     end
>
>  
>
>     BUT:
>    
>         let create : type a b . int -> (a b, a) t =
>                                         ^^^
>     Error: Unbound type constructor b
>
>     Is there some special syntax I'm missing or is it simply impossible to
>     declare such a container in the abstract?
>
> I think you need higher kinded types, not GADTs. Haskell has them, for example you can write code that only depends on
> the type class of "b" (which is parameterized by "a"), and "b" has signature "* -> *" or something like that.

That type is indeed higher-kinded. OCaml does support higher-kinded
types but not in the core type system: you need to use functors.

For example,

>       let create : type a b . int -> (a b, a) t =
>                      fun x -> Hashtbl.create x

could be written as:

  module Create (B: sig type 'a t end) = struct
    let f x : ('a B.t, 'a) Hashtbl.t = Hashtbl.create x
  end;;

Note that this is not actually the correct type for writing an
associative container. See Jeremy's post for details.

Regards,

Leo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] GADTs and associative container
  2013-07-09 20:52 ` Lukasz Stafiniak
  2013-07-10  9:52   ` Leo White
@ 2013-07-10 10:16   ` Alain Frisch
  1 sibling, 0 replies; 7+ messages in thread
From: Alain Frisch @ 2013-07-10 10:16 UTC (permalink / raw)
  To: Lukasz Stafiniak, Goswin von Brederlow; +Cc: Caml

Hi,

This thread might of interest to you:
https://sympa.inria.fr/sympa/arc/caml-list/2013-02/msg00037.html


Alain


On 07/09/2013 10:52 PM, Lukasz Stafiniak wrote:
> On Tue, Jul 9, 2013 at 10:43 PM, Goswin von Brederlow <goswin-v-b@web.de
> <mailto:goswin-v-b@web.de>> wrote:
>
>     Hi,
>
>     I'm wondering if one can have an ascociative container, like a Hashtbl.t
>     with dependent types (GADTs as the key, value depending on the key).
>     Something like this:
>
>     module H = struct
>        type ('a, 'b) t = ('a, 'b) Hashtbl.t
>        let create : type a b . int -> (a b, a) t =
>                       fun x -> Hashtbl.create x
>        let add : type a b . (a b, a) t -> a b -> a -> unit =
>                    fun h k v -> Hashtbl.add h k v
>        let find : type a b . (a b, a) t -> a b -> a =
>                    fun h k -> Hashtbl.find h k
>     end
>
>     BUT:
>
>          let create : type a b . int -> (a b, a) t =
>                                          ^^^
>     Error: Unbound type constructor b
>
>
>     Is there some special syntax I'm missing or is it simply impossible to
>     declare such a container in the abstract?
>
> I think you need higher kinded types, not GADTs. Haskell has them, for
> example you can write code that only depends on the type class of "b"
> (which is parameterized by "a"), and "b" has signature "* -> *" or
> something like that.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] GADTs and associative container
  2013-07-10  2:22 ` Jeremy Yallop
@ 2013-07-11 14:17   ` Goswin von Brederlow
  2013-07-11 14:28     ` Jeremy Yallop
  0 siblings, 1 reply; 7+ messages in thread
From: Goswin von Brederlow @ 2013-07-11 14:17 UTC (permalink / raw)
  To: caml-list

On Wed, Jul 10, 2013 at 03:22:30AM +0100, Jeremy Yallop wrote:
> It is indeed possible to create associative containers where the types
> of the values depend on the types of the keys.  Let's see what can to
> be done to turn the standard hash table into such a container, using
> GADTs for keys.
> 
> We'll start with the interface.  The standard hash table interface
> (Hashtbl.S) looks like this, in part:
> 
>     module type S =
>     sig
>       type key
>       type 'a t
>       val create : int -> 'a t
>       val add : 'a t -> key -> 'a -> unit
>       val remove : 'a t -> key -> unit
>       val find : 'a t -> key -> 'a
>       val iter : (key -> 'a -> unit) -> 'a t -> unit
>     end

Ok, so you are going with the FUNCTOR interface. I understand that
allows for a higher order of type parameters. Right?

I think that's where I got stuck with trying to use the simpler
non-functor Hashtbl.
 
> The type of tables ('a t) is parameterized by the type of values,
> since each table holds a single type of value.  We're aiming instead
> to have value types depend on key types, so we'll move the type
> parameter into the key type.  Making this change mechanically
> throughout the interface gives us the following:
> 
>     module type GS =
>     sig
>       type 'a key
>       type t
>       val create : int -> t
>       val add : t -> 'a key -> 'a -> unit
>       val remove : t -> 'a key -> unit
>       val find : t -> 'a key -> 'a
>       val iter : < f: 'a. 'a key -> 'a -> unit > -> t -> unit
>     end

Can't GADT using functions be passed as arguments without stuffing
them into a class object?
 
> Actually, I've made one additional change, in the type of iter.  In
> the regular Hashtbl iter function we can get by with ML-style
> polymorphism, where all the type variables are implicitly quantified
> at the outermost point.  This constrains the function passed to iter
> to be monomorphic, which is fine, since regular Hashtbls only support
> a single value type.  In our revised interface, however, the function
> argument must be polymorphic, since it needs to handle *any* suitable
> pairing of keys and values.  The object type allows quantifier
> nesting, giving us the polymorphism we need.
> 
> The type of iter is a hint of things to come: putting things *into* a
> polymorphic hash table is a doddle, but there's a bit of a knack to
> getting them *out* again intact, as we'll see further down.
> 
> Next up is the definition of keys.  The standard Hashtbl.Make functor
> uses a definition of keys that bundles the key type together with
> equality and hashing operations, like this:
> 
>     module type HashedType =
>     sig
>        type t
>        val equal : t -> t -> bool
>        val hash : t -> int
>     end
> 
> Of course, it's no good having just any old definitions of equal and
> hash.  It's essential that equal l r implies hash l = hash r, for
> example, and there are additional fairly obvious constraints on equal.
> 
> Our analogue to HashedType, GHashedType, comes with some additional
> operations (and so places additional demands on the creator of hash
> tables).  The first part of the signature looks essentially the same
> as HashedType: we've added a parameter to the key type, but it's not
> used as yet, so we can replace it with the don't-care underscore.
> (Note that this means that our equality is heterogeneous, happy to
> accept keys of disparate types.)  The remainder of the signature deals
> with packing up key-value pairs into existential boxes, and attempting
> to get them out again; this will allow us to store different types of
> key in a single table in our implementation.
> 
>     module type GHashedType =
>     sig
>       type _ key
>       val equal : _ key -> _ key -> bool
>       val hash : _ key -> int
> 
>       type pair = Pair : 'a key * 'a -> pair
>       val unpack : 'a key -> pair -> 'a
>     end
> 
> As with HashedType there are requirements not captured in the types.
> In particular, we'd like unpack k (Pair (k', v)) = v whenever equal k
> k' is true.
> 
> Time for the implementation.  This is mostly straightforward: after
> some preliminaries (mostly about hiding the type parameter in keys by
> boxing them up appropriately), there are two functions (add and
> remove) that put keys and values in boxes and store them in a
> monomorphic table, and two functions (find and iter) that unbox keys
> and values to recover the parameterization.
> 
>     module GHashtbl (G : GHashedType) :
>       GS with type 'a key = 'a G.key =
>     struct
>       include G
>       type k = Key : 'a key -> k
>       module H = Hashtbl.Make(struct
>         type t = k
>         let hash (Key k) = hash k
>         let equal (Key l) (Key r) = equal l r
>       end)
> 
>       type t = pair H.t
> 
>       let create n = H.create n
> 
>       let add tbl k v = H.add tbl (Key k) (Pair (k, v))
> 
>       let remove tbl k = H.remove tbl (Key k)
> 
>       let find tbl key = unpack key (H.find tbl (Key key))
> 
>       let iter (f : <f: 'a. 'a key -> 'a -> unit>) tbl =
>         H.iter (fun _ (Pair (k, v)) -> f#f k v) tbl
>     end
> 
> As is often the case, the unboxing is the interesting part.  The find
> function reveals why we introduced the unpack operation for keys (and
> hence the pair type, which could otherwise have been hidden away in
> the body of GHashtbl), and shows the secondary purpose of keys as
> unboxers of values.  The iter function makes use of the polymorphism
> that we introduced in its signature earlier; when we unbox a pair we
> have no idea how to instantiate the type variable in the contents, so
> it's just as well we have a function to hand (f#f) that's polymorphic
> enough to handle any possible instantiation.
> 
> Time to try it out.  Here's a sample implementation of GHashedType
> that associates ints with int lists (which we'll use to store prime
> factors) and strings with bools (which we'll use to indicate
> capitalization).
> 
>     module KeyType (* : GHashedType *) =
>     struct
>       type _ key =
>         Int : int -> int list key
>       | String : string -> bool key
> 
>       let equal : type a b. a key -> b key -> bool =
>         fun l r -> match l, r with
>         | Int x, Int y -> x = y
>         | String x, String y -> x = y
>         | _ -> false
> 
>       let hash = Hashtbl.hash
> 
>       type pair = Pair : 'a key * 'a -> pair
> 
>       let rec unpack : type a. a key -> pair -> a =
>         fun k p -> match k, p with
>         | Int _, Pair (Int _, v) -> v
>         | String _, Pair (String _, v) -> v
>         | _ -> raise Not_found
>     end

This boxing and unboxing feels wrong. It's wastefull because you
create dummy objects for the Pair. The key part is a repetition of
what is already the key in the table. Only the value is actualy
new information. Secondly I don't want to have to write the unpacking
function for every type. 

As I see it you are replacing a dependent type with a runtime check
for type equality here. Idealy the type system should see that the
"Not_found" part in unpack isn't even be possible. The compiler should
output code that simply returns v without any checks.


So I still wonder if one couldn't build a (better) associative
container from scratch that supports GADTs to describe the type
dependency between the key and the value without that custom unpack
function. How much of the above is just there to make it work with the
standard polymorphic Hahstbl.t?

MfG
	Goswin

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Caml-list] GADTs and associative container
  2013-07-11 14:17   ` Goswin von Brederlow
@ 2013-07-11 14:28     ` Jeremy Yallop
  0 siblings, 0 replies; 7+ messages in thread
From: Jeremy Yallop @ 2013-07-11 14:28 UTC (permalink / raw)
  To: Caml List; +Cc: Goswin von Brederlow

On 11 July 2013 15:17, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> Can't GADT using functions be passed as arguments without stuffing
> them into a class object?

Yes.  There are several ways to do so.  The code in the thread that
Alain mentioned uses what is probably the best approach in this case.

> This boxing and unboxing feels wrong. It's wastefull because you
> create dummy objects for the Pair. The key part is a repetition of
> what is already the key in the table. Only the value is actualy
> new information.
[...]
> So I still wonder if one couldn't build a (better) associative
> container from scratch that supports GADTs to describe the type
> dependency between the key and the value without that custom unpack
> function. How much of the above is just there to make it work with the
> standard polymorphic Hahstbl.t?

These issues are also discussed in the thread that Alain mentioned.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-07-11 14:28 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-09 20:43 [Caml-list] GADTs and associative container Goswin von Brederlow
2013-07-09 20:52 ` Lukasz Stafiniak
2013-07-10  9:52   ` Leo White
2013-07-10 10:16   ` Alain Frisch
2013-07-10  2:22 ` Jeremy Yallop
2013-07-11 14:17   ` Goswin von Brederlow
2013-07-11 14:28     ` Jeremy Yallop

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).