From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 3EF5C81792 for ; Wed, 10 Jul 2013 04:22:31 +0200 (CEST) Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of yallop@gmail.com) identity=pra; client-ip=74.125.82.174; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="yallop@gmail.com"; x-sender="yallop@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of yallop@gmail.com designates 74.125.82.174 as permitted sender) identity=mailfrom; client-ip=74.125.82.174; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="yallop@gmail.com"; x-sender="yallop@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-we0-f174.google.com) identity=helo; client-ip=74.125.82.174; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="yallop@gmail.com"; x-sender="postmaster@mail-we0-f174.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApYBAOXE3FFKfVKuk2dsb2JhbABahAiDCL4AgQkIFg4BAQEBBwsLCRQEJIIkAQUjBBkBGx0BAwwGBQsPAiYCAiIBEQEFARwGiA8BAw+cdot/T4J/hCkKGScNWIdzAQUMgRqORQeCVoEeA5dUj2QWKYQ4Ow X-IPAS-Result: ApYBAOXE3FFKfVKuk2dsb2JhbABahAiDCL4AgQkIFg4BAQEBBwsLCRQEJIIkAQUjBBkBGx0BAwwGBQsPAiYCAiIBEQEFARwGiA8BAw+cdot/T4J/hCkKGScNWIdzAQUMgRqORQeCVoEeA5dUj2QWKYQ4Ow X-IronPort-AV: E=Sophos;i="4.87,1032,1363129200"; d="scan'208";a="25301685" Received: from mail-we0-f174.google.com ([74.125.82.174]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 10 Jul 2013 04:22:30 +0200 Received: by mail-we0-f174.google.com with SMTP id q58so5359895wes.19 for ; Tue, 09 Jul 2013 19:22:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=TssV9Qc7IO3lRiHbK3wyYwLEadw/+OuZLPynSCUBgj0=; b=l7o4/LzKrA5JRptapwpFFdw4R5x/eGBciQatnEADKEospRb6RzGFK2AC9TIV8XGbch 5blyV/P2mwUG5CSPiqNEfKQNJNir9AqwbfktG7bTDkUgbTtT53ltVNvWkCZy2teYG1Gm lhoyBqQmlfHunCqEEn38yLovNNKEjPIE+CHbkLM88MrmJ+QGVFwCMhrcS+o11fUez2ea heov/tv6p5PpUzKaX/gydczqmHTJNbtg+2uW3rVEgypsvgaiTU+c/oaNiqtvGeHZQhEO Pm3Oq/fJwSDq2+sSVRI8T4werM+gHXDdQxeaGk4oSNEkDh6bheb0gyg8Otwh22g9dFBH asbg== MIME-Version: 1.0 X-Received: by 10.194.87.100 with SMTP id w4mr16672954wjz.34.1373422950381; Tue, 09 Jul 2013 19:22:30 -0700 (PDT) Received: by 10.216.91.6 with HTTP; Tue, 9 Jul 2013 19:22:30 -0700 (PDT) In-Reply-To: <20130709204312.GA30194@frosties> References: <20130709204312.GA30194@frosties> Date: Wed, 10 Jul 2013 03:22:30 +0100 Message-ID: From: Jeremy Yallop To: Caml List Cc: Goswin von Brederlow Content-Type: text/plain; charset=UTF-8 Subject: Re: [Caml-list] GADTs and associative container It is indeed possible to create associative containers where the types of the values depend on the types of the keys. Let's see what can to be done to turn the standard hash table into such a container, using GADTs for keys. We'll start with the interface. The standard hash table interface (Hashtbl.S) looks like this, in part: module type S = sig type key type 'a t val create : int -> 'a t val add : 'a t -> key -> 'a -> unit val remove : 'a t -> key -> unit val find : 'a t -> key -> 'a val iter : (key -> 'a -> unit) -> 'a t -> unit end The type of tables ('a t) is parameterized by the type of values, since each table holds a single type of value. We're aiming instead to have value types depend on key types, so we'll move the type parameter into the key type. Making this change mechanically throughout the interface gives us the following: module type GS = sig type 'a key type t val create : int -> t val add : t -> 'a key -> 'a -> unit val remove : t -> 'a key -> unit val find : t -> 'a key -> 'a val iter : < f: 'a. 'a key -> 'a -> unit > -> t -> unit end Actually, I've made one additional change, in the type of iter. In the regular Hashtbl iter function we can get by with ML-style polymorphism, where all the type variables are implicitly quantified at the outermost point. This constrains the function passed to iter to be monomorphic, which is fine, since regular Hashtbls only support a single value type. In our revised interface, however, the function argument must be polymorphic, since it needs to handle *any* suitable pairing of keys and values. The object type allows quantifier nesting, giving us the polymorphism we need. The type of iter is a hint of things to come: putting things *into* a polymorphic hash table is a doddle, but there's a bit of a knack to getting them *out* again intact, as we'll see further down. Next up is the definition of keys. The standard Hashtbl.Make functor uses a definition of keys that bundles the key type together with equality and hashing operations, like this: module type HashedType = sig type t val equal : t -> t -> bool val hash : t -> int end Of course, it's no good having just any old definitions of equal and hash. It's essential that equal l r implies hash l = hash r, for example, and there are additional fairly obvious constraints on equal. Our analogue to HashedType, GHashedType, comes with some additional operations (and so places additional demands on the creator of hash tables). The first part of the signature looks essentially the same as HashedType: we've added a parameter to the key type, but it's not used as yet, so we can replace it with the don't-care underscore. (Note that this means that our equality is heterogeneous, happy to accept keys of disparate types.) The remainder of the signature deals with packing up key-value pairs into existential boxes, and attempting to get them out again; this will allow us to store different types of key in a single table in our implementation. module type GHashedType = sig type _ key val equal : _ key -> _ key -> bool val hash : _ key -> int type pair = Pair : 'a key * 'a -> pair val unpack : 'a key -> pair -> 'a end As with HashedType there are requirements not captured in the types. In particular, we'd like unpack k (Pair (k', v)) = v whenever equal k k' is true. Time for the implementation. This is mostly straightforward: after some preliminaries (mostly about hiding the type parameter in keys by boxing them up appropriately), there are two functions (add and remove) that put keys and values in boxes and store them in a monomorphic table, and two functions (find and iter) that unbox keys and values to recover the parameterization. module GHashtbl (G : GHashedType) : GS with type 'a key = 'a G.key = struct include G type k = Key : 'a key -> k module H = Hashtbl.Make(struct type t = k let hash (Key k) = hash k let equal (Key l) (Key r) = equal l r end) type t = pair H.t let create n = H.create n let add tbl k v = H.add tbl (Key k) (Pair (k, v)) let remove tbl k = H.remove tbl (Key k) let find tbl key = unpack key (H.find tbl (Key key)) let iter (f : 'a -> unit>) tbl = H.iter (fun _ (Pair (k, v)) -> f#f k v) tbl end As is often the case, the unboxing is the interesting part. The find function reveals why we introduced the unpack operation for keys (and hence the pair type, which could otherwise have been hidden away in the body of GHashtbl), and shows the secondary purpose of keys as unboxers of values. The iter function makes use of the polymorphism that we introduced in its signature earlier; when we unbox a pair we have no idea how to instantiate the type variable in the contents, so it's just as well we have a function to hand (f#f) that's polymorphic enough to handle any possible instantiation. Time to try it out. Here's a sample implementation of GHashedType that associates ints with int lists (which we'll use to store prime factors) and strings with bools (which we'll use to indicate capitalization). module KeyType (* : GHashedType *) = struct type _ key = Int : int -> int list key | String : string -> bool key let equal : type a b. a key -> b key -> bool = fun l r -> match l, r with | Int x, Int y -> x = y | String x, String y -> x = y | _ -> false let hash = Hashtbl.hash type pair = Pair : 'a key * 'a -> pair let rec unpack : type a. a key -> pair -> a = fun k p -> match k, p with | Int _, Pair (Int _, v) -> v | String _, Pair (String _, v) -> v | _ -> raise Not_found end Using KeyType we can instantiate the functor and set about creating heterogeneous hash tables: # module HT = GHashtbl(KeyType) ... # let ht = HT.create 10;; val ht : HT.t = # begin HT.add ht (Int 10) [2; 5]; HT.add ht (Int 12) [2; 2; 3]; HT.add ht (Int 2) [2]; end;; - : unit = () # begin HT.add ht (String "foo") false; HT.add ht (String "Foo") true; HT.add ht (String "bar") false; HT.add ht (String "Bar") true; end;; - : unit = () # HT.find ht (Int 10), HT.find ht (String "Foo"), HT.find ht (Int 12);; - : int list * bool * int list = ([2; 5], true, [2; 2; 3]) # let o = object method f : type a. a key -> a -> unit = fun k v -> match k with | Int i -> let s = String.concat "*" (List.map string_of_int v) in Printf.printf "%d = %s\n" i s | String s -> Printf.printf "%s is%s capitalized\n" s (if v then "" else " not") end;; val o : < f : 'a. 'a KeyType.key -> 'a -> unit > = # HT.iter o ht;; 2 = 2 12 = 2*2*3 10 = 2*5 foo is not capitalized bar is not capitalized Bar is capitalized Foo is capitalized - : unit = () Jeremy.