From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by sympa.inria.fr (Postfix) with ESMTPS id 4A62C7ED5C for ; Sat, 4 Aug 2012 11:21:06 +0200 (CEST) Received-SPF: None (mail4-smtp-sop.national.inria.fr: no sender authenticity information available from domain of lpw25@hermes.cam.ac.uk) identity=pra; client-ip=131.111.8.150; receiver=mail4-smtp-sop.national.inria.fr; envelope-from="lpw25@hermes.cam.ac.uk"; x-sender="lpw25@hermes.cam.ac.uk"; x-conformance=sidf_compatible Received-SPF: None (mail4-smtp-sop.national.inria.fr: no sender authenticity information available from domain of lpw25@hermes.cam.ac.uk) identity=mailfrom; client-ip=131.111.8.150; receiver=mail4-smtp-sop.national.inria.fr; envelope-from="lpw25@hermes.cam.ac.uk"; x-sender="lpw25@hermes.cam.ac.uk"; x-conformance=sidf_compatible Received-SPF: None (mail4-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@ppsw-50.csi.cam.ac.uk) identity=helo; client-ip=131.111.8.150; receiver=mail4-smtp-sop.national.inria.fr; envelope-from="lpw25@hermes.cam.ac.uk"; x-sender="postmaster@ppsw-50.csi.cam.ac.uk"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AukBAJ7oHFCDbwiWnGdsb2JhbABFuT8iAQEBAQEICwkJFCeCIQEFJxFBEAEKDjhXBi6HcrNiiQSLSIcEA5skjRE X-IronPort-AV: E=Sophos;i="4.77,711,1336341600"; d="scan'208";a="152427699" Received: from ppsw-50.csi.cam.ac.uk ([131.111.8.150]) by mail4-smtp-sop.national.inria.fr with ESMTP; 04 Aug 2012 11:21:05 +0200 X-Cam-AntiVirus: no malware found X-Cam-SpamDetails: not scanned X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/ Received: from hermes-2.csi.cam.ac.uk ([131.111.8.54]:48405) by ppsw-50.csi.cam.ac.uk (smtp.hermes.cam.ac.uk [131.111.8.157]:25) with esmtpa (EXTERNAL:lpw25) id 1SxaXg-0002no-s1 (Exim 4.72) (return-path ); Sat, 04 Aug 2012 10:21:04 +0100 Received: from prayer by hermes-2.csi.cam.ac.uk (hermes.cam.ac.uk) with local (PRAYER:lpw25) id 1SxaXg-0007OG-Cb (Exim 4.67) (return-path ); Sat, 04 Aug 2012 10:21:04 +0100 Received: from [128.232.9.0] by webmail.hermes.cam.ac.uk with HTTP (Prayer-1.3.5); 04 Aug 2012 10:21:04 +0100 Date: 04 Aug 2012 10:21:04 +0100 From: Leo P White To: Reed Wilson Cc: caml-list@inria.fr Message-ID: In-Reply-To: References: X-Mailer: Prayer v1.3.5 Mime-Version: 1.0 Content-Type: text/plain; format=flowed; charset=ISO-8859-1 Sender: "L.P. White" Subject: Re: [Caml-list] creating GADTs Hi, The problem with your function let mpeg_of_int = function | 0 -> MPEG1 | _ -> MPEG2 ;; is that it cannot have type int -> 'a mpeg_t becuase that would mean that you could use it to create any values with any mpeg_t type (e.g. string mpeg_t). I can think of two possible solutions to your problem. The first is to use an existential type via another GADT, so that your simple example becomes: type mpeg1_t;; type mpeg2_t;; type _ mpeg_t = | MPEG1 : mpeg1_t mpeg_t | MPEG2 : mpeg2_t mpeg_t ;; type mpeg_ext_t = Mpeg_Ext: 'a mpeg_t -> mpeg_ext_t;; let mpeg_of_int = function | 0 -> Mpeg_Ext MPEG1 | _ -> Mpeg_Ext MPEG2 ;; The second is to use polymorphic variants instead of abstract types to maintain your invariants. So your full example would become: type mpeg_tag_t = [ `Mpeg1 | `Mpeg2 ] type _ mpeg_t = | MPEG1 : [< mpeg_tag_t >`Mpeg1] mpeg_t | MPEG2 : [< mpeg_tag_t > `Mpeg2] mpeg_t ;; type _ samplerate_t = | S48000 : [`Mpeg1] samplerate_t | S44100 : [`Mpeg1] samplerate_t | S24000 : [`Mpeg2] samplerate_t | S22050 : [`Mpeg2] samplerate_t ;; type channel_tag_t = [ `Mono | `Stereo ] type _ channel_t = | Channel_mono : [< channel_tag_t > `Mono] channel_t | Channel_stereo : [< channel_tag_t > `Stereo] channel_t ;; type (_,_) side_bits_t = | Bits_1_mono : (int * int) -> ([`Mpeg1], [`Mono]) side_bits_t | Bits_1_stereo : (int * int * int * int) -> ([`Mpeg1], [`Stereo]) side_bits_t | Bits_2_mono : int -> ([`Mpeg2], [`Mono]) side_bits_t | Bits_2_stereo : (int * int) -> ([`Mpeg2], [`Stereo]) side_bits_t ;; type ('id,'chan) frame_t = { header_id : 'id mpeg_t; header_crc : bool; header_samplerate : 'id samplerate_t; header_channel_mode : 'chan channel_t; side_bits : ('id,'chan) side_bits_t; };; This allows you to write your function as before: let mpeg_of_int = function | 0 -> MPEG1 | _ -> MPEG2;; and it still maintains the desired invariant, so that this is allowed: let good = { header_id = MPEG1; header_crc = true; header_samplerate = S48000; header_channel_mode = Channel_mono; side_bits = Bits_1_mono(1, 2); };; but this isn't: let bad = { header_id = MPEG1; header_crc = true; header_samplerate = S48000; header_channel_mode = Channel_stereo; side_bits = Bits_1_mono(1, 2); };; Error: This expression has type ([ `Mpeg1 ], [ `Mono ]) side_bits_t but an expression was expected of type ([ `Mpeg1 ], [< channel_tag_t > `Stereo ] as 'a) side_bits_t Type [ `Mono ] is not compatible with type 'a = [< `Mono | `Stereo > `Stereo ] The first variant type does not allow tag(s) `Stereo Note that this would still not allow you to write the function: let mpeg_of_int = function | 0 -> S48000 | _ -> S24000;; which suffers from the same problem as your original function. Regards, Leo On Aug 4 2012, Reed Wilson wrote: >Greetings, > >I think I just wrapped my head around the concept of GADTs, and I found a >situation where I think they'd help. The problem is that, although creating >the types seems to have worked out fine, actually making values which use >that type is proving... difficult. > >The problem is that, at one point in my program, I need a function that can >convert an int into my new type, but the type-checker doesn't seem to like >it. Here's the problem area in my program: > >type mpeg1_t;; >type mpeg2_t;; >type _ mpeg_t = >| MPEG1 : mpeg1_t mpeg_t >| MPEG2 : mpeg2_t mpeg_t >;; > >let mpeg_of_int = function >| 0 -> MPEG1 >| _ -> MPEG2 >;; > > >The error is obvious in mpeg_of_int: >Error: This expression has type mpeg2_t mpeg_t > but an expression was expected of type mpeg1_t mpeg_t > >I'm trying to make mpeg_of_int be the type "int -> 'a mpeg_t" (or something >similar), but nothing I do will stick. I don't really think this type is >ill-defined since the possible types of 'a are constrained by the >definition of mpeg_t. > >The main reason I want to use GADTs is that there are some fields in a >record I use which change format depending on other fields, but there's no >good way to use a variant due to the dependencies... (that's not a great >explanation, so here's a longer example:) > >I'm making an MP3 parser, and each MP3 frame has a different number of >fields depending on the MPEG ID (MPEG1/MPEG2) and the number of channels. >This is a simplified version of my GADT types: > >type mpeg1_t;; >type mpeg2_t;; >type _ mpeg_t = >| MPEG1 : mpeg1_t mpeg_t >| MPEG2 : mpeg2_t mpeg_t >;; > >type _ samplerate_t = >| S48000 : mpeg1_t samplerate_t >| S44100 : mpeg1_t samplerate_t >| S24000 : mpeg2_t samplerate_t >| S22050 : mpeg2_t samplerate_t >;; > >type channel_mono_t;; >type channel_stereo_t;; > >type _ channel_t = >| Channel_mono : channel_mono_t channel_t >| Channel_stereo : channel_stereo_t channel_t >;; > >type (_,_) side_bits_t = >| Bits_1_mono : (int * int) -> (mpeg1_t, channel_mono_t) side_bits_t >| Bits_1_stereo : (int * int * int * int) -> (mpeg1_t, channel_stereo_t) >side_bits_t >| Bits_2_mono : int -> (mpeg2_t, channel_mono_t) side_bits_t >| Bits_2_stereo : (int * int) -> (mpeg2_t, channel_stereo_t) side_bits_t >;; > >type ('id,'chan) frame_t = { >header_id : 'id mpeg_t; >header_crc : bool; >header_samplerate : 'id samplerate_t; >header_channel_mode : 'chan channel_t; >side_bits : ('id,'chan) side_bits_t; >};; > > >Without the use of GADTs, I can't think of a way to access both header_id >and header_channel_mode in frame_t without some big hierarchical structure. >The holdup seems to be that some parts of frame_t depend on 'id, some parts >on 'chan, and some parts on both. Here's the simplest thing I can think of >that will give the same information without GADTs: > >type mpeg1_mono_t = { >mpeg1_mono_side_bits : int * int; > >};; >type mpeg1_stereo_t = { >mpeg1_stereo_side_bits : int * int * int * int; >};; > >type mpeg1_channel_t = MPEG1_channel_mono of mpeg1_mono_t | >MPEG1_channel_stereo of mpeg1_stereo_t;; > >type mpeg1_samplerate_t = S48000 | S44100;; > >type mpeg1_t = { >mpeg1_channel_mode : mpeg1_channel_t; >mpeg1_samplerate : mpeg1_samplerate_t; >} > >type mpeg2_mono_t = { >mpeg2_mono_side_bits : int; >};; >type mpeg2_stereo_t = { >mpeg2_stereo_side_bits : int * int; >};; > >type mpeg2_channel_t = MPEG2_channel_mono of mpeg2_mono_t | >MPEG2_channel_stereo of mpeg2_stereo_t;; > >type mpeg2_samplerate_t = S48000 | S44100;; > >type mpeg2_t = { >mpeg2_channel_mode : mpeg2_channel_t; >mpeg2_samplerate : mpeg2_samplerate_t; >} > >type mpeg_t = MPEG1 of mpeg1_t | MPEG2 of mpeg2_t;; > >type frame_t = { >header_id : mpeg_t; >header_crc : bool; >};; > > >Without GADTs, if I wanted a function to return the number of channels, I'd >have to write something like this: >let frame_channels = function >| {header_id = MPEG1 {mpeg1_channel_mode = MPEG1_channel_mono _}} -> 1 >| {header_id = MPEG2 {mpeg2_channel_mode = MPEG2_channel_mono _}} -> 1 >| {header_id = MPEG1 {mpeg1_channel_mode = MPEG1_channel_stereo _}} -> 2 >| {header_id = MPEG2 {mpeg2_channel_mode = MPEG2_channel_stereo _}} -> 2 >;; > >However, since the number of channels doesn't actually depend on the MPEG >ID (MPEG1 or MPEG2), with GADTs I can shorten that to: >let frame_channels : type a b. (a,b) frame_t -> int = function >| {header_channel_mode = Channel_mono} -> 1 >| {header_channel_mode = Channel_stereo} -> 2 >;; > > >Now (finally!), the questions: >1. Does it seem like GADTs are the way to go here, or is there a cleaner >method? >2. Is there any way of getting mpeg_of_int past the type-checker? >3. If not, is that because I'm doing something unsound, or is it the GADT >implementation being too strict? > >Thanks for any insights! > >