From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by sympa.inria.fr (Postfix) with ESMTPS id 5DAAC7ED5C for ; Sat, 4 Aug 2012 09:56:30 +0200 (CEST) Received-SPF: None (mail1-smtp-roc.national.inria.fr: no sender authenticity information available from domain of cedilla@gmail.com) identity=pra; client-ip=209.85.160.54; receiver=mail1-smtp-roc.national.inria.fr; envelope-from="cedilla@gmail.com"; x-sender="cedilla@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail1-smtp-roc.national.inria.fr: domain of cedilla@gmail.com designates 209.85.160.54 as permitted sender) identity=mailfrom; client-ip=209.85.160.54; receiver=mail1-smtp-roc.national.inria.fr; envelope-from="cedilla@gmail.com"; x-sender="cedilla@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail1-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-pb0-f54.google.com) identity=helo; client-ip=209.85.160.54; receiver=mail1-smtp-roc.national.inria.fr; envelope-from="cedilla@gmail.com"; x-sender="postmaster@mail-pb0-f54.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AmICAE3VHFDRVaA2k2dsb2JhbABFhXuzPAgiAQEBAQkJCwkUBCOCOQIPBBkBGx4DEgMNNwIkAREBBQFXh1sBAwybVYJhCQOLVE+CcYR3ChknDVeIcQEFDJEugRIDiE2MfI4vPoQe X-IronPort-AV: E=Sophos;i="4.77,710,1336341600"; d="scan'208";a="169061192" Received: from mail-pb0-f54.google.com ([209.85.160.54]) by mail1-smtp-roc.national.inria.fr with ESMTP/TLS/RC4-SHA; 04 Aug 2012 09:56:29 +0200 Received: by mail-pb0-f54.google.com with SMTP id ro2so3217863pbb.27 for ; Sat, 04 Aug 2012 00:56:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=F0vrg1kpqRkgHmyG9QFf35fn9x22RIo4MMEBX45I7yM=; b=iKfSsOHlcvpVEXK5rgMJVinWCqh3qf7DQKpUS/5j3SK4Au43oby0pLGjWfir/YZX51 /j6VpAzzPzZNOARn8gNj/lzzPUVyW9K4hRlls+xbYXZZMcLkwUT9AJbDGtG99ztk0/Wj gR4CRCPrU1lk3rb6XVjAzpcu4OvzzJtwAFY40ZS6Xfn/aZjwij2B4jaVqKW7tw8UGObS ZVPf7g85Wzi7Ca17Ffc629DU8kn68G+pmruUBnasVM14dikry//ZHrHUsVEagbgdLEe8 rydCen5Y14z2jIfpausgUC6JPNmUXCfEhrFsCuCazCmMSzjofcoW2l/ZdWKKvucQDgIb Kt0A== MIME-Version: 1.0 Received: by 10.68.200.98 with SMTP id jr2mr3335231pbc.81.1344066989004; Sat, 04 Aug 2012 00:56:29 -0700 (PDT) Received: by 10.142.173.15 with HTTP; Sat, 4 Aug 2012 00:56:28 -0700 (PDT) Date: Sat, 4 Aug 2012 00:56:28 -0700 Message-ID: From: Reed Wilson To: caml-list@inria.fr Content-Type: multipart/alternative; boundary=047d7b10ceeb6f191e04c66bfb2e X-Validation-by: cedilla@gmail.com Subject: [Caml-list] creating GADTs --047d7b10ceeb6f191e04c66bfb2e Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Greetings, I think I just wrapped my head around the concept of GADTs, and I found a situation where I think they'd help. The problem is that, although creating the types seems to have worked out fine, actually making values which use that type is proving... difficult. The problem is that, at one point in my program, I need a function that can convert an int into my new type, but the type-checker doesn't seem to like it. Here's the problem area in my program: type mpeg1_t;; type mpeg2_t;; type _ mpeg_t =3D | MPEG1 : mpeg1_t mpeg_t | MPEG2 : mpeg2_t mpeg_t ;; let mpeg_of_int =3D function | 0 -> MPEG1 | _ -> MPEG2 ;; The error is obvious in mpeg_of_int: Error: This expression has type mpeg2_t mpeg_t but an expression was expected of type mpeg1_t mpeg_t I'm trying to make mpeg_of_int be the type "int -> 'a mpeg_t" (or something similar), but nothing I do will stick. I don't really think this type is ill-defined since the possible types of 'a are constrained by the definition of mpeg_t. The main reason I want to use GADTs is that there are some fields in a record I use which change format depending on other fields, but there's no good way to use a variant due to the dependencies... (that's not a great explanation, so here's a longer example:) I'm making an MP3 parser, and each MP3 frame has a different number of fields depending on the MPEG ID (MPEG1/MPEG2) and the number of channels. This is a simplified version of my GADT types: type mpeg1_t;; type mpeg2_t;; type _ mpeg_t =3D | MPEG1 : mpeg1_t mpeg_t | MPEG2 : mpeg2_t mpeg_t ;; type _ samplerate_t =3D | S48000 : mpeg1_t samplerate_t | S44100 : mpeg1_t samplerate_t | S24000 : mpeg2_t samplerate_t | S22050 : mpeg2_t samplerate_t ;; type channel_mono_t;; type channel_stereo_t;; type _ channel_t =3D | Channel_mono : channel_mono_t channel_t | Channel_stereo : channel_stereo_t channel_t ;; type (_,_) side_bits_t =3D | Bits_1_mono : (int * int) -> (mpeg1_t, channel_mono_t) side_bits_t | Bits_1_stereo : (int * int * int * int) -> (mpeg1_t, channel_stereo_t) side_bits_t | Bits_2_mono : int -> (mpeg2_t, channel_mono_t) side_bits_t | Bits_2_stereo : (int * int) -> (mpeg2_t, channel_stereo_t) side_bits_t ;; type ('id,'chan) frame_t =3D { header_id : 'id mpeg_t; header_crc : bool; header_samplerate : 'id samplerate_t; header_channel_mode : 'chan channel_t; side_bits : ('id,'chan) side_bits_t; };; Without the use of GADTs, I can't think of a way to access both header_id and header_channel_mode in frame_t without some big hierarchical structure. The holdup seems to be that some parts of frame_t depend on 'id, some parts on 'chan, and some parts on both. Here's the simplest thing I can think of that will give the same information without GADTs: type mpeg1_mono_t =3D { mpeg1_mono_side_bits : int * int; };; type mpeg1_stereo_t =3D { mpeg1_stereo_side_bits : int * int * int * int; };; type mpeg1_channel_t =3D MPEG1_channel_mono of mpeg1_mono_t | MPEG1_channel_stereo of mpeg1_stereo_t;; type mpeg1_samplerate_t =3D S48000 | S44100;; type mpeg1_t =3D { mpeg1_channel_mode : mpeg1_channel_t; mpeg1_samplerate : mpeg1_samplerate_t; } type mpeg2_mono_t =3D { mpeg2_mono_side_bits : int; };; type mpeg2_stereo_t =3D { mpeg2_stereo_side_bits : int * int; };; type mpeg2_channel_t =3D MPEG2_channel_mono of mpeg2_mono_t | MPEG2_channel_stereo of mpeg2_stereo_t;; type mpeg2_samplerate_t =3D S48000 | S44100;; type mpeg2_t =3D { mpeg2_channel_mode : mpeg2_channel_t; mpeg2_samplerate : mpeg2_samplerate_t; } type mpeg_t =3D MPEG1 of mpeg1_t | MPEG2 of mpeg2_t;; type frame_t =3D { header_id : mpeg_t; header_crc : bool; };; Without GADTs, if I wanted a function to return the number of channels, I'd have to write something like this: let frame_channels =3D function | {header_id =3D MPEG1 {mpeg1_channel_mode =3D MPEG1_channel_mono _}} -> 1 | {header_id =3D MPEG2 {mpeg2_channel_mode =3D MPEG2_channel_mono _}} -> 1 | {header_id =3D MPEG1 {mpeg1_channel_mode =3D MPEG1_channel_stereo _}} -> 2 | {header_id =3D MPEG2 {mpeg2_channel_mode =3D MPEG2_channel_stereo _}} -> 2 ;; However, since the number of channels doesn't actually depend on the MPEG ID (MPEG1 or MPEG2), with GADTs I can shorten that to: let frame_channels : type a b. (a,b) frame_t -> int =3D function | {header_channel_mode =3D Channel_mono} -> 1 | {header_channel_mode =3D Channel_stereo} -> 2 ;; Now (finally!), the questions: 1. Does it seem like GADTs are the way to go here, or is there a cleaner method? 2. Is there any way of getting mpeg_of_int past the type-checker? 3. If not, is that because I'm doing something unsound, or is it the GADT implementation being too strict? Thanks for any insights! --=20 =C3=A7 --047d7b10ceeb6f191e04c66bfb2e Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Greetings,

I think I just wrapped my head around the con= cept of GADTs, and I found a situation where I think they'd help. The p= roblem is that,=C2=A0although creating the types seems to have worked out f= ine, actually making values which use that type is proving... difficult.

The problem is that, at one point in my program, I need= a function that can convert an int into my new type, but the type-checker = doesn't seem to like it. Here's the problem area in my program:

type mpeg1_t;;
type mpeg2_t;;
= type _ mpeg_t =3D
| MPEG1 : mpeg1_t mpeg_t
| MPEG2 : mp= eg2_t mpeg_t
;;

let mpeg_of_int =3D func= tion
| 0 -> MPEG1
| _ -> MPEG2
;;

<= /div>

The error is obvious in mpeg_of_int:
Error: This expression has type mpeg2_t mpeg_t
=C2=A0 =C2=A0 = =C2=A0 =C2=A0but an expression was expected of type mpeg1_t mpeg_t

I'm trying to make mpeg_of_int be the type "in= t -> 'a mpeg_t" (or something similar), but nothing I do will s= tick. I don't really think this type is ill-defined since the possible = types of 'a are constrained by the definition of mpeg_t.

The main reason I want to use GADTs is that there are s= ome fields in a record I use which change format depending on other fields,= but there's no good way to use a variant due to the dependencies... (t= hat's not a great explanation, so here's a longer example:)

I'm making an MP3 parser, and each MP3 frame has a = different number of fields depending on the MPEG ID (MPEG1/MPEG2) and the n= umber of channels. This is a simplified version of my GADT types:

type mpeg1_t;;
type mpeg2_t;;
= type _ mpeg_t =3D
| MPEG1 : mpeg1_t mpeg_t
| MPEG2 : mp= eg2_t mpeg_t
;;

type _ samplerate_t =3D<= /div>
| S48000 : mpeg1_t samplerate_t
| S44100 : mpeg1_t samplerat= e_t
| S24000 : mpeg2_t samplerate_t
| S22050 : mpeg2_t = samplerate_t
;;

type channel_mono_t;;
type channel_stereo_t;;

type _ channel_t =3D<= /div>
| Channel_mono : channel_mono_t channel_t
| Channel_ste= reo : channel_stereo_t channel_t
;;

type= (_,_) side_bits_t =3D
| Bits_1_mono : (int * int) -> (mpeg1_t, channel_mono_t) side_bits_= t
| Bits_1_stereo : (int * int * int * int) -> (mpeg1_t, chann= el_stereo_t) side_bits_t
| Bits_2_mono : int -> (mpeg2_t, chan= nel_mono_t) side_bits_t
| Bits_2_stereo : (int * int) -> (mpeg2_t, channel_stereo_t) side_b= its_t
;;

type ('id,'chan) frame_= t =3D {
= header_id : 'id mpeg_t;
heade= r_crc : bool;
header_samplerate : 'id samplerate_t;
header_channel_mode= : 'chan channel_t;
side_= bits : ('id,'chan) side_bits_t;
};;

<= /div>

Without the use of GADTs, I can't think of a w= ay to access both header_id and header_channel_mode in frame_t without some= big=C2=A0hierarchical=C2=A0structure. The holdup seems to be that some par= ts of frame_t depend on 'id, some parts on 'chan, and some parts on= both. Here's the simplest thing I can think of that will give the same= information without GADTs:

type mpeg1_mono_t =3D {
mpeg1_mono_side_bits : int= * int;

};;
type mpeg1_stereo_t =3D {
mpeg1_ster= eo_side_bits : int * int * int * int;
};;

type mpeg1_channel_t =3D MPEG1_channel_mono of mpeg1_mono_t | MPEG1_chann= el_stereo of mpeg1_stereo_t;;

type mpeg1_samplerate_t =3D S48000 | S44100;;

type mpeg1_t =3D {
mpeg1_channel_mode : mpeg1_channel_t;
mpeg1_samp= lerate : mpeg1_samplerate_t;
}

type mpeg= 2_mono_t =3D {
mpeg2_mono_side_bits : int;
};;
type mpeg2_stereo_t =3D {
mpeg2_stereo_side_bits : int *= int;
};;

type mpeg2_channel_t =3D MPEG2= _channel_mono of mpeg2_mono_t | MPEG2_channel_stereo of mpeg2_stereo_t;;

type mpeg2_samplerate_t =3D S48000 | S44100;;

type mpeg2_t =3D {
mpeg2_channel_mode : mpeg2_channel_t;
mpeg2_samp= lerate : mpeg2_samplerate_t;
}

type mpeg= _t =3D MPEG1 of mpeg1_t | MPEG2 of mpeg2_t;;

type = frame_t =3D {
heade= r_id : mpeg_t;
header_crc : bool;
};;

<= div>
Without GADTs, if I wanted a function to return the number o= f channels, I'd have to write something like this:
let f= rame_channels =3D function
| {header_id =3D MPEG1 {mpeg1_channel_= mode =3D MPEG1_channel_mono _}} -> 1
| {header_id =3D MPEG2 {mpeg2_channel_mode =3D MPEG2_channel_mono _}} = -> 1
| {header_id =3D MPEG1 {mpeg1_channel_mode =3D MPEG1_chan= nel_stereo _}} -> 2
| {header_id =3D MPEG2 {mpeg2_channel_mode= =3D MPEG2_channel_stereo _}} -> 2
;;

However, since the number of channel= s doesn't actually depend on the MPEG ID (MPEG1 or MPEG2), with GADTs I= can shorten that to:
let frame_channels : type a b. (a,b) f= rame_t -> int =3D function
| {header_channel_mode =3D Channel_mono} -> 1
| {header_c= hannel_mode =3D Channel_stereo} -> 2
;;


Now (finally!), the questions:
1. Does it= seem like GADTs are the way to go here, or is there a cleaner method?
2. Is there any way of getting=C2=A0mpeg_of_int past the type-checker?=
3. If not, is that because I'm doing something unsound, or i= s it the GADT implementation being too strict?

Tha= nks for any insights!

--=C2=A0
=C3=A7
--047d7b10ceeb6f191e04c66bfb2e--