caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] creating GADTs
@ 2012-08-04  7:56 Reed Wilson
  2012-08-04  9:21 ` Leo P White
  0 siblings, 1 reply; 8+ messages in thread
From: Reed Wilson @ 2012-08-04  7:56 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 4795 bytes --]

Greetings,

I think I just wrapped my head around the concept of GADTs, and I found a
situation where I think they'd help. The problem is that, although creating
the types seems to have worked out fine, actually making values which use
that type is proving... difficult.

The problem is that, at one point in my program, I need a function that can
convert an int into my new type, but the type-checker doesn't seem to like
it. Here's the problem area in my program:

type mpeg1_t;;
type mpeg2_t;;
type _ mpeg_t =
| MPEG1 : mpeg1_t mpeg_t
| MPEG2 : mpeg2_t mpeg_t
;;

let mpeg_of_int = function
| 0 -> MPEG1
| _ -> MPEG2
;;


The error is obvious in mpeg_of_int:
Error: This expression has type mpeg2_t mpeg_t
       but an expression was expected of type mpeg1_t mpeg_t

I'm trying to make mpeg_of_int be the type "int -> 'a mpeg_t" (or something
similar), but nothing I do will stick. I don't really think this type is
ill-defined since the possible types of 'a are constrained by the
definition of mpeg_t.

The main reason I want to use GADTs is that there are some fields in a
record I use which change format depending on other fields, but there's no
good way to use a variant due to the dependencies... (that's not a great
explanation, so here's a longer example:)

I'm making an MP3 parser, and each MP3 frame has a different number of
fields depending on the MPEG ID (MPEG1/MPEG2) and the number of channels.
This is a simplified version of my GADT types:

type mpeg1_t;;
type mpeg2_t;;
type _ mpeg_t =
| MPEG1 : mpeg1_t mpeg_t
| MPEG2 : mpeg2_t mpeg_t
;;

type _ samplerate_t =
| S48000 : mpeg1_t samplerate_t
| S44100 : mpeg1_t samplerate_t
| S24000 : mpeg2_t samplerate_t
| S22050 : mpeg2_t samplerate_t
;;

type channel_mono_t;;
type channel_stereo_t;;

type _ channel_t =
| Channel_mono : channel_mono_t channel_t
| Channel_stereo : channel_stereo_t channel_t
;;

type (_,_) side_bits_t =
| Bits_1_mono : (int * int) -> (mpeg1_t, channel_mono_t) side_bits_t
| Bits_1_stereo : (int * int * int * int) -> (mpeg1_t, channel_stereo_t)
side_bits_t
| Bits_2_mono : int -> (mpeg2_t, channel_mono_t) side_bits_t
| Bits_2_stereo : (int * int) -> (mpeg2_t, channel_stereo_t) side_bits_t
;;

type ('id,'chan) frame_t = {
header_id : 'id mpeg_t;
header_crc : bool;
header_samplerate : 'id samplerate_t;
header_channel_mode : 'chan channel_t;
side_bits : ('id,'chan) side_bits_t;
};;


Without the use of GADTs, I can't think of a way to access both header_id
and header_channel_mode in frame_t without some big hierarchical structure.
The holdup seems to be that some parts of frame_t depend on 'id, some parts
on 'chan, and some parts on both. Here's the simplest thing I can think of
that will give the same information without GADTs:

type mpeg1_mono_t = {
mpeg1_mono_side_bits : int * int;

};;
type mpeg1_stereo_t = {
mpeg1_stereo_side_bits : int * int * int * int;
};;

type mpeg1_channel_t = MPEG1_channel_mono of mpeg1_mono_t |
MPEG1_channel_stereo of mpeg1_stereo_t;;

type mpeg1_samplerate_t = S48000 | S44100;;

type mpeg1_t = {
mpeg1_channel_mode : mpeg1_channel_t;
mpeg1_samplerate : mpeg1_samplerate_t;
}

type mpeg2_mono_t = {
mpeg2_mono_side_bits : int;
};;
type mpeg2_stereo_t = {
mpeg2_stereo_side_bits : int * int;
};;

type mpeg2_channel_t = MPEG2_channel_mono of mpeg2_mono_t |
MPEG2_channel_stereo of mpeg2_stereo_t;;

type mpeg2_samplerate_t = S48000 | S44100;;

type mpeg2_t = {
mpeg2_channel_mode : mpeg2_channel_t;
mpeg2_samplerate : mpeg2_samplerate_t;
}

type mpeg_t = MPEG1 of mpeg1_t | MPEG2 of mpeg2_t;;

type frame_t = {
header_id : mpeg_t;
header_crc : bool;
};;


Without GADTs, if I wanted a function to return the number of channels, I'd
have to write something like this:
let frame_channels = function
| {header_id = MPEG1 {mpeg1_channel_mode = MPEG1_channel_mono _}} -> 1
| {header_id = MPEG2 {mpeg2_channel_mode = MPEG2_channel_mono _}} -> 1
| {header_id = MPEG1 {mpeg1_channel_mode = MPEG1_channel_stereo _}} -> 2
| {header_id = MPEG2 {mpeg2_channel_mode = MPEG2_channel_stereo _}} -> 2
;;

However, since the number of channels doesn't actually depend on the MPEG
ID (MPEG1 or MPEG2), with GADTs I can shorten that to:
let frame_channels : type a b. (a,b) frame_t -> int = function
| {header_channel_mode = Channel_mono} -> 1
| {header_channel_mode = Channel_stereo} -> 2
;;


Now (finally!), the questions:
1. Does it seem like GADTs are the way to go here, or is there a cleaner
method?
2. Is there any way of getting mpeg_of_int past the type-checker?
3. If not, is that because I'm doing something unsound, or is it the GADT
implementation being too strict?

Thanks for any insights!

-- 
ç

[-- Attachment #2: Type: text/html, Size: 7299 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] creating GADTs
  2012-08-04  7:56 [Caml-list] creating GADTs Reed Wilson
@ 2012-08-04  9:21 ` Leo P White
       [not found]   ` <CALLFq5SWTGDuS5-Z7aN_UOqtTJ0sMvghWQTGqXUshUURiOoWcg@mail.gmail.com>
  0 siblings, 1 reply; 8+ messages in thread
From: Leo P White @ 2012-08-04  9:21 UTC (permalink / raw)
  To: Reed Wilson; +Cc: caml-list

Hi,

The problem with your function

 let mpeg_of_int = function
 | 0 -> MPEG1
 | _ -> MPEG2
 ;;

is that it cannot have type int -> 'a mpeg_t becuase that would mean that 
you could use it to create any values with any mpeg_t type (e.g. string 
mpeg_t).

I can think of two possible solutions to your problem.

The first is to use an existential type via another GADT, so that your 
simple example becomes:

  type mpeg1_t;;
  type mpeg2_t;;
  type _ mpeg_t =
  | MPEG1 : mpeg1_t mpeg_t
  | MPEG2 : mpeg2_t mpeg_t
  ;;

  type mpeg_ext_t = Mpeg_Ext: 'a mpeg_t -> mpeg_ext_t;;

  let mpeg_of_int = function
  | 0 -> Mpeg_Ext MPEG1
  | _ -> Mpeg_Ext MPEG2
  ;;

The second is to use polymorphic variants instead of abstract types to 
maintain your invariants. So your full example would become:

  type mpeg_tag_t = [ `Mpeg1 | `Mpeg2 ]

  type _ mpeg_t =
  | MPEG1 : [< mpeg_tag_t >`Mpeg1] mpeg_t
  | MPEG2 : [< mpeg_tag_t > `Mpeg2] mpeg_t
  ;;

  type _ samplerate_t =
  | S48000 : [`Mpeg1] samplerate_t
  | S44100 : [`Mpeg1] samplerate_t
  | S24000 : [`Mpeg2] samplerate_t
  | S22050 : [`Mpeg2] samplerate_t
  ;;

  type channel_tag_t = [ `Mono | `Stereo ]

  type _ channel_t =
  | Channel_mono : [< channel_tag_t > `Mono] channel_t
  | Channel_stereo : [< channel_tag_t > `Stereo] channel_t
  ;;

  type (_,_) side_bits_t =
  | Bits_1_mono : (int * int) -> ([`Mpeg1], [`Mono]) side_bits_t
  | Bits_1_stereo : (int * int * int * int) 
                       -> ([`Mpeg1], [`Stereo]) side_bits_t
  | Bits_2_mono : int -> ([`Mpeg2], [`Mono]) side_bits_t
  | Bits_2_stereo : (int * int) -> ([`Mpeg2], [`Stereo]) side_bits_t
  ;;

  type ('id,'chan) frame_t = {
    header_id : 'id mpeg_t;
    header_crc : bool;
    header_samplerate : 'id samplerate_t;
    header_channel_mode : 'chan channel_t;
    side_bits : ('id,'chan) side_bits_t;
  };;

This allows you to write your function as before:

  let mpeg_of_int = function
  | 0 -> MPEG1
  | _ -> MPEG2;;

and it still maintains the desired invariant, so that this is allowed:

  let good = 
  { header_id = MPEG1;
    header_crc = true;
    header_samplerate = S48000;
    header_channel_mode = Channel_mono;
    side_bits = Bits_1_mono(1, 2); };;

but this isn't:

let bad = 
  { header_id = MPEG1;
    header_crc = true;
    header_samplerate = S48000;
    header_channel_mode = Channel_stereo;
    side_bits = Bits_1_mono(1, 2); };;

  Error: This expression has type ([ `Mpeg1 ], [ `Mono ]) side_bits_t
       but an expression was expected of type
         ([ `Mpeg1 ], [< channel_tag_t > `Stereo ] as 'a) side_bits_t
       Type [ `Mono ] is not compatible with type
         'a = [< `Mono | `Stereo > `Stereo ] 
       The first variant type does not allow tag(s) `Stereo

Note that this would still not allow you to write the function:

  let mpeg_of_int = function
  | 0 -> S48000
  | _ -> S24000;;

which suffers from the same problem as your original function.

Regards,

Leo




On Aug 4 2012, Reed Wilson wrote:

>Greetings,
>
>I think I just wrapped my head around the concept of GADTs, and I found a
>situation where I think they'd help. The problem is that, although creating
>the types seems to have worked out fine, actually making values which use
>that type is proving... difficult.
>
>The problem is that, at one point in my program, I need a function that can
>convert an int into my new type, but the type-checker doesn't seem to like
>it. Here's the problem area in my program:
>
>type mpeg1_t;;
>type mpeg2_t;;
>type _ mpeg_t =
>| MPEG1 : mpeg1_t mpeg_t
>| MPEG2 : mpeg2_t mpeg_t
>;;
>
>let mpeg_of_int = function
>| 0 -> MPEG1
>| _ -> MPEG2
>;;
>
>
>The error is obvious in mpeg_of_int:
>Error: This expression has type mpeg2_t mpeg_t
>       but an expression was expected of type mpeg1_t mpeg_t
>
>I'm trying to make mpeg_of_int be the type "int -> 'a mpeg_t" (or something
>similar), but nothing I do will stick. I don't really think this type is
>ill-defined since the possible types of 'a are constrained by the
>definition of mpeg_t.
>
>The main reason I want to use GADTs is that there are some fields in a
>record I use which change format depending on other fields, but there's no
>good way to use a variant due to the dependencies... (that's not a great
>explanation, so here's a longer example:)
>
>I'm making an MP3 parser, and each MP3 frame has a different number of
>fields depending on the MPEG ID (MPEG1/MPEG2) and the number of channels.
>This is a simplified version of my GADT types:
>
>type mpeg1_t;;
>type mpeg2_t;;
>type _ mpeg_t =
>| MPEG1 : mpeg1_t mpeg_t
>| MPEG2 : mpeg2_t mpeg_t
>;;
>
>type _ samplerate_t =
>| S48000 : mpeg1_t samplerate_t
>| S44100 : mpeg1_t samplerate_t
>| S24000 : mpeg2_t samplerate_t
>| S22050 : mpeg2_t samplerate_t
>;;
>
>type channel_mono_t;;
>type channel_stereo_t;;
>
>type _ channel_t =
>| Channel_mono : channel_mono_t channel_t
>| Channel_stereo : channel_stereo_t channel_t
>;;
>
>type (_,_) side_bits_t =
>| Bits_1_mono : (int * int) -> (mpeg1_t, channel_mono_t) side_bits_t
>| Bits_1_stereo : (int * int * int * int) -> (mpeg1_t, channel_stereo_t)
>side_bits_t
>| Bits_2_mono : int -> (mpeg2_t, channel_mono_t) side_bits_t
>| Bits_2_stereo : (int * int) -> (mpeg2_t, channel_stereo_t) side_bits_t
>;;
>
>type ('id,'chan) frame_t = {
>header_id : 'id mpeg_t;
>header_crc : bool;
>header_samplerate : 'id samplerate_t;
>header_channel_mode : 'chan channel_t;
>side_bits : ('id,'chan) side_bits_t;
>};;
>
>
>Without the use of GADTs, I can't think of a way to access both header_id
>and header_channel_mode in frame_t without some big hierarchical structure.
>The holdup seems to be that some parts of frame_t depend on 'id, some parts
>on 'chan, and some parts on both. Here's the simplest thing I can think of
>that will give the same information without GADTs:
>
>type mpeg1_mono_t = {
>mpeg1_mono_side_bits : int * int;
>
>};;
>type mpeg1_stereo_t = {
>mpeg1_stereo_side_bits : int * int * int * int;
>};;
>
>type mpeg1_channel_t = MPEG1_channel_mono of mpeg1_mono_t |
>MPEG1_channel_stereo of mpeg1_stereo_t;;
>
>type mpeg1_samplerate_t = S48000 | S44100;;
>
>type mpeg1_t = {
>mpeg1_channel_mode : mpeg1_channel_t;
>mpeg1_samplerate : mpeg1_samplerate_t;
>}
>
>type mpeg2_mono_t = {
>mpeg2_mono_side_bits : int;
>};;
>type mpeg2_stereo_t = {
>mpeg2_stereo_side_bits : int * int;
>};;
>
>type mpeg2_channel_t = MPEG2_channel_mono of mpeg2_mono_t |
>MPEG2_channel_stereo of mpeg2_stereo_t;;
>
>type mpeg2_samplerate_t = S48000 | S44100;;
>
>type mpeg2_t = {
>mpeg2_channel_mode : mpeg2_channel_t;
>mpeg2_samplerate : mpeg2_samplerate_t;
>}
>
>type mpeg_t = MPEG1 of mpeg1_t | MPEG2 of mpeg2_t;;
>
>type frame_t = {
>header_id : mpeg_t;
>header_crc : bool;
>};;
>
>
>Without GADTs, if I wanted a function to return the number of channels, I'd
>have to write something like this:
>let frame_channels = function
>| {header_id = MPEG1 {mpeg1_channel_mode = MPEG1_channel_mono _}} -> 1
>| {header_id = MPEG2 {mpeg2_channel_mode = MPEG2_channel_mono _}} -> 1
>| {header_id = MPEG1 {mpeg1_channel_mode = MPEG1_channel_stereo _}} -> 2
>| {header_id = MPEG2 {mpeg2_channel_mode = MPEG2_channel_stereo _}} -> 2
>;;
>
>However, since the number of channels doesn't actually depend on the MPEG
>ID (MPEG1 or MPEG2), with GADTs I can shorten that to:
>let frame_channels : type a b. (a,b) frame_t -> int = function
>| {header_channel_mode = Channel_mono} -> 1
>| {header_channel_mode = Channel_stereo} -> 2
>;;
>
>
>Now (finally!), the questions:
>1. Does it seem like GADTs are the way to go here, or is there a cleaner
>method?
>2. Is there any way of getting mpeg_of_int past the type-checker?
>3. If not, is that because I'm doing something unsound, or is it the GADT
>implementation being too strict?
>
>Thanks for any insights!
>
>


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] creating GADTs
       [not found]   ` <CALLFq5SWTGDuS5-Z7aN_UOqtTJ0sMvghWQTGqXUshUURiOoWcg@mail.gmail.com>
@ 2012-08-04 22:52     ` Reed Wilson
  2012-08-04 23:31       ` Gabriel Scherer
  2012-08-04 23:45       ` Gabriel Scherer
  0 siblings, 2 replies; 8+ messages in thread
From: Reed Wilson @ 2012-08-04 22:52 UTC (permalink / raw)
  To: caml-list

Thanks for the response!

On Sat, Aug 4, 2012 at 2:21 AM, Leo P White <lpw25@cam.ac.uk> wrote:
>
> Hi,
>
> The problem with your function
>
>
> let mpeg_of_int = function
> | 0 -> MPEG1
> | _ -> MPEG2
> ;;
>
> is that it cannot have type int -> 'a mpeg_t becuase that would mean that you could use it to create any values with any mpeg_t type (e.g. string mpeg_t).


I thought that looked like a funny type for a function, but shouldn't
it be clear from the definition of mpeg_t that the function can't
return string mpeg_t? I thought that's one of the things GADTs were
supposed to solve.

>
>
> I can think of two possible solutions to your problem.
>
> The first is to use an existential type via another GADT, so that your simple example becomes:
>
>
>  type mpeg1_t;;
>  type mpeg2_t;;
>  type _ mpeg_t =
>  | MPEG1 : mpeg1_t mpeg_t
>  | MPEG2 : mpeg2_t mpeg_t
>  ;;
>
>  type mpeg_ext_t = Mpeg_Ext: 'a mpeg_t -> mpeg_ext_t;;
>
>  let mpeg_of_int = function
>  | 0 -> Mpeg_Ext MPEG1
>  | _ -> Mpeg_Ext MPEG2
>  ;;


This does work, but unfortunately it doesn't seem to work for my
purpose, since I still can't get mpeg1_t or mpeg2_t "out" of it. (I
apologize for my continued use of nonstandard terminology - I've never
used any of these advanced typing features before)

Basically, in the end I'd build a function that creates a ('id,'chan)
frame_t, with 'id and 'chan depending on the bits stored in the MP3's
header. So something like this:

let frame_of_id_and_chan : type id chan. (id mpeg_t * chan channel_t)
-> (id,chan) frame_t = function
| (MPEG1, Channel_mono) -> {
header_id = MPEG1;
header_crc = true;
header_samplerate = S48000;
header_channel_mode = Channel_mono;
side_bits = Bits_1_mono (1,1);
}
| (MPEG1, Channel_stereo) -> {
header_id = MPEG1;
header_crc = true;
header_samplerate = S48000;
header_channel_mode = Channel_stereo;
side_bits = Bits_1_stereo (1,1,1,1);
}
| (MPEG2, Channel_mono) -> {
header_id = MPEG2;
header_crc = true;
header_samplerate = S24000;
header_channel_mode = Channel_mono;
side_bits = Bits_2_mono (1);
}
| (MPEG2, Channel_stereo) -> {
header_id = MPEG2;
header_crc = true;
header_samplerate = S24000;
header_channel_mode = Channel_stereo;
side_bits = Bits_2_stereo (1,1);
}
;;


Then I would create the frame based on the bits in the file:

let id_int = int_of_raw_bits ... in
let channel_mode_int = int_of_raw_bits ... in
let frame = frame_of_id_and_chan (mpeg_of_int id_int, channel_of_int
channel_mode_int) in
...


If hiding the mpeg_t type inside mpeg_ext_t, I wouldn't be able to use
it to create a frame of the right type.

>
>
> The second is to use polymorphic variants instead of abstract types to maintain your invariants. So your full example would become:
>
>  type mpeg_tag_t = [ `Mpeg1 | `Mpeg2 ]
>
>  type _ mpeg_t =
>  | MPEG1 : [< mpeg_tag_t >`Mpeg1] mpeg_t
>  | MPEG2 : [< mpeg_tag_t > `Mpeg2] mpeg_t
>  ;;
>
>  type _ samplerate_t =
>  | S48000 : [`Mpeg1] samplerate_t
>  | S44100 : [`Mpeg1] samplerate_t
>  | S24000 : [`Mpeg2] samplerate_t
>  | S22050 : [`Mpeg2] samplerate_t
>  ;;
>
>  type channel_tag_t = [ `Mono | `Stereo ]
>
>  type _ channel_t =
>  | Channel_mono : [< channel_tag_t > `Mono] channel_t
>  | Channel_stereo : [< channel_tag_t > `Stereo] channel_t
>  ;;
>
>  type (_,_) side_bits_t =
>  | Bits_1_mono : (int * int) -> ([`Mpeg1], [`Mono]) side_bits_t
>  | Bits_1_stereo : (int * int * int * int)                       -> ([`Mpeg1], [`Stereo]) side_bits_t
>  | Bits_2_mono : int -> ([`Mpeg2], [`Mono]) side_bits_t
>  | Bits_2_stereo : (int * int) -> ([`Mpeg2], [`Stereo]) side_bits_t
>
>  ;;
>
>  type ('id,'chan) frame_t = {
>    header_id : 'id mpeg_t;
>    header_crc : bool;
>    header_samplerate : 'id samplerate_t;
>    header_channel_mode : 'chan channel_t;
>    side_bits : ('id,'chan) side_bits_t;
>  };;
>
> This allows you to write your function as before:
>
>
>  let mpeg_of_int = function
>  | 0 -> MPEG1
>  | _ -> MPEG2;;
>
> and it still maintains the desired invariant, so that this is allowed:
>
>  let good =  { header_id = MPEG1;
>    header_crc = true;
>    header_samplerate = S48000;
>    header_channel_mode = Channel_mono;
>    side_bits = Bits_1_mono(1, 2); };;
>
> but this isn't:
>
> let bad =  { header_id = MPEG1;
>    header_crc = true;
>    header_samplerate = S48000;
>    header_channel_mode = Channel_stereo;
>    side_bits = Bits_1_mono(1, 2); };;
>
>  Error: This expression has type ([ `Mpeg1 ], [ `Mono ]) side_bits_t
>
>       but an expression was expected of type
>         ([ `Mpeg1 ], [< channel_tag_t > `Stereo ] as 'a) side_bits_t
>       Type [ `Mono ] is not compatible with type
>         'a = [< `Mono | `Stereo > `Stereo ]       The first variant type does not allow tag(s) `Stereo
>
> Note that this would still not allow you to write the function:
>
>  let mpeg_of_int = function
>  | 0 -> S48000
>  | _ -> S24000;;
>
> which suffers from the same problem as your original function.


That helps a lot! I think I got it working with polymorphic variants.
I basically added < mpeg_tag_t > to all the samplerate definitions,
which seems to let everything work. The result is this:

type mpeg_tag_t = [ `Mpeg1 | `Mpeg2 ]

type _ mpeg_t =
| MPEG1 : [< mpeg_tag_t > `Mpeg1] mpeg_t
| MPEG2 : [< mpeg_tag_t > `Mpeg2] mpeg_t
;;

type _ samplerate_t =
| S48000 : [< mpeg_tag_t > `Mpeg1] samplerate_t
| S44100 : [< mpeg_tag_t > `Mpeg1] samplerate_t
| S24000 : [< mpeg_tag_t > `Mpeg2] samplerate_t
| S22050 : [< mpeg_tag_t > `Mpeg2] samplerate_t

;;

type channel_tag_t = [ `Mono | `Stereo ]

type _ channel_t =
| Channel_mono : [< channel_tag_t > `Mono] channel_t
| Channel_stereo : [< channel_tag_t > `Stereo] channel_t
;;

type (_,_) side_bits_t =
| Bits_1_mono : (int * int) -> ([`Mpeg1], [`Mono]) side_bits_t
| Bits_1_stereo : (int * int * int * int) -> ([`Mpeg1], [`Stereo]) side_bits_t
| Bits_2_mono : int -> ([`Mpeg2], [`Mono]) side_bits_t
| Bits_2_stereo : (int * int) -> ([`Mpeg2], [`Stereo]) side_bits_t
;;

type ('id,'chan) frame_t = {
mutable header_id : 'id mpeg_t;
mutable header_crc : bool;
mutable header_samplerate : 'id samplerate_t;
mutable header_channel_mode : 'chan channel_t;
mutable side_bits : ('id,'chan) side_bits_t;
};;


let mpeg_of_int = function
| 0 -> MPEG1
| 1 -> MPEG2
| _ -> failwith "Bad"
;;

let samplerate_of_int = function
| 0 -> S48000
| 1 -> S44100
| 2 -> S24000
| 3 -> S22050
| _ -> failwith "Bad"
;;

let channel_mode_of_int = function
| 0 -> Channel_mono
| 1 -> Channel_stereo
| _ -> failwith "Bad"
;;

All the types seem to work out, with one fairly contrived exception...
If I write a few test functions:

 let frame_of_sr_id_chan : type id chan. id samplerate_t -> id mpeg_t
* chan channel_t -> (id,chan) frame_t = fun sr -> function
| (MPEG1, Channel_mono) -> {header_id = MPEG1; header_crc = true;
header_samplerate = sr; header_channel_mode = Channel_mono; side_bits
= Bits_1_mono (1,1)}
| (MPEG1, Channel_stereo) -> {header_id = MPEG1; header_crc = true;
header_samplerate = sr; header_channel_mode = Channel_stereo;
side_bits = Bits_1_stereo (1,1,1,1)}
| (MPEG2, Channel_mono) -> {header_id = MPEG2; header_crc = true;
header_samplerate = sr; header_channel_mode = Channel_mono; side_bits
= Bits_2_mono (1)}
| (MPEG2, Channel_stereo) -> {header_id = MPEG2; header_crc = true;
header_samplerate = sr; header_channel_mode = Channel_stereo;
side_bits = Bits_2_stereo (1,1)}
;;

let print_samplerate : type id chan. (id,chan) frame_t -> unit = function
| {header_id = MPEG1; header_samplerate = S48000} -> Printf.printf "M1_S48000\n"
| {header_id = MPEG1; header_samplerate = S44100} -> Printf.printf "M1_S44100\n"
| {header_id = MPEG2; header_samplerate = S24000} -> Printf.printf "M2_S24000\n"
| {header_id = MPEG2; header_samplerate = S22050} -> Printf.printf "M2_S22050\n"
;;


I was happy to notice that this did not complain about being
incomplete, correctly seeing that the samplerates can only be used
with a matching header_id;

However, if I then I make an invalid frame using the *_of_int functions:

let test_frame = frame_of_sr_id_chan (samplerate_of_int 3)
(mpeg_of_int 0, channel_mode_of_int 0);;
print_samplerate test_frame;;


This accepts the input (even though samplerate_of_int 3 is `Mpeg2, and
mpeg_of_int 0 is `Mpeg1). However, this will print M1_S44100, which is
definitely not what I specified (samplerate_of_int 3 is S22050)! There
must be something I don't get with this typing. Any ideas what?

In the end, though, I will not actually use a function like
frame_of_sr_id_chan, so using polymorphic variants should do exactly
what I want.

Thanks again,
Reed

--
ç

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] creating GADTs
  2012-08-04 22:52     ` Reed Wilson
@ 2012-08-04 23:31       ` Gabriel Scherer
  2012-08-04 23:45       ` Gabriel Scherer
  1 sibling, 0 replies; 8+ messages in thread
From: Gabriel Scherer @ 2012-08-04 23:31 UTC (permalink / raw)
  To: Reed Wilson; +Cc: caml-list

>> is that it cannot have type int -> 'a mpeg_t becuase that would mean that you could use it to create any values with any mpeg_t type (e.g. string mpeg_t).
>
>
> I thought that looked like a funny type for a function, but shouldn't
> it be clear from the definition of mpeg_t that the function can't
> return string mpeg_t? I thought that's one of the things GADTs were
> supposed to solve.

This is not how typing works (and this specific part of Leo's
otherwise excellent answer was a bit misleading). A type (int -> 'a mpeg_t)
means that the return type has to be *polymorphic* in 'a. Any
value returned by this function must have type (t mpeg_t) for *any*
type t. MPEG1 is not polymorphic in the type parameter, it does not
have type ('a mpeg_t).

Think of a function of type (int -> 'a list) for example; the only
possible return value is the empty list, because it is polymoprhic: []
has type (int list) and (bool list) and ... This is the way type
polymorphism works in ML, and GADTs do not change that.

What you want to say is not that, given an integer, you return
a (t mpeg_t) for *any* t (that would be the polymorphic type 'a mpeg_t),
but that you return an (t mpeg_t) for *some* unknown type
t. This is not an universal quantification, but an existential one,
and it's why using an existential type (encoded as a GADT in
Leo's proposal) is the natural solution to your problem.

(The polymorphic variant solution is not all that different, except
that instead of saying "some unknown t", you can say the slightly more
precise "some t that has at most the constructors `MPEG1 and
`MPEG2`". My personal opinion is that GADTs are complex enough as they
are, and that throwing polymorphic variants in the mix is maybe going
a bit too far, so I'd rather concentrate on the existential solution.)

> Then I would create the frame based on the bits in the file:
>
> let id_int = int_of_raw_bits ... in
> let channel_mode_int = int_of_raw_bits ... in
> let frame = frame_of_id_and_chan (mpeg_of_int id_int, channel_of_int
> channel_mode_int) in
> ...
>
>
> If hiding the mpeg_t type inside mpeg_ext_t, I wouldn't be able to use
> it to create a frame of the right type.

From an *unknown* (hidden in an existential mpeg_ext_t type) mpeg tag,
you can't get a statically known frame tag. But you can produce an
frame of unknown (existential) type, playing the same trick again
(I haven't type-checked my code so it may be full of typos):

  type channel_ext_t = Chan_ext : 'chan channel_t -> channel_ext_t

  type frame_ext_t = Frame_ext : ('id, 'chan) frame_t -> 'chan frame_ext_t

  let id_int = int_of_raw_bits ... in
  let channel_mode_int = int_of_raw_bits ... in
  let frame =
    match mpeg_of_int id_int, channel_of_int channel_mode_int with
      | Mpeg_ext mpeg, Chan_ext chan ->
          Frame_ext (frame_of_id_and_chan mpeg chan)

The idea is that GADTs help with your problem because they allow you
to reason *statically* about the possible combinations (while your
previous solution used dynamic checking in the form of pattern
matching on the tags, and therefore was less convenient as you had to
do checks all over the place).

A rich type such as
  type id chan. id mpeg_t -> chan channel_t -> (id,chan) frame_t
let you say "if I statically know the id and the chan, I can build
a well-typed (id,chan) frame". So the functions that manipulate your
data statically reason about it, and are nice to write.

But then, in the end, the tags/chans are not statically known, they
are decided at runtime. The existential types
(or polymorphic variants) are there to tell the type-checker "sorry,
you can't know what will be used", they are a boundary between what's
written using static knowledge (the auxiliary functions) and what's
known only at runtime (the actual input/output of your script). Your
function "frame" has to unbox the existential Mpeg_ext and Chan_ext
(to get them, locally, in a form manipulable by the
auxiliary functions), then rebox the resulting frame to denote that
it's only known at runtime.

This boxing/unboxing in/out of existential types is a bit painful
(especially since OCaml doesn't have "first-class" structural
existentials, you have to build them explicitly through type
definitions or first-class modules), but hopefully it only happen at
the IO boundaries of your application, that is not too
often. Existential types are a burden you inflicted upon yourself by
choosing more explicit static types: now you sometimes have to
explicitly say "I don't know", while before you never knew.



On Sun, Aug 5, 2012 at 12:52 AM, Reed Wilson <cedilla@gmail.com> wrote:
> Thanks for the response!
>
> On Sat, Aug 4, 2012 at 2:21 AM, Leo P White <lpw25@cam.ac.uk> wrote:
>>
>> Hi,
>>
>> The problem with your function
>>
>>
>> let mpeg_of_int = function
>> | 0 -> MPEG1
>> | _ -> MPEG2
>> ;;
>>
>> is that it cannot have type int -> 'a mpeg_t becuase that would mean that you could use it to create any values with any mpeg_t type (e.g. string mpeg_t).
>
>
> I thought that looked like a funny type for a function, but shouldn't
> it be clear from the definition of mpeg_t that the function can't
> return string mpeg_t? I thought that's one of the things GADTs were
> supposed to solve.
>
>>
>>
>> I can think of two possible solutions to your problem.
>>
>> The first is to use an existential type via another GADT, so that your simple example becomes:
>>
>>
>>  type mpeg1_t;;
>>  type mpeg2_t;;
>>  type _ mpeg_t =
>>  | MPEG1 : mpeg1_t mpeg_t
>>  | MPEG2 : mpeg2_t mpeg_t
>>  ;;
>>
>>  type mpeg_ext_t = Mpeg_Ext: 'a mpeg_t -> mpeg_ext_t;;
>>
>>  let mpeg_of_int = function
>>  | 0 -> Mpeg_Ext MPEG1
>>  | _ -> Mpeg_Ext MPEG2
>>  ;;
>
>
> This does work, but unfortunately it doesn't seem to work for my
> purpose, since I still can't get mpeg1_t or mpeg2_t "out" of it. (I
> apologize for my continued use of nonstandard terminology - I've never
> used any of these advanced typing features before)
>
> Basically, in the end I'd build a function that creates a ('id,'chan)
> frame_t, with 'id and 'chan depending on the bits stored in the MP3's
> header. So something like this:
>
> let frame_of_id_and_chan : type id chan. (id mpeg_t * chan channel_t)
> -> (id,chan) frame_t = function
> | (MPEG1, Channel_mono) -> {
> header_id = MPEG1;
> header_crc = true;
> header_samplerate = S48000;
> header_channel_mode = Channel_mono;
> side_bits = Bits_1_mono (1,1);
> }
> | (MPEG1, Channel_stereo) -> {
> header_id = MPEG1;
> header_crc = true;
> header_samplerate = S48000;
> header_channel_mode = Channel_stereo;
> side_bits = Bits_1_stereo (1,1,1,1);
> }
> | (MPEG2, Channel_mono) -> {
> header_id = MPEG2;
> header_crc = true;
> header_samplerate = S24000;
> header_channel_mode = Channel_mono;
> side_bits = Bits_2_mono (1);
> }
> | (MPEG2, Channel_stereo) -> {
> header_id = MPEG2;
> header_crc = true;
> header_samplerate = S24000;
> header_channel_mode = Channel_stereo;
> side_bits = Bits_2_stereo (1,1);
> }
> ;;
>
>
> Then I would create the frame based on the bits in the file:
>
> let id_int = int_of_raw_bits ... in
> let channel_mode_int = int_of_raw_bits ... in
> let frame = frame_of_id_and_chan (mpeg_of_int id_int, channel_of_int
> channel_mode_int) in
> ...
>
>
> If hiding the mpeg_t type inside mpeg_ext_t, I wouldn't be able to use
> it to create a frame of the right type.
>
>>
>>
>> The second is to use polymorphic variants instead of abstract types to maintain your invariants. So your full example would become:
>>
>>  type mpeg_tag_t = [ `Mpeg1 | `Mpeg2 ]
>>
>>  type _ mpeg_t =
>>  | MPEG1 : [< mpeg_tag_t >`Mpeg1] mpeg_t
>>  | MPEG2 : [< mpeg_tag_t > `Mpeg2] mpeg_t
>>  ;;
>>
>>  type _ samplerate_t =
>>  | S48000 : [`Mpeg1] samplerate_t
>>  | S44100 : [`Mpeg1] samplerate_t
>>  | S24000 : [`Mpeg2] samplerate_t
>>  | S22050 : [`Mpeg2] samplerate_t
>>  ;;
>>
>>  type channel_tag_t = [ `Mono | `Stereo ]
>>
>>  type _ channel_t =
>>  | Channel_mono : [< channel_tag_t > `Mono] channel_t
>>  | Channel_stereo : [< channel_tag_t > `Stereo] channel_t
>>  ;;
>>
>>  type (_,_) side_bits_t =
>>  | Bits_1_mono : (int * int) -> ([`Mpeg1], [`Mono]) side_bits_t
>>  | Bits_1_stereo : (int * int * int * int)                       -> ([`Mpeg1], [`Stereo]) side_bits_t
>>  | Bits_2_mono : int -> ([`Mpeg2], [`Mono]) side_bits_t
>>  | Bits_2_stereo : (int * int) -> ([`Mpeg2], [`Stereo]) side_bits_t
>>
>>  ;;
>>
>>  type ('id,'chan) frame_t = {
>>    header_id : 'id mpeg_t;
>>    header_crc : bool;
>>    header_samplerate : 'id samplerate_t;
>>    header_channel_mode : 'chan channel_t;
>>    side_bits : ('id,'chan) side_bits_t;
>>  };;
>>
>> This allows you to write your function as before:
>>
>>
>>  let mpeg_of_int = function
>>  | 0 -> MPEG1
>>  | _ -> MPEG2;;
>>
>> and it still maintains the desired invariant, so that this is allowed:
>>
>>  let good =  { header_id = MPEG1;
>>    header_crc = true;
>>    header_samplerate = S48000;
>>    header_channel_mode = Channel_mono;
>>    side_bits = Bits_1_mono(1, 2); };;
>>
>> but this isn't:
>>
>> let bad =  { header_id = MPEG1;
>>    header_crc = true;
>>    header_samplerate = S48000;
>>    header_channel_mode = Channel_stereo;
>>    side_bits = Bits_1_mono(1, 2); };;
>>
>>  Error: This expression has type ([ `Mpeg1 ], [ `Mono ]) side_bits_t
>>
>>       but an expression was expected of type
>>         ([ `Mpeg1 ], [< channel_tag_t > `Stereo ] as 'a) side_bits_t
>>       Type [ `Mono ] is not compatible with type
>>         'a = [< `Mono | `Stereo > `Stereo ]       The first variant type does not allow tag(s) `Stereo
>>
>> Note that this would still not allow you to write the function:
>>
>>  let mpeg_of_int = function
>>  | 0 -> S48000
>>  | _ -> S24000;;
>>
>> which suffers from the same problem as your original function.
>
>
> That helps a lot! I think I got it working with polymorphic variants.
> I basically added < mpeg_tag_t > to all the samplerate definitions,
> which seems to let everything work. The result is this:
>
> type mpeg_tag_t = [ `Mpeg1 | `Mpeg2 ]
>
> type _ mpeg_t =
> | MPEG1 : [< mpeg_tag_t > `Mpeg1] mpeg_t
> | MPEG2 : [< mpeg_tag_t > `Mpeg2] mpeg_t
> ;;
>
> type _ samplerate_t =
> | S48000 : [< mpeg_tag_t > `Mpeg1] samplerate_t
> | S44100 : [< mpeg_tag_t > `Mpeg1] samplerate_t
> | S24000 : [< mpeg_tag_t > `Mpeg2] samplerate_t
> | S22050 : [< mpeg_tag_t > `Mpeg2] samplerate_t
>
> ;;
>
> type channel_tag_t = [ `Mono | `Stereo ]
>
> type _ channel_t =
> | Channel_mono : [< channel_tag_t > `Mono] channel_t
> | Channel_stereo : [< channel_tag_t > `Stereo] channel_t
> ;;
>
> type (_,_) side_bits_t =
> | Bits_1_mono : (int * int) -> ([`Mpeg1], [`Mono]) side_bits_t
> | Bits_1_stereo : (int * int * int * int) -> ([`Mpeg1], [`Stereo]) side_bits_t
> | Bits_2_mono : int -> ([`Mpeg2], [`Mono]) side_bits_t
> | Bits_2_stereo : (int * int) -> ([`Mpeg2], [`Stereo]) side_bits_t
> ;;
>
> type ('id,'chan) frame_t = {
> mutable header_id : 'id mpeg_t;
> mutable header_crc : bool;
> mutable header_samplerate : 'id samplerate_t;
> mutable header_channel_mode : 'chan channel_t;
> mutable side_bits : ('id,'chan) side_bits_t;
> };;
>
>
> let mpeg_of_int = function
> | 0 -> MPEG1
> | 1 -> MPEG2
> | _ -> failwith "Bad"
> ;;
>
> let samplerate_of_int = function
> | 0 -> S48000
> | 1 -> S44100
> | 2 -> S24000
> | 3 -> S22050
> | _ -> failwith "Bad"
> ;;
>
> let channel_mode_of_int = function
> | 0 -> Channel_mono
> | 1 -> Channel_stereo
> | _ -> failwith "Bad"
> ;;
>
> All the types seem to work out, with one fairly contrived exception...
> If I write a few test functions:
>
>  let frame_of_sr_id_chan : type id chan. id samplerate_t -> id mpeg_t
> * chan channel_t -> (id,chan) frame_t = fun sr -> function
> | (MPEG1, Channel_mono) -> {header_id = MPEG1; header_crc = true;
> header_samplerate = sr; header_channel_mode = Channel_mono; side_bits
> = Bits_1_mono (1,1)}
> | (MPEG1, Channel_stereo) -> {header_id = MPEG1; header_crc = true;
> header_samplerate = sr; header_channel_mode = Channel_stereo;
> side_bits = Bits_1_stereo (1,1,1,1)}
> | (MPEG2, Channel_mono) -> {header_id = MPEG2; header_crc = true;
> header_samplerate = sr; header_channel_mode = Channel_mono; side_bits
> = Bits_2_mono (1)}
> | (MPEG2, Channel_stereo) -> {header_id = MPEG2; header_crc = true;
> header_samplerate = sr; header_channel_mode = Channel_stereo;
> side_bits = Bits_2_stereo (1,1)}
> ;;
>
> let print_samplerate : type id chan. (id,chan) frame_t -> unit = function
> | {header_id = MPEG1; header_samplerate = S48000} -> Printf.printf "M1_S48000\n"
> | {header_id = MPEG1; header_samplerate = S44100} -> Printf.printf "M1_S44100\n"
> | {header_id = MPEG2; header_samplerate = S24000} -> Printf.printf "M2_S24000\n"
> | {header_id = MPEG2; header_samplerate = S22050} -> Printf.printf "M2_S22050\n"
> ;;
>
>
> I was happy to notice that this did not complain about being
> incomplete, correctly seeing that the samplerates can only be used
> with a matching header_id;
>
> However, if I then I make an invalid frame using the *_of_int functions:
>
> let test_frame = frame_of_sr_id_chan (samplerate_of_int 3)
> (mpeg_of_int 0, channel_mode_of_int 0);;
> print_samplerate test_frame;;
>
>
> This accepts the input (even though samplerate_of_int 3 is `Mpeg2, and
> mpeg_of_int 0 is `Mpeg1). However, this will print M1_S44100, which is
> definitely not what I specified (samplerate_of_int 3 is S22050)! There
> must be something I don't get with this typing. Any ideas what?
>
> In the end, though, I will not actually use a function like
> frame_of_sr_id_chan, so using polymorphic variants should do exactly
> what I want.
>
> Thanks again,
> Reed
>
> --
> ç
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa-roc.inria.fr/wws/info/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] creating GADTs
  2012-08-04 22:52     ` Reed Wilson
  2012-08-04 23:31       ` Gabriel Scherer
@ 2012-08-04 23:45       ` Gabriel Scherer
  2012-08-05  2:47         ` Reed Wilson
  1 sibling, 1 reply; 8+ messages in thread
From: Gabriel Scherer @ 2012-08-04 23:45 UTC (permalink / raw)
  To: Reed Wilson; +Cc: caml-list

> I was happy to notice that this did not complain about being
> incomplete, correctly seeing that the samplerates can only be used
> with a matching header_id;
>
> However, if I then I make an invalid frame using the *_of_int functions:
>
> let test_frame = frame_of_sr_id_chan (samplerate_of_int 3)
> (mpeg_of_int 0, channel_mode_of_int 0);;
> print_samplerate test_frame;;

Here is what I observe on my machine:

# let print_samplerate : type id chan. (id,chan) frame_t -> unit = function
[...]
Warning 8: this pattern-matching is not exhaustive.
Here is an example of a value that is not matched:
{header_id=MPEG2; header_samplerate=S48000}

# let test_frame = frame_of_sr_id_chan (samplerate_of_int 3)
(mpeg_of_int 0, channel_mode_of_int 0);;
  val test_frame : (mpeg_tag_t, channel_tag_t) frame_t =
  {header_id = MPEG1; header_crc = true; header_samplerate = S22050;
   header_channel_mode = Channel_mono; side_bits = Bits_1_mono <abstr>}

# print_samplerate test_frame;;
Exception: Match_failure ("//toplevel//", 43, -733).

Note that the type of your *_of_int function is not precise enough for
what you want to do:

  val mpeg_of_int : int -> mpeg_tag_t mpeg_t = <fun>
  val samplerate_of_int : int -> mpeg_tag_t samplerate_t = <fun>
  val channel_mode_of_int : int -> channel_tag_t channel_t = <fun>

This means that the mpeg_id returned by mpeg_of_int will have the
imprecise type mpeg_tag_t (basically *either* a MPEG1 or a MPEG2),
instead of having the more precise type you want to guarantee matching
exhaustivity later. This is what Leo described as "still suffer of the
same problem as your original function".

On Sun, Aug 5, 2012 at 12:52 AM, Reed Wilson <cedilla@gmail.com> wrote:
> Thanks for the response!
>
> On Sat, Aug 4, 2012 at 2:21 AM, Leo P White <lpw25@cam.ac.uk> wrote:
>>
>> Hi,
>>
>> The problem with your function
>>
>>
>> let mpeg_of_int = function
>> | 0 -> MPEG1
>> | _ -> MPEG2
>> ;;
>>
>> is that it cannot have type int -> 'a mpeg_t becuase that would mean that you could use it to create any values with any mpeg_t type (e.g. string mpeg_t).
>
>
> I thought that looked like a funny type for a function, but shouldn't
> it be clear from the definition of mpeg_t that the function can't
> return string mpeg_t? I thought that's one of the things GADTs were
> supposed to solve.
>
>>
>>
>> I can think of two possible solutions to your problem.
>>
>> The first is to use an existential type via another GADT, so that your simple example becomes:
>>
>>
>>  type mpeg1_t;;
>>  type mpeg2_t;;
>>  type _ mpeg_t =
>>  | MPEG1 : mpeg1_t mpeg_t
>>  | MPEG2 : mpeg2_t mpeg_t
>>  ;;
>>
>>  type mpeg_ext_t = Mpeg_Ext: 'a mpeg_t -> mpeg_ext_t;;
>>
>>  let mpeg_of_int = function
>>  | 0 -> Mpeg_Ext MPEG1
>>  | _ -> Mpeg_Ext MPEG2
>>  ;;
>
>
> This does work, but unfortunately it doesn't seem to work for my
> purpose, since I still can't get mpeg1_t or mpeg2_t "out" of it. (I
> apologize for my continued use of nonstandard terminology - I've never
> used any of these advanced typing features before)
>
> Basically, in the end I'd build a function that creates a ('id,'chan)
> frame_t, with 'id and 'chan depending on the bits stored in the MP3's
> header. So something like this:
>
> let frame_of_id_and_chan : type id chan. (id mpeg_t * chan channel_t)
> -> (id,chan) frame_t = function
> | (MPEG1, Channel_mono) -> {
> header_id = MPEG1;
> header_crc = true;
> header_samplerate = S48000;
> header_channel_mode = Channel_mono;
> side_bits = Bits_1_mono (1,1);
> }
> | (MPEG1, Channel_stereo) -> {
> header_id = MPEG1;
> header_crc = true;
> header_samplerate = S48000;
> header_channel_mode = Channel_stereo;
> side_bits = Bits_1_stereo (1,1,1,1);
> }
> | (MPEG2, Channel_mono) -> {
> header_id = MPEG2;
> header_crc = true;
> header_samplerate = S24000;
> header_channel_mode = Channel_mono;
> side_bits = Bits_2_mono (1);
> }
> | (MPEG2, Channel_stereo) -> {
> header_id = MPEG2;
> header_crc = true;
> header_samplerate = S24000;
> header_channel_mode = Channel_stereo;
> side_bits = Bits_2_stereo (1,1);
> }
> ;;
>
>
> Then I would create the frame based on the bits in the file:
>
> let id_int = int_of_raw_bits ... in
> let channel_mode_int = int_of_raw_bits ... in
> let frame = frame_of_id_and_chan (mpeg_of_int id_int, channel_of_int
> channel_mode_int) in
> ...
>
>
> If hiding the mpeg_t type inside mpeg_ext_t, I wouldn't be able to use
> it to create a frame of the right type.
>
>>
>>
>> The second is to use polymorphic variants instead of abstract types to maintain your invariants. So your full example would become:
>>
>>  type mpeg_tag_t = [ `Mpeg1 | `Mpeg2 ]
>>
>>  type _ mpeg_t =
>>  | MPEG1 : [< mpeg_tag_t >`Mpeg1] mpeg_t
>>  | MPEG2 : [< mpeg_tag_t > `Mpeg2] mpeg_t
>>  ;;
>>
>>  type _ samplerate_t =
>>  | S48000 : [`Mpeg1] samplerate_t
>>  | S44100 : [`Mpeg1] samplerate_t
>>  | S24000 : [`Mpeg2] samplerate_t
>>  | S22050 : [`Mpeg2] samplerate_t
>>  ;;
>>
>>  type channel_tag_t = [ `Mono | `Stereo ]
>>
>>  type _ channel_t =
>>  | Channel_mono : [< channel_tag_t > `Mono] channel_t
>>  | Channel_stereo : [< channel_tag_t > `Stereo] channel_t
>>  ;;
>>
>>  type (_,_) side_bits_t =
>>  | Bits_1_mono : (int * int) -> ([`Mpeg1], [`Mono]) side_bits_t
>>  | Bits_1_stereo : (int * int * int * int)                       -> ([`Mpeg1], [`Stereo]) side_bits_t
>>  | Bits_2_mono : int -> ([`Mpeg2], [`Mono]) side_bits_t
>>  | Bits_2_stereo : (int * int) -> ([`Mpeg2], [`Stereo]) side_bits_t
>>
>>  ;;
>>
>>  type ('id,'chan) frame_t = {
>>    header_id : 'id mpeg_t;
>>    header_crc : bool;
>>    header_samplerate : 'id samplerate_t;
>>    header_channel_mode : 'chan channel_t;
>>    side_bits : ('id,'chan) side_bits_t;
>>  };;
>>
>> This allows you to write your function as before:
>>
>>
>>  let mpeg_of_int = function
>>  | 0 -> MPEG1
>>  | _ -> MPEG2;;
>>
>> and it still maintains the desired invariant, so that this is allowed:
>>
>>  let good =  { header_id = MPEG1;
>>    header_crc = true;
>>    header_samplerate = S48000;
>>    header_channel_mode = Channel_mono;
>>    side_bits = Bits_1_mono(1, 2); };;
>>
>> but this isn't:
>>
>> let bad =  { header_id = MPEG1;
>>    header_crc = true;
>>    header_samplerate = S48000;
>>    header_channel_mode = Channel_stereo;
>>    side_bits = Bits_1_mono(1, 2); };;
>>
>>  Error: This expression has type ([ `Mpeg1 ], [ `Mono ]) side_bits_t
>>
>>       but an expression was expected of type
>>         ([ `Mpeg1 ], [< channel_tag_t > `Stereo ] as 'a) side_bits_t
>>       Type [ `Mono ] is not compatible with type
>>         'a = [< `Mono | `Stereo > `Stereo ]       The first variant type does not allow tag(s) `Stereo
>>
>> Note that this would still not allow you to write the function:
>>
>>  let mpeg_of_int = function
>>  | 0 -> S48000
>>  | _ -> S24000;;
>>
>> which suffers from the same problem as your original function.
>
>
> That helps a lot! I think I got it working with polymorphic variants.
> I basically added < mpeg_tag_t > to all the samplerate definitions,
> which seems to let everything work. The result is this:
>
> type mpeg_tag_t = [ `Mpeg1 | `Mpeg2 ]
>
> type _ mpeg_t =
> | MPEG1 : [< mpeg_tag_t > `Mpeg1] mpeg_t
> | MPEG2 : [< mpeg_tag_t > `Mpeg2] mpeg_t
> ;;
>
> type _ samplerate_t =
> | S48000 : [< mpeg_tag_t > `Mpeg1] samplerate_t
> | S44100 : [< mpeg_tag_t > `Mpeg1] samplerate_t
> | S24000 : [< mpeg_tag_t > `Mpeg2] samplerate_t
> | S22050 : [< mpeg_tag_t > `Mpeg2] samplerate_t
>
> ;;
>
> type channel_tag_t = [ `Mono | `Stereo ]
>
> type _ channel_t =
> | Channel_mono : [< channel_tag_t > `Mono] channel_t
> | Channel_stereo : [< channel_tag_t > `Stereo] channel_t
> ;;
>
> type (_,_) side_bits_t =
> | Bits_1_mono : (int * int) -> ([`Mpeg1], [`Mono]) side_bits_t
> | Bits_1_stereo : (int * int * int * int) -> ([`Mpeg1], [`Stereo]) side_bits_t
> | Bits_2_mono : int -> ([`Mpeg2], [`Mono]) side_bits_t
> | Bits_2_stereo : (int * int) -> ([`Mpeg2], [`Stereo]) side_bits_t
> ;;
>
> type ('id,'chan) frame_t = {
> mutable header_id : 'id mpeg_t;
> mutable header_crc : bool;
> mutable header_samplerate : 'id samplerate_t;
> mutable header_channel_mode : 'chan channel_t;
> mutable side_bits : ('id,'chan) side_bits_t;
> };;
>
>
> let mpeg_of_int = function
> | 0 -> MPEG1
> | 1 -> MPEG2
> | _ -> failwith "Bad"
> ;;
>
> let samplerate_of_int = function
> | 0 -> S48000
> | 1 -> S44100
> | 2 -> S24000
> | 3 -> S22050
> | _ -> failwith "Bad"
> ;;
>
> let channel_mode_of_int = function
> | 0 -> Channel_mono
> | 1 -> Channel_stereo
> | _ -> failwith "Bad"
> ;;
>
> All the types seem to work out, with one fairly contrived exception...
> If I write a few test functions:
>
>  let frame_of_sr_id_chan : type id chan. id samplerate_t -> id mpeg_t
> * chan channel_t -> (id,chan) frame_t = fun sr -> function
> | (MPEG1, Channel_mono) -> {header_id = MPEG1; header_crc = true;
> header_samplerate = sr; header_channel_mode = Channel_mono; side_bits
> = Bits_1_mono (1,1)}
> | (MPEG1, Channel_stereo) -> {header_id = MPEG1; header_crc = true;
> header_samplerate = sr; header_channel_mode = Channel_stereo;
> side_bits = Bits_1_stereo (1,1,1,1)}
> | (MPEG2, Channel_mono) -> {header_id = MPEG2; header_crc = true;
> header_samplerate = sr; header_channel_mode = Channel_mono; side_bits
> = Bits_2_mono (1)}
> | (MPEG2, Channel_stereo) -> {header_id = MPEG2; header_crc = true;
> header_samplerate = sr; header_channel_mode = Channel_stereo;
> side_bits = Bits_2_stereo (1,1)}
> ;;
>
> let print_samplerate : type id chan. (id,chan) frame_t -> unit = function
> | {header_id = MPEG1; header_samplerate = S48000} -> Printf.printf "M1_S48000\n"
> | {header_id = MPEG1; header_samplerate = S44100} -> Printf.printf "M1_S44100\n"
> | {header_id = MPEG2; header_samplerate = S24000} -> Printf.printf "M2_S24000\n"
> | {header_id = MPEG2; header_samplerate = S22050} -> Printf.printf "M2_S22050\n"
> ;;
>
>
> I was happy to notice that this did not complain about being
> incomplete, correctly seeing that the samplerates can only be used
> with a matching header_id;
>
> However, if I then I make an invalid frame using the *_of_int functions:
>
> let test_frame = frame_of_sr_id_chan (samplerate_of_int 3)
> (mpeg_of_int 0, channel_mode_of_int 0);;
> print_samplerate test_frame;;
>
>
> This accepts the input (even though samplerate_of_int 3 is `Mpeg2, and
> mpeg_of_int 0 is `Mpeg1). However, this will print M1_S44100, which is
> definitely not what I specified (samplerate_of_int 3 is S22050)! There
> must be something I don't get with this typing. Any ideas what?
>
> In the end, though, I will not actually use a function like
> frame_of_sr_id_chan, so using polymorphic variants should do exactly
> what I want.
>
> Thanks again,
> Reed
>
> --
> ç
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa-roc.inria.fr/wws/info/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] creating GADTs
  2012-08-04 23:45       ` Gabriel Scherer
@ 2012-08-05  2:47         ` Reed Wilson
  2012-08-05 23:06           ` Reed Wilson
  0 siblings, 1 reply; 8+ messages in thread
From: Reed Wilson @ 2012-08-05  2:47 UTC (permalink / raw)
  To: Gabriel Scherer; +Cc: caml-list


[-- Attachment #1.1: Type: text/plain, Size: 2689 bytes --]

Again, thanks for all the assistance.

The warning and Match_failure in your post are what I expected in my
program, but I didn't receive either. The print_samplerate function just
returns the wrong answer. I tested it on 4.00.0 Win 32- and 64-bit, Linux
64-bit, and 4.01.0+dev6_2012-07-30 Linux 64-bit.

Even in the toplevel it runs fine:

# let print_samplerate : type id chan. (id,chan) frame_t -> unit = function
| {header_id = MPEG1; header_samplerate = S48000} -> Printf.printf
"M1_S48000\n"
| {header_id = MPEG1; header_samplerate = S44100} -> Printf.printf
"M1_S44100\n"
| {header_id = MPEG2; header_samplerate = S24000} -> Printf.printf
"M2_S24000\n"
| {header_id = MPEG2; header_samplerate = S22050} -> Printf.printf
"M2_S22050\n"
;;
val print_samplerate : ('id, 'chan) frame_t -> unit = <fun>

# let test_frame = frame_of_sr_id_chan (samplerate_of_int 3) (mpeg_of_int
0, channel_mode_of_int 0);;
val test_frame : (mpeg_tag_t, channel_tag_t) frame_t =
  {header_id = MPEG1; header_crc = true; header_samplerate = S22050;
   header_channel_mode = Channel_mono; side_bits = Bits_1_mono <abstr>}

# print_samplerate test_frame;;
M1_S44100
- : unit = ()


The exact file I ran is attached (hopefully -- are attachments good on this
list?), and it prints "M1_S44100" with no errors or warnings on all my
computers, even though I think it should give an exception or otherwise
fail. Does it do something different on yours?

Thanks,
Reed

On Sat, Aug 4, 2012 at 4:45 PM, Gabriel Scherer
<gabriel.scherer@gmail.com>wrote:

> > I was happy to notice that this did not complain about being
> > incomplete, correctly seeing that the samplerates can only be used
> > with a matching header_id;
> >
> > However, if I then I make an invalid frame using the *_of_int functions:
> >
> > let test_frame = frame_of_sr_id_chan (samplerate_of_int 3)
> > (mpeg_of_int 0, channel_mode_of_int 0);;
> > print_samplerate test_frame;;
>
> Here is what I observe on my machine:
>
> # let print_samplerate : type id chan. (id,chan) frame_t -> unit = function
> [...]
> Warning 8: this pattern-matching is not exhaustive.
> Here is an example of a value that is not matched:
> {header_id=MPEG2; header_samplerate=S48000}
>
> # let test_frame = frame_of_sr_id_chan (samplerate_of_int 3)
> (mpeg_of_int 0, channel_mode_of_int 0);;
>   val test_frame : (mpeg_tag_t, channel_tag_t) frame_t =
>   {header_id = MPEG1; header_crc = true; header_samplerate = S22050;
>    header_channel_mode = Channel_mono; side_bits = Bits_1_mono <abstr>}
>
> # print_samplerate test_frame;;
> Exception: Match_failure ("//toplevel//", 43, -733).
>

-- 
ç

[-- Attachment #1.2: Type: text/html, Size: 3470 bytes --]

[-- Attachment #2: typetest2.ml --]
[-- Type: application/octet-stream, Size: 2708 bytes --]

type mpeg_tag_t = [ `Mpeg1 | `Mpeg2 ];;

type _ mpeg_t =
| MPEG1 : [< mpeg_tag_t > `Mpeg1] mpeg_t
| MPEG2 : [< mpeg_tag_t > `Mpeg2] mpeg_t
;;

type _ samplerate_t =
| S48000 : [< mpeg_tag_t > `Mpeg1] samplerate_t
| S44100 : [< mpeg_tag_t > `Mpeg1] samplerate_t
| S24000 : [< mpeg_tag_t > `Mpeg2] samplerate_t
| S22050 : [< mpeg_tag_t > `Mpeg2] samplerate_t
;;

type channel_tag_t = [ `Mono | `Stereo ];;

type _ channel_t =
| Channel_mono : [< channel_tag_t > `Mono] channel_t
| Channel_stereo : [< channel_tag_t > `Stereo] channel_t
;;

type (_,_) side_bits_t =
| Bits_1_mono : (int * int) -> ([`Mpeg1], [`Mono]) side_bits_t
| Bits_1_stereo : (int * int * int * int) -> ([`Mpeg1], [`Stereo]) side_bits_t
| Bits_2_mono : int -> ([`Mpeg2], [`Mono]) side_bits_t
| Bits_2_stereo : (int * int) -> ([`Mpeg2], [`Stereo]) side_bits_t
;;

type ('id,'chan) frame_t = {
header_id : 'id mpeg_t;
header_crc : bool;
header_samplerate : 'id samplerate_t;
header_channel_mode : 'chan channel_t;
side_bits : ('id,'chan) side_bits_t;
};;

let mpeg_of_int = function
| 0 -> MPEG1
| 1 -> MPEG2
| _ -> failwith "Bad"
;;

let samplerate_of_int = function
| 0 -> S48000
| 1 -> S44100
| 2 -> S24000
| 3 -> S22050
| _ -> failwith "Bad"
;;

let channel_mode_of_int = function
| 0 -> Channel_mono
| 1 -> Channel_stereo
| _ -> failwith "Bad"
;;





let frame_of_sr_id_chan : type id chan. id samplerate_t -> id mpeg_t * chan channel_t -> (id,chan) frame_t = fun sr -> function
| (MPEG1, Channel_mono) -> {header_id = MPEG1; header_crc = true; header_samplerate = sr; header_channel_mode = Channel_mono; side_bits = Bits_1_mono (1,1)}
| (MPEG1, Channel_stereo) -> {header_id = MPEG1; header_crc = true; header_samplerate = sr; header_channel_mode = Channel_stereo; side_bits = Bits_1_stereo (1,1,1,1)}
| (MPEG2, Channel_mono) -> {header_id = MPEG2; header_crc = true; header_samplerate = sr; header_channel_mode = Channel_mono; side_bits = Bits_2_mono (1)}
| (MPEG2, Channel_stereo) -> {header_id = MPEG2; header_crc = true; header_samplerate = sr; header_channel_mode = Channel_stereo; side_bits = Bits_2_stereo (1,1)}
;;
 
let print_samplerate : type id chan. (id,chan) frame_t -> unit = function
| {header_id = MPEG1; header_samplerate = S48000} -> Printf.printf "M1_S48000\n"
| {header_id = MPEG1; header_samplerate = S44100} -> Printf.printf "M1_S44100\n"
| {header_id = MPEG2; header_samplerate = S24000} -> Printf.printf "M2_S24000\n"
| {header_id = MPEG2; header_samplerate = S22050} -> Printf.printf "M2_S22050\n"
;;

let test_frame = frame_of_sr_id_chan (samplerate_of_int 3) (mpeg_of_int 0, channel_mode_of_int 0);;
print_samplerate test_frame;;


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] creating GADTs
  2012-08-05  2:47         ` Reed Wilson
@ 2012-08-05 23:06           ` Reed Wilson
  2012-08-07  4:19             ` Jacques Garrigue
  0 siblings, 1 reply; 8+ messages in thread
From: Reed Wilson @ 2012-08-05 23:06 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 1208 bytes --]

I think I've gotten everything pretty much straightened out now, but there
is something I didn't expect with GADTs and pattern-matching. Using my
previous samplerate_t type:

type mpeg1_t
type mpeg2_t
type _ samplerate_t =
| S48000 : mpeg1_t samplerate_t
| S44100 : mpeg1_t samplerate_t
| S24000 : mpeg2_t samplerate_t
| S22050 : mpeg2_t samplerate_t


I can't seem to do something like this:

let samples_per_frame : type id. id samplerate_t -> int = function
| S48000 | S44100 -> 1152
| S24000 | S22050 -> 576


Instead I have to have each of the constructors use a separate branch:

let samples_per_frame : type id. id samplerate_t -> int = function
| S48000 -> 1152
| S44100 -> 1152
| S24000 -> 576
| S22050 -> 576


even though S48000 and S44100 are the exact same type! OCaml complains:
Error: This pattern matches values of type mpeg1_t samplerate_t
       but a pattern was expected which matches values of type
         id samplerate_t


I certainly don't know a lot about these new type features, but I know
there shouldn't be a limitation in what I'm doing with them here. Is there
any way to tell the typing system to let me combine the branches?

-- 
ç

[-- Attachment #2: Type: text/html, Size: 3235 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] creating GADTs
  2012-08-05 23:06           ` Reed Wilson
@ 2012-08-07  4:19             ` Jacques Garrigue
  0 siblings, 0 replies; 8+ messages in thread
From: Jacques Garrigue @ 2012-08-07  4:19 UTC (permalink / raw)
  To: Reed Wilson; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 1328 bytes --]

2012/08/06 1:11 "Reed Wilson" <cedilla@gmail.com>:
>
> I think I've gotten everything pretty much straightened out now, but
there is something I didn't expect with GADTs and pattern-matching. Using
my previous samplerate_t type:
>
> type mpeg1_t
> type mpeg2_t
> type _ samplerate_t =
> | S48000 : mpeg1_t samplerate_t
> | S44100 : mpeg1_t samplerate_t
> | S24000 : mpeg2_t samplerate_t
> | S22050 : mpeg2_t samplerate_t
>
>
> I can't seem to do something like this:
>
> let samples_per_frame : type id. id samplerate_t -> int = function
> | S48000 | S44100 -> 1152
> | S24000 | S22050 -> 576
>
>
> Instead I have to have each of the constructors use a separate branch:
>
> let samples_per_frame : type id. id samplerate_t -> int = function
> | S48000 -> 1152
> | S44100 -> 1152
> | S24000 -> 576
> | S22050 -> 576
>
>
> even though S48000 and S44100 are the exact same type! OCaml complains:
> Error: This pattern matches values of type mpeg1_t samplerate_t
>        but a pattern was expected which matches values of type
>          id samplerate_t

The error message is not very clear, but the real cause is that currently
it is prohibited to generate GADT constraints in or-patterns.
One would have to check that the generated constraints are identical, which
is not possible at this point.
Future work...

Jacques Garrigue

[-- Attachment #2: Type: text/html, Size: 1708 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-08-07  4:19 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-04  7:56 [Caml-list] creating GADTs Reed Wilson
2012-08-04  9:21 ` Leo P White
     [not found]   ` <CALLFq5SWTGDuS5-Z7aN_UOqtTJ0sMvghWQTGqXUshUURiOoWcg@mail.gmail.com>
2012-08-04 22:52     ` Reed Wilson
2012-08-04 23:31       ` Gabriel Scherer
2012-08-04 23:45       ` Gabriel Scherer
2012-08-05  2:47         ` Reed Wilson
2012-08-05 23:06           ` Reed Wilson
2012-08-07  4:19             ` Jacques Garrigue

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).