From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by sympa.inria.fr (Postfix) with ESMTPS id E5F1C7ED5C for ; Sun, 5 Aug 2012 01:32:32 +0200 (CEST) Received-SPF: None (mail4-smtp-sop.national.inria.fr: no sender authenticity information available from domain of gabriel.scherer@gmail.com) identity=pra; client-ip=209.85.214.182; receiver=mail4-smtp-sop.national.inria.fr; envelope-from="gabriel.scherer@gmail.com"; x-sender="gabriel.scherer@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail4-smtp-sop.national.inria.fr: domain of gabriel.scherer@gmail.com designates 209.85.214.182 as permitted sender) identity=mailfrom; client-ip=209.85.214.182; receiver=mail4-smtp-sop.national.inria.fr; envelope-from="gabriel.scherer@gmail.com"; x-sender="gabriel.scherer@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail4-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-ob0-f182.google.com) identity=helo; client-ip=209.85.214.182; receiver=mail4-smtp-sop.national.inria.fr; envelope-from="gabriel.scherer@gmail.com"; x-sender="postmaster@mail-ob0-f182.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgwCALuvHVDRVda2kGdsb2JhbABFqCqRDAgiAQEBAQkJDQcUBCOCIAEBAQMBEgITGQEbEgsBAwELBgULAwoNISEBAREBBQEKEgYTCAoJB4dbAQMGBgucKAkDjCOCcYQvChknAwpXiHEBBQyKV2eHBAOTdoFTgRSJdYMmPoQA X-IronPort-AV: E=Sophos;i="4.77,714,1336341600"; d="scan'208";a="152448596" Received: from mail-ob0-f182.google.com ([209.85.214.182]) by mail4-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 05 Aug 2012 01:32:31 +0200 Received: by obbun3 with SMTP id un3so5611409obb.27 for ; Sat, 04 Aug 2012 16:32:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=gpg7Qi8xFfwF/vbSKTMcYIThEDkiEBnttyNUw9++kZI=; b=DiK1QBbnBdnTfHQMqoz9CHMYdZE7lHklM/HMqNWcBm+YYiEuUaZYqA8aF5QgfHfjeq JjWdFL/hzqD5ZdxB6oj4J9jdBOJCzn4u6oUKPlmH7KFIzehB403cruEz9ujKDCrj8nuf HM/QmLr5Fs988dcj4erzZn86XNftChh3vXFcePevuzc2ydzVK6xyQHTfJAkf9vbwpVGQ M/BpdvaoGFMryjwXbDnuyM1JM5WQ72p+gR09DnPDCrx8Y0dSQf6mEqDwGzzylqbgYFPq JHEjG1KbKyVSfhSTpjQdQd6HTkMb7gi3pXH+gD883phNS/m+Kyn8uKJuWSNJHNgGLkHY U1rg== Received: by 10.50.213.1 with SMTP id no1mr1977831igc.71.1344123150053; Sat, 04 Aug 2012 16:32:30 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.94.39 with HTTP; Sat, 4 Aug 2012 16:31:48 -0700 (PDT) In-Reply-To: References: From: Gabriel Scherer Date: Sun, 5 Aug 2012 01:31:48 +0200 Message-ID: To: Reed Wilson Cc: caml-list@inria.fr Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Subject: Re: [Caml-list] creating GADTs >> is that it cannot have type int -> 'a mpeg_t becuase that would mean tha= t you could use it to create any values with any mpeg_t type (e.g. string m= peg_t). > > > I thought that looked like a funny type for a function, but shouldn't > it be clear from the definition of mpeg_t that the function can't > return string mpeg_t? I thought that's one of the things GADTs were > supposed to solve. This is not how typing works (and this specific part of Leo's otherwise excellent answer was a bit misleading). A type (int -> 'a mpeg_t) means that the return type has to be *polymorphic* in 'a. Any value returned by this function must have type (t mpeg_t) for *any* type t. MPEG1 is not polymorphic in the type parameter, it does not have type ('a mpeg_t). Think of a function of type (int -> 'a list) for example; the only possible return value is the empty list, because it is polymoprhic: [] has type (int list) and (bool list) and ... This is the way type polymorphism works in ML, and GADTs do not change that. What you want to say is not that, given an integer, you return a (t mpeg_t) for *any* t (that would be the polymorphic type 'a mpeg_t), but that you return an (t mpeg_t) for *some* unknown type t. This is not an universal quantification, but an existential one, and it's why using an existential type (encoded as a GADT in Leo's proposal) is the natural solution to your problem. (The polymorphic variant solution is not all that different, except that instead of saying "some unknown t", you can say the slightly more precise "some t that has at most the constructors `MPEG1 and `MPEG2`". My personal opinion is that GADTs are complex enough as they are, and that throwing polymorphic variants in the mix is maybe going a bit too far, so I'd rather concentrate on the existential solution.) > Then I would create the frame based on the bits in the file: > > let id_int =3D int_of_raw_bits ... in > let channel_mode_int =3D int_of_raw_bits ... in > let frame =3D frame_of_id_and_chan (mpeg_of_int id_int, channel_of_int > channel_mode_int) in > ... > > > If hiding the mpeg_t type inside mpeg_ext_t, I wouldn't be able to use > it to create a frame of the right type. =46rom an *unknown* (hidden in an existential mpeg_ext_t type) mpeg tag, you can't get a statically known frame tag. But you can produce an frame of unknown (existential) type, playing the same trick again (I haven't type-checked my code so it may be full of typos): type channel_ext_t =3D Chan_ext : 'chan channel_t -> channel_ext_t type frame_ext_t =3D Frame_ext : ('id, 'chan) frame_t -> 'chan frame_ext_t let id_int =3D int_of_raw_bits ... in let channel_mode_int =3D int_of_raw_bits ... in let frame =3D match mpeg_of_int id_int, channel_of_int channel_mode_int with | Mpeg_ext mpeg, Chan_ext chan -> Frame_ext (frame_of_id_and_chan mpeg chan) The idea is that GADTs help with your problem because they allow you to reason *statically* about the possible combinations (while your previous solution used dynamic checking in the form of pattern matching on the tags, and therefore was less convenient as you had to do checks all over the place). A rich type such as type id chan. id mpeg_t -> chan channel_t -> (id,chan) frame_t let you say "if I statically know the id and the chan, I can build a well-typed (id,chan) frame". So the functions that manipulate your data statically reason about it, and are nice to write. But then, in the end, the tags/chans are not statically known, they are decided at runtime. The existential types (or polymorphic variants) are there to tell the type-checker "sorry, you can't know what will be used", they are a boundary between what's written using static knowledge (the auxiliary functions) and what's known only at runtime (the actual input/output of your script). Your function "frame" has to unbox the existential Mpeg_ext and Chan_ext (to get them, locally, in a form manipulable by the auxiliary functions), then rebox the resulting frame to denote that it's only known at runtime. This boxing/unboxing in/out of existential types is a bit painful (especially since OCaml doesn't have "first-class" structural existentials, you have to build them explicitly through type definitions or first-class modules), but hopefully it only happen at the IO boundaries of your application, that is not too often. Existential types are a burden you inflicted upon yourself by choosing more explicit static types: now you sometimes have to explicitly say "I don't know", while before you never knew. On Sun, Aug 5, 2012 at 12:52 AM, Reed Wilson wrote: > Thanks for the response! > > On Sat, Aug 4, 2012 at 2:21 AM, Leo P White wrote: >> >> Hi, >> >> The problem with your function >> >> >> let mpeg_of_int =3D function >> | 0 -> MPEG1 >> | _ -> MPEG2 >> ;; >> >> is that it cannot have type int -> 'a mpeg_t becuase that would mean tha= t you could use it to create any values with any mpeg_t type (e.g. string m= peg_t). > > > I thought that looked like a funny type for a function, but shouldn't > it be clear from the definition of mpeg_t that the function can't > return string mpeg_t? I thought that's one of the things GADTs were > supposed to solve. > >> >> >> I can think of two possible solutions to your problem. >> >> The first is to use an existential type via another GADT, so that your s= imple example becomes: >> >> >> type mpeg1_t;; >> type mpeg2_t;; >> type _ mpeg_t =3D >> | MPEG1 : mpeg1_t mpeg_t >> | MPEG2 : mpeg2_t mpeg_t >> ;; >> >> type mpeg_ext_t =3D Mpeg_Ext: 'a mpeg_t -> mpeg_ext_t;; >> >> let mpeg_of_int =3D function >> | 0 -> Mpeg_Ext MPEG1 >> | _ -> Mpeg_Ext MPEG2 >> ;; > > > This does work, but unfortunately it doesn't seem to work for my > purpose, since I still can't get mpeg1_t or mpeg2_t "out" of it. (I > apologize for my continued use of nonstandard terminology - I've never > used any of these advanced typing features before) > > Basically, in the end I'd build a function that creates a ('id,'chan) > frame_t, with 'id and 'chan depending on the bits stored in the MP3's > header. So something like this: > > let frame_of_id_and_chan : type id chan. (id mpeg_t * chan channel_t) > -> (id,chan) frame_t =3D function > | (MPEG1, Channel_mono) -> { > header_id =3D MPEG1; > header_crc =3D true; > header_samplerate =3D S48000; > header_channel_mode =3D Channel_mono; > side_bits =3D Bits_1_mono (1,1); > } > | (MPEG1, Channel_stereo) -> { > header_id =3D MPEG1; > header_crc =3D true; > header_samplerate =3D S48000; > header_channel_mode =3D Channel_stereo; > side_bits =3D Bits_1_stereo (1,1,1,1); > } > | (MPEG2, Channel_mono) -> { > header_id =3D MPEG2; > header_crc =3D true; > header_samplerate =3D S24000; > header_channel_mode =3D Channel_mono; > side_bits =3D Bits_2_mono (1); > } > | (MPEG2, Channel_stereo) -> { > header_id =3D MPEG2; > header_crc =3D true; > header_samplerate =3D S24000; > header_channel_mode =3D Channel_stereo; > side_bits =3D Bits_2_stereo (1,1); > } > ;; > > > Then I would create the frame based on the bits in the file: > > let id_int =3D int_of_raw_bits ... in > let channel_mode_int =3D int_of_raw_bits ... in > let frame =3D frame_of_id_and_chan (mpeg_of_int id_int, channel_of_int > channel_mode_int) in > ... > > > If hiding the mpeg_t type inside mpeg_ext_t, I wouldn't be able to use > it to create a frame of the right type. > >> >> >> The second is to use polymorphic variants instead of abstract types to m= aintain your invariants. So your full example would become: >> >> type mpeg_tag_t =3D [ `Mpeg1 | `Mpeg2 ] >> >> type _ mpeg_t =3D >> | MPEG1 : [< mpeg_tag_t >`Mpeg1] mpeg_t >> | MPEG2 : [< mpeg_tag_t > `Mpeg2] mpeg_t >> ;; >> >> type _ samplerate_t =3D >> | S48000 : [`Mpeg1] samplerate_t >> | S44100 : [`Mpeg1] samplerate_t >> | S24000 : [`Mpeg2] samplerate_t >> | S22050 : [`Mpeg2] samplerate_t >> ;; >> >> type channel_tag_t =3D [ `Mono | `Stereo ] >> >> type _ channel_t =3D >> | Channel_mono : [< channel_tag_t > `Mono] channel_t >> | Channel_stereo : [< channel_tag_t > `Stereo] channel_t >> ;; >> >> type (_,_) side_bits_t =3D >> | Bits_1_mono : (int * int) -> ([`Mpeg1], [`Mono]) side_bits_t >> | Bits_1_stereo : (int * int * int * int) -> ([`M= peg1], [`Stereo]) side_bits_t >> | Bits_2_mono : int -> ([`Mpeg2], [`Mono]) side_bits_t >> | Bits_2_stereo : (int * int) -> ([`Mpeg2], [`Stereo]) side_bits_t >> >> ;; >> >> type ('id,'chan) frame_t =3D { >> header_id : 'id mpeg_t; >> header_crc : bool; >> header_samplerate : 'id samplerate_t; >> header_channel_mode : 'chan channel_t; >> side_bits : ('id,'chan) side_bits_t; >> };; >> >> This allows you to write your function as before: >> >> >> let mpeg_of_int =3D function >> | 0 -> MPEG1 >> | _ -> MPEG2;; >> >> and it still maintains the desired invariant, so that this is allowed: >> >> let good =3D { header_id =3D MPEG1; >> header_crc =3D true; >> header_samplerate =3D S48000; >> header_channel_mode =3D Channel_mono; >> side_bits =3D Bits_1_mono(1, 2); };; >> >> but this isn't: >> >> let bad =3D { header_id =3D MPEG1; >> header_crc =3D true; >> header_samplerate =3D S48000; >> header_channel_mode =3D Channel_stereo; >> side_bits =3D Bits_1_mono(1, 2); };; >> >> Error: This expression has type ([ `Mpeg1 ], [ `Mono ]) side_bits_t >> >> but an expression was expected of type >> ([ `Mpeg1 ], [< channel_tag_t > `Stereo ] as 'a) side_bits_t >> Type [ `Mono ] is not compatible with type >> 'a =3D [< `Mono | `Stereo > `Stereo ] The first variant ty= pe does not allow tag(s) `Stereo >> >> Note that this would still not allow you to write the function: >> >> let mpeg_of_int =3D function >> | 0 -> S48000 >> | _ -> S24000;; >> >> which suffers from the same problem as your original function. > > > That helps a lot! I think I got it working with polymorphic variants. > I basically added < mpeg_tag_t > to all the samplerate definitions, > which seems to let everything work. The result is this: > > type mpeg_tag_t =3D [ `Mpeg1 | `Mpeg2 ] > > type _ mpeg_t =3D > | MPEG1 : [< mpeg_tag_t > `Mpeg1] mpeg_t > | MPEG2 : [< mpeg_tag_t > `Mpeg2] mpeg_t > ;; > > type _ samplerate_t =3D > | S48000 : [< mpeg_tag_t > `Mpeg1] samplerate_t > | S44100 : [< mpeg_tag_t > `Mpeg1] samplerate_t > | S24000 : [< mpeg_tag_t > `Mpeg2] samplerate_t > | S22050 : [< mpeg_tag_t > `Mpeg2] samplerate_t > > ;; > > type channel_tag_t =3D [ `Mono | `Stereo ] > > type _ channel_t =3D > | Channel_mono : [< channel_tag_t > `Mono] channel_t > | Channel_stereo : [< channel_tag_t > `Stereo] channel_t > ;; > > type (_,_) side_bits_t =3D > | Bits_1_mono : (int * int) -> ([`Mpeg1], [`Mono]) side_bits_t > | Bits_1_stereo : (int * int * int * int) -> ([`Mpeg1], [`Stereo]) side_b= its_t > | Bits_2_mono : int -> ([`Mpeg2], [`Mono]) side_bits_t > | Bits_2_stereo : (int * int) -> ([`Mpeg2], [`Stereo]) side_bits_t > ;; > > type ('id,'chan) frame_t =3D { > mutable header_id : 'id mpeg_t; > mutable header_crc : bool; > mutable header_samplerate : 'id samplerate_t; > mutable header_channel_mode : 'chan channel_t; > mutable side_bits : ('id,'chan) side_bits_t; > };; > > > let mpeg_of_int =3D function > | 0 -> MPEG1 > | 1 -> MPEG2 > | _ -> failwith "Bad" > ;; > > let samplerate_of_int =3D function > | 0 -> S48000 > | 1 -> S44100 > | 2 -> S24000 > | 3 -> S22050 > | _ -> failwith "Bad" > ;; > > let channel_mode_of_int =3D function > | 0 -> Channel_mono > | 1 -> Channel_stereo > | _ -> failwith "Bad" > ;; > > All the types seem to work out, with one fairly contrived exception... > If I write a few test functions: > > let frame_of_sr_id_chan : type id chan. id samplerate_t -> id mpeg_t > * chan channel_t -> (id,chan) frame_t =3D fun sr -> function > | (MPEG1, Channel_mono) -> {header_id =3D MPEG1; header_crc =3D true; > header_samplerate =3D sr; header_channel_mode =3D Channel_mono; side_bits > =3D Bits_1_mono (1,1)} > | (MPEG1, Channel_stereo) -> {header_id =3D MPEG1; header_crc =3D true; > header_samplerate =3D sr; header_channel_mode =3D Channel_stereo; > side_bits =3D Bits_1_stereo (1,1,1,1)} > | (MPEG2, Channel_mono) -> {header_id =3D MPEG2; header_crc =3D true; > header_samplerate =3D sr; header_channel_mode =3D Channel_mono; side_bits > =3D Bits_2_mono (1)} > | (MPEG2, Channel_stereo) -> {header_id =3D MPEG2; header_crc =3D true; > header_samplerate =3D sr; header_channel_mode =3D Channel_stereo; > side_bits =3D Bits_2_stereo (1,1)} > ;; > > let print_samplerate : type id chan. (id,chan) frame_t -> unit =3D functi= on > | {header_id =3D MPEG1; header_samplerate =3D S48000} -> Printf.printf "M= 1_S48000\n" > | {header_id =3D MPEG1; header_samplerate =3D S44100} -> Printf.printf "M= 1_S44100\n" > | {header_id =3D MPEG2; header_samplerate =3D S24000} -> Printf.printf "M= 2_S24000\n" > | {header_id =3D MPEG2; header_samplerate =3D S22050} -> Printf.printf "M= 2_S22050\n" > ;; > > > I was happy to notice that this did not complain about being > incomplete, correctly seeing that the samplerates can only be used > with a matching header_id; > > However, if I then I make an invalid frame using the *_of_int functions: > > let test_frame =3D frame_of_sr_id_chan (samplerate_of_int 3) > (mpeg_of_int 0, channel_mode_of_int 0);; > print_samplerate test_frame;; > > > This accepts the input (even though samplerate_of_int 3 is `Mpeg2, and > mpeg_of_int 0 is `Mpeg1). However, this will print M1_S44100, which is > definitely not what I specified (samplerate_of_int 3 is S22050)! There > must be something I don't get with this typing. Any ideas what? > > In the end, though, I will not actually use a function like > frame_of_sr_id_chan, so using polymorphic variants should do exactly > what I want. > > Thanks again, > Reed > > -- > =E7 > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa-roc.inria.fr/wws/info/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs >