From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by sympa.inria.fr (Postfix) with ESMTPS id 1062F7ED34 for ; Sun, 15 Jul 2012 19:44:13 +0200 (CEST) Received-SPF: None (mail4-smtp-sop.national.inria.fr: no sender authenticity information available from domain of gabriel.scherer@gmail.com) identity=pra; client-ip=209.85.213.54; receiver=mail4-smtp-sop.national.inria.fr; envelope-from="gabriel.scherer@gmail.com"; x-sender="gabriel.scherer@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail4-smtp-sop.national.inria.fr: domain of gabriel.scherer@gmail.com designates 209.85.213.54 as permitted sender) identity=mailfrom; client-ip=209.85.213.54; receiver=mail4-smtp-sop.national.inria.fr; envelope-from="gabriel.scherer@gmail.com"; x-sender="gabriel.scherer@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail4-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-yw0-f54.google.com) identity=helo; client-ip=209.85.213.54; receiver=mail4-smtp-sop.national.inria.fr; envelope-from="gabriel.scherer@gmail.com"; x-sender="postmaster@mail-yw0-f54.google.com"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AjkCAFUAA1DRVdU2k2dsb2JhbABFpx+RCggiAQEBAQkJKAQjgiABAQEEEgITGQEbEgsBAwwGBQsNDSEiAREBBQEKEgYTEhCHWwEDDAucTwkDjCOCcYNwChknAwpXiHEBBQyKTmaGAgOIS4xwgRKJcoMlPoQe X-IronPort-AV: E=Sophos;i="4.77,588,1336341600"; d="scan'208";a="150710551" Received: from mail-yw0-f54.google.com ([209.85.213.54]) by mail4-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 15 Jul 2012 19:44:11 +0200 Received: by yhfs35 with SMTP id s35so6598161yhf.27 for ; Sun, 15 Jul 2012 10:44:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=Pg81EeRrJhHULPQEQw1kbhLiHPa2Pz9bVAR4ANxAmuQ=; b=kXfNbfsXJzN7UqsJ2Ae8dblH9p01SFdQ2vLX3MOGMCTGJvAS8EXRAZBddoUf7fsEFt Wtg5KfEvFjaRkd6IZqCabHFECcV5K8snjkK2OM7uyXmDV+xIXnia2ge7Nu0pqKnWHa7x rwKNta9B3KmjLoqaCFUrHswMEdMCdAByUzdgpsWYy4aXPO1AslDSTUrSid3eUGQjnWrG rWdTzef3qcyuY2F4g8eNZyt55imkG4KSu50i8yy3G2VG1UVlc8huGaeK1I/gB/TxnvlU t9AXbxMhvsdxKFIS1IKoKxtHSDbou9jk7L+GHE4vB4CPPhUT6wgl6zSuNBYtA/5gttYC romQ== Received: by 10.42.29.4 with SMTP id p4mr2239447icc.30.1342374250358; Sun, 15 Jul 2012 10:44:10 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.94.39 with HTTP; Sun, 15 Jul 2012 10:43:30 -0700 (PDT) In-Reply-To: <1342371289.7422.YahooMailNeo@web111511.mail.gq1.yahoo.com> References: <1342371289.7422.YahooMailNeo@web111511.mail.gq1.yahoo.com> From: Gabriel Scherer Date: Sun, 15 Jul 2012 10:43:30 -0700 Message-ID: To: Dario Teixeira Cc: "caml-list@inria.fr" Content-Type: text/plain; charset=ISO-8859-1 Subject: Re: [Caml-list] GADTs and parsers I suspect the problem may be one of variance: you are using polymorphic variants, but implicitly assuming that ([ `Nolink ] inline_t) is a subtype of ([ `Nolink | `Link ] inline_t), which means that "inline_t" is covariant in its first parameter. However, variance of GADTs is currently very strict, and your inline_t definition is in fact considered invariant (to check that, replace "type _ inline_t" by "type (+'a) inline_t" and watch the error), so this conversion does not pass. On the contrary, with a phantom type definition, the parameter is inferred to be irrelevant (any variance is accepted), so you can do whatever you want inside the abstraction boundary. Note that it may well be something else: with polymorphic variants it's often hard to distinguish subtyping from polymorphic instantiation (which will work just as well with GADTs), when you are manipulating types polymorphic in the row variable ([< ... ] and [> ...]). (For more information on GADTs and variance, see the previous thread on this on the mailing-list; the summary is that it's complicated, but I still hope we could get something finer in the future. It would probably not work with polymorphic variant instances, anyway.) On Sun, Jul 15, 2012 at 9:54 AM, Dario Teixeira wrote: > Hi, > > I'm revisiting an old problem with 4.00's newfangled GADTs. Suppose you have > four kinds of inline nodes, two of which (Text and Link) are leaves, while > the other two (Bold and Mref) are the parents of other nodes. Moreover, > you want to enforce the invariant that a "linkish" node (Link and Mref) > may not be the ancestor of another linkish node. One possible implementation: > > type _ inline_t = > | Text: string -> [> `Nonlink ] inline_t > | Bold: 'a inline_t list -> 'a inline_t > | Link: string -> [> `Link ] inline_t > | Mref: string * [< `Nonlink ] inline_t list -> [> `Link ] inline_t > > > Defining a simple transformation function (in this case one which uppercases > all text) is also straightforward, just as long as one includes the proper > type annotations: > > let uppercase seq = > let rec process: type a. a inline_t -> a inline_t = function > | Text txt -> Text (String.uppercase txt) > | Bold xs -> Bold (List.map process xs) > | Link lnk -> Link lnk > | Mref (lnk, xs) -> Mref (lnk, List.map process xs) > in List.map process seq > > > But suppose now that I got from a parser a ast_t value with an almost identical > structure to inline_t, with the exception that it does not intrinsically > satisfy the latter's invariant: (Note: for this toy example it may well be > easy to design the parser grammar such that the invariant is always preserved; > but suppose you're dealing with a "dumb" parser) > > type ast_t = > | Ast_Text of string > | Ast_Bold of ast_t list > | Ast_Link of string > | Ast_Mref of string * ast_t list > > > Below is one possible implementation of a function that converts the possibly > "broken" ast_t into an inline_t. Note how the processing is split into two > separate functions -- one which deals only with nonlinks, and other that > takes anything -- so we can be sure to satisfy the GADT constraints. > > let inlineseq_from_astseq seq = > let rec process_nonlink = function > | Ast_Text txt -> Text txt > | Ast_Bold xs -> Bold (List.map process_nonlink xs) > | _ -> assert false in > let rec process_any = function > | Ast_Text txt -> Text txt > | Ast_Bold xs -> Bold (List.map process_any xs) > | Ast_Link lnk -> Link lnk > | Ast_Mref (lnk, xs) -> Mref (lnk, List.map process_nonlink xs) > in List.map process_any seq > > > Now here's my problem: suppose I wanted to avoid the branch duplication > present in the above function. The code below seems to do the trick, > while at the same time ensuring that the result is always a valid inline_t. > However, the compiler has trouble seeing that the code is a sound way to > produce convert an ast_t into an inline_t, and rejects the code. Moreover, > it is not enough to simply add the type annotations for subfunction 'process', > as was done in 'uppercase'. > > let inlineseq_from_astseq seq = > let rec process allow_link ast = match (allow_link, ast) with > | (_, Ast_Text txt) -> Text txt > | (x, Ast_Bold xs) -> Bold (List.map (process x) xs) > | (true, Ast_Link lnk) -> Link lnk > | (false, Ast_Link _) -> assert false > | (true, Ast_Mref (lnk, xs)) -> Mref (lnk, List.map (process false) xs) > | (false, Ast_Mref _) -> assert false > in List.map (process true) seq > > > Can the single function approach be made to work? I'm having trouble figuring > out just exactly what sort of help the compiler may require to see the code > above as correct... (Assuming it is correct, of course...) > > Thanks in advance for your time! > Cheers, > Dario Teixeira > > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa-roc.inria.fr/wws/info/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs >