From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by sympa.inria.fr (Postfix) with ESMTPS id 288597ED34 for ; Mon, 16 Jul 2012 00:41:10 +0200 (CEST) Received-SPF: None (mail1-smtp-roc.national.inria.fr: no sender authenticity information available from domain of garrigue@math.nagoya-u.ac.jp) identity=pra; client-ip=133.6.130.5; receiver=mail1-smtp-roc.national.inria.fr; envelope-from="garrigue@math.nagoya-u.ac.jp"; x-sender="garrigue@math.nagoya-u.ac.jp"; x-conformance=sidf_compatible Received-SPF: None (mail1-smtp-roc.national.inria.fr: no sender authenticity information available from domain of garrigue@math.nagoya-u.ac.jp) identity=mailfrom; client-ip=133.6.130.5; receiver=mail1-smtp-roc.national.inria.fr; envelope-from="garrigue@math.nagoya-u.ac.jp"; x-sender="garrigue@math.nagoya-u.ac.jp"; x-conformance=sidf_compatible Received-SPF: None (mail1-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@mailhost.math.nagoya-u.ac.jp) identity=helo; client-ip=133.6.130.5; receiver=mail1-smtp-roc.national.inria.fr; envelope-from="garrigue@math.nagoya-u.ac.jp"; x-sender="postmaster@mailhost.math.nagoya-u.ac.jp"; x-conformance=sidf_compatible X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AiMFAGdGA1CFBoIF/2dsb2JhbABFpx+SGoIgAQEEAScTCQEqCwIDCwslDxJFEgYlh2wDBgUMp06ENAGEQQNUiH0HkGJgiEyMcoESjnSCbg X-IronPort-AV: E=Sophos;i="4.77,590,1336341600"; d="scan'208";a="166937578" Received: from rabbit.math.nagoya-u.ac.jp (HELO mailhost.math.nagoya-u.ac.jp) ([133.6.130.5]) by mail1-smtp-roc.national.inria.fr with ESMTP; 16 Jul 2012 00:41:07 +0200 Received: from mailhost.math.nagoya-u.ac.jp (localhost [127.0.0.1]) by mailhost.math.nagoya-u.ac.jp (Postfix) with ESMTP id 61CB56297; Mon, 16 Jul 2012 07:41:04 +0900 (JST) Received: from mailhost.math.nagoya-u.ac.jp (localhost [127.0.0.1]) by mailhost.math.nagoya-u.ac.jp (Postfix) with ESMTP id 3120340DF; Mon, 16 Jul 2012 07:41:04 +0900 (JST) DomainKey-Signature: h=Received:Subject:Mime-Version:Content-Type:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=; c=nofws; d=math.nagoya-u.ac.jp; q=; s=alpha Received: from tet.garrigue.jp (58x158x128x157.ap58.ftth.ucom.ne.jp [58.158.128.157]) by mailhost.math.nagoya-u.ac.jp (Postfix) with ESMTPSA id DAB8B40DD; Mon, 16 Jul 2012 07:41:03 +0900 (JST) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Jacques Garrigue In-Reply-To: <1342371289.7422.YahooMailNeo@web111511.mail.gq1.yahoo.com> Date: Mon, 16 Jul 2012 07:41:04 +0900 Cc: "caml-list@inria.fr" Content-Transfer-Encoding: quoted-printable Message-Id: <068D1036-F986-4349-9EEA-B4D7453D5180@math.nagoya-u.ac.jp> References: <1342371289.7422.YahooMailNeo@web111511.mail.gq1.yahoo.com> To: Dario Teixeira X-Mailer: Apple Mail (2.1084) Subject: Re: [Caml-list] GADTs and parsers Actually Leo more or less gave you the answer: since your recursive process function needs to return either `Nonlink or any inline_t nodes, it has to be polymorphically recursive, and use a custom GADT to connect the allow_link parameter with the result type. Here is a more complete version of his code. I also changed a bit your types, because there is another problem using polymorphic variants with GADTs: when you get a GADT equation involving a refinable polymorphic variant, you are going to get get a local private type. If your type is unbounded, you cannot even use subtyping on it. Also [< `Nonlink] has exactly the same elements as [ `Nonlink], so it is better to use the latter. Jacques type inkind =3D [ `Link | `Nonlink ] type _ inline_t =3D | Text: string -> [< inkind > `Nonlink ] inline_t | Bold: 'a inline_t list -> 'a inline_t | Link: string -> [< inkind > `Link ] inline_t | Mref: string * [ `Nonlink ] inline_t list -> [< inkind > `Link ] inli= ne_t let uppercase seq =3D let rec process: type a. a inline_t -> a inline_t =3D function | Text txt -> Text (String.uppercase txt) | Bold xs -> Bold (List.map process xs) | Link lnk -> Link lnk | Mref (lnk, xs) -> Mref (lnk, List.map process xs) in List.map process seq type ast_t =3D | Ast_Text of string | Ast_Bold of ast_t list | Ast_Link of string | Ast_Mref of string * ast_t list let inlineseq_from_astseq seq =3D let rec process_nonlink =3D function | Ast_Text txt -> Text txt | Ast_Bold xs -> Bold (List.map process_nonlink xs) | _ -> assert false in let rec process_any =3D function | Ast_Text txt -> Text txt | Ast_Bold xs -> Bold (List.map process_any xs) | Ast_Link lnk -> Link lnk | Ast_Mref (lnk, xs) -> Mref (lnk, List.map process_nonlink xs) in List.map process_any seq type _ linkp =3D | Nonlink : [ `Nonlink ] linkp | Maylink : inkind linkp let inlineseq_from_astseq seq =3D let rec process : type a. a linkp -> ast_t -> a inline_t =3D fun allow_link ast -> match (allow_link, ast) with | (Maylink, Ast_Text txt) -> Text txt | (Nonlink, Ast_Text txt) -> Text txt | (x, Ast_Bold xs) -> Bold (List.map (process x) xs) | (Maylink, Ast_Link lnk) -> Link lnk | (Nonlink, Ast_Link _) -> assert false | (Maylink, Ast_Mref (lnk, xs)) -> Mref (lnk, List.map (process Nonlink) xs) | (Nonlink, Ast_Mref _) -> assert false in List.map (process Maylink) seq On 2012/07/16, at 1:54, Dario Teixeira wrote: > Hi, >=20 > I'm revisiting an old problem with 4.00's newfangled GADTs. Suppose you = have > four kinds of inline nodes, two of which (Text and Link) are leaves, while > the other two (Bold and Mref) are the parents of other nodes. Moreover, > you want to enforce the invariant that a "linkish" node (Link and Mref) > may not be the ancestor of another linkish node. One possible implementa= tion: >=20 > type _ inline_t =3D > | Text: string -> [> `Nonlink ] inline_t > | Bold: 'a inline_t list -> 'a inline_t > | Link: string -> [> `Link ] inline_t > | Mref: string * [< `Nonlink ] inline_t list -> [> `Link ] inline_t >=20 >=20 > Defining a simple transformation function (in this case one which upperca= ses > all text) is also straightforward, just as long as one includes the proper > type annotations: >=20 > let uppercase seq =3D > let rec process: type a. a inline_t -> a inline_t =3D function > | Text txt -> Text (String.uppercase txt) > | Bold xs -> Bold (List.map process xs) > | Link lnk -> Link lnk > | Mref (lnk, xs) -> Mref (lnk, List.map process xs) > in List.map process seq >=20 >=20 > But suppose now that I got from a parser a ast_t value with an almost ide= ntical > structure to inline_t, with the exception that it does not intrinsically > satisfy the latter's invariant: (Note: for this toy example it may well = be > easy to design the parser grammar such that the invariant is always prese= rved; > but suppose you're dealing with a "dumb" parser) >=20 > type ast_t =3D > | Ast_Text of string > | Ast_Bold of ast_t list > | Ast_Link of string > | Ast_Mref of string * ast_t list >=20 >=20 > Below is one possible implementation of a function that converts the poss= ibly > "broken" ast_t into an inline_t. Note how the processing is split into t= wo > separate functions -- one which deals only with nonlinks, and other that > takes anything -- so we can be sure to satisfy the GADT constraints. >=20 > let inlineseq_from_astseq seq =3D > let rec process_nonlink =3D function > | Ast_Text txt -> Text txt > | Ast_Bold xs -> Bold (List.map process_nonlink xs) > | _ -> assert false in > let rec process_any =3D function > | Ast_Text txt -> Text txt > | Ast_Bold xs -> Bold (List.map process_any xs) > | Ast_Link lnk -> Link lnk > | Ast_Mref (lnk, xs) -> Mref (lnk, List.map process_nonlink xs) > in List.map process_any seq >=20 >=20 > Now here's my problem: suppose I wanted to avoid the branch duplication > present in the above function. The code below seems to do the trick, > while at the same time ensuring that the result is always a valid inline_= t. > However, the compiler has trouble seeing that the code is a sound way to > produce convert an ast_t into an inline_t, and rejects the code. Moreove= r, > it is not enough to simply add the type annotations for subfunction 'proc= ess', > as was done in 'uppercase'. >=20 > let inlineseq_from_astseq seq =3D > let rec process allow_link ast =3D match (allow_link, ast) with > | (_, Ast_Text txt) -> Text txt > | (x, Ast_Bold xs) -> Bold (List.map (process x) xs) > | (true, Ast_Link lnk) -> Link lnk > | (false, Ast_Link _) -> assert false > | (true, Ast_Mref (lnk, xs)) -> Mref (lnk, List.map (process fals= e) xs) > | (false, Ast_Mref _) -> assert false > in List.map (process true) seq >=20 >=20 > Can the single function approach be made to work? I'm having trouble fig= uring > out just exactly what sort of help the compiler may require to see the co= de > above as correct... (Assuming it is correct, of course...) >=20 > Thanks in advance for your time! > Cheers, > Dario Teixeira >=20 >=20 > --=20 > Caml-list mailing list. Subscription management and archives: > https://sympa-roc.inria.fr/wws/info/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs >=20