From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by yquem.inria.fr (Postfix) with ESMTP id 7B7C1BBAF for ; Mon, 23 Jun 2008 17:57:38 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Au8AADRkX0jOvjGvfGdsb2JhbACSbAEBCwUCBgcRA51v X-IronPort-AV: E=Sophos;i="4.27,690,1204498800"; d="scan'208";a="14300152" Received: from web54605.mail.re2.yahoo.com ([206.190.49.175]) by mail1-smtp-roc.national.inria.fr with SMTP; 23 Jun 2008 17:57:37 +0200 Received: (qmail 64808 invoked by uid 60001); 23 Jun 2008 15:57:36 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=KjIBu4SAYeVHw2QLpOF8T6I84RGa+ptIC836eC1nXKGbWCZP0Yrm7QbBgUtWbHR7XXrP6guStK8fzeI1QLjDwUZ7Zpn+bvr73mJnI7TlbpGTrEQfPAnkYSKdQ8kcX1pqosLhJBCH+UM5xq4m8Vt4hSnDNcRRj7HaFVGpWqYf2TA=; Received: from [212.13.63.213] by web54605.mail.re2.yahoo.com via HTTP; Mon, 23 Jun 2008 08:57:36 PDT X-Mailer: YahooMailWebService/0.7.199 Date: Mon, 23 Jun 2008 08:57:36 -0700 (PDT) From: Dario Teixeira Subject: Defining and parsing subset To: caml-list@yquem.inria.fr MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Message-ID: <548342.64682.qm@web54605.mail.re2.yahoo.com> X-Spam: no; 0.00; subset:01 subset:01 submodules:01 parsers:01 parser:01 parser:01 node:01 node:01 sig:01 val:01 val:01 struct:01 endline:01 endline:01 rec:01 Hi, Suppose you have two different types of documents, one of which is a subset of the other. The types themselves are fairly complex (lots of submodules, etc), to the point that one cannot use the "include" directive to define the more complex type as an extension of the simpler type. Now, you want to avoid any code duplication, both in the definition of each type and in the code that provides the document parsers. The only solution I could find relies on phantom types to annotate each type component. Elements allowed for either type are annotated with `Basic= , while the ones that belong *only* to the more complex type are annotated wi= th `Complex. As for parsing, there is in fact only one parser, that produces values of type `Complex. The parser for `Basic values is faked by invoking the `Complex parser and then walking through the data structure, making sur= e it contains only those elements allowed for the `Basic subset. This seems to work fine; however, since I'm guessing this is a common prob= lem, I'm curious to know if there is another solution, perhaps even a canonical "camlish" way to solve the problem. Any thoughts? Kind regards, Dario Teixeira P.S. Some sample code (with dummy parser): (************************************************************************) (* Node module.=09=09=09=09=09=09=09=09*) (************************************************************************) module Node: sig =09type inline_t =3D node_t list =09 and node_t =3D =09=09private =09=09| Text of string=09(* Allowed for basic *) =09=09| Bold of inline_t=09(* Allowed for basic *) =09=09| Italic of inline_t=09(* Complex only *) =09type 'a t =3D private node_t =09val text: string -> [> `Basic] t =09val bold: 'a t list -> 'a t =09val italic: [< `Basic | `Complex] t list -> [> `Complex] t =09val print_complex: [< `Complex | `Basic] t -> unit =09val print_basic: [< `Basic] t -> unit =09exception Invalid_subset =09val parse_complex: string -> [> `Complex] t =09val parse_basic: string -> [> `Basic] t end =3D struct =09type inline_t =3D node_t list =09 and node_t =3D =09=09| Text of string =09=09| Bold of inline_t =09=09| Italic of inline_t =09type 'a t =3D node_t =09let text s =3D Text s =09let bold inline =3D Bold inline =09let italic inline =3D Italic inline =09let print_complex node =3D print_endline "complex" =09let print_basic node =3D print_endline "basic" =09exception Invalid_subset =09let parse_complex =3D function =09=09| "complex"=09-> italic [bold [text "ola"]] =09=09| _=09=09-> bold [text "ola"] =09let rec complex_to_basic =3D function =09=09| Text s=09-> text s =09=09| Bold inline=09-> bold (List.map complex_to_basic inline) =09=09| Italic inline=09-> raise Invalid_subset =09let parse_basic s =3D complex_to_basic (parse_complex s) end (************************************************************************) (* Top-level.=09=09=09=09=09=09=09=09*) (************************************************************************) open Node let () =3D =09let node =3D bold [text "ola"] =09in print_basic node;; let () =3D =09let node =3D italic [bold [text "ola"]] =09in print_complex node;; let () =3D =09let node =3D parse_complex "complex" =09in print_complex node;; let () =3D =09let node =3D parse_basic "complex" (* Raises exception *) =09in print_basic node;; =0A=0A=0A __________________________________________________________= =0ASent from Yahoo! Mail.=0AA Smarter Email http://uk.docs.yahoo.com/nowyou= can.html