caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Gabriel Scherer <gabriel.scherer@gmail.com>
To: Dario Teixeira <darioteixeira@yahoo.com>
Cc: "caml-list@inria.fr" <caml-list@inria.fr>
Subject: Re: [Caml-list] GADTs and parsers
Date: Sun, 15 Jul 2012 10:43:30 -0700	[thread overview]
Message-ID: <CAPFanBEJ6jf3qzDGdy6kt3Hy7yaYjAb6D+6U1woXMQXMipArUg@mail.gmail.com> (raw)
In-Reply-To: <1342371289.7422.YahooMailNeo@web111511.mail.gq1.yahoo.com>

I suspect the problem may be one of variance: you are using
polymorphic variants, but implicitly assuming that ([ `Nolink ]
inline_t) is a subtype of ([ `Nolink | `Link ] inline_t), which means
that "inline_t" is covariant in its first parameter. However, variance
of GADTs is currently very strict, and your inline_t definition is in
fact considered invariant (to check that, replace "type _ inline_t" by
"type (+'a) inline_t" and watch the error), so this conversion does
not pass. On the contrary, with a phantom type definition, the
parameter is inferred to be irrelevant (any variance is accepted), so
you can do whatever you want inside the abstraction boundary.

Note that it may well be something else: with polymorphic variants
it's often hard to distinguish subtyping from polymorphic
instantiation (which will work just as well with GADTs), when you are
manipulating types polymorphic in the row variable ([< ... ] and [>
...]).

(For more information on GADTs and variance, see the previous thread
on this on the mailing-list; the summary is that it's complicated, but
I still hope we could get something finer in the future. It would
probably not work with polymorphic variant instances, anyway.)

On Sun, Jul 15, 2012 at 9:54 AM, Dario Teixeira <darioteixeira@yahoo.com> wrote:
> Hi,
>
> I'm revisiting an old problem with 4.00's newfangled GADTs.  Suppose you have
> four kinds of inline nodes, two of which (Text and Link) are leaves, while
> the other two (Bold and Mref) are the parents of other nodes.  Moreover,
> you want to enforce the invariant that a "linkish" node (Link and Mref)
> may not be the ancestor of another linkish node.  One possible implementation:
>
> type _ inline_t =
>     | Text: string -> [> `Nonlink ] inline_t
>     | Bold: 'a inline_t list -> 'a inline_t
>     | Link: string -> [> `Link ] inline_t
>     | Mref: string * [< `Nonlink ] inline_t list -> [> `Link ] inline_t
>
>
> Defining a simple transformation function (in this case one which uppercases
> all text) is also straightforward, just as long as one includes the proper
> type annotations:
>
> let uppercase seq =
>     let rec process: type a. a inline_t -> a inline_t = function
>         | Text txt       -> Text (String.uppercase txt)
>         | Bold xs        -> Bold (List.map process xs)
>         | Link lnk       -> Link lnk
>         | Mref (lnk, xs) -> Mref (lnk, List.map process xs)
>     in List.map process seq
>
>
> But suppose now that I got from a parser a ast_t value with an almost identical
> structure to inline_t, with the exception that it does not intrinsically
> satisfy the latter's invariant:  (Note: for this toy example it may well be
> easy to design the parser grammar such that the invariant is always preserved;
> but suppose you're dealing with a "dumb" parser)
>
> type ast_t =
>     | Ast_Text of string
>     | Ast_Bold of ast_t list
>     | Ast_Link of string
>     | Ast_Mref of string * ast_t list
>
>
> Below is one possible implementation of a function that converts the possibly
> "broken" ast_t into an inline_t.  Note how the processing is split into two
> separate functions -- one which deals only with nonlinks, and other that
> takes anything -- so we can be sure to satisfy the GADT constraints.
>
> let inlineseq_from_astseq seq =
>     let rec process_nonlink = function
>         | Ast_Text txt  -> Text txt
>         | Ast_Bold xs   -> Bold (List.map process_nonlink xs)
>         | _             -> assert false in
>     let rec process_any = function
>         | Ast_Text txt       -> Text txt
>         | Ast_Bold xs        -> Bold (List.map process_any xs)
>         | Ast_Link lnk       -> Link lnk
>         | Ast_Mref (lnk, xs) -> Mref (lnk, List.map process_nonlink xs)
>     in List.map process_any seq
>
>
> Now here's my problem: suppose I wanted to avoid the branch duplication
> present in the above function.  The code below seems to do the trick,
> while at the same time ensuring that the result is always a valid inline_t.
> However, the compiler has trouble seeing that the code is a sound way to
> produce convert an ast_t into an inline_t, and rejects the code.  Moreover,
> it is not enough to simply add the type annotations for subfunction 'process',
> as was done in 'uppercase'.
>
> let inlineseq_from_astseq seq =
>     let rec process allow_link ast = match (allow_link, ast) with
>         | (_, Ast_Text txt)          -> Text txt
>         | (x, Ast_Bold xs)           -> Bold (List.map (process x) xs)
>         | (true, Ast_Link lnk)       -> Link lnk
>         | (false, Ast_Link _)        -> assert false
>         | (true, Ast_Mref (lnk, xs)) -> Mref (lnk, List.map (process false) xs)
>         | (false, Ast_Mref _)        -> assert false
>     in List.map (process true) seq
>
>
> Can the single function approach be made to work?  I'm having trouble figuring
> out just exactly what sort of help the compiler may require to see the code
> above as correct...  (Assuming it is correct, of course...)
>
> Thanks in advance for your time!
> Cheers,
> Dario Teixeira
>
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa-roc.inria.fr/wws/info/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>

  reply	other threads:[~2012-07-15 17:44 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-07-15 16:54 Dario Teixeira
2012-07-15 17:43 ` Gabriel Scherer [this message]
2012-07-15 18:58   ` Dario Teixeira
2012-07-15 20:37     ` Gabriel Scherer
2012-07-15 19:55 ` Leo P White
2012-07-16 14:45   ` Dario Teixeira
2012-07-15 22:41 ` Jacques Garrigue
2012-07-16  8:50   ` Leo P White
2012-07-16 10:06     ` Leo P White
2012-07-16 15:12   ` Dario Teixeira

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPFanBEJ6jf3qzDGdy6kt3Hy7yaYjAb6D+6U1woXMQXMipArUg@mail.gmail.com \
    --to=gabriel.scherer@gmail.com \
    --cc=caml-list@inria.fr \
    --cc=darioteixeira@yahoo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).