caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Troublesome nodes
@ 2008-07-11 20:39 Dario Teixeira
  2008-07-11 21:20 ` [Caml-list] " Jeremy Yallop
                   ` (2 more replies)
  0 siblings, 3 replies; 29+ messages in thread
From: Dario Teixeira @ 2008-07-11 20:39 UTC (permalink / raw)
  To: caml-list

Hi,

This problem was originally raised in a thread in the ocaml-beginners
list [1], but since polymorphic variants, covariant constraints, and
recursive knots were brought into the discussion, I reckon it deserves
the attention of some heavy weights.  Moreover, the problem is trickier
than first appearances suggest.

So, what's the situation?  I want to create a data structure holding
document nodes.  There are four different kinds of nodes, two of which
are terminals (Text and See), and two of which are defined recursively
(Bold and Mref).  Moreover, both See and Mref produce links, and there
is an additional constraint that a link node may *not* be the immediate
ancestor of another link node.  Using conventional union types, a node
could be modelled like this:

module Old_node =
struct
    type seq_t = super_node_t list
    and super_node_t =
        | Nonlink_node of nonlink_node_t
        | Link_node of link_node_t
    and nonlink_node_t =
        | Text of string
        | Bold of seq_t
    and link_node_t =
        | Mref of string * nonlink_node_t list
        | See of string
end


The problem with this representation is that it introduces an unwanted
scaffolding for nodes.  Moreover, it prevents the use of constructor
functions for nodes, since non-link nodes may be represented in the
tree in a context-dependent fashion: either directly such as Bold [...],
or as Nonlink_node (Bold [...]).  Note that preserving the link/nonlink
distinction in the structure is helpful for pattern matching purposes,
but the extra scaffolding is just a pain.

One alternative is to use polymorphic variants, and to take advantage
of the fact that new types can be built as the union of existing ones.
Ideally, one could do something like this:

type seq_t = super_node_t list
 and nonlink_node_t =
    [ `Text of string
    | `Bold of seq_t ]
 and link_node_t =
    [ Mref of string * nonlink_node_t list
    | See of string ]
 and super_node_t = [nonlink_node_t | link_node_t]


However, this fails with an error "The type constructor nonlink_node_t is
not yet completely defined".  Jon Harrop suggested untying the recursive
knot, but the solution has a few drawbacks of its own [2].

Another alternative is to flatten the structure altogether and to annotate
the constructor functions with phantom types to prevent the violation of
the no-parent constraint:

module Node:
sig
    type seq_t = node_t list
    and node_t =
        private
        | Text of string
        | Bold of seq_t
        | Mref of string * seq_t
        | See of string

    type +'a t

    val text: string -> [> `Nonlink] t
    val bold: 'a t list -> [> `Nonlink] t
    val mref: string -> [< `Nonlink] t list -> [> `Link] t
    val see: string -> [> `Link] t
end =
struct
    type seq_t = node_t list
    and node_t =
        | Text of string
        | Bold of seq_t
        | Mref of string * seq_t
        | See of string

    type +'a t = node_t

    let text txt = Text txt
    let bold inl = Bold inl
    let mref ref inl = Mref (ref, inl)
    let see ref = See ref
end


This works fine, but because the link/nonlink distinction is lost, making
even a simple Node_to_Node translator becomes a mess:

module Node_to_Node =
struct
    let rec convert_nonlink_node = function
        | Node.Text txt          -> Node.text txt
        | Node.Bold inl          -> Node.bold (List.map convert_super_node inl)
        | _                      -> failwith "oops"

    and convert_link_node = function
        | Node.Mref (ref, inl)   -> Node.mref ref (List.map convert_nonlink_node inl)
        | Node.See ref           -> Node.see ref
        | _                      -> failwith "oops"

    and convert_super_node node = match node with
        | Node.Text _
        | Node.Bold _            -> (convert_nonlink_node node :> [`Link | `Nonlink] Node.t)
        | Node.See _
        | Node.Mref _            -> convert_link_node node
end


So, I am looking for a solution that meets the following conditions:

 - It satisfies the "no link node shall be parent of another" constraint;
 - the structure should be pattern-matchable;
 - but nodes should be created via constructor functions.

Any ideas?

Thanks in advance and sorry for the long post!
Cheers,
Dario Teixeira

[1] http://tech.groups.yahoo.com/group/ocaml_beginners/message/9930
[2] http://tech.groups.yahoo.com/group/ocaml_beginners/message/9932




      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-11 20:39 Troublesome nodes Dario Teixeira
@ 2008-07-11 21:20 ` Jeremy Yallop
  2008-07-12 12:37   ` Dario Teixeira
  2008-07-11 23:11 ` Zheng Li
  2008-07-13 14:32 ` [Caml-list] " Dario Teixeira
  2 siblings, 1 reply; 29+ messages in thread
From: Jeremy Yallop @ 2008-07-11 21:20 UTC (permalink / raw)
  To: Dario Teixeira; +Cc: caml-list

Dario Teixeira wrote:
> Ideally, one could do something like this:
> 
> type seq_t = super_node_t list
>  and nonlink_node_t =
>     [ `Text of string
>     | `Bold of seq_t ]
>  and link_node_t =
>     [ Mref of string * nonlink_node_t list
>     | See of string ]
>  and super_node_t = [nonlink_node_t | link_node_t]
> 
> 
> However, this fails with an error "The type constructor nonlink_node_t is
> not yet completely defined".  

Here's a slight variation of this scheme that might suit your needs. 
The idea is to move the recursion from the type level to the module 
level, sidestepping the restriction that you can't extend types in the 
same recursive group.

   module rec M :
   sig
     type seq_t = M.super_node_t list
     type nonlink_node_t =
         [ `Text of string
         | `Bold of seq_t ]
     type link_node_t =
         [ `Mref of string * nonlink_node_t list
         | `See of string ]
     type super_node_t =
         [ nonlink_node_t | link_node_t ]
   end = M

Jeremy.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: Troublesome nodes
  2008-07-11 20:39 Troublesome nodes Dario Teixeira
  2008-07-11 21:20 ` [Caml-list] " Jeremy Yallop
@ 2008-07-11 23:11 ` Zheng Li
  2008-07-13 14:32 ` [Caml-list] " Dario Teixeira
  2 siblings, 0 replies; 29+ messages in thread
From: Zheng Li @ 2008-07-11 23:11 UTC (permalink / raw)
  To: caml-list

Hi,

Dario Teixeira wrote:
> Ideally, one could do something like this:
> 
> type seq_t = super_node_t list
>  and nonlink_node_t =
>     [ `Text of string
>     | `Bold of seq_t ]
>  and link_node_t =
>     [ Mref of string * nonlink_node_t list
>     | See of string ]
>  and super_node_t = [nonlink_node_t | link_node_t]
> 
> 
> However, this fails with an error "The type constructor nonlink_node_t is
> not yet completely defined".  Jon Harrop suggested untying the recursive
> knot, but the solution has a few drawbacks of its own [2].

How about just define

type seq_t = super_node_t list
   and nonlink_node_t =
       [ `Text of string
       | `Bold of seq_t ]
   and link_node_t =
       [ `Mref of string * nonlink_node_t list
       | `See of string ]
   and super_node_t =
       [`Test of string |`Bold of seq_t | `Mref of string * 
nonlink_node_t list | `See of string]

or similar ... not sure whether this satisfies your requirements though.

--
Zheng


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-11 21:20 ` [Caml-list] " Jeremy Yallop
@ 2008-07-12 12:37   ` Dario Teixeira
  2008-07-12 13:25     ` Jacques Carette
  2008-07-12 16:44     ` Wolfgang Lux
  0 siblings, 2 replies; 29+ messages in thread
From: Dario Teixeira @ 2008-07-12 12:37 UTC (permalink / raw)
  To: Jeremy Yallop; +Cc: caml-list

Hi,

Thanks Jeremy, that's quite ingenious.  However, I've hit a problem in the
definition of the constructor functions (the full code + compiler error
is below).  Now, I know that nonlink_node_t is a subset of super_node_t,
and therefore any variant valid for nonlink_node_t is also acceptable for
super_node_t.  But how do I tell this to the compiler?  (Note that having
users of the module manually casting types with :> is something I would
rather avoid).

Thanks again,
Dario


module rec Node:
sig
	type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
	type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
	type super_node_t = [ nonlink_node_t | link_node_t ]

	val text: string -> nonlink_node_t
	val bold: super_node_t list -> nonlink_node_t
	val see: string -> link_node_t
	val mref: string -> nonlink_node_t list -> link_node_t
end =
struct
	type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
	type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
	type super_node_t = [ nonlink_node_t | link_node_t ]

	let text txt = `Text txt
	let bold seq = `Bold seq
	let see ref = `See ref
	let mref ref seq = `Mref (ref, seq)
end

open Node
let foo = text "foo"
let bar = bold [text "bar"]


Error: This expression has type Node.nonlink_node_t
       but is here used with type Node.super_node_t
       The first variant type does not allow tag(s) `Mref, `See




      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-12 12:37   ` Dario Teixeira
@ 2008-07-12 13:25     ` Jacques Carette
  2008-07-12 16:44     ` Wolfgang Lux
  1 sibling, 0 replies; 29+ messages in thread
From: Jacques Carette @ 2008-07-12 13:25 UTC (permalink / raw)
  To: Dario Teixeira; +Cc: caml-list

The ingenious use of a recursive module signature seems to be the source 
of the problem here.  If one defines 'bar' inside the module Node, it 
works fine [but is clearly useless].

What surprises me is that the code (below) also gives the exact same 
error.  In other contexts, I have managed to get such annotations (see 
the definition of the bold function) to 'work', but not here.  I am 
quite puzzled by this, so if you do get an answer to your question 
offl-ist, I would appreciate if you could forward it on.

Jacques

PS: because I am actually using metaocaml for other projects, I am using 
ocaml 3.09 for my tests, so it is possible that something is different 
in 3.10.

module Node :
sig
    type nonlink_node_t
    type link_node_t
    type super_node_t

    val text: string -> nonlink_node_t
    val bold: super_node_t list -> nonlink_node_t
    val see: string -> link_node_t
    val mref: string -> nonlink_node_t list -> link_node_t
end =
struct
    type 'a p1 = [ `Text of string | `Bold of 'a list ]
    type 'a p2 = [ `See of string | `Mref of string * 'a p1 list ]
    type 'a p3 = [ 'a p1 | 'a p2 ]

    type super_node_t = super_node_t p3
    type nonlink_node_t = super_node_t p1
    type link_node_t = super_node_t p2

    let text txt = `Text txt
    let bold seq = `Bold (seq :> super_node_t list)
    let see ref = `See ref
    let mref ref seq = `Mref (ref, seq)
end;;

Dario Teixeira wrote:
> Hi,
>
> Thanks Jeremy, that's quite ingenious.  However, I've hit a problem in the
> definition of the constructor functions (the full code + compiler error
> is below).  Now, I know that nonlink_node_t is a subset of super_node_t,
> and therefore any variant valid for nonlink_node_t is also acceptable for
> super_node_t.  But how do I tell this to the compiler?  (Note that having
> users of the module manually casting types with :> is something I would
> rather avoid).
>
> Thanks again,
> Dario
>
>
> module rec Node:
> sig
> 	type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
> 	type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
> 	type super_node_t = [ nonlink_node_t | link_node_t ]
>
> 	val text: string -> nonlink_node_t
> 	val bold: super_node_t list -> nonlink_node_t
> 	val see: string -> link_node_t
> 	val mref: string -> nonlink_node_t list -> link_node_t
> end =
> struct
> 	type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
> 	type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
> 	type super_node_t = [ nonlink_node_t | link_node_t ]
>
> 	let text txt = `Text txt
> 	let bold seq = `Bold seq
> 	let see ref = `See ref
> 	let mref ref seq = `Mref (ref, seq)
> end
>
> open Node
> let foo = text "foo"
> let bar = bold [text "bar"]
>
>
> Error: This expression has type Node.nonlink_node_t
>        but is here used with type Node.super_node_t
>        The first variant type does not allow tag(s) `Mref, `See
>
>
>
>
>       __________________________________________________________
> Not happy with your email address?.
> Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>   


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-12 12:37   ` Dario Teixeira
  2008-07-12 13:25     ` Jacques Carette
@ 2008-07-12 16:44     ` Wolfgang Lux
  2008-07-12 18:21       ` Dario Teixeira
  2008-07-12 18:58       ` Jacques Carette
  1 sibling, 2 replies; 29+ messages in thread
From: Wolfgang Lux @ 2008-07-12 16:44 UTC (permalink / raw)
  To: Dario Teixeira; +Cc: Jeremy Yallop, caml-list

Dario Teixeira wrote:

> module rec Node:
> sig
> 	type nonlink_node_t = [ `Text of string | `Bold of  
> Node.super_node_t list ]
> 	type link_node_t = [ `See of string | `Mref of string *  
> nonlink_node_t list ]
> 	type super_node_t = [ nonlink_node_t | link_node_t ]
>
> 	val text: string -> nonlink_node_t
> 	val bold: super_node_t list -> nonlink_node_t
> 	val see: string -> link_node_t
> 	val mref: string -> nonlink_node_t list -> link_node_t
> end =
> struct
> 	type nonlink_node_t = [ `Text of string | `Bold of  
> Node.super_node_t list ]
> 	type link_node_t = [ `See of string | `Mref of string *  
> nonlink_node_t list ]
> 	type super_node_t = [ nonlink_node_t | link_node_t ]
>
> 	let text txt = `Text txt
> 	let bold seq = `Bold seq
> 	let see ref = `See ref
> 	let mref ref seq = `Mref (ref, seq)
> end
>
> open Node
> let foo = text "foo"
> let bar = bold [text "bar"]
>
>
> Error: This expression has type Node.nonlink_node_t
>        but is here used with type Node.super_node_t
>        The first variant type does not allow tag(s) `Mref, `See

This problem is apparently easy to solve. Just don't expect an exact  
super_node_t list argument for the bold function. Instead use
   val bold : [< super_node_t] list -> nonlink_node_t
in the signature and
   let bold seq = `Bold (seq :> super_node_t list)
in the body of the Node module's definition.

Wolfgang


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-12 16:44     ` Wolfgang Lux
@ 2008-07-12 18:21       ` Dario Teixeira
  2008-07-12 18:27         ` Jeremy Yallop
  2008-07-12 18:58       ` Jacques Carette
  1 sibling, 1 reply; 29+ messages in thread
From: Dario Teixeira @ 2008-07-12 18:21 UTC (permalink / raw)
  To: Wolfgang Lux; +Cc: Jeremy Yallop, caml-list

Hi,

Thank you all for your help!  I think we're almost there: with Wolfgang's
suggestion, the Node module complies with the original constraints, as the
code below shows:  (the compiler complains -- as it should! -- when on the
last line I try to nest two link nodes directly)

module rec Node:
sig
    type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
    type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
    type super_node_t = [ nonlink_node_t | link_node_t ]

    val text: string -> nonlink_node_t
    val bold: [< super_node_t] list -> nonlink_node_t
    val see: string -> link_node_t
    val mref: string -> nonlink_node_t list -> link_node_t
end =
struct
    type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
    type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
    type super_node_t = [ nonlink_node_t | link_node_t ]

    let text txt = `Text txt
    let bold seq = `Bold (seq :> super_node_t list)
    let see ref = `See ref
    let mref ref seq = `Mref (ref, seq)
end

open Node

let foo1 = text "foo"                   (* valid *)
let foo2 = bold [text "foo"]            (* valid *)
let foo3 = mref "ref" [text "foo"]      (* valid *)
let foo4 = mref "ref" [see "ref"]       (* invalid *)


Now, the icing on the cake would be the possibility to build a Node_to_Node
set of functions without code duplication.  Ideally, I would like to do
something as follows:

module Node_to_Node =
struct
    open Node

    let rec convert_nonlink_node = function
        | `Text txt         -> Node.text txt
        | `Bold seq         -> Node.bold (List.map convert_super_node seq)

    and convert_link_node = function
        | `See ref          -> Node.see ref
        | `Mref (ref, seq)  -> Node.mref ref (List.map convert_nonlink_node seq)

    and convert_super_node node = match node with
        | #nonlink_node_t   -> (convert_nonlink_node node :> super_node_t)
        | #link_node_t      -> (convert_link_node node :> super_node_t)
end


But this fails on the last line:

Error: This expression has type
         [< `Bold of 'a list & Node.super_node_t list | `Text of string ]
         as 'a
       but is here used with type
         [< `Mref of string * 'a list | `See of string ]
       These two variant types have no intersection


Do you see any way around it?

Thanks again for all your time and attention!
Cheers,
Dario



      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-12 18:21       ` Dario Teixeira
@ 2008-07-12 18:27         ` Jeremy Yallop
  0 siblings, 0 replies; 29+ messages in thread
From: Jeremy Yallop @ 2008-07-12 18:27 UTC (permalink / raw)
  To: Dario Teixeira; +Cc: Wolfgang Lux, caml-list

Dario Teixeira wrote:
>     and convert_super_node node = match node with
>         | #nonlink_node_t   -> (convert_nonlink_node node :> super_node_t)
>         | #link_node_t      -> (convert_link_node node :> super_node_t)

You need to rebind the matched value within the pattern in order to 
refine the type on the rhs:

and convert_super_node node = match node with
| #nonlink_node_t as node -> (convert_nonlink_node node :> super_node_t)
| #link_node_t    as node -> (convert_link_node node :> super_node_t)

Jeremy.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-12 16:44     ` Wolfgang Lux
  2008-07-12 18:21       ` Dario Teixeira
@ 2008-07-12 18:58       ` Jacques Carette
  1 sibling, 0 replies; 29+ messages in thread
From: Jacques Carette @ 2008-07-12 18:58 UTC (permalink / raw)
  To: Wolfgang Lux; +Cc: Dario Teixeira, caml-list

Wolfgang Lux wrote:
> This problem is apparently easy to solve. Just don't expect an exact 
> super_node_t list argument for the bold function. Instead use
>   val bold : [< super_node_t] list -> nonlink_node_t
> in the signature and
>   let bold seq = `Bold (seq :> super_node_t list)
> in the body of the Node module's definition.
Indeed, works like a charm.  You don't even need the coercion in the 
body of 'bold' for it to work.

This will hopefully allow me to remove some coercions from my own code, 
and use subtype annotations in the types.

Jacques


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-11 20:39 Troublesome nodes Dario Teixeira
  2008-07-11 21:20 ` [Caml-list] " Jeremy Yallop
  2008-07-11 23:11 ` Zheng Li
@ 2008-07-13 14:32 ` Dario Teixeira
  2008-07-13 17:39   ` Dario Teixeira
  2 siblings, 1 reply; 29+ messages in thread
From: Dario Teixeira @ 2008-07-13 14:32 UTC (permalink / raw)
  To: caml-list

Hi,

For the sake of future reference, I'm going to summarise the solution to
the original problem, arrived at thanks to the contribution of various
people in this list (your help was very much appreciated!).  This might
came in handy in the future if you run into a similar problem.

So, we have a document composed of a sequence of nodes.  There are four
different kinds of nodes, two of which are terminals (Text and See)
and two of which are defined recursively (Bold and Mref).  Both See and
Mref produce links, and we want to impose a constraint that no link node
shall be the immediate ancestor of another link node.  An additional
constraint is that nodes must be created via constructor functions.
Finally, the structure should be pattern-matchable, and the distinction
between link/nonlink nodes should be preserved so that Node-to-Node
functions do not require code duplication and/or ugly hacks.

Using conventional variants, the structure can be represented as follows.
Alas, this solution is not compatible with constructor functions and
requires unwanted scaffolding:

module Ast =
struct
    type super_node_t =
        | Nonlink_node of nonlink_node_t
        | Link_node of link_node_t
    and nonlink_node_t =
        | Text of string
        | Bold of super_node_t list
    and link_node_t =
        | Mref of string * nonlink_node_t list
        | See of string
end


Below is the solution that was finally obtained.  Nodes are represented
using polymorphic variants, and the module itself is recursive to get
around the "type not fully defined" problem:

module rec Node:
sig
    type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
    type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
    type super_node_t = [ nonlink_node_t | link_node_t ]

    val text: string -> nonlink_node_t
    val bold: [< super_node_t] list -> nonlink_node_t
    val see: string -> link_node_t
    val mref: string -> nonlink_node_t list -> link_node_t
end =
struct
    type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
    type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
    type super_node_t = [ nonlink_node_t | link_node_t ]

    let text txt = `Text txt
    let bold seq = `Bold (seq :> super_node_t list)
    let see ref = `See ref
    let mref ref seq = `Mref (ref, seq)
end


If you try it out, you will see that this solution enforces the basic
node constraints.  The values foo1-foo4 are valid, whereas the compiler
won't accept foo5, because a link node is the parent of another:

open Node
let foo1 = text "foo"
let foo2 = bold [text "foo"]
let foo3 = mref "ref" [foo1; foo2]
let foo4 = mref "ref" [bold [see "ref"]]
let foo5 = mref "ref" [see "ref"]


Now, suppose you need to create an Ast-to-Node sets of functions (a likely
scenario if you are building a parser for documents).  The solution is
fairly straightforward, but note the need to use the cast operator :>
to promote link/nonlink nodes to supernodes:

module Ast_to_Node =
struct
    let rec convert_nonlink_node = function
        | Ast.Text txt          -> Node.text txt
        | Ast.Bold seq          -> Node.bold (List.map convert_super_node seq)

    and convert_link_node = function
        | Ast.Mref (ref, seq)   -> Node.mref ref (List.map convert_nonlink_node seq)
        | Ast.See ref           -> Node.see ref

    and convert_super_node = function
        | Ast.Nonlink_node node -> (convert_nonlink_node node :> Node.super_node_t)
        | Ast.Link_node node    -> (convert_link_node node :> Node.super_node_t)
end


Finally, another common situation is to build functions to process nodes.
The example below is that of a node-to-node "identity" module.  Again, it
is fairly straightforward, but note the use of the operator # and the need
to cast link/nonlink nodes to supernodes:

module Node_to_Node =
struct
    open Node

    let rec convert_nonlink_node = function
        | `Text txt              -> text txt
        | `Bold seq              -> bold (List.map convert_super_node seq)

    and convert_link_node = function
        | `See ref               -> see ref
        | `Mref (ref, seq)       -> mref ref (List.map convert_nonlink_node seq)

    and convert_super_node = function
        | #nonlink_node_t as node -> (convert_nonlink_node node :> super_node_t)
        | #link_node_t as node    -> (convert_link_node node :> super_node_t)
end


Best regards,
Dario Teixeira



      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-13 14:32 ` [Caml-list] " Dario Teixeira
@ 2008-07-13 17:39   ` Dario Teixeira
  2008-07-13 21:10     ` Jon Harrop
  0 siblings, 1 reply; 29+ messages in thread
From: Dario Teixeira @ 2008-07-13 17:39 UTC (permalink / raw)
  To: caml-list

Hi again,

Sorry, but in the meantime I came across two problems with the supposedly
ultimate solution I just posted.  I have a correction for one, but not
for the other.

The following statements trigger the first problem:

let foo1 = text "foo"
let foo2 = see "ref"
let foo3 = bold [foo1; foo2]

Error: This expression has type Node.link_node_t but is here used with type
         Node.nonlink_node_t
       These two variant types have no intersection

The solution that immediately comes to mind is to make the return types
for the constructor functions open:  (I can see no disadvantage with
this solution; please tell me if you find any)


module rec Node:
sig
    type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
    type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
    type super_node_t = [ nonlink_node_t | link_node_t ]

    val text: string -> [> nonlink_node_t]
    val bold: [< super_node_t] list -> [> nonlink_node_t]
    val see: string -> [> link_node_t]
    val mref: string -> nonlink_node_t list -> [> link_node_t]
end =
struct
    type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
    type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
    type super_node_t = [ nonlink_node_t | link_node_t ]

    let text txt = `Text txt
    let bold seq = `Bold (seq :> super_node_t list)
    let see ref = `See ref
    let mref ref seq = `Mref (ref, seq)
end


The second problem, while not a show-stopper, may open a hole for misuse of
the module, so I would rather get it fixed.  Basically, while the module
provides constructor functions to build nodes, nothing prevents the user
from bypassing them and constructing nodes manually.  The obvious solution
of declaring the types "private" results in an "This fixed type has no row
variable" error.  Any way around it?

Thanks once more for your time,
Dario Teixeira



      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-13 17:39   ` Dario Teixeira
@ 2008-07-13 21:10     ` Jon Harrop
  2008-07-14 15:11       ` Dario Teixeira
  0 siblings, 1 reply; 29+ messages in thread
From: Jon Harrop @ 2008-07-13 21:10 UTC (permalink / raw)
  To: caml-list

On Sunday 13 July 2008 18:39:07 Dario Teixeira wrote:
> Hi again,
>
> Sorry, but in the meantime I came across two problems with the supposedly
> ultimate solution I just posted.  I have a correction for one, but not
> for the other.
>
> The following statements trigger the first problem:
>
> let foo1 = text "foo"
> let foo2 = see "ref"
> let foo3 = bold [foo1; foo2]
>
> Error: This expression has type Node.link_node_t but is here used with type
>          Node.nonlink_node_t
>        These two variant types have no intersection
>
> The solution that immediately comes to mind is to make the return types
> for the constructor functions open:  (I can see no disadvantage with
> this solution; please tell me if you find any)

I believe that is the correct solution. Sorry I didn't reach it sooner myself!

> module rec Node:
> sig
>     type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t
> list ] type link_node_t = [ `See of string | `Mref of string *
> nonlink_node_t list ] type super_node_t = [ nonlink_node_t | link_node_t ]
>
>     val text: string -> [> nonlink_node_t]
>     val bold: [< super_node_t] list -> [> nonlink_node_t]
>     val see: string -> [> link_node_t]
>     val mref: string -> nonlink_node_t list -> [> link_node_t]
> end =
> struct
>     type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t
> list ] type link_node_t = [ `See of string | `Mref of string *
> nonlink_node_t list ] type super_node_t = [ nonlink_node_t | link_node_t ]
>
>     let text txt = `Text txt
>     let bold seq = `Bold (seq :> super_node_t list)
>     let see ref = `See ref
>     let mref ref seq = `Mref (ref, seq)
> end
>
>
> The second problem, while not a show-stopper, may open a hole for misuse of
> the module, so I would rather get it fixed.  Basically, while the module
> provides constructor functions to build nodes, nothing prevents the user
> from bypassing them and constructing nodes manually.  The obvious solution
> of declaring the types "private" results in an "This fixed type has no row
> variable" error.  Any way around it?

I was going to suggest boxing every node in an ordinary variant type with a 
single private constructor:

  type 'a t = private Node of 'a constraint 'a = [< Node.super_node_t ]

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-13 21:10     ` Jon Harrop
@ 2008-07-14 15:11       ` Dario Teixeira
  2008-07-14 18:52         ` Dario Teixeira
  0 siblings, 1 reply; 29+ messages in thread
From: Dario Teixeira @ 2008-07-14 15:11 UTC (permalink / raw)
  To: caml-list, Jon Harrop

Hi,

> I was going to suggest boxing every node in an ordinary variant type with a
> single private constructor:
> 
>   type 'a t = private Node of 'a constraint 'a = [< Node.super_node_t ]

Thanks for the suggestion, Jon.  Boxing every node is actually very
welcome anyway, because later I would like to classify nodes into two
different classes, "Basic" and "Complex" (these are orthogonal to the
existing link/nonlink distinction), and the only way I see of doing this
is via some phantom type trickery (more on that in a later message).

Anyway, going back to your boxing suggestion, how can it actually be
applied to the signature of the Node module?  As you can see below, I
have define type 'a t (without the "Node" constructor -- is it really
necessary?), but what is the syntax for saying that the constructors
text, bold, see, and mref, return a kind of 'a t without breaking the
existing restrictions?

module rec Node:
sig
        type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
        type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
        type super_node_t = [ nonlink_node_t | link_node_t ]
        type 'a t = private 'a constraint 'a = [< super_node_t ]

        val text: string -> [> nonlink_node_t]
        val bold: [< super_node_t] list -> [> nonlink_node_t]
        val see: string -> [> link_node_t]
        val mref: string -> nonlink_node_t list -> [> link_node_t]
end =


Sorry for the stream of questions.  As you may have noticed, this
thread has gone deep into uncharted territory as far as the Ocaml
documentation goes.  In fact, I have the feeling the only existing
documentation for these subjects lies within the collective wisdom of
the people on this list...

Best regards,
Dario Teixeira



      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-14 15:11       ` Dario Teixeira
@ 2008-07-14 18:52         ` Dario Teixeira
  2008-07-14 19:37           ` Jeremy Yallop
  0 siblings, 1 reply; 29+ messages in thread
From: Dario Teixeira @ 2008-07-14 18:52 UTC (permalink / raw)
  To: caml-list; +Cc: Jon Harrop

Hi,

After some trial and error in the top-level, I think I got the hang of the
'constraint' declaration (is there other documentation for it besides the
couple of lines in the Reference Manual?).  Note that besides the already
discussed link/nonlink distinction between nodes, there is now also an
orthogonal distinction between basic/complex nodes (all nodes except for
Mref are basic; any document that uses Mref thus becomes "contaminated"
and cannot use basic-only functions; the inverse, however, is not true).
I took advantage of Jon's suggestion of boxing all return types into a
single Node.t to implement this latter distinction using phantom types
(is there even an alternative technique?).

Now, the code below seems correct, and running "ocamlc -i node.ml" produces
the type signatures you would expect, and without error messages.  However,
running "ocamlc -c node.ml" fails with a "The type of this expression contains
type variables that cannot be generalized" error on the declaration of foo3.
Moreover, if you comment out the top-level declarations and try to compile
only the module declaration, you will also get different results whether you
run "ocamlc -i" or "ocamlc -c".  The former displays the module's signature
as expected, but the latter produces this error message:

 Type declarations do not match:
   type ('a, 'b) t = ('a, 'b) Node.t constraint 'a = [< super_node_t ]
 is not included in
   type ('a, 'b) t = private 'a constraint 'a = [< super_node_t ]


I find this behaviour a bit odd.  What is going on?

Best regards,
Dario Teixeira


module rec Node:
sig
    type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
    type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
    type super_node_t = [ nonlink_node_t | link_node_t ]
    type ('a, 'b) t = private 'a constraint 'a = [< super_node_t ]

    val text: string -> ([> nonlink_node_t], [> `Basic]) t
    val bold: ([< super_node_t], 'a) t list -> ([> nonlink_node_t], 'a) t
    val see: string -> ([> link_node_t], [> `Basic]) t
    val mref: string -> (nonlink_node_t, 'a) t list -> ([> link_node_t], [> `Complex]) t

    val make_basic: ([< super_node_t], [< `Basic]) t -> (super_node_t, [`Basic]) t
    val make_complex: ([< super_node_t], [< `Basic | `Complex]) t -> (super_node_t, [`Complex]) t
end =
struct
    type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
    type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
    type super_node_t = [ nonlink_node_t | link_node_t ]
    type ('a, 'b) t = 'a constraint 'a = [< super_node_t ]

    let text txt = `Text txt
    let bold seq = `Bold (seq :> super_node_t list)
    let see ref = `See ref
    let mref ref seq = `Mref (ref, seq)

    let make_basic n = (n :> (super_node_t, [`Basic]) t)
    let make_complex n = (n :> (super_node_t, [`Complex]) t)
end


open Node

let foo1 = text "ola"
let foo2 = bold [text "ola"]
let foo3 = mref "ref" [foo1; foo2]

let basic1 = make_basic foo1
let basic2 = make_basic foo2

let complex1 = make_complex foo1
let complex2 = make_complex foo2
let complex3 = make_complex foo3



      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-14 18:52         ` Dario Teixeira
@ 2008-07-14 19:37           ` Jeremy Yallop
  2008-07-16 21:22             ` Dario Teixeira
  2008-07-17  0:43             ` Jacques Garrigue
  0 siblings, 2 replies; 29+ messages in thread
From: Jeremy Yallop @ 2008-07-14 19:37 UTC (permalink / raw)
  To: Dario Teixeira; +Cc: caml-list

Dario Teixeira wrote:
>     type ('a, 'b) t = private 'a constraint 'a = [< super_node_t ]

I don't think this is quite what you want yet, although it's getting
close!

The first problem is that phantom types must be implemented in terms
of abstract (or at least generative) types.  A simple example to
illustrate the problem: the types

     int t

and

     float t

denote the same type given the alias declaration

     type 'a t = unit

but different types given the abstract type declaration

     type 'a t

The clause "= private 'a" in your declaration above indicates that `t'
denotes an alias, not an abstract type.

The second problem is indicated by the error message you reported:

    "The type of this expression contains type variables that cannot be
     generalized"

Weak (ungeneralised) type variables are not allowed at module-level
bindings.  You'll see a similar error message if you try to compile a
module containing the following binding, for example

    let x = ref None

The solution to this problem is usually to change the form of the
offending expression, for example by eta-expanding a term denoting a
function.  In your case, though, the solution is to change the
declaration of the abstract type `t' to indicate that the type
parameters do not occur negatively in the right-hand side.  This takes
advantage of a novel feature of OCaml -- the "relaxed value
restriction" -- which uses a subtype-based analysis rather than a
simple syntactic check to determine whether bindings can be made
polymorphic.

The final type declaration, then, is

     type (+'a, +'b) t
           constraint 'a = [< super_node_t ]

in the signature and

     type (+'a, +'b) t = 'a
           constraint 'a = [< super_node_t ]

in the implementation.

Jeremy.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-14 19:37           ` Jeremy Yallop
@ 2008-07-16 21:22             ` Dario Teixeira
  2008-07-17  0:43             ` Jacques Garrigue
  1 sibling, 0 replies; 29+ messages in thread
From: Dario Teixeira @ 2008-07-16 21:22 UTC (permalink / raw)
  To: Jeremy Yallop; +Cc: caml-list

Hi,

> The first problem is that phantom types must be implemented in terms
> of abstract (or at least generative) types.  A simple example to
> illustrate the problem: the types
> 
>      int t
> and
>      float t
> 
> denote the same type given the alias declaration
> 
>      type 'a t = unit
> 
> but different types given the abstract type declaration
> 
>      type 'a t
> 
> The clause "= private 'a" in your declaration above indicates that `t'
> denotes an alias, not an abstract type.

What mislead me was a simple experiment I made where declaring 't' private
had the same effect as making it abstract for purposes of creating
module values (thus forcing you to constructor functions), while at
the same time keeping the 't' open for purposes of pattern-matching.
Suppose you have a simple module that prevents the user from mixing the
proverbial apples and oranges:

module Fruit:
sig
        type 'a t

        val apples: int -> [`Apples] t
        val oranges: int -> [`Oranges] t
        val (+): 'a t -> 'a t -> 'a t
end =
struct
        type 'a t = int

        let apples n = n
        let oranges n = n
        let (+) = (+)
end

open Fruit
let foo = apples 10
let bar = oranges 20
let sum = foo + bar  (* oops! *)


Now, if you un-abstract 't' by declaring "type 'a t = int" in the sig,
then the rogue unification makes the compiler accept the erroneous
top-level statements.  However, if instead you declare it as "type
'a t = private int", then rogue unification is prevented, but the type
is not completely hidden.  You can thus have your cake and eat it too.
(Please correct me if my interpretation of 'private' is off).


> Weak (ungeneralised) type variables are not allowed at module-level
> bindings.  You'll see a similar error message if you try to compile a
> module containing the following binding, for example
> 
>     let x = ref None
> 
> The solution to this problem is usually to change the form of the
> offending expression, for example by eta-expanding a term denoting a
> function.  In your case, though, the solution is to change the
> declaration of the abstract type `t' to indicate that the type
> parameters do not occur negatively in the right-hand side.  This takes
> advantage of a novel feature of OCaml -- the "relaxed value
> restriction" -- which uses a subtype-based analysis rather than a
> simple syntactic check to determine whether bindings can be made
> polymorphic.

Thanks for the explanation.  The solution I adopted was to use the
'+' covariance modifier (is that its name?) *only* for parameter 'a,
because parameter 'b should always be "terminated" via a make_basic or
make_complex function:  (there is no equivalent "termination" function
for parameter 'a because I want to allow documents to be composed from
other documents adlib).

type (+'a, 'b) t constraint 'a = [< super_node_t ]
val make_basic: ('a, [< `Basic]) t -> ('a, [`Basic]) t
val make_complex: ('a, [< `Basic | `Complex]) t -> ('a, [`Complex]) t

Also, I found it is alright to use ungeneralised type variables in
intermediate expressions, as long as the final expression generalises it:
(this is actually analogous to what happens when you declare "let x = ref
None"; it is fine as long as sooner or later you make the type concrete)

let doc =
	let n1 = bold [text "hello"] in		(* 'b is weak *)
	let n2 = mref "ref" [n1] in		(* 'b is still weak *)
	make_complex n2				(* 'b is now concrete *)

Cheers,
Dario Teixeira



      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-14 19:37           ` Jeremy Yallop
  2008-07-16 21:22             ` Dario Teixeira
@ 2008-07-17  0:43             ` Jacques Garrigue
  2008-07-17 10:59               ` Jeremy Yallop
  2008-07-17 16:12               ` Dario Teixeira
  1 sibling, 2 replies; 29+ messages in thread
From: Jacques Garrigue @ 2008-07-17  0:43 UTC (permalink / raw)
  To: jeremy.yallop; +Cc: darioteixeira, caml-list

From: Jeremy Yallop <jeremy.yallop@ed.ac.uk>
> Dario Teixeira wrote:
> >     type ('a, 'b) t = private 'a constraint 'a = [< super_node_t ]
> 
> I don't think this is quite what you want yet, although it's getting
> close!
> 
> The first problem is that phantom types must be implemented in terms
> of abstract (or at least generative) types.

This is actually the other way round: abstract and private types allow
phantom types, but abbreviations and normal datatypes (generative
ones) don't.
So the above code really defines a phantom type.

Jacques Garrigue


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-17  0:43             ` Jacques Garrigue
@ 2008-07-17 10:59               ` Jeremy Yallop
  2008-07-18  2:34                 ` Jacques Garrigue
  2008-07-17 16:12               ` Dario Teixeira
  1 sibling, 1 reply; 29+ messages in thread
From: Jeremy Yallop @ 2008-07-17 10:59 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: darioteixeira, caml-list

Jacques Garrigue wrote:
> From: Jeremy Yallop <jeremy.yallop@ed.ac.uk>
>> Dario Teixeira wrote:
>>>     type ('a, 'b) t = private 'a constraint 'a = [< super_node_t ]
>> I don't think this is quite what you want yet, although it's getting
>> close!
>>
>> The first problem is that phantom types must be implemented in terms
>> of abstract (or at least generative) types.
> 
> This is actually the other way round: abstract and private types allow
> phantom types, but abbreviations and normal datatypes (generative
> ones) don't.

Thanks for the correction!  I haven't yet properly internalized private 
rows.  It seems that private rows allow phantom types because there is 
abstraction (and hence generativity) involved.

However, it seems to me that normal (generative) datatypes also allow 
phantom types.  If `t' is a unary generative datatype constructor  then 
`int t' and `unit t' are not unifiable, even if the type parameter does 
not occur on the rhs of the definition of `t'.  For example:

    type 'a t = T
    let f ((_ : int t) : unit t) = ()  (* Wrong. *)

> So the above code really defines a phantom type.

Yes.

On a related note, does Dario's declaration above become ambiguous with 
the introduction of private type abbreviations?  The following program
passes typechecking in 3.10.2, but not in 3.11+dev12, perhaps because 
`t' is interpreted as a private type abbreviation rather than as a 
private row type.

type super_node_t = [`S]

module M :
sig
   type ('a, 'b) t = private 'a constraint 'a = [< super_node_t ]
end =
struct
   type super_node_t = [`S]
   type ('a, 'b) t = 'a constraint 'a = [< super_node_t ]
end

let f x = (x : [<super_node_t] :> ([< super_node_t], _) M.t)

Jeremy.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-17  0:43             ` Jacques Garrigue
  2008-07-17 10:59               ` Jeremy Yallop
@ 2008-07-17 16:12               ` Dario Teixeira
  2008-07-18  2:27                 ` Jacques Garrigue
  1 sibling, 1 reply; 29+ messages in thread
From: Dario Teixeira @ 2008-07-17 16:12 UTC (permalink / raw)
  To: jeremy.yallop, Jacques Garrigue; +Cc: caml-list

Hi,

Thanks for the clarification, Jacques.  So I guess my initial interpretation
of 'private' was correct.  But is 'private' also applicable when a type
is declared using a constraint?  In my Node module, for example, type 't'
is declared abstract in the signature:

type (+'a, 'b) t constraint 'a = [< super_node_t ]

In the implementation, the type is declared as follows:

type (+'a, 'b) t = 'a constraint 'a = [< super_node_t ]

Is it possible in this case to make signature equal to the implementation
except for a 'private' declaration?  (Being able to pattern-match on values
of type 't' would be very handy, that is why I would prefer to use 'private'
instead of making the type fully abstract).

Note: I am running Ocaml 3.11+dev12.  Jeremy just sent a message where
he reports that the compiler behaviour in this matter changed between
3.10 and 3.11.

Thank you for your time,
Dario Teixeira



      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-17 16:12               ` Dario Teixeira
@ 2008-07-18  2:27                 ` Jacques Garrigue
  2008-07-18 13:09                   ` Dario Teixeira
  0 siblings, 1 reply; 29+ messages in thread
From: Jacques Garrigue @ 2008-07-18  2:27 UTC (permalink / raw)
  To: darioteixeira; +Cc: caml-list

From: Dario Teixeira <darioteixeira@yahoo.com>

> Thanks for the clarification, Jacques.  So I guess my initial interpretation
> of 'private' was correct.  But is 'private' also applicable when a type
> is declared using a constraint?  In my Node module, for example, type 't'
> is declared abstract in the signature:
> 
> type (+'a, 'b) t constraint 'a = [< super_node_t ]
> 
> In the implementation, the type is declared as follows:
> 
> type (+'a, 'b) t = 'a constraint 'a = [< super_node_t ]
> 
> Is it possible in this case to make signature equal to the implementation
> except for a 'private' declaration?  (Being able to pattern-match on values
> of type 't' would be very handy, that is why I would prefer to use 'private'
> instead of making the type fully abstract).
> 
> Note: I am running Ocaml 3.11+dev12.  Jeremy just sent a message where
> he reports that the compiler behaviour in this matter changed between
> 3.10 and 3.11.

This is indeed possible in 3.11, due to the addition of private
abbreviations. Note however that the behaviour of private
abbreviations is not as transparent as private types or rows: the only
practical difference between an abstract type and a private
abbreviation is that you can explicitely coerce from the private
abbreviation to its definition. You cannot use pattern matching
directly without coercion.

So if you write

  type (+'a, 'b) t = private 'a constraint 'a = [< super_node_t ]

then you must write (x : (_,_) t :> [> ]) if you want to use pattern
matching on x.

Another side-effect of the addition of private abbreviations is that
now the distinction between private rows and private abbreviations is
syntactic. In particular

  type (+'a, 'b) t = private 'a constraint 'a = [< super_node_t ]

defined a private row in 3.10, but it is a private abbreviation in 3.11
(with of course a different semantics).
If you need the private row semantics, you should now write it as

  type (+a, 'b) t = private [< super_node_t] as 'a

which is more intuitive anyway.

Jacques Garrigue


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-17 10:59               ` Jeremy Yallop
@ 2008-07-18  2:34                 ` Jacques Garrigue
  2008-07-18  9:47                   ` Jeremy Yallop
  0 siblings, 1 reply; 29+ messages in thread
From: Jacques Garrigue @ 2008-07-18  2:34 UTC (permalink / raw)
  To: jeremy.yallop; +Cc: caml-list

From: Jeremy Yallop <jeremy.yallop@ed.ac.uk>

> Jacques Garrigue wrote:
> > From: Jeremy Yallop <jeremy.yallop@ed.ac.uk>
> >> Dario Teixeira wrote:
> >>>     type ('a, 'b) t = private 'a constraint 'a = [< super_node_t ]
> >> I don't think this is quite what you want yet, although it's getting
> >> close!
> >>
> >> The first problem is that phantom types must be implemented in terms
> >> of abstract (or at least generative) types.
> > 
> > This is actually the other way round: abstract and private types allow
> > phantom types, but abbreviations and normal datatypes (generative
> > ones) don't.
> 
> Thanks for the correction!  I haven't yet properly internalized private 
> rows.  It seems that private rows allow phantom types because there is 
> abstraction (and hence generativity) involved.
> 
> However, it seems to me that normal (generative) datatypes also allow 
> phantom types.  If `t' is a unary generative datatype constructor  then 
> `int t' and `unit t' are not unifiable, even if the type parameter does 
> not occur on the rhs of the definition of `t'.  For example:
> 
>     type 'a t = T
>     let f ((_ : int t) : unit t) = ()  (* Wrong. *)

However

      let f x = (x : int t :> unit t)  (* Ok *)

For datatypes, variance is inferred automatically, and if a variable
does not occur in the type you are allowed to change it arbitrarily
through subtyping.

> On a related note, does Dario's declaration above become ambiguous with 
> the introduction of private type abbreviations?  The following program
> passes typechecking in 3.10.2, but not in 3.11+dev12, perhaps because 
> `t' is interpreted as a private type abbreviation rather than as a 
> private row type.

I detailed the change in my answer to his mail. In 3.11 the
distinction is syntactic: only explicit variant or object types give
rise to private rows.
By the way, I forgot to mention that aliases (as keyword) of those
were not allowed until now, so I have to fix it to make the syntax for
private rows work. (There are useful examples using this syntax.)

  type ('a, 'b) t = private [< super_node_t ] as 'a

Cheers,

Jacques Garrigue


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-18  2:34                 ` Jacques Garrigue
@ 2008-07-18  9:47                   ` Jeremy Yallop
  2008-07-18 13:02                     ` Jacques Garrigue
  0 siblings, 1 reply; 29+ messages in thread
From: Jeremy Yallop @ 2008-07-18  9:47 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: caml-list

Jacques Garrigue wrote:
> From: Jeremy Yallop <jeremy.yallop@ed.ac.uk>
>> However, it seems to me that normal (generative) datatypes also allow 
>> phantom types.  If `t' is a unary generative datatype constructor  then 
>> `int t' and `unit t' are not unifiable, even if the type parameter does 
>> not occur on the rhs of the definition of `t'.  For example:
>>
>>     type 'a t = T
>>     let f ((_ : int t) : unit t) = ()  (* Wrong. *)
> 
> However
> 
>       let f x = (x : int t :> unit t)  (* Ok *)
> 
> For datatypes, variance is inferred automatically, and if a variable
> does not occur in the type you are allowed to change it arbitrarily
> through subtyping.

Thanks for the explanation.  I agree that this means that datatypes are 
not useful for phantom types.

I don't yet understand exactly how private abbreviations are supposed to 
work.  The currently implemented behaviour doesn't accomplish what I 
understand to be the intention.  For example, given

    module Nat :
    sig
      type t = private int
      val z : t
      val s : t -> t
    end =
    struct
      type t = int
      let z = 0
      let s = (+) 1
    end

we can write

    # (((Nat.z : Nat.t :> int) - 1) :> Nat.t);;
    - : Nat.t = -1

Similarly, I'm surprised by the behaviour of coercions at parameterised 
types.  Given a private type

      type 'a t = private 'a

should we be able to coerce a value of type 'a t to type 'a?

Jeremy.


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-18  9:47                   ` Jeremy Yallop
@ 2008-07-18 13:02                     ` Jacques Garrigue
  2008-07-18 13:55                       ` Jacques Garrigue
  0 siblings, 1 reply; 29+ messages in thread
From: Jacques Garrigue @ 2008-07-18 13:02 UTC (permalink / raw)
  To: jeremy.yallop; +Cc: caml-list

From: Jeremy Yallop <jeremy.yallop@ed.ac.uk>

> I don't yet understand exactly how private abbreviations are supposed to 
> work.  The currently implemented behaviour doesn't accomplish what I 
> understand to be the intention.  For example, given
> 
>     module Nat :
>     sig
>       type t = private int
>       val z : t
>       val s : t -> t
>     end =
>     struct
>       type t = int
>       let z = 0
>       let s = (+) 1
>     end
> 
> we can write
> 
>     # (((Nat.z : Nat.t :> int) - 1) :> Nat.t);;
>     - : Nat.t = -1

It looks like you've found a serious bug in the current
implementation. Actually you don't even need a coercion:

  # (((Nat.z : Nat.t :> int) - 1) : Nat.t);;
  - : Nat.t = -1

This should certainly not be accepted, as you can see by writing

# (((Nat.z : Nat.t :> int) - 1) : int :> Nat.t);;  (* fails *)

I think I know the reason, which is probably related to the way
abbreviation expansions are cached, but this needs more investigating.

Jacques Garrigue


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-18  2:27                 ` Jacques Garrigue
@ 2008-07-18 13:09                   ` Dario Teixeira
  2008-07-18 17:36                     ` Dario Teixeira
  0 siblings, 1 reply; 29+ messages in thread
From: Dario Teixeira @ 2008-07-18 13:09 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: caml-list

Hi,

Thanks once more for your help, Jacques.  I had no idea this problem would
prove to be such a challenge!


> So if you write
>
>  type (+'a, 'b) t = private 'a constraint 'a = [< super_node_t ]
>
> then you must write (x : (_,_) t :> [> ]) if you want to use pattern
> matching on x.


The 'private' declaration you suggested does work on some simpler examples,
but the 3.11 compiler won't accept it for the Node module.  This is the code:

module rec Node:
sig
        type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
        type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
        type super_node_t = [ nonlink_node_t | link_node_t ]
        type (+'a, 'b) t = private 'a constraint 'a = [< super_node_t ]

        val text: string -> ([> nonlink_node_t], [> `Basic]) t
        val bold: ([< super_node_t], 'a) t list -> ([> nonlink_node_t], 'a) t
        val see: string -> ([> link_node_t], [> `Basic]) t
        val mref: string -> (nonlink_node_t, 'a) t list -> ([> link_node_t], [> `Complex]) t

        val make_basic: ('a, [< `Basic]) t -> ('a, [`Basic]) t
        val make_complex: ('a, [< `Basic | `Complex]) t -> ('a, [`Complex]) t
end =
struct
        type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
        type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
        type super_node_t = [ nonlink_node_t | link_node_t ]
        type (+'a, 'b) t = 'a constraint 'a = [< super_node_t ]

        let text txt = `Text txt
        let bold seq = `Bold (seq :> super_node_t list)
        let see ref = `See ref
        let mref ref seq = `Mref (ref, seq)

        let make_basic n = n
        let make_complex n = n
end

And this is the error:

 Type declarations do not match:
   type (+'a, 'b) t = ('a, 'b) Node.t constraint 'a = [< super_node_t ]
 is not included in
   type (+'a, 'b) t = private 'a constraint 'a = [< super_node_t ]


> By the way, I forgot to mention that aliases (as keyword) of those
> were not allowed until now, so I have to fix it to make the syntax for
> private rows work. (There are useful examples using this syntax.)
>
>  type ('a, 'b) t = private [< super_node_t ] as 'a

With 3.11+dev12, the above declaration produces a "Unbound type parameter .."
error.  When you say you have to fix it, do you mean this will actually be
a valid declaration in 3.11 final?  And will this solution allow for easier
pattern-matching of Node values, without the need for coercion?

Speaking of coercion, the exact syntax for it in combination with '#'
patterns elludes me.  Take for example a simple, "identity", Node-to-Node
module listed below.  How can coercion be applied?


module Node_to_Node =
struct
    open Node

    let rec convert_nonlink_node = function
        | `Text txt                     -> text txt
        | `Bold inl                     -> bold (List.map convert_super_node inl)

    and convert_link_node = function
        | `Mref (ref, inl)              -> mref ref (List.map convert_nonlink_node inl)
        | `See ref                      -> see ref

    and convert_super_node = function
        | #nonlink_node_t as node       -> (convert_nonlink_node node : ('a, 'b) t :> (super_node_t, 'b) t)
        | #link_node_t as node          -> convert_link_node node
end


Best regards,
Dario Teixeira



      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-18 13:02                     ` Jacques Garrigue
@ 2008-07-18 13:55                       ` Jacques Garrigue
  2008-07-19  2:15                         ` Jacques Garrigue
  0 siblings, 1 reply; 29+ messages in thread
From: Jacques Garrigue @ 2008-07-18 13:55 UTC (permalink / raw)
  To: jeremy.yallop; +Cc: caml-list

> It looks like you've found a serious bug in the current
> implementation. Actually you don't even need a coercion:
> 
>   # (((Nat.z : Nat.t :> int) - 1) : Nat.t);;
>   - : Nat.t = -1
> 
> This should certainly not be accepted, as you can see by writing
> 
> # (((Nat.z : Nat.t :> int) - 1) : int :> Nat.t);;  (* fails *)
> 
> I think I know the reason, which is probably related to the way
> abbreviation expansions are cached, but this needs more investigating.

This is now fixed. I just hope this does not break anything...
It should be ok, but please report any new non-termination problem
immediately :-)

Cheers,

Jacques Garrigue


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-18 13:09                   ` Dario Teixeira
@ 2008-07-18 17:36                     ` Dario Teixeira
  2008-07-19  2:23                       ` Jacques Garrigue
  0 siblings, 1 reply; 29+ messages in thread
From: Dario Teixeira @ 2008-07-18 17:36 UTC (permalink / raw)
  To: caml-list

Hi,

I would like to add one extra, odd, tidbit.  On a simple example, 'private'
will only work if a dummy constructor is added to the type.  Thus, this will
not work:

module rec Node:
sig
        (* ... *)
        type (+'a, 'b) t = private 'a constraint 'a = [< super_node_t]
        (* ... *)
end =
struct
        (* ... *)
        type (+'a, 'b) t = 'a constraint 'a = [< super_node_t]
        (* ... *)
end =


But this will:

module rec Node:
sig
        (* ... *)
        type (+'a, 'b) t = private Dummy of 'a constraint 'a = [< super_node_t]
        (* ... *)
end =
struct
        (* ... *)
        type (+'a, 'b) t = Dummy of 'a constraint 'a = [< super_node_t]
        (* ... *)
end =


Is this behaviour correct?  (I'm running 3.11+dev12)

Cheers,
Dario Teixeira



      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-18 13:55                       ` Jacques Garrigue
@ 2008-07-19  2:15                         ` Jacques Garrigue
  0 siblings, 0 replies; 29+ messages in thread
From: Jacques Garrigue @ 2008-07-19  2:15 UTC (permalink / raw)
  To: jeremy.yallop; +Cc: caml-list

From: Jacques Garrigue <garrigue@math.nagoya-u.ac.jp>

> This is now fixed. I just hope this does not break anything...
> It should be ok, but please report any new non-termination problem
> immediately :-)

OK, my first solution was not correct, and caused infinite loops with

  type t = <a:u; b:int> and u = private t
  let f x = (x : t :> <a:'a; b:int> as 'a)

This is now fixed by distinguishing public and private expansions in
the expansion cache. This time I see no side-effect.

Jacques Garrigue


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-18 17:36                     ` Dario Teixeira
@ 2008-07-19  2:23                       ` Jacques Garrigue
  2008-07-19  8:43                         ` Dario Teixeira
  0 siblings, 1 reply; 29+ messages in thread
From: Jacques Garrigue @ 2008-07-19  2:23 UTC (permalink / raw)
  To: darioteixeira; +Cc: caml-list

From: Dario Teixeira <darioteixeira@yahoo.com>

> I would like to add one extra, odd, tidbit.  On a simple example, 'private'
> will only work if a dummy constructor is added to the type.  Thus, this will
> not work:
> 
> module rec Node:
> sig
>         (* ... *)
>         type (+'a, 'b) t = private 'a constraint 'a = [< super_node_t]
>         (* ... *)
> end =
> struct
>         (* ... *)
>         type (+'a, 'b) t = 'a constraint 'a = [< super_node_t]
>         (* ... *)
> end

I don't understand your counter-example. In typechecks without any
problem, even in older versions. May you did cut it down too much?

> But this will:
> 
> module rec Node:
> sig
>         (* ... *)
>         type (+'a, 'b) t = private Dummy of 'a constraint 'a = [< super_node_t]
>         (* ... *)
> end =
> struct
>         (* ... *)
>         type (+'a, 'b) t = Dummy of 'a constraint 'a = [< super_node_t]
>         (* ... *)
> end =

You Dummy creates a datatype, so this is a completely different story.
Recursion is freely allowed in datatype definitions, while there are
restrictions for abbreviations.

> Is this behaviour correct?  (I'm running 3.11+dev12)

I cannot tell you, since I couldn't reproduce your problem.

By the way, I somehow get the feeling that your solution is
over-engineered. You shouldn't need all that many constraints.
I you could find a way to use private rows rather than private
abbreviations, your code might be much simpler.
But I'm sorry I couldn't follow the whole discussion on your problem.
I you could summarize it shortly, with the basic code (without phantom
types) and the invariants you are trying to enforce, I might try to
look into it.

Jacques Garrigue


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Caml-list] Troublesome nodes
  2008-07-19  2:23                       ` Jacques Garrigue
@ 2008-07-19  8:43                         ` Dario Teixeira
  0 siblings, 0 replies; 29+ messages in thread
From: Dario Teixeira @ 2008-07-19  8:43 UTC (permalink / raw)
  To: Jacques Garrigue; +Cc: caml-list

Hi,

> I don't understand your counter-example. In typechecks without any
> problem, even in older versions. May you did cut it down too much?

Perhaps... ;-)  Here is the full code, producing an error in version
"3.11+dev12 Private_abbrevs+natdynlink (2008-02-29)", compiled via GODI:

module rec Node:
sig
    type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
    type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
    type super_node_t = [ nonlink_node_t | link_node_t ]
    type (+'a, 'b) t = private 'a constraint 'a = [< super_node_t ]

    val text: string -> ([> nonlink_node_t], [> `Basic]) t
    val bold: ([< super_node_t], 'a) t list -> ([> nonlink_node_t], 'a) t
    val see: string -> ([> link_node_t], [> `Basic]) t
    val mref: string -> (nonlink_node_t, 'a) t list -> ([> link_node_t], [> `Complex]) t

    val make_basic: ('a, [< `Basic]) t list -> ('a, [`Basic]) t list
    val make_complex: ('a, [< `Basic | `Complex]) t list -> ('a, [`Complex]) t list
end =
struct
    type nonlink_node_t = [ `Text of string | `Bold of Node.super_node_t list ]
    type link_node_t = [ `See of string | `Mref of string * nonlink_node_t list ]
    type super_node_t = [ nonlink_node_t | link_node_t ]
    type (+'a, 'b) t = 'a constraint 'a = [< super_node_t ]

    let text txt = `Text txt
    let bold seq = `Bold (seq :> (super_node_t, 'a) t list)
    let see ref = `See ref
    let mref ref seq = `Mref (ref, seq)

    let make_basic n = n
    let make_complex n = n
end


> By the way, I somehow get the feeling that your solution is
> over-engineered. You shouldn't need all that many constraints.
> I you could find a way to use private rows rather than private
> abbreviations, your code might be much simpler.
> But I'm sorry I couldn't follow the whole discussion on your problem.
> I you could summarize it shortly, with the basic code (without phantom
> types) and the invariants you are trying to enforce, I might try to
> look into it.

Thanks Jacques, I would really appreciate it.  I'm stuck at this point, so
any help is welcome.  Moreover, this is in fact a neat problem, so any Ocaml
hacker should be able to enjoy it.  Here follows a complete description of
the problem, unprejudiced by the current solution.

I have two types of documents, "basic" and "complex".  Both are composed
of a list of nodes.  The difference between them is that complex documents
may use a superset of the kinds of nodes available for basic documents.
Namely, while basic documents may only use "Text", "Bold", and "See" nodes,
complex documents may also use "Mref" nodes in addition to those three.

In total, there are therefore only four different kinds of nodes.  Two of
those, "Text" and "See" are terminal nodes; the other two, "Bold" and "Mref"
are defined via recursion.  There is one additional restriction: "See" and
"Mref" produce links, and links may not be the immediate ancestors of other
links (note that since "See" is a terminal node and cannot therefore be a
parent, in practice this rule is identical to saying that "Mref" may not
the parent of "See").

Disregarding the basic/complex distinction, nodes could be defined as follows:

type super_node_t =
    | Nonlink_node of nonlink_node_t
    | Link_node of link_node_t
and nonlink_node_t =
    | Text of string
    | Bold of super_node_t list
and link_node_t =
    | Mref of string * nonlink_node_t list
    | See of string


The table below summarises the properties of each node:

Node:	Allowed for:	Definition:	Produces link?
------------------------------------------------------
Text	basic/complex	terminal	no
Bold	basic/complex	rec		no
See	basic/complex	terminal	yes
Mref	complex		rec		yes


There is one final, important, requirement.  I want to be able to
pattern-match on nodes from the outside of the module.  To build a
Node-to-Node function, for example.

To conclude, I'll leave you with some sample documents.  Some are valid,
but others are not.  The compiler should catch the latter.

(* valid *)
let doc1 = make_basic
		[
		text "foo";
		bold [text "bar"];
		see "ref"
		]


(* valid *)
let doc2 = make_complex
		[
		text "foo";
		bold [text "bar"];
		see "ref"
		]


(* valid *)
let doc3 = make_complex
		[
		text "foo";
		mref "ref" [text "bar"]
		]


(* Invalid because a link node is the parent of another *)
let doc4 = make_complex
		[
		mref "ref" [see "foo"]
		]


(* Invalid because basic documents do not allow Mref nodes *)
let doc5 = make_basic
		[
		text "foo";
		mref "ref" [text "bar"]
		]


Best regards,
Dario Teixeira



      __________________________________________________________
Not happy with your email address?.
Get the one you really want - millions of new email addresses available now at Yahoo! http://uk.docs.yahoo.com/ymail/new.html


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2008-07-19  8:43 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-07-11 20:39 Troublesome nodes Dario Teixeira
2008-07-11 21:20 ` [Caml-list] " Jeremy Yallop
2008-07-12 12:37   ` Dario Teixeira
2008-07-12 13:25     ` Jacques Carette
2008-07-12 16:44     ` Wolfgang Lux
2008-07-12 18:21       ` Dario Teixeira
2008-07-12 18:27         ` Jeremy Yallop
2008-07-12 18:58       ` Jacques Carette
2008-07-11 23:11 ` Zheng Li
2008-07-13 14:32 ` [Caml-list] " Dario Teixeira
2008-07-13 17:39   ` Dario Teixeira
2008-07-13 21:10     ` Jon Harrop
2008-07-14 15:11       ` Dario Teixeira
2008-07-14 18:52         ` Dario Teixeira
2008-07-14 19:37           ` Jeremy Yallop
2008-07-16 21:22             ` Dario Teixeira
2008-07-17  0:43             ` Jacques Garrigue
2008-07-17 10:59               ` Jeremy Yallop
2008-07-18  2:34                 ` Jacques Garrigue
2008-07-18  9:47                   ` Jeremy Yallop
2008-07-18 13:02                     ` Jacques Garrigue
2008-07-18 13:55                       ` Jacques Garrigue
2008-07-19  2:15                         ` Jacques Garrigue
2008-07-17 16:12               ` Dario Teixeira
2008-07-18  2:27                 ` Jacques Garrigue
2008-07-18 13:09                   ` Dario Teixeira
2008-07-18 17:36                     ` Dario Teixeira
2008-07-19  2:23                       ` Jacques Garrigue
2008-07-19  8:43                         ` Dario Teixeira

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).