caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Re: new library modules (sets and maps)
@ 1993-05-05 15:40 Didier Remy
  1993-05-06  0:37 ` Xavier Leroy
  0 siblings, 1 reply; 3+ messages in thread
From: Didier Remy @ 1993-05-05 15:40 UTC (permalink / raw)
  To: xavier; +Cc: caml-list


Of  course, the typechecker cannot tell  whether you are using two different
orderings on the same map. If you pack the ordering with the primitives in a
record, you could also associate a stamp to the package (each timem you pass
a different  ordering) which  you be  used to  mark all sets created by this
package.   Then  you could  dynamically  check  whether  two  sets sets  are
originated from the same package.  Except that you cannot generate stamps in
a purely applicative style!

Using an object oriented style, you could consider the functions on  sets as
methods, and pack them with each set (instead of the stamp).  The primitives
then take just call the  right methods from the set they  operate  on, which
prevent from  using  two  differnt ordering on the same set.  In fact it  is
enough to pack only the ordering "method".

The interface would still be:

        type 'a t;;

        type 'a ordering == 'a -> 'a -> int;;

        type ('a, 'b) set_operations =
          { empty: 'a t;
            is_empty: 'a t -> bool;
            mem: 'a -> 'a t -> bool;
            add: 'a -> 'a t -> 'a t;
            iter: ('a -> 'b) -> 'a t -> unit;
            (* and so on *) };;

        value make: 'a ordering -> ('a, 'b) set_operations;;

but the implementation of 'a t is now

         type 'a t = {ordering : 'a -> 'a -> 'int;
                      val : 'a old_t};;

and  the  implementation of the functions would changed accordingly  to take
the ordering from their argument rather than from their closure.  Of course,
there is an ambiguity for functions working on  several sets.  The functions
could be written to work with  two orderings, one for each set, but this may
complicate their implementation  or slower their execution, for a  situation
that should not arise in practice...

    Didier.




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: new library modules (sets and maps)
  1993-05-05 15:40 new library modules (sets and maps) Didier Remy
@ 1993-05-06  0:37 ` Xavier Leroy
  1993-05-06  8:39   ` Benjamin Pierce
  0 siblings, 1 reply; 3+ messages in thread
From: Xavier Leroy @ 1993-05-06  0:37 UTC (permalink / raw)
  To: Didier Remy; +Cc: caml-list

Didier Remy writes:

> Of  course, the typechecker cannot tell  whether you are using two different
> orderings on the same map.

Yes, it can. With a decent type abstraction mechanism, as in Quest or SML.
Come on, let's be honest and tell the truth. This is the one example
where I really miss a more powerful module system than Caml Light's.

> Then  you could  dynamically  check  whether  two  sets sets  are
> originated from the same package.  Except that you cannot generate stamps in
> a purely applicative style!

We're programming in ML. Who cares about purely applicative style?
Anyway, dynamic checks are evil, generally speaking, and in the
particular case of sets and maps, speed is absolutely critical.
Otherwise, we could just as well stick to the naive implementation of
sets as lists without duplicates.

> Using an object oriented style, you could consider the functions on  sets as
> methods, and pack them with each set (instead of the stamp).

Please, no object-oriented nonsense. This is a matter of type
abstraction. I strongly doubt objects will help clarify anything here.

More to the point, the solution to which you finally arrive (storing
the ordering function in each set, and give it as argument to
set__empty), which is also the solution proposed by Damien Doligez,
solves the problem for functions that take only one set argument, but
fail on functions such as "union", which take two. In general you
can't decide whether the two orderings are the same, though a == test
would probably work fine in practice. Also, storing the ordering at
the top of the set makes all set operations slightly more expensive
(all set constructions involve an extra "cons"). Storing the ordering
in the closures of the operations is cheaper.

> Of course,
> there is an ambiguity for functions working on  several sets.  The functions
> could be written to work with  two orderings, one for each set

I take this situation to be a serious programming error. Programmers
who dare to order one type in two different ways, then mix sets
inconsistently deserve to burn in hell. Besides, there is no easy way
to handle this situation with balanced tree implementations, you
basically have to sort and rebuild entirely one of the trees. 

- Xavier




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: new library modules (sets and maps)
  1993-05-06  0:37 ` Xavier Leroy
@ 1993-05-06  8:39   ` Benjamin Pierce
  0 siblings, 0 replies; 3+ messages in thread
From: Benjamin Pierce @ 1993-05-06  8:39 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: Didier Remy, caml-list

Didier and Xavier write:
> > Using an object oriented style, you could consider the functions on  sets as
> > methods, and pack them with each set (instead of the stamp).
> 
> Please, no object-oriented nonsense. This is a matter of type
> abstraction. I strongly doubt objects will help clarify anything here.
> 
> More to the point, the solution to which you finally arrive (storing
> the ordering function in each set, and give it as argument to
> set__empty), which is also the solution proposed by Damien Doligez,
> solves the problem for functions that take only one set argument, but
> fail on functions such as "union", which take two. In general you
> can't decide whether the two orderings are the same ....

While I agree with Xavier's other arguments about why an
object-oriented approach is probably to be avoided here, it's worth
remarking that a sufficiently powerful type system for objects can
also handle cases like set union.

Let Object be a type constructor that maps a description of an object
interface (a vector of method types) to a description of the set of
objects with this interface.  For example

    SetI = Fun(Rep) { addElement: Rep->Int->Rep,
                      member: Rep->Int->Bool }
               
describes the interface of simple set-of-integer objects.  (The
reasons for the lambda-abstraction over the type variable Rep are
technical, but they don't matter here; what's important is just the
intuition that each of the methods implements a certain kind of
transformation on a set-object's internal state: addElelement
transforms an internal state and an integer into a new internal state
and member maps an internal state and an integer into a boolean.)
Then 

    Set = Object(SetI)

is the type of integer-set objects.  

Now, a particular implementation of some set-objects with this type
may be based on a binary tree representation and may have --
internally -- a particular comparison function, but there is no way to
make use of these facts to implement functions like union, because
these aspects of the objects are not accessible through the public
interface.  So let's make them accessible:

    SetI' = Fun(Rep) { addElement: Rep->Int->Rep,
                       member: Rep->Int->Bool,
                       compare: Int->Int->Bool,
                       asTree: Rep->IntTree }

Now if

    Set' = Object(SetI')

then we can easily implement functions like union on elements of Set'.
But there are two problems:
  1) The asTree function is not somehing we want in the interface of
     our sets, since we want to be free to reimplement them using some
     different representation.
  2) A function that takes two sets are arguments cannot be sure that
     they were created with the same comparison function.

Fortunately, we can solve both of these problems at the same time
using a simple trick.

Informally, 

    SetI' < SetI
    
(the interface of Set' objects is richer than that of Set objects) and

    Object(SetI') < Object(SetI).  

[This is also true formally in a certain system, but that doesn't
concern us here].  

Now, for each comparison function, we want to build a type of sets
based on that comparison function, where the only operations that show
in the interface of these sets are addElement and member.  In other
words, we want a function

    buildSetPackage : (Int->Int->Bool) -> SetPackage

where SetPackage is an existential type -- an implementation of an
abstract data type:

    SetPackage = Some(SetI' < SetI) {
                   emptySet : Object(SetI'),
                   union : Object(SetI') -> Object(SetI') -> Object(SetI')
                   }

>From the fact that the hidden representation type SetI' is declared to
be smaller than SetI (this is a "partially abstract type," in Cardelli
and Wegner's terminology), it follows that we can send the messages
addElement and member to instances of Object(SetI') just as if they
were instances of Object(SetI).  But the union operation must be
applied to two arguments with the same hidden SetI', and the only way
to obtain a new object with this hidden type is to use the emptySet
provided by this package, so the implementation of union is justified
in assuming that the comparison functions of its two arguments are
identical.  (Indeed, the comparison function need not be stored with
the sets any more: it can be stored in the package itself, which
brings us essentially full-circle.)


Well, there it is.  The punchline, in the end, is something like this: 

  1) for things like set union and comparison, what you want is a
     powerful system of abstract data types -- a higher-order module
     system -- as Xavier suggested, and

  2) such a module system interacts smoothly with a view of sets as
     objects.  (In fact, you can even "inherit" things like union
     functions, but that's a topic for another discussion!)

Cheers,

        Benjamin

P.S.  Details of this construction are in a forthcoming tech report by
David N. Turner and myself.  If anybody's interested I can send you a
postscript file. 





^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~1993-05-06  8:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1993-05-05 15:40 new library modules (sets and maps) Didier Remy
1993-05-06  0:37 ` Xavier Leroy
1993-05-06  8:39   ` Benjamin Pierce

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).