Re: new library modules

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* Re: new library modules
@ 1993-05-05  9:53 Damien Doligez
  0 siblings, 0 replies; 7+ messages in thread
From: Damien Doligez @ 1993-05-05  9:53 UTC (permalink / raw)
  To: caml-list

Xavier Leroy writes :

>I guess you mean (in the case of "set"):
[...]
>        type 'a ordering == 'a -> 'a -> int;;
>
>        type ('a, 'b) set_operations =
>          { empty: 'a t;
>            is_empty: 'a t -> bool;
>            mem: 'a -> 'a t -> bool;
>            add: 'a -> 'a t -> 'a t;
>            iter: ('a -> 'b) -> 'a t -> unit;
>            (* and so on *) };;
>
>        value make: 'a ordering -> ('a, 'b) set_operations;;

This is not what I meant.  What I meant was :

  type 'a t;;
  type 'a ordering == 'a -> 'a -> int;;

  value make : 'a ordering -> 'a t
  and   is_empty : 'a t -> bool
  and so on ...

and in the implementation :

  type 'a t = { my_order: 'a ordering; data: (* internal representation *) };;

This has the advantage that you cannot use the wrong ordering when working
with a given set, and that you do not have to use a type annotation when
creating the set (with the "make" function).

Damien Doligez




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: new library modules
@ 1993-05-07 10:29 cousinea
  0 siblings, 0 replies; 7+ messages in thread
From: cousinea @ 1993-05-07 10:29 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: caml-list

As I said, I have also implemented balanced trees and used them
to manipulate what Xavier calls sets and maps .
It is certainly less efficient that Xavier's implementation
since it is a purely functional one
but I have nonetheless used them quite extensively,  building
balanced trees of size over 2^20.

I am not sure my choices have been the right ones
but here is a short description.
My main functions over balanced trees are:

empty: 'a t
add: ('a*'a -> comparison) -> option -> 'a -> 'a t -> 'a t
exists: ('a -> comparison) -> 'a t  -> bool
get: ('a -> comparison) -> 'a t  -> 'a
get_all: ('a -> comparison) -> 'a t  -> 'a list
remove: ('a -> comparison) -> 'a t  -> 'a t
remove_all: ('a -> comparison) -> 'a t  -> 'a t
................

Values of type ('a*'a -> comparison) are supposed to be
total preorders and it is user's responsibality to use
them consistently.
type option = Take_Old | Take_New | Take_Both
 is a type that specifies what should be done
when adding a value to a tree that contains an equivalent one.

Note that functions "get" and "exists" use an argument 

of type ('a -> comparison)
which is similar to giving both an argument "ord" of type
('a*'a -> comparison) and an argument "x" of type 'a as Xavier
does for function "mem". This choice takes into account
the fact that, when using a preorder and not an order,
it is sometimes useful to search for objects that have
a more specific property that just identity or equivalence
to a given object.
This goes against Damien's suggestion to encapsulate the preorder
in the type which I approve for sets and maps  (by the  way,
the proposal by Laufer and Odersky should give a solution
for that).  

Xavier's maps are easily obtained from my balanced trees:
type ('a,'b) map == ('a,'b) t;;
let add_map ord m x y= add ord Take_New (x,y) m;;
let find_map ord m x = snd (get (fun y -> ord x (fst y)) m);;
........

Note that here, the order on type 'a  induces a preorder on type
('a * 'b) which is used for the search.  This is another argument
for the importance of preorders.

  ---Guy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: new library modules
@ 1993-05-07 10:23 cousinea
  0 siblings, 0 replies; 7+ messages in thread
From: cousinea @ 1993-05-07 10:23 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: caml-list

This is an answer to Xavier which asked me to give
examples of preorders on stored objects.

I have recently writen programs for solving puzzles.  By "puzzle"
I mean one-player games in which you start with a given
configuration and try to reach  a configuration that has a given
property.
A puzzle is therefore defined by three values:

start : 'a
possible_moves:  'a -> 'a list
accept:  'a -> bool

Searching for a solution amounts to exploring the connex component
of "start" in the graph defined  by function "possible_moves" and
it usually requires to memorize the configurations in order to avoid
redundant search.
In many puzzles there is a natural equivalence between configurations
(for instance symetries or permutations of equivalent objects). 

Therefore,  it is natural  (and often necessary for efficiency reasons)
to save configurations up to that equivalence. 

Of course, in order to save and retrieve the configurations
efficiently, it is necessary to find an strict order relation
compatible with this equivalence and I must admit that, very often,
this is obtained by first defining a canonical representation for
objects of an equivalent class and then using a total order on the
canonical forms.
However, even in that case, it may be more informative to save the
"real" configurations that have been found and not their canonical
form in order to end with a configuration data base that reflects
the partial spannning tree that has been built to reach a solution.

Therefore, I appreciate having the possibility to use a preorder
rather than an order in the archive mechanism.

  --Guy 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: new library modules
@ 1993-05-04  9:14 cousinea
  1993-05-04  9:38 ` Vale'rie Me'nissier-Morain
  1993-05-05  2:25 ` Xavier Leroy
  0 siblings, 2 replies; 7+ messages in thread
From: cousinea @ 1993-05-04  9:14 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: caml-list

The sets that you  propose are not only totally ordered sets
but rather totally pre-ordered  and I think that it is indeed important
for  many applications.  It is very often the case that you have
to deal with a set of complex objects  a part which is used for
the pre-order relation.
In that case, you will probably suggest to use a "map" but  "maps"
do not allow for multiple bindings.

 I have myself written a library for balanced trees and used it for
 several applications. I have found that, when dealing with equivalent
 elements, it is sometimes useful
     -  to allow equivalent elements to occur simultaneously in the tree.
    -  to  allow the user to precise, when using function "add" ,  the treatment
       he/she wants to be performed on equivalent elements (keeping both,
       keeping the old one, replacing the old one by the new one)

I am conscious that this goes beyond "set" operations but you get
these possibilities almost  for free when you have balanced trees
based on a pre-order relation.
Why not let us have access directly to your balanced trees package?

Also:
Rather than using type "int" for comparison, it would be clearer
to use an explicit type  comparison= Smaller | Equiv | Greater.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* new library modules
  1993-05-04  9:14 cousinea
@ 1993-05-04  9:38 ` Vale'rie Me'nissier-Morain
  1993-05-05  2:25 ` Xavier Leroy
  1 sibling, 0 replies; 7+ messages in thread
From: Vale'rie Me'nissier-Morain @ 1993-05-04  9:38 UTC (permalink / raw)
  To: cousinea; +Cc: caml-list

Just a remark about your last point: 

> Rather than using type "int" for comparison, it would be clearer
> to use an explicit type comparison = Smaller | Equiv | Greater.

While implementing arithmetic I have some practice of comparison
and sign functions and at the beginning I think like you that a 3
values boolean type will be a good feature, more precise than type
"int" with bad values to treat, but for several operations we need
arithmetic operations on these objects, and it is easier to use type
"int" than do some boolean operations on these values. So practically
it is not so clear that we always want to use such a type.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: new library modules
  1993-05-04  9:14 cousinea
  1993-05-04  9:38 ` Vale'rie Me'nissier-Morain
@ 1993-05-05  2:25 ` Xavier Leroy
  1 sibling, 0 replies; 7+ messages in thread
From: Xavier Leroy @ 1993-05-05  2:25 UTC (permalink / raw)
  To: caml-list

Guy Cousineau writes:
> The sets that you  propose are not only totally ordered sets
> but rather totally pre-ordered  and I think that it is indeed important
> for  many applications.  It is very often the case that you have
> to deal with a set of complex objects  a part which is used for
> the pre-order relation.

This sounds intriguing. Can you give an example or two? So far, the
sets I've used were either over base types (integers, strings), or
over complex objects represented as records with an integer "stamp"
field to identify them in a unique way, in which case two objects with
the same stamps were by definition the same object.

> Why not let us have access directly to your balanced trees package?

Well, actually, I don't have a general balanced tree package, the modules
"set" and "map" have their own balanced tree type. This makes for a
more compact representation of maps, but is probably a poor design.

What would the interface of a general balanced tree package look like?
Any suggestions?

> Rather than using type "int" for comparison, it would be clearer
> to use an explicit type  comparison= Smaller | Equiv | Greater.

I agree that using "int" is much less clear. The reasons I can give
are: 1- I don't know where to define the type "comparison" in the
standard library; 2- we get efficient comparison functions for
integers (prefix -) and for strings (compare_string, which is a
primitive). Both reasons are pretty lame, and I can be convinced
otherwise.

While I'm sending a message to the mailing list, here are some more
feedback I got:

Damien Doligez writes:
> Pour les sets et les maps, pourquoi est-ce que tu ne mets pas l'ordre et
> l'ensemble dans un record (ou un type abstrait quelconque), ce qui
> eliminerait le bug du mec qui passe la mauvaise fonction en argument ?

I guess you mean (in the case of "set"):

        type 'a t;;

        type 'a ordering == 'a -> 'a -> int;;

        type ('a, 'b) set_operations =
          { empty: 'a t;
            is_empty: 'a t -> bool;
            mem: 'a -> 'a t -> bool;
            add: 'a -> 'a t -> 'a t;
            iter: ('a -> 'b) -> 'a t -> unit;
            (* and so on *) };;

        value make: 'a ordering -> ('a, 'b) set_operations;;

And to work with integer sets, one would do:

        let intset = set__make (fun i j -> i-j);;

        ... intset.empty ... intset.mem ... intset.add ...

This is more elegant than parameterizing each operation by the
ordering, but still not foolproof: if we do

        let intset2 = set__make (fun i j -> j-i);;
then
        intset2.mem 2 (intset.add 1 (intset.add 2 intset.empty));;

is well-typed, though incorrect. We need structures and functors to do
this properly; ordinary functions and records are not enough...

Any opinion?

Christophe Raffali writes:
> Is it impossible to use an internal order invisible for the user to represent
> the type "'a set" ? perhaps in another library "set" (with no
> specified order)? This order could be constructed like the equality ?

Yes, it's easy to turn the generic equality primitive into a generic
ordering primitive. The problem is: just as with equality, you won't
get the relation you really want. For instance, sets will be compared by
structure rather than by contents, so you won't be able to do sets of
sets. We Bourbaki fans all want to do that. This does not work either
with my stamped objects: generic ordering will insist on comparing all
fields of the object, possibly looping, while all is needed is to
compare the stamp fields.

We can also settle for this solution, claiming that this behavior is
no worse than, e.g., list__assoc or hashtbl__find, and that the
problem will be solved when we figure out how to attach nonstructural
equality/comparison functions to specific types.

- Xavier Leroy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* new library modules
@ 1993-05-03 18:45 Xavier Leroy
  0 siblings, 0 replies; 7+ messages in thread
From: Xavier Leroy @ 1993-05-03 18:45 UTC (permalink / raw)
  To: caml-list

Hello, mailing list,

I'm considering to add the following modules to the standard library
in a future release of Caml Light:

        genlex          generic lexical analyzer
        set             applicative sets over totally ordered types
        map             applicative maps over totally ordered types

The interfaces are given at the end of this message. I'm interested in
any comments you might have on these modules, such as missing
operations, poor names, or vague documentation. 

Also, it's the right time to call for extensions or corrections in the
library (``Gee, I really miss a frobnicate_list function''), even
though what can go in the standard library is severely constrainted
both for ideological reasons (I believe no combinator should be part
of the standard library, including function composition) and practical
reasons (we're about to hit various size limitations in the PC 8086
version).

- Xavier Leroy

---------------------------- Module genlex ----------------------------

(* A generic lexical analyzer *)

(* This module implements a simple ``standard'' lexical analyzer, presented
   as a function from character streams to token streams. It implements
   roughly the lexical conventions of Caml, but is parameterized by the
   set of keywords of your language. *)

#open "stream";;

type token =
    Kwd of string
  | Ident of string
  | Int of int
  | Float of float
  | String of string
  | Char of char;;
        (* The type of tokens. The lexical classes are: [Int] and [Float]
           for integer and floating-point numbers; [String] for
           string literals, enclosed in double quotes; [Char] for
           character literals, enclosed in backquotes; [Ident] for
           identifiers (either sequences of letters, digits, underscores
           and quotes, or sequences of ``operator characters'' such as
           [+], [*], etc); and [Kwd] for keywords (either identifiers or
           single ``special characters'' such as [(], [}], etc). *)
           
value make_lexer: string list -> (char stream -> token stream);;
        (* Construct the lexer function. The first argument is the list of
           keywords. An identifier [s] is returned as [Kwd s] if [s]
           belongs to this list, and as [Ident s] otherwise.
           A special character [s] is returned as [Kwd s] if [s]
           belongs to this list, and cause a lexical error (exception
           [Parse_error]) otherwise. Blanks and newlines are skipped.
           Comments delimited by [(*] and [*)] are skipped as well,
           and can be nested.

           Example: a lexer suitable for a desk calculator is obtained by
-          [  let lexer = make_lexer ["+";"-";"*";"/";"let";"="; "("; ")"]
           ]
           The associated parser would be a function from [token stream]
           to, for instance, [int], and would have rules such as:
           [  let parse_expr = function
                  [< 'Int n >] -> n
                | [< parse_expr n1; (parse_end n1) n2 >] -> n2
                | [< 'Kwd "("; parse_expr n; 'Kwd ")" >] -> n
              and parse_end n1 = function
                  [< 'Kwd "+"; parse_expr n2 >] -> n1+n2
                | ...
           ] *)

---------------------------- Module set ----------------------------

(* Sets over ordered types *)

(* This module implements the set data structure, given a total ordering
   function over the set elements. All operations over sets
   are purely applicative (no side-effects).
   The implementation uses balanced binary trees, and is therefore
   reasonably efficient: membership takes time logarithmic in the size
   of the set, for instance. *)

type 'a t;;
        (* The type of sets containing elements of type ['a]. *)

type 'a ordering == 'a -> 'a -> int;;
        (* Most operations over sets take as argument a
           total ordering function over the set elements.
           This is a two-argument function [f] such that
           [f e1 e2] is zero if the elements [e1] and [e2] are equal,
           [f e1 e2] is strictly negative if [e1] is smaller than [e2],
           and [f e1 e2] is strictly positive if [e1] is greater than [e2].

           For instance, a suitable ordering function for type [int]
           is [fun i j -> i-j]. For type [string], you could use
           [compare_string].

           The user is responsible for always using the same ordering
           function to operate on a given set. (This is what we mean
           by ``the set [s] is ordered according to the ordering [ord]''
           in the following.) Otherwise, the functions
           below could produce incorrect results. *)

value empty: 'a t
        (* The empty set. *)
  and is_empty: 'a t -> bool
        (* Test whether a set is empty or not. *)
  and mem: 'a ordering -> 'a -> 'a t -> bool
        (* [mem ord x s] tests whether [x] is equal to an element
           of [s], in the sense that [ord x y = 0] for some element [y]
           of [s]. *)
  and add: 'a ordering -> 'a -> 'a t -> 'a t
        (* [add ord x s] returns a set containing all elements of [s],
           plus [x]. If [x] was already in [s], [s] is returned unchanged.
           [s] and the returned set are ordered according to [ord]. *)
  and remove: 'a ordering -> 'a -> 'a t -> 'a t
        (* [add ord x s] returns a set containing all elements of [s],
           except [x]. If [x] was not in [s], [s] is returned unchanged.
           [s] and the returned set are ordered according to [ord]. *)
  and union: 'a ordering -> 'a t -> 'a t -> 'a t
  and inter: 'a ordering -> 'a t -> 'a t -> 'a t
  and diff: 'a ordering -> 'a t -> 'a t -> 'a t
        (* Union, intersection and set difference. The two set arguments
           and the returned set are ordered according to
           the first argument. *)
  and equal: 'a ordering -> 'a t -> 'a t -> bool
        (* [equal ord s1 s2] tests whether the sets [s1] and [s2] are
           equal, that is, contain equal elements according to [ord]. *) 
  and compare: 'a ordering -> 'a t -> 'a t -> int
        (* Total ordering between sets. Can be used as the ordering function
           for doing sets of sets. *)
  and elements: 'a t -> 'a list
        (* Return the list of all elements of the given set.
           The elements appear in the list in some non-specified order. *)
  and iter: ('a -> 'b) -> 'a t -> unit
        (* [iter f s] applies [f] in turn to all elements of [s], and
           discards the results. The elements of [s] are presented to [f]
           in a non-specified order. *)
  and fold: ('a -> 'b -> 'b) -> 'a t -> 'b -> 'b
        (* [fold f s a] computes [(f xN ... (f x2 (f x1 a))...)],
           where [x1 ... xN] are the elements of [s].
           The order in which elements of [s] are presented to [f] is
           not specified. *)
;;

---------------------------- Module map ----------------------------

(* Association tables over ordered types *)

(* This module implements applicative association tables, also known as
   finite maps, given a total ordering function over the keys.
   All operations over maps are purely applicative (no side-effects).
   The implementation uses balanced binary trees, and therefore searching
   and insertion take time logarithmic in the size of the map. *)

type ('a, 'b) t;;
        (* The type of maps from type ['a] to type ['b]. *)


type 'a ordering == 'a -> 'a -> int;;
        (* Most operations over maps take as argument a
           total ordering function over the keys of the map.
           This is a two-argument function [f] such that
           [f k1 k2] is zero if the keys [k1] and [k2] are equal,
           [f k1 k2] is strictly negative if [k1] is smaller than [k2],
           and [f k1 k2] is strictly positive if [k1] is greater than [k2].

           For instance, a suitable ordering function for type [int]
           is [fun i j -> i-j]. For type [string], you could use
           [compare_string].

           The user is responsible for always using the same ordering
           function to operate on a given map. (This is what we mean
           by ``the map [m] is ordered according to the ordering [ord]''
           in the following.) Otherwise, the functions
           below could produce incorrect results. *)

value empty: ('a, 'b) t
        (* The empty map. *)
  and add: 'a ordering -> ('a, 'b) t -> 'a -> 'b -> ('a, 'b) t
        (* [add ord m x y] returns a map containing the same bindings as
           [m], plus a binding of [x] to [y]. Previous bindings for [x] 
           in [m] are not removed, but simply hidden: they reappear
           after performing [remove ord map x].
           (This is the semantics of association lists.)
           [m] and the returned map are ordered according to [ord]. *)
  and find: 'a ordering -> ('a, 'b) t -> 'a -> 'b
        (* [find ord m x] returns the current binding of [x] in [m],
           or raises [Not_found] if no such binding exists.
           [m] is ordered according to [ord]. *)
  and remove: 'a ordering -> ('a, 'b) t -> 'a -> ('a, 'b) t
        (* [remove ord m x] returns a map containing the same bindings
           as [m], minus the current binding for [x]. The previous
           binding for [x] is restored if it exists. [m] is returned
           unchanged if [x] is not bound in [m].
           [m] and the returned map are ordered according to [ord]. *)
  and iter: ('a -> 'b -> 'c) -> ('a, 'b) t -> unit
        (* [iter f m] applies [f] to all bindings in map [m],
	   discarding all the results.
           [f] receives the key as first argument, and the associated value
           as second argument. The order in which the bindings are passed to
           [f] is unspecified.
           A given binding will be presented to [f] only once. *)
  and fold: ('a -> 'b -> 'c -> 'c) -> ('a, 'b) t -> 'c -> 'c
        (* [fold f m a] computes [(f kN dN ... (f k2 d2 (f k1 d1 a))...)],
           where [(k1,d1) ... (kN, dN)] are the bindings of map [m].
           The order in which bindings are presented to [f] is not
           specified. *)
;;




^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~1993-05-07 13:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1993-05-05  9:53 new library modules Damien Doligez
  -- strict thread matches above, loose matches on Subject: below --
1993-05-07 10:29 cousinea
1993-05-07 10:23 cousinea
1993-05-04  9:14 cousinea
1993-05-04  9:38 ` Vale'rie Me'nissier-Morain
1993-05-05  2:25 ` Xavier Leroy
1993-05-03 18:45 Xavier Leroy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).