caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* RE: [Caml-list] Efficient and canonical set representation?
@ 2003-11-07 15:27 Fred Smith
  2003-11-07 15:44 ` Samuel Lacas
  0 siblings, 1 reply; 17+ messages in thread
From: Fred Smith @ 2003-11-07 15:27 UTC (permalink / raw)
  To: Harrison, John R, erayo, caml-list


I guess what you're looking for are sorted arrays:
  1) O(log n) lookup and insertion via binary search
  2) O(n) union and intersection are simple
  3) Equal sets are represented by structurally equivalent objects.

-Fred

> -----Original Message-----
> From: owner-caml-list@pauillac.inria.fr 
> [mailto:owner-caml-list@pauillac.inria.fr] On Behalf Of 
> Harrison, John R
> Sent: Friday, November 07, 2003 9:16 AM
> To: erayo@cs.bilkent.edu.tr; caml-list@inria.fr
> Cc: Harrison, John R
> Subject: RE: [Caml-list] Efficient and canonical set representation?
> 
> 
> | You basically want O(1) for set equality, I suppose.
> 
> Actually, no --- perhaps I should have made clearer what I 
> *really* want. The efficiency of comparison wasn't my 
> motivation, but rather elegance and aesthetics. And I meant 
> "canonical" with respect to ordinary structural equality, not 
> necessarily pointer equality, so the problem is potentially a 
> bit easier than you might have thought.
> 
> I want to be able to treat an abstract type in a truly 
> abstract way, and not worry about special-purpose equality 
> relations on certain types. Otherwise it's an ugly mess 
> dealing with complicated nestings like sets of pairs of lists of sets.
> 
> Now, I think the right solution, conceptually speaking, is to 
> allow user-defined equality on abstract types. But as far as 
> I know this cannot be done in OCaml, and I've never met much 
> enthusiasm for the idea among the CAML or SML experts.
> 
> So a poor second best is to define abstract types in a canonical way, 
> which was the starting-point of my question.
> 
> After your remarks and Brian's, I'm starting to wonder if it 
> is possible at all to do what I want. Maybe I should be 
> looking for an impossibility proof instead...
> 
> John.
> 
> -------------------
> To unsubscribe, mail caml-list-request@inria.fr Archives: 
http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs
FAQ: http://caml.inria.fr/FAQ/ Beginner's list:
http://groups.yahoo.com/group/ocaml_beginners

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] Efficient and canonical set representation?
  2003-11-07 15:27 [Caml-list] Efficient and canonical set representation? Fred Smith
@ 2003-11-07 15:44 ` Samuel Lacas
  2003-11-08 16:50   ` Eray Ozkural
  0 siblings, 1 reply; 17+ messages in thread
From: Samuel Lacas @ 2003-11-07 15:44 UTC (permalink / raw)
  To: caml-list

Fred Smith a écrit 2.2K le Fri, Nov 07, 2003 at 10:27:25AM -0500:
# 
# I guess what you're looking for are sorted arrays:
#   1) O(log n) lookup and insertion via binary search
#   2) O(n) union and intersection are simple
#   3) Equal sets are represented by structurally equivalent objects.
# 
# -Fred

Hmm, except that, if I'm not wrong, it was required the structure to
hold any kind of object. Sorted arrays require the elements to be
sortable. Using the hash of the objects may be an answer ?

sL

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] Efficient and canonical set representation?
  2003-11-07 15:44 ` Samuel Lacas
@ 2003-11-08 16:50   ` Eray Ozkural
  0 siblings, 0 replies; 17+ messages in thread
From: Eray Ozkural @ 2003-11-08 16:50 UTC (permalink / raw)
  To: Samuel Lacas, caml-list

On Friday 07 November 2003 17:44, Samuel Lacas wrote:
> Hmm, except that, if I'm not wrong, it was required the structure to
> hold any kind of object. Sorted arrays require the elements to be
> sortable. Using the hash of the objects may be an answer ?

You can give a number to each member object I guess in a lot of cases. But of 
course, in general a set doesn't mean "set of sortable objects".

Regards,


-- 
Eray Ozkural (exa) <erayo@cs.bilkent.edu.tr>
Comp. Sci. Dept., Bilkent University, Ankara  KDE Project: http://www.kde.org
www: http://www.cs.bilkent.edu.tr/~erayo  Malfunction: http://mp3.com/ariza
GPG public key fingerprint: 360C 852F 88B0 A745 F31B  EA0F 7C07 AE16 874D 539C

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [Caml-list] Efficient and canonical set representation?
@ 2003-11-12 17:18 Harrison, John R
  0 siblings, 0 replies; 17+ messages in thread
From: Harrison, John R @ 2003-11-12 17:18 UTC (permalink / raw)
  To: Diego Olivier Fernandez Pons; +Cc: caml-list, Harrison, John R

| It is not a coincidence. I remember having read somewhere that it
| could be proven that if a balanced tree (based on comparisons) did not
| have enough different representations of a same set, then the
| insertion could not be done in logarithmic time.

Interesting. I was beginning to suspect that might be the case. Does
anyone have an exact reference for such a result?

John.

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [Caml-list] Efficient and canonical set representation?
  2003-11-12  0:20 Harrison, John R
  2003-11-12  2:04 ` Brian Hurt
@ 2003-11-12 16:16 ` Diego Olivier Fernandez Pons
  1 sibling, 0 replies; 17+ messages in thread
From: Diego Olivier Fernandez Pons @ 2003-11-12 16:16 UTC (permalink / raw)
  To: Harrison, John R; +Cc: caml-list

    Bonjour,

> Is it just a coincidence that the numerous varieties of balanced
> tree (AVL, 2-3-4, red-black, ...) all seem to be non-canonical? Or
> is it essential to their efficiency? (Perhaps this is a question for
> another forum.)

It is not a coincidence. I remember having read somewhere that it
could be proven that if a balanced tree (based on comparisons) did not
have enough different representations of a same set, then the
insertion could not be done in logarithmic time.


        Diego Olivier

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [Caml-list] Efficient and canonical set representation?
  2003-11-12  3:34 Harrison, John R
@ 2003-11-12  7:50 ` Brian Hurt
  0 siblings, 0 replies; 17+ messages in thread
From: Brian Hurt @ 2003-11-12  7:50 UTC (permalink / raw)
  To: Harrison, John R; +Cc: Ocaml Mailing List

On Tue, 11 Nov 2003, Harrison, John R wrote:

> | I've been batting around ideas for ways to do balanced trees so that no 
> | matter what order you add things, you always get the same tree.  But even 
> | assuming you could do this, doing a structural compare is still O(N).  So 
> | you might as well let the trees be different.
> 
> Right, but see my second message --- I'm only interested in canonicity
> up to structural equality and I'm happy with O(N) comparison. So it's
> just the "no matter what order you add things you get the same tree"
> property that I care about. But it's not yet obvious to me whether I
> can even achieve that much.
> 

It feels like that can be done, at the cost of an occassional O(N) 
"massive rebalancing".  Well, it certainly can be done with an O(N) 
insert/delete.  I'll think about it a bit.


-- 
"Usenet is like a herd of performing elephants with diarrhea -- massive,
difficult to redirect, awe-inspiring, entertaining, and a source of
mind-boggling amounts of excrement when you least expect it."
                                - Gene Spafford 
Brian

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [Caml-list] Efficient and canonical set representation?
@ 2003-11-12  3:34 Harrison, John R
  2003-11-12  7:50 ` Brian Hurt
  0 siblings, 1 reply; 17+ messages in thread
From: Harrison, John R @ 2003-11-12  3:34 UTC (permalink / raw)
  To: Brian Hurt; +Cc: Ocaml Mailing List, Harrison, John R

| I've been batting around ideas for ways to do balanced trees so that no 
| matter what order you add things, you always get the same tree.  But even 
| assuming you could do this, doing a structural compare is still O(N).  So 
| you might as well let the trees be different.

Right, but see my second message --- I'm only interested in canonicity
up to structural equality and I'm happy with O(N) comparison. So it's
just the "no matter what order you add things you get the same tree"
property that I care about. But it's not yet obvious to me whether I
can even achieve that much.

John.

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [Caml-list] Efficient and canonical set representation?
  2003-11-12  0:20 Harrison, John R
@ 2003-11-12  2:04 ` Brian Hurt
  2003-11-12 16:16 ` Diego Olivier Fernandez Pons
  1 sibling, 0 replies; 17+ messages in thread
From: Brian Hurt @ 2003-11-12  2:04 UTC (permalink / raw)
  To: Harrison, John R; +Cc: Diego Olivier Fernandez Pons, Ocaml Mailing List

On Tue, 11 Nov 2003, Harrison, John R wrote:

> That seems to be the best suggestion so far. I guess it would work well
> in practice. But theoretically it still doesn't give O(log n) lookup
> and insertion without the kinds of assumptions you noted about the
> distribution of elements w.r.t. the hash function. And relying on
> polymorphic hashing seems a bit of a hack.
> 
> So I still can't help wondering if there's an elegant solution with the
> desired worst-case behaviour, preferably relying only on pairwise
> comparison. Is it just a coincidence that the numerous varieties of
> balanced tree (AVL, 2-3-4, red-black, ...) all seem to be non-canonical?
> Or is it essential to their efficiency? (Perhaps this is a question for
> another forum.)

I don't think so.

I've been batting around ideas for ways to do balanced trees so that no 
matter what order you add things, you always get the same tree.  But even 
assuming you could do this, doing a structural compare is still O(N).  So 
you might as well let the trees be different.  Note that Patricia trees, 
as I understand them, don't save you here either.  Mathematically, two 
sets A and B are equal if every element in set A is in set B and vice 
versa.

Think about it for a moment.  Assume we have a tree strucutre:
type 'a node_t = Node of 'a * 'a node_t * 'a node_t | Empty

Now, assume the code magically keeps the trees balanced exactly the same 
way.  How would you do the comparison?

let rec equals a b =
    match a, b with
        Empty, Empty -> true
        | Node(a_data, a_left, a_right), Node(b_data, b_left, b_right) ->
            (a_data == b_data) && (equals a_left b_left) &&
            (equals a_right b_right)
        | _ -> false
;;

This is an O(N) algorithm still.

The only way I can think of to make pointer equivelence meaningfull is to 
keep some structure of all the structures currently in use in the 
background.  Then you have to search this structure on every insertion or 
deletion to see if the new set is equal (using the old O(N) comparison) to 
an already existing set.  This structure could be a tree as well, but this 
still makes insertion and deletion O(N log M) (where M is the number of 
structures currently in use).  Instead of just O(log N).  Much worse.

There are ways you can make comparison faster.  For example, keep the 
number of elements handy in the structure (O(1) length operation) and just 
compare the lengths before doing anything else.  If you can hash the 
objects, you can keep a hash of the entire structure, being the sum of the 
hashes of the individual elements (updating the hash is then O(1) on 
insert or delete).  If the hashs don't match, the structures are 
gaurenteed to be different.  And, obviously, if pointer comparison is 
equal, the structures are equal.  Note that you always have cases where 
you have to do an O(N) comparison.

I think you're SOL.

-- 
"Usenet is like a herd of performing elephants with diarrhea -- massive,
difficult to redirect, awe-inspiring, entertaining, and a source of
mind-boggling amounts of excrement when you least expect it."
                                - Gene Spafford 
Brian




-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [Caml-list] Efficient and canonical set representation?
@ 2003-11-12  0:20 Harrison, John R
  2003-11-12  2:04 ` Brian Hurt
  2003-11-12 16:16 ` Diego Olivier Fernandez Pons
  0 siblings, 2 replies; 17+ messages in thread
From: Harrison, John R @ 2003-11-12  0:20 UTC (permalink / raw)
  To: Diego Olivier Fernandez Pons; +Cc: caml-list, Harrison, John R

That seems to be the best suggestion so far. I guess it would work well
in practice. But theoretically it still doesn't give O(log n) lookup
and insertion without the kinds of assumptions you noted about the
distribution of elements w.r.t. the hash function. And relying on
polymorphic hashing seems a bit of a hack.

So I still can't help wondering if there's an elegant solution with the
desired worst-case behaviour, preferably relying only on pairwise
comparison. Is it just a coincidence that the numerous varieties of
balanced tree (AVL, 2-3-4, red-black, ...) all seem to be non-canonical?
Or is it essential to their efficiency? (Perhaps this is a question for
another forum.)

John.

-----Original Message-----
From: owner-caml-list@pauillac.inria.fr
[mailto:owner-caml-list@pauillac.inria.fr]On Behalf Of Diego Olivier
Fernandez Pons
Sent: Monday, November 10, 2003 5:25 AM
To: Harrison, John R
Cc: caml-list@inria.fr
Subject: RE: [Caml-list] Efficient and canonical set representation?


    Bonjour,

> After your remarks and Brian's, I'm starting to wonder if it is
> possible at all to do what I want. Maybe I should be looking for an
> impossibility proof instead...

Patricia sets seem to be what you are looking for.
 (1). Efficient usual operations (lookup, insertion, union)
 (2). Structural equality

Their only problem is that they cannot handle polymorphic orderable
types but only integers...

Hash the data, use this key to insert it in a patricia map and solve
the collisions by chaining in an ordered list (with the polymorphic
[compare] function). (1) and (2) still hold under usual hypothesis on
the rate of collisions.

A few changes to JCF's implementation should be enough.

        Diego Olivier



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [Caml-list] Efficient and canonical set representation?
  2003-11-10 13:24 ` Diego Olivier Fernandez Pons
@ 2003-11-10 19:28   ` Julien Signoles
  0 siblings, 0 replies; 17+ messages in thread
From: Julien Signoles @ 2003-11-10 19:28 UTC (permalink / raw)
  To: caml-list

On Mon, 10 Nov 2003, Diego Olivier Fernandez Pons wrote:

> Patricia sets seem to be what you are looking for.
>  (1). Efficient usual operations (lookup, insertion, union)
>  (2). Structural equality
>
> Their only problem is that they cannot handle polymorphic orderable
> types but only integers...
>
> Hash the data, use this key to insert it in a patricia map and solve
> the collisions by chaining in an ordered list (with the polymorphic
> [compare] function). (1) and (2) still hold under usual hypothesis on
> the rate of collisions.
>
> A few changes to JCF's implementation should be enough.

I think JCF's Hmap module is what you want.
A hmap is a map over hash-consed values implemented as Patricia Trees.
See http://www.lri.fr/~filliatr/software.en.html for more details.

Julien Signoles
-- 
mailto:Julien.Signoles@lri.fr ; http://www.lri.fr/~signoles
"In theory, practice and theory are the same,
but in practice they are different" (Larry McVoy)

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [Caml-list] Efficient and canonical set representation?
  2003-11-07 17:27 Fred Smith
@ 2003-11-10 13:24 ` Diego Olivier Fernandez Pons
  2003-11-10 19:28   ` Julien Signoles
  0 siblings, 1 reply; 17+ messages in thread
From: Diego Olivier Fernandez Pons @ 2003-11-10 13:24 UTC (permalink / raw)
  To: johnh; +Cc: caml-list

    Bonjour,

> After your remarks and Brian's, I'm starting to wonder if it is
> possible at all to do what I want. Maybe I should be looking for an
> impossibility proof instead...

Patricia sets seem to be what you are looking for.
 (1). Efficient usual operations (lookup, insertion, union)
 (2). Structural equality

Their only problem is that they cannot handle polymorphic orderable
types but only integers...

Hash the data, use this key to insert it in a patricia map and solve
the collisions by chaining in an ordered list (with the polymorphic
[compare] function). (1) and (2) still hold under usual hypothesis on
the rate of collisions.

A few changes to JCF's implementation should be enough.

        Diego Olivier



-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [Caml-list] Efficient and canonical set representation?
@ 2003-11-07 17:27 Fred Smith
  2003-11-10 13:24 ` Diego Olivier Fernandez Pons
  0 siblings, 1 reply; 17+ messages in thread
From: Fred Smith @ 2003-11-07 17:27 UTC (permalink / raw)
  To: Samuel Lacas, caml-list


Ocaml has a structural comparison function (<=) : 'a -> 'a -> bool.  

But the bigger problem with this approach (as Jean-Baptiste Rouquier) kindly pointed out to me is that insertion takes O(n), not O(log(n)) time. Duh.

-Fred


> -----Original Message-----
> From: owner-caml-list@pauillac.inria.fr 
> [mailto:owner-caml-list@pauillac.inria.fr] On Behalf Of Samuel Lacas
> Sent: Friday, November 07, 2003 10:44 AM
> To: caml-list@inria.fr
> Subject: Re: [Caml-list] Efficient and canonical set representation?
> 
> 
> Fred Smith a écrit 2.2K le Fri, Nov 07, 2003 at 10:27:25AM -0500: # 
> # I guess what you're looking for are sorted arrays:
> #   1) O(log n) lookup and insertion via binary search
> #   2) O(n) union and intersection are simple
> #   3) Equal sets are represented by structurally equivalent objects.
> # 
> # -Fred
> 
> Hmm, except that, if I'm not wrong, it was required the 
> structure to hold any kind of object. Sorted arrays require 
> the elements to be sortable. Using the hash of the objects 
> may be an answer ?
> 
> sL
> 
> -------------------
> To unsubscribe, mail caml-list-request@inria.fr Archives: 
http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

* RE: [Caml-list] Efficient and canonical set representation?
@ 2003-11-07 14:15 Harrison, John R
  0 siblings, 0 replies; 17+ messages in thread
From: Harrison, John R @ 2003-11-07 14:15 UTC (permalink / raw)
  To: erayo, caml-list; +Cc: Harrison, John R

| You basically want O(1) for set equality, I suppose.

Actually, no --- perhaps I should have made clearer what I *really* want.
The efficiency of comparison wasn't my motivation, but rather elegance
and aesthetics. And I meant "canonical" with respect to ordinary
structural equality, not necessarily pointer equality, so the problem is
potentially a bit easier than you might have thought.

I want to be able to treat an abstract type in a truly abstract way,
and not worry about special-purpose equality relations on certain types.
Otherwise it's an ugly mess dealing with complicated nestings like sets
of pairs of lists of sets.

Now, I think the right solution, conceptually speaking, is to allow
user-defined equality on abstract types. But as far as I know this cannot
be done in OCaml, and I've never met much enthusiasm for the idea among
the CAML or SML experts.

So a poor second best is to define abstract types in a canonical way, 
which was the starting-point of my question.

After your remarks and Brian's, I'm starting to wonder if it is possible
at all to do what I want. Maybe I should be looking for an impossibility
proof instead...

John.

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] Efficient and canonical set representation?
  2003-11-06 16:41 Harrison, John R
  2003-11-06 17:04 ` Brian Hurt
  2003-11-07  3:43 ` Eray Ozkural
@ 2003-11-07  3:52 ` Eray Ozkural
  2 siblings, 0 replies; 17+ messages in thread
From: Eray Ozkural @ 2003-11-07  3:52 UTC (permalink / raw)
  To: Harrison, John R, caml-list

Hello John,

I forgot to mention: this is exactly the same question that I asked aeons ago 
for representing hypergraphs on haskell list. (For those who didn't remember: 
a hypergraph is basically a collection of sets whose elements are drawn from 
a finite set X, it is the generalization of a graph in which each edge 
connects multiple vertices rather than 2)

Of course there is a way to represent hypergraphs efficiently, but it's not a 
functional way. In the optimal and general purpose iterative method, you use 
the equivalent of adjacency list representation extended to hypergraphs: 
basically an array of pins (for each net) and an array of nets (for each 
pin).

 Since you are asking this question, you probably already know this. I am 
pretty sure a good ocaml implementation doesn't exist, but the idea is the 
same as in C and you can code it ;)

If speed isn't a paramount concern over abstraction at the present you can 
hack the Hypergraph module that uses edison which I had posted to haskell 
list. If you can't find it or can't get it to work, let me know and structure 
monster will try to help :)

Regards,

On Thursday 06 November 2003 18:41, Harrison, John R wrote:
> Does anyone know a representation of finite sets over an orderable
> polymorphic type that's (1) efficient and (2) canonical? Even better would
> be a CAML or OCaml implementation. More precisely I'm looking for:
>
>   1. Log-time lookup and insertion, and linear-time union, intersection
> etc.
>
>   2. Equal sets are represented by the same object.
>
> For example, ordered lists satisfy (2) but only part of (1), while all the
> variants of balanced trees I can remember that satisfy (1) --- AVL trees
> etc. --- fail (2).
>
> Thanks,
>
> John.
>
> -------------------
> To unsubscribe, mail caml-list-request@inria.fr Archives:
> http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ:
> http://caml.inria.fr/FAQ/ Beginner's list:
> http://groups.yahoo.com/group/ocaml_beginners

-- 
Eray Ozkural (exa) <erayo@cs.bilkent.edu.tr>
Comp. Sci. Dept., Bilkent University, Ankara  KDE Project: http://www.kde.org
www: http://www.cs.bilkent.edu.tr/~erayo  Malfunction: http://mp3.com/ariza
GPG public key fingerprint: 360C 852F 88B0 A745 F31B  EA0F 7C07 AE16 874D 539C

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] Efficient and canonical set representation?
  2003-11-06 16:41 Harrison, John R
  2003-11-06 17:04 ` Brian Hurt
@ 2003-11-07  3:43 ` Eray Ozkural
  2003-11-07  3:52 ` Eray Ozkural
  2 siblings, 0 replies; 17+ messages in thread
From: Eray Ozkural @ 2003-11-07  3:43 UTC (permalink / raw)
  To: Harrison, John R, caml-list

On Thursday 06 November 2003 18:41, Harrison, John R wrote:
> Does anyone know a representation of finite sets over an orderable
> polymorphic type that's (1) efficient and (2) canonical? Even better would
> be a CAML or OCaml implementation. More precisely I'm looking for:
>
>   1. Log-time lookup and insertion, and linear-time union, intersection
> etc.
>
>   2. Equal sets are represented by the same object.
>
> For example, ordered lists satisfy (2) but only part of (1), while all the
> variants of balanced trees I can remember that satisfy (1) --- AVL trees
> etc. --- fail (2).

It will be pretty hard to get 2. Not unless you are using disjoint sets :) You 
basically want O(1) for set equality, I suppose. Now, I think you wish to 
insert a new set in amortized O(lgn) time like in a disjoint set 
implementation. 

You can still use edison, isn't it good enough for you?

I'm saying this, because I don't think there is a straightforward functional  
way. I have on my mind 2-universal hash functions, but here I am facing bugs 
of my own. I'm not in the structure monster mode right now it seems :) 

Cheers,

-- 
Eray Ozkural (exa) <erayo@cs.bilkent.edu.tr>
Comp. Sci. Dept., Bilkent University, Ankara  KDE Project: http://www.kde.org
www: http://www.cs.bilkent.edu.tr/~erayo  Malfunction: http://mp3.com/ariza
GPG public key fingerprint: 360C 852F 88B0 A745 F31B  EA0F 7C07 AE16 874D 539C

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Caml-list] Efficient and canonical set representation?
  2003-11-06 16:41 Harrison, John R
@ 2003-11-06 17:04 ` Brian Hurt
  2003-11-07  3:43 ` Eray Ozkural
  2003-11-07  3:52 ` Eray Ozkural
  2 siblings, 0 replies; 17+ messages in thread
From: Brian Hurt @ 2003-11-06 17:04 UTC (permalink / raw)
  To: Harrison, John R; +Cc: caml-list

On Thu, 6 Nov 2003, Harrison, John R wrote:

> Does anyone know a representation of finite sets over an orderable polymorphic type
> that's (1) efficient and (2) canonical? Even better would be a CAML or OCaml
> implementation. More precisely I'm looking for:
> 
>   1. Log-time lookup and insertion, and linear-time union, intersection etc.
> 
>   2. Equal sets are represented by the same object.

Two is the tricky one to implement.  Imagine a case where I have set A 
with it's elements, and set B with all the elements less one of set A, but 
inserted in a different order.  B is a different object than A (the two 
sets are not equal).  Now you add that one last element from A, you want 
the insert routine to return A.  This means that the insert routine has to 
know that A exists, and has to compare the new B to A to determine that it 
should return A and not B.  It can be done but it's not trivial.

Games with structure definitions don't help, because Ocaml will happily
allocate different structures with the same data (this is why 1. == 1. is
false).  With a balanced tree structure you can implement the naive
equality comparison in linear time (the sequence i/2^i converges, allowing
you enumerate the elements in linear time).  If you need faster (average) 
compares, there are a number of short cuts you can do.  For example, you 
can keep the number of elements currently in the set handy, and if the 
number of elements don't match, obviously the sets won't be equal.  
Fancier, you can also keep a hash of all elements in the set- the hashs 
aren't equal, you can gaurentee the sets aren't equal.  Be carefull with 
defining your hash function so the order elements were added isn't 
important.

Brian


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Caml-list] Efficient and canonical set representation?
@ 2003-11-06 16:41 Harrison, John R
  2003-11-06 17:04 ` Brian Hurt
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Harrison, John R @ 2003-11-06 16:41 UTC (permalink / raw)
  To: caml-list; +Cc: Harrison, John R

Does anyone know a representation of finite sets over an orderable polymorphic type
that's (1) efficient and (2) canonical? Even better would be a CAML or OCaml
implementation. More precisely I'm looking for:

  1. Log-time lookup and insertion, and linear-time union, intersection etc.

  2. Equal sets are represented by the same object.

For example, ordered lists satisfy (2) but only part of (1), while all the variants
of balanced trees I can remember that satisfy (1) --- AVL trees etc. --- fail (2).

Thanks,

John.

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2003-11-12 17:18 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-11-07 15:27 [Caml-list] Efficient and canonical set representation? Fred Smith
2003-11-07 15:44 ` Samuel Lacas
2003-11-08 16:50   ` Eray Ozkural
  -- strict thread matches above, loose matches on Subject: below --
2003-11-12 17:18 Harrison, John R
2003-11-12  3:34 Harrison, John R
2003-11-12  7:50 ` Brian Hurt
2003-11-12  0:20 Harrison, John R
2003-11-12  2:04 ` Brian Hurt
2003-11-12 16:16 ` Diego Olivier Fernandez Pons
2003-11-07 17:27 Fred Smith
2003-11-10 13:24 ` Diego Olivier Fernandez Pons
2003-11-10 19:28   ` Julien Signoles
2003-11-07 14:15 Harrison, John R
2003-11-06 16:41 Harrison, John R
2003-11-06 17:04 ` Brian Hurt
2003-11-07  3:43 ` Eray Ozkural
2003-11-07  3:52 ` Eray Ozkural

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).