caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* efficient binary relations?
@ 2006-03-31 11:56 Christian Lindig
  2006-03-31 12:21 ` [Caml-list] " skaller
  2006-03-31 12:50 ` Sebastian Egner
  0 siblings, 2 replies; 5+ messages in thread
From: Christian Lindig @ 2006-03-31 11:56 UTC (permalink / raw)
  To: Caml List

I am looking for an efficient representation of binary relations in 
OCaml. I have used bitvectors in the past but would like to use a more 
high-level and less imperative data structure.

The most important operation is the following. For a binary relation R 
over \X x \Y compute for a set X the set X' = { y | (x,y) in  R for all 
x in X}. In other words, X' is the set of y that are common to all x in 
X. Likewise, Y' must be computed. This operation requires to compute 
the intersection of sets and was the main reason I chose bitvectors. If 
you know about a suitable data structure I would glad to hear about it.

-- Christian

--
http://www.st.cs.uni-sb.de/~lindig/


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] efficient binary relations?
  2006-03-31 11:56 efficient binary relations? Christian Lindig
@ 2006-03-31 12:21 ` skaller
  2006-03-31 12:42   ` Christian Lindig
  2006-03-31 12:50 ` Sebastian Egner
  1 sibling, 1 reply; 5+ messages in thread
From: skaller @ 2006-03-31 12:21 UTC (permalink / raw)
  To: Christian Lindig; +Cc: Caml List

On Fri, 2006-03-31 at 13:56 +0200, Christian Lindig wrote:
> I am looking for an efficient representation of binary relations in 
> OCaml. I have used bitvectors in the past but would like to use a more 
> high-level and less imperative data structure.

You have said nothing about the type of elements and their 
distribution.. 


-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] efficient binary relations?
  2006-03-31 12:21 ` [Caml-list] " skaller
@ 2006-03-31 12:42   ` Christian Lindig
  0 siblings, 0 replies; 5+ messages in thread
From: Christian Lindig @ 2006-03-31 12:42 UTC (permalink / raw)
  To: skaller; +Cc: Caml List


On Mar 31, 2006, at 2:21 PM, skaller wrote:

> On Fri, 2006-03-31 at 13:56 +0200, Christian Lindig wrote:
>> I am looking for an efficient representation of binary relations in
>> OCaml. I have used bitvectors in the past but would like to use a more
>> high-level and less imperative data structure.
>
> You have said nothing about the type of elements and their
> distribution..

The typical case will be strings or other small tokens as elements but 
I'd like to keep this as a (functor) parameter or have a polymorphic 
data structure. Relations are typically sparsely populated but can be 
large (1000x1000). Sorry, but I don't have hard numbers.

-- Christian


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] efficient binary relations?
  2006-03-31 11:56 efficient binary relations? Christian Lindig
  2006-03-31 12:21 ` [Caml-list] " skaller
@ 2006-03-31 12:50 ` Sebastian Egner
  2006-03-31 13:42   ` Christian Lindig
  1 sibling, 1 reply; 5+ messages in thread
From: Sebastian Egner @ 2006-03-31 12:50 UTC (permalink / raw)
  To: Christian Lindig; +Cc: Caml List

[-- Attachment #1: Type: text/plain, Size: 1670 bytes --]

Hello Christian,

Please could you clarify the circumstances a little bit?

1. Are you looking for a data structure that you set
up for a fixed R once and then query many times for 
different X? Or are you looking for a dynamic data
structure in which you keep changing R? Or are you
looking for a 'functional data structure' where the
older versions of R are preserved? Or for a functional
data structure where R is fixed, but the queries X
are constructed incrementally?

2. Is R sparse, i.e. is |R| << |\X|*|\Y|? 

If not, bitvectors might not be so bad after all.

If yes, you might want to look for a data structure
storing the sets {x}' = {y : (x, y) in R} in such a
way that they can be intersected efficiently. If R
is fixed, {x}' can be represented as a sorted array
of the elements. These arrays can be intersected
quickly (see Knuth/TACP) but the asymptotically optimal
algorithms are rather tricky (if I recall correctly,
Knuth doesn't give them directly but only cites them).

A straight-forward algorithm is taking the shorter array
and looking up the elements in the longer one by binary
search. This nicely generalizes to your case of computing
X' for a given X: Get all {x}' for x in X and sort them
into increasing |{x}'|; this takes O(|X| log |X|).
Then for all y in {x}', x such that |{x}'| is minimal,
look up y in {u}' by binary search for all u in X\{x}
or until it is not found anymore; this takes 
O(Sum[log |{u}'| : u in X\{x}]) per candidate y.
Simplifying the estimation (aehem...) you end up with
O( min { |{x}'| : x in X } * |X| log max { |{x}'| : x in X } ),
which is at least independent on |\Y|, if that is your
main concern.

Sebastian.

[-- Attachment #2: Type: text/html, Size: 2827 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Caml-list] efficient binary relations?
  2006-03-31 12:50 ` Sebastian Egner
@ 2006-03-31 13:42   ` Christian Lindig
  0 siblings, 0 replies; 5+ messages in thread
From: Christian Lindig @ 2006-03-31 13:42 UTC (permalink / raw)
  To: Sebastian Egner; +Cc: Caml List


On Mar 31, 2006, at 2:50 PM, Sebastian Egner wrote:

> Please could you clarify the circumstances a little bit?
>
> 1. Are you looking for a data structure that you set
> up for a fixed R once and then query many times for
> different X? Or are you looking for a dynamic data
> structure in which you keep changing R? Or are you
> looking for a 'functional data structure' where the
> older versions of R are preserved? Or for a functional
> data structure where R is fixed, but the queries X
> are constructed incrementally?
>
> 2. Is R sparse, i.e. is |R| << |\X|*|\Y|?

First, thanks for a detailed suggestion! R is sparse, constructed once, 
and queried often. As I mentioned, this is in the context of concept 
analysis. The ' operation computes the maximal blocks (or concepts) in 
a cross table; typically their number grows cubic with the size |R| of 
the relation - hence the importance of the ' operation. (In the worst 
case, when the matrix is densely populated, there may be exponentially 
many blocks.)

I generally prefer applicative data structures (without side effects) 
but understand that this is not always possible.

-- Christian


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-03-31 13:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-03-31 11:56 efficient binary relations? Christian Lindig
2006-03-31 12:21 ` [Caml-list] " skaller
2006-03-31 12:42   ` Christian Lindig
2006-03-31 12:50 ` Sebastian Egner
2006-03-31 13:42   ` Christian Lindig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).