caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* Re: [Caml-list] efficient binary relations?
       [not found] <28871572.1143807766574.JavaMail.www@wwinf1529>
@ 2006-03-31 13:24 ` Christian Lindig
  0 siblings, 0 replies; 6+ messages in thread
From: Christian Lindig @ 2006-03-31 13:24 UTC (permalink / raw)
  To: Caml List


On Mar 31, 2006, at 2:22 PM, yoann padioleau wrote:
> If you can represent the y by integers, and that the binary relation
> have good property, such as the set of y related to a x (the extension 
> of x in concept analysis ?) often follows each other,  then you can 
> represent sets  by  list of intervals.

Indeed, this is related to Concept Analysis. My old implementation is 
in C

	http://www.st.cs.uni-sb.de/~lindig/src/concepts.html

and I'd like to have a better structured implementation in Caml. I 
don't care for the last bits of performance.

-- C


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] efficient binary relations?
  2006-03-31 12:50 ` Sebastian Egner
@ 2006-03-31 13:42   ` Christian Lindig
  0 siblings, 0 replies; 6+ messages in thread
From: Christian Lindig @ 2006-03-31 13:42 UTC (permalink / raw)
  To: Sebastian Egner; +Cc: Caml List


On Mar 31, 2006, at 2:50 PM, Sebastian Egner wrote:

> Please could you clarify the circumstances a little bit?
>
> 1. Are you looking for a data structure that you set
> up for a fixed R once and then query many times for
> different X? Or are you looking for a dynamic data
> structure in which you keep changing R? Or are you
> looking for a 'functional data structure' where the
> older versions of R are preserved? Or for a functional
> data structure where R is fixed, but the queries X
> are constructed incrementally?
>
> 2. Is R sparse, i.e. is |R| << |\X|*|\Y|?

First, thanks for a detailed suggestion! R is sparse, constructed once, 
and queried often. As I mentioned, this is in the context of concept 
analysis. The ' operation computes the maximal blocks (or concepts) in 
a cross table; typically their number grows cubic with the size |R| of 
the relation - hence the importance of the ' operation. (In the worst 
case, when the matrix is densely populated, there may be exponentially 
many blocks.)

I generally prefer applicative data structures (without side effects) 
but understand that this is not always possible.

-- Christian


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] efficient binary relations?
  2006-03-31 11:56 Christian Lindig
  2006-03-31 12:21 ` [Caml-list] " skaller
@ 2006-03-31 12:50 ` Sebastian Egner
  2006-03-31 13:42   ` Christian Lindig
  1 sibling, 1 reply; 6+ messages in thread
From: Sebastian Egner @ 2006-03-31 12:50 UTC (permalink / raw)
  To: Christian Lindig; +Cc: Caml List

[-- Attachment #1: Type: text/plain, Size: 1670 bytes --]

Hello Christian,

Please could you clarify the circumstances a little bit?

1. Are you looking for a data structure that you set
up for a fixed R once and then query many times for 
different X? Or are you looking for a dynamic data
structure in which you keep changing R? Or are you
looking for a 'functional data structure' where the
older versions of R are preserved? Or for a functional
data structure where R is fixed, but the queries X
are constructed incrementally?

2. Is R sparse, i.e. is |R| << |\X|*|\Y|? 

If not, bitvectors might not be so bad after all.

If yes, you might want to look for a data structure
storing the sets {x}' = {y : (x, y) in R} in such a
way that they can be intersected efficiently. If R
is fixed, {x}' can be represented as a sorted array
of the elements. These arrays can be intersected
quickly (see Knuth/TACP) but the asymptotically optimal
algorithms are rather tricky (if I recall correctly,
Knuth doesn't give them directly but only cites them).

A straight-forward algorithm is taking the shorter array
and looking up the elements in the longer one by binary
search. This nicely generalizes to your case of computing
X' for a given X: Get all {x}' for x in X and sort them
into increasing |{x}'|; this takes O(|X| log |X|).
Then for all y in {x}', x such that |{x}'| is minimal,
look up y in {u}' by binary search for all u in X\{x}
or until it is not found anymore; this takes 
O(Sum[log |{u}'| : u in X\{x}]) per candidate y.
Simplifying the estimation (aehem...) you end up with
O( min { |{x}'| : x in X } * |X| log max { |{x}'| : x in X } ),
which is at least independent on |\Y|, if that is your
main concern.

Sebastian.

[-- Attachment #2: Type: text/html, Size: 2827 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] efficient binary relations?
  2006-03-31 12:21 ` [Caml-list] " skaller
@ 2006-03-31 12:42   ` Christian Lindig
  0 siblings, 0 replies; 6+ messages in thread
From: Christian Lindig @ 2006-03-31 12:42 UTC (permalink / raw)
  To: skaller; +Cc: Caml List


On Mar 31, 2006, at 2:21 PM, skaller wrote:

> On Fri, 2006-03-31 at 13:56 +0200, Christian Lindig wrote:
>> I am looking for an efficient representation of binary relations in
>> OCaml. I have used bitvectors in the past but would like to use a more
>> high-level and less imperative data structure.
>
> You have said nothing about the type of elements and their
> distribution..

The typical case will be strings or other small tokens as elements but 
I'd like to keep this as a (functor) parameter or have a polymorphic 
data structure. Relations are typically sparsely populated but can be 
large (1000x1000). Sorry, but I don't have hard numbers.

-- Christian


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] efficient binary relations?
@ 2006-03-31 12:39 yoann padioleau
  0 siblings, 0 replies; 6+ messages in thread
From: yoann padioleau @ 2006-03-31 12:39 UTC (permalink / raw)
  To: caml-list




 
> I am looking for an efficient representation of binary relations in 
> OCaml. I have used bitvectors in the past but would like to use a more 
> high-level and less imperative data structure.
> 
> The most important operation is the following. For a binary relation R 
> over \X x \Y compute for a set X the set X' = { y | (x,y) in  R for all 
> x in X}. In other words, X' is the set of y that are common to all x in 
> X. Likewise, Y' must be computed. This operation requires to compute 
> the intersection of sets and was the main reason I chose bitvectors. If 
> you know about a suitable data structure I would glad to hear about it.
 
If you can represent the y by integers, and that the binary relation 
have good property, such as the set of y related to a x (the extension of x in concept analysis ?)
often follows each other,  then you can represent sets  by  list of intervals.
For instance the set of integers from 200 to 400 and 500 to 550  by [(200,400);(500;550)].
Computing intersection with such sets can be quite fast.
You can also represent such sets by tree of intervals (there is a pearl on this in journal of functionnal
programming if I remember).

Then, when you want to compute the intersection of an important number of set, maybe you have
to re-order the intersection operations, for instance starting with the sets which have the less elements.

I think that in constraint programming they have such problems.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Caml-list] efficient binary relations?
  2006-03-31 11:56 Christian Lindig
@ 2006-03-31 12:21 ` skaller
  2006-03-31 12:42   ` Christian Lindig
  2006-03-31 12:50 ` Sebastian Egner
  1 sibling, 1 reply; 6+ messages in thread
From: skaller @ 2006-03-31 12:21 UTC (permalink / raw)
  To: Christian Lindig; +Cc: Caml List

On Fri, 2006-03-31 at 13:56 +0200, Christian Lindig wrote:
> I am looking for an efficient representation of binary relations in 
> OCaml. I have used bitvectors in the past but would like to use a more 
> high-level and less imperative data structure.

You have said nothing about the type of elements and their 
distribution.. 


-- 
John Skaller <skaller at users dot sf dot net>
Felix, successor to C++: http://felix.sf.net


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2006-03-31 13:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <28871572.1143807766574.JavaMail.www@wwinf1529>
2006-03-31 13:24 ` [Caml-list] efficient binary relations? Christian Lindig
2006-03-31 12:39 yoann padioleau
  -- strict thread matches above, loose matches on Subject: below --
2006-03-31 11:56 Christian Lindig
2006-03-31 12:21 ` [Caml-list] " skaller
2006-03-31 12:42   ` Christian Lindig
2006-03-31 12:50 ` Sebastian Egner
2006-03-31 13:42   ` Christian Lindig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).