caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* ocamlgraph predecessors
@ 2009-08-08  5:24 Benjamin Ylvisaker
  2009-08-08 13:35 ` [Caml-list] " Edgar Friendly
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Benjamin Ylvisaker @ 2009-08-08  5:24 UTC (permalink / raw)
  To: caml-list

I have been using ocamlgraph for a while, and have been generally  
happy with it.  I experienced some poor performance with moderately  
large graphs (10-100k vertices) recently, which led me to look through  
the source code a little.  It seems that doing anything with the  
predecessors of a vertex, even just getting a list of them, requires  
scanning through all the vertices in a graph.  This seems a little  
crazy to me.  Am I missing something?  Is there some kind of work- 
around that gives reasonable performance for predecessor operations  
(i.e. not O(|V|)).

Thanks,
Ben


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] ocamlgraph predecessors
  2009-08-08  5:24 ocamlgraph predecessors Benjamin Ylvisaker
@ 2009-08-08 13:35 ` Edgar Friendly
  2009-08-08 20:16   ` Benjamin Ylvisaker
  2009-08-25 14:22 ` Julien Signoles
  2009-08-26  6:54 ` Jean-Christophe Filliâtre
  2 siblings, 1 reply; 8+ messages in thread
From: Edgar Friendly @ 2009-08-08 13:35 UTC (permalink / raw)
  To: Benjamin Ylvisaker; +Cc: caml-list

Benjamin Ylvisaker wrote:
> I have been using ocamlgraph for a while, and have been generally happy
> with it.  I experienced some poor performance with moderately large
> graphs (10-100k vertices) recently, which led me to look through the
> source code a little.  It seems that doing anything with the
> predecessors of a vertex, even just getting a list of them, requires
> scanning through all the vertices in a graph.  This seems a little crazy
> to me.  Am I missing something?  Is there some kind of work-around that
> gives reasonable performance for predecessor operations (i.e. not O(|V|)).
> 
> Thanks,
> Ben
> 

What you're asking is similar to the problem of finding the predecessor
of an arbitrary node in a singly-linked-list.  You have no option but to
scan the whole list to find its predecessor.  If you had a
doubly-linked-list, predecessor lookups would work easily, but that's a
different data structure, with much more overhead.

When you talk about "predecessors", I assume you're using a directed
graph, and want to know which nodes have edges to a given node.  if your
graph is static, you could build lookup tables, pregenerating this
information and caching it.  Even with a dynamic graph, maintaining
lookup tables on this info shouldn't be too hard.

Does that help?
E


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] ocamlgraph predecessors
  2009-08-08 13:35 ` [Caml-list] " Edgar Friendly
@ 2009-08-08 20:16   ` Benjamin Ylvisaker
  2009-08-09 14:56     ` Edgar Friendly
  0 siblings, 1 reply; 8+ messages in thread
From: Benjamin Ylvisaker @ 2009-08-08 20:16 UTC (permalink / raw)
  To: Edgar Friendly; +Cc: caml-list


On Aug 8, 2009, at 6:35 AM, Edgar Friendly wrote:

> Benjamin Ylvisaker wrote:
>> I have been using ocamlgraph for a while, and have been generally  
>> happy
>> with it.  I experienced some poor performance with moderately large
>> graphs (10-100k vertices) recently, which led me to look through the
>> source code a little.  It seems that doing anything with the
>> predecessors of a vertex, even just getting a list of them, requires
>> scanning through all the vertices in a graph.  This seems a little  
>> crazy
>> to me.  Am I missing something?  Is there some kind of work-around  
>> that
>> gives reasonable performance for predecessor operations (i.e. not  
>> O(|V|)).
>>
>> Thanks,
>> Ben
>>
>
> What you're asking is similar to the problem of finding the  
> predecessor
> of an arbitrary node in a singly-linked-list.  You have no option  
> but to
> scan the whole list to find its predecessor.  If you had a
> doubly-linked-list, predecessor lookups would work easily, but  
> that's a
> different data structure, with much more overhead.
>
> When you talk about "predecessors", I assume you're using a directed
> graph, and want to know which nodes have edges to a given node.  if  
> your
> graph is static, you could build lookup tables, pregenerating this
> information and caching it.  Even with a dynamic graph, maintaining
> lookup tables on this info shouldn't be too hard.
>
> Does that help?
> E

The list analogy seems like a little bit of a stretch to me.  I  
understand the point, but I think most programmers would expect  
predecessor and successor operations in a generic mutable directed  
graph library to be symmetric in every way, including performance.

I'm thinking about making a thin wrapper around ocamlgraph that makes  
"internal" edges in both directions with tags to distinguish them  
whenever the user creates an "external" edge.  All the wrapper graph  
traversal functions would only use ocamlgraph's successor functions,  
and then use the tags to distinguish which edges are really supposed  
to point in which direction.  It's a bit of a hassle, and will roughly  
double the amount of storage required for edges, but I need  
predecessor access in my application, and the O(|V|) performance is  
really painful for big graphs.

Ben


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] ocamlgraph predecessors
  2009-08-08 20:16   ` Benjamin Ylvisaker
@ 2009-08-09 14:56     ` Edgar Friendly
  2009-08-25 14:22       ` Julien Signoles
  0 siblings, 1 reply; 8+ messages in thread
From: Edgar Friendly @ 2009-08-09 14:56 UTC (permalink / raw)
  To: Benjamin Ylvisaker; +Cc: caml-list

Benjamin Ylvisaker wrote:
> 
> On Aug 8, 2009, at 6:35 AM, Edgar Friendly wrote:
> 
>> Benjamin Ylvisaker wrote:
>>> I have been using ocamlgraph for a while, and have been generally happy
>>> with it.  I experienced some poor performance with moderately large
>>> graphs (10-100k vertices) recently, which led me to look through the
>>> source code a little.  It seems that doing anything with the
>>> predecessors of a vertex, even just getting a list of them, requires
>>> scanning through all the vertices in a graph.  This seems a little crazy
>>> to me.  Am I missing something?  Is there some kind of work-around that
>>> gives reasonable performance for predecessor operations (i.e. not
>>> O(|V|)).
>>>
>>> Thanks,
>>> Ben
>>>
>>
>> What you're asking is similar to the problem of finding the predecessor
>> of an arbitrary node in a singly-linked-list.  You have no option but to
>> scan the whole list to find its predecessor.  If you had a
>> doubly-linked-list, predecessor lookups would work easily, but that's a
>> different data structure, with much more overhead.
>>
>> When you talk about "predecessors", I assume you're using a directed
>> graph, and want to know which nodes have edges to a given node.  if your
>> graph is static, you could build lookup tables, pregenerating this
>> information and caching it.  Even with a dynamic graph, maintaining
>> lookup tables on this info shouldn't be too hard.
>>
>> Does that help?
>> E
> 
> The list analogy seems like a little bit of a stretch to me.  I
> understand the point, but I think most programmers would expect
> predecessor and successor operations in a generic mutable directed graph
> library to be symmetric in every way, including performance.
> 
If you'd like to call this a weakness in the current implementation, you
may.

> I'm thinking about making a thin wrapper around ocamlgraph that makes
> "internal" edges in both directions with tags to distinguish them
> whenever the user creates an "external" edge.  All the wrapper graph
> traversal functions would only use ocamlgraph's successor functions, and
> then use the tags to distinguish which edges are really supposed to
> point in which direction.  It's a bit of a hassle, and will roughly
> double the amount of storage required for edges, but I need predecessor
> access in my application, and the O(|V|) performance is really painful
> for big graphs.
> 
> Ben
> 
This is another solution to the slow predecessor performance, and will
have different performance characteristics than predecessor
lookup-tables.  Note that the lookup table solution is isomorphic to
building a second graph with all the arrows reversed, and using the
efficient successor operations on it.  Maybe this'll be easier than
keeping a merged graph.  Maybe not.

E


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] ocamlgraph predecessors
  2009-08-08  5:24 ocamlgraph predecessors Benjamin Ylvisaker
  2009-08-08 13:35 ` [Caml-list] " Edgar Friendly
@ 2009-08-25 14:22 ` Julien Signoles
  2009-08-26 14:35   ` Benjamin Ylvisaker
  2009-08-26  6:54 ` Jean-Christophe Filliâtre
  2 siblings, 1 reply; 8+ messages in thread
From: Julien Signoles @ 2009-08-25 14:22 UTC (permalink / raw)
  To: Benjamin Ylvisaker; +Cc: caml-list

Benjamin Ylvisaker a écrit :
> I have been using ocamlgraph for a while, and have been generally happy 
> with it.  I experienced some poor performance with moderately large 
> graphs (10-100k vertices) recently, which led me to look through the 
> source code a little.  It seems that doing anything with the 
> predecessors of a vertex, even just getting a list of them, requires 
> scanning through all the vertices in a graph.  This seems a little crazy 
> to me.  Am I missing something?  Is there some kind of work-around that 
> gives reasonable performance for predecessor operations (i.e. not O(|V|)).

Actually, looking at the current implementation, accessing predecessors 
is worse that O(|V|): that is max(O(|V|,O(|E|)).

If you use concrete (imperative directional) graphs, the simpler 
work-around  is to use Imperative.Digraph.ConcreteBidirectional as 
suggested by Kevin Cheung. It uses more memory space (at worse the 
double) that standard concrete directional graphs. But accessing 
predecessors is in O(1) amortized instead of max(O(|V|,O(|E|)) and 
removing a vertex is in O(D*ln(D)) where D is the maximal degree of the 
graph instead of O(|V|*ln(|V|)).

If you don't use this functor, other work-arounds have been suggested in 
other posts.

By the way contributing to ocamlgraph by adding 
Imperative.Digraph.AbstractBidirectional (for instance) is still 
possible and welcome :o).

--
Julien Signoles


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] ocamlgraph predecessors
  2009-08-09 14:56     ` Edgar Friendly
@ 2009-08-25 14:22       ` Julien Signoles
  0 siblings, 0 replies; 8+ messages in thread
From: Julien Signoles @ 2009-08-25 14:22 UTC (permalink / raw)
  To: Edgar Friendly; +Cc: caml list

Edgar Friendly a écrit :
> This is another solution to the slow predecessor performance, and will
> have different performance characteristics than predecessor
> lookup-tables.  Note that the lookup table solution is isomorphic to
> building a second graph with all the arrows reversed, and using the
> efficient successor operations on it.  Maybe this'll be easier than
> keeping a merged graph.  Maybe not.

This solution should be easy to implement with ocamlgraph because this 
operation already exists in ocamlgraph : that's the "mirror" function. 
See here : http://ocamlgraph.lri.fr/doc/Oper.Make.html

--
Julien Signoles


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] ocamlgraph predecessors
  2009-08-08  5:24 ocamlgraph predecessors Benjamin Ylvisaker
  2009-08-08 13:35 ` [Caml-list] " Edgar Friendly
  2009-08-25 14:22 ` Julien Signoles
@ 2009-08-26  6:54 ` Jean-Christophe Filliâtre
  2 siblings, 0 replies; 8+ messages in thread
From: Jean-Christophe Filliâtre @ 2009-08-26  6:54 UTC (permalink / raw)
  To: Benjamin Ylvisaker; +Cc: caml-list

Hi,

Benjamin Ylvisaker wrote:
> I have been using ocamlgraph for a while, and have been generally happy
> with it.  I experienced some poor performance with moderately large
> graphs (10-100k vertices) recently, which led me to look through the
> source code a little.  It seems that doing anything with the
> predecessors of a vertex, even just getting a list of them, requires
> scanning through all the vertices in a graph.  This seems a little crazy
> to me.  Am I missing something?  Is there some kind of work-around that
> gives reasonable performance for predecessor operations (i.e. not O(|V|)).

Not providing predecessors in constant time was a deliberate choice in
Ocamlgraph. (A graph is basically a map which binds any vertex to the
set of its successors, and that's it.)

If you need efficient access to the predecessors, you have several
workarounds:

- implement your own graph data structure; after all, ocamlgraph was
designed to clearly separate data structures and algorithms, so that you
will still be able to use graph algorithms on your own graphs.

- use the graph data structure Imperative.Digraph.ConcreteBidirectional,
which is the only graph data structure in Ocamlgraph providing
predecessors in constant time; it is actually the contribution of a user
(Ted Kremenek) who experienced the same need as yourself.

- memoize the results of the predecessors function (either in a modified
version of the data structure or externally if your algorithm allows it).

Hope this helps,
-- 
Jean-Christophe


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Caml-list] ocamlgraph predecessors
  2009-08-25 14:22 ` Julien Signoles
@ 2009-08-26 14:35   ` Benjamin Ylvisaker
  0 siblings, 0 replies; 8+ messages in thread
From: Benjamin Ylvisaker @ 2009-08-26 14:35 UTC (permalink / raw)
  To: Julien Signoles; +Cc: caml-list


On Aug 25, 2009, at 10:22 AM, Julien Signoles wrote:

> Benjamin Ylvisaker a écrit :
>> I have been using ocamlgraph for a while, and have been generally  
>> happy with it.  I experienced some poor performance with moderately  
>> large graphs (10-100k vertices) recently, which led me to look  
>> through the source code a little.  It seems that doing anything  
>> with the predecessors of a vertex, even just getting a list of  
>> them, requires scanning through all the vertices in a graph.  This  
>> seems a little crazy to me.  Am I missing something?  Is there some  
>> kind of work-around that gives reasonable performance for  
>> predecessor operations (i.e. not O(|V|)).
>
> Actually, looking at the current implementation, accessing  
> predecessors is worse that O(|V|): that is max(O(|V|,O(|E|)).
>
> If you use concrete (imperative directional) graphs, the simpler  
> work-around  is to use Imperative.Digraph.ConcreteBidirectional as  
> suggested by Kevin Cheung. It uses more memory space (at worse the  
> double) that standard concrete directional graphs. But accessing  
> predecessors is in O(1) amortized instead of max(O(|V|,O(|E|)) and  
> removing a vertex is in O(D*ln(D)) where D is the maximal degree of  
> the graph instead of O(|V|*ln(|V|)).
>
> If you don't use this functor, other work-arounds have been  
> suggested in other posts.
>
> By the way contributing to ocamlgraph by adding  
> Imperative.Digraph.AbstractBidirectional (for instance) is still  
> possible and welcome :o).

Thanks for your suggestions.  I had not noticed the  
ConcreteBidirectional module, but it looks like it wouldn't be a drop- 
in replacement for me, because it's unlabeled and I need labels.

If anyone is curious, here is the wrapper logic that I ended up adding:

When the user wants an edge from v1 to v2 with label X, two "internal"  
edges get created: one from v1 to v2 with label EdgeForward (X) and  
one from v2 to v1 with label EdgeBackward (X).  These two edges are  
considered equivalent by the wrapper code.  Additionally, to made edge  
removal faster, I added a table that maps an edge to its "pair".  All  
edge-related wrapper functions can take either of the pair of wrapper  
edges, and will use the source or destination vertex of the edge,  
depending on the Forward/Backward label.  The edge removal wrapper  
function gets the mate of the passed in edge by looking it up in the  
table, and removes them both.

To make vertex removal faster, I added a Boolean "removed" field to  
vertex labels that is set to false on vertex creation.  When the  
wrapper vertex removal function is called, it removes all the incident  
edges and then just sets the vertex's removed flag to true.  Vertex  
scanning functions clearly need to check the flag to determine whether  
an "internal" vertex should actually be considered part of the graph  
or not.

If there is a lot of vertex creation and removal in an application,  
clearly a lot of "garbage" vertices will end up polluting the graph.   
When the amount of garbage gets too large, a new copy of a graph can  
be constructed with only the "real" vertices copied over.  This is not  
a totally transparent operation, because if there are any tables keyed  
on vertices or edges, external to the graph itself, they'll get  
confused by the copying.

This set of wrapper logic clearly adds a non-trivial amount of memory  
overhead:  1) The edge and vertex labels are a little bit bigger.  2)  
There are two edges for every externally visible edge.  3) The edge- 
pair table adds another O(|E|) chunk of memory.  4) An application- 
dependent number of garbage vertices will be floating around.  I'm  
pretty sure all the wrapper operations are asymptotically as fast as  
they reasonably can be, though.  If an implementation like this were  
done in the library itself (notice the strategic use of the passive  
voice) instead of as a wrapper, I'm pretty sure vertex removal could  
be handled more cleanly.

Ben


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2009-08-26 14:35 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-08-08  5:24 ocamlgraph predecessors Benjamin Ylvisaker
2009-08-08 13:35 ` [Caml-list] " Edgar Friendly
2009-08-08 20:16   ` Benjamin Ylvisaker
2009-08-09 14:56     ` Edgar Friendly
2009-08-25 14:22       ` Julien Signoles
2009-08-25 14:22 ` Julien Signoles
2009-08-26 14:35   ` Benjamin Ylvisaker
2009-08-26  6:54 ` Jean-Christophe Filliâtre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).