From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ben8@cs.washington.edu>
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr
X-Spam-Level: 
X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled 
	version=3.1.3
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105])
	by yquem.inria.fr (Postfix) with ESMTP id 6D130BBAF
	for <caml-list@yquem.inria.fr>; Mon, 10 Aug 2009 19:46:14 +0200 (CEST)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AgwBAD/5f0pCbwQZkGdsb2JhbACaTwEBAQEJCQwHEwSnRoV3iE6EGAU
X-IronPort-AV: E=Sophos;i="4.43,354,1246831200"; 
   d="scan'208";a="44532933"
Received: from out1.smtp.messagingengine.com ([66.111.4.25])
  by mail4-smtp-sop.national.inria.fr with ESMTP; 10 Aug 2009 19:46:13 +0200
Received: from compute1.internal (compute1.internal [10.202.2.41])
	by gateway1.messagingengine.com (Postfix) with ESMTP id 0BC46131F0;
	Mon, 10 Aug 2009 13:46:13 -0400 (EDT)
Received: from heartbeat2.messagingengine.com ([10.202.2.161])
  by compute1.internal (MEProxy); Mon, 10 Aug 2009 13:46:13 -0400
X-Sasl-enc: JuAGDT1JzO0wUBTUyUI6tguyjeBbKmi1e2TJ3PjtVMfZ 1249926372
Received: from eplet-2.dyn.cs.washington.edu (eplet-2.dyn.cs.washington.edu [128.208.3.12])
	by mail.messagingengine.com (Postfix) with ESMTPSA id 808445915;
	Mon, 10 Aug 2009 13:46:12 -0400 (EDT)
From: Benjamin Ylvisaker <ben8@cs.washington.edu>
To: kcheung@math.carleton.ca
In-Reply-To: <60096.70.26.45.246.1249906775.squirrel@pegasus.carleton.ca>
Subject: Re: [Caml-list] ocamlgraph predecessors
X-Priority: 3 (Normal)
References: <20090809183246.GB25629@happyleptic.org> <20090810.105950.42868244.garrigue@math.nagoya-u.ac.jp> <60096.70.26.45.246.1249906775.squirrel@pegasus.carleton.ca>
Message-Id: <50C01D21-861F-478A-B5AC-ED4E11F19D34@cs.washington.edu>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
Content-Transfer-Encoding: 7bit
Mime-Version: 1.0 (Apple Message framework v936)
Date: Mon, 10 Aug 2009 10:46:11 -0700
Cc: caml-list@yquem.inria.fr
X-Mailer: Apple Mail (2.936)
X-Spam: no; 0.00; mutable:01 iter:01 node:01 functionnal:01 doubly:01 doubly:01 10,:98 carleton:98 cheung:98 wrote:01 graph:01 graph:01 caml-list:01 edges:01 benjamin:01 

In further performance debugging, I discovered that vertex removal is  
also an O(|V|) operation.  This makes sense given the fact that  
accessing predecessors is O(|V|), because the predecessor links have  
to be removed in order to properly remove a vertex.  Stepping back,  
however, I still think it's a bad default for a generic mutable graph  
library.

The hack I came up with for this one is even less elegant that the  
predecessors issue.  I added a set of removed vertices to my wrapper  
graph type, and changed the wrapper vertex removal function to remove  
all the relevant edges and put the vertex into the removed set.   
Vertex member/iter/fold operations need to check the removed set to  
determine whether a vertex is actually in the graph or not.   
Periodically, when the set of removed vertices has gotten large, I  
make a new graph and copy in just the "real" vertices from the old  
graph.

If the library were to include a version that maintains back links, it  
would incur the space overhead and (small in my opinion) time overhead  
of my predecessors wrapper hack.  It could dispense with the vertex  
removal hack, though, because the back links could be deleted directly.

Ben


On Aug 10, 2009, at 5:19 AM, kcheung@math.carleton.ca wrote:

> Perhaps something like that in ConcreteBidirectional
> be implemented for general Digraph so predecessors
> can be accessed in O(1)?  If I am not mistaken, that
> will double the storage and running time of most of
> the operations.  This implementation could be added
> as an additional variant without modifying existing
> code.
>
> Kevin Cheung.
>
>> From: rixed@happyleptic.org
>>>> What you're asking is similar to the problem of finding the
>>> predecessor
>>>> of an arbitrary node in a singly-linked-list.  You have no option  
>>>> but
>>> to
>>>> scan the whole list to find its predecessor.  If you had a
>>>> doubly-linked-list, predecessor lookups would work easily, but  
>>>> that's
>>> a
>>>> different data structure, with much more overhead.
>>>
>>> Much more overhead, really ?
>>> So this is for performance reasons that all functionnal languages
>>> promote singly-linked lists, while for instance in Linux every list
>>> is implemented with a doubly linked list for purely ideological  
>>> reasons
>>> ?
>>>
>>> :-)
>>
>> Yes indeed, much more overhead. But the source is not the fact you
>> have to maintain backlinks, but their impact on the GC.
>> With a GC, any modification on existing values has a cost, since you
>> have to keep track of them independently of the value itself.
>> Since linux has no GC, using doubly linked lists has only a very
>> limited cost, mostly related to the extra space needed.
>> By the way, BSD uses lots of singly-linked lists, probably because it
>> comes from a time when there was not so much memory.
>>
>> Jacques Garrigue
>>