From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 307D2BC88 for ; Wed, 9 Feb 2005 10:43:37 +0100 (CET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j199hafY016805 for ; Wed, 9 Feb 2005 10:43:36 +0100 Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id KAA26538 for ; Wed, 9 Feb 2005 10:43:36 +0100 (MET) Received: from rproxy.gmail.com (rproxy.gmail.com [64.233.170.199]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j199hZA7016802 for ; Wed, 9 Feb 2005 10:43:35 +0100 Received: by rproxy.gmail.com with SMTP id i8so1219700rne for ; Wed, 09 Feb 2005 01:43:34 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:reply-to:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:references; b=R49TpX8C6WvlLVdtaAJ6cngM5PxShcs5xo/qfrd9cul2hIxhl5Vrk0onXUvDoxtsf/au4JxySgeAuVGnXbqRLXBBItZ7tQRzUqpdsT7K/BL5mNi6RjUrzTnTbvyfav7YqgxrIncBmXQolfvFlmNEbkf1aXyiwmn4c5BYH/9HzYk= Received: by 10.38.101.30 with SMTP id y30mr404215rnb; Wed, 09 Feb 2005 01:43:34 -0800 (PST) Received: by 10.38.65.58 with HTTP; Wed, 9 Feb 2005 01:43:34 -0800 (PST) Message-ID: <7f8e92aa050209014322b561e@mail.gmail.com> Date: Wed, 9 Feb 2005 11:43:34 +0200 From: Radu Grigore Reply-To: Radu Grigore To: skaller@users.sourceforge.net Subject: Re: [Caml-list] Re: paralell assignment problem Cc: Stefan Monnier , caml-list In-Reply-To: <7f8e92aa05020810335ab052e0@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <1107832025.13571.260.camel@pelican.wigram> <87y8dzi0ns.fsf-monnier+gmane.comp.lang.caml.inria@gnu.org> <1107882489.5022.175.camel@pelican.wigram> <7f8e92aa05020810335ab052e0@mail.gmail.com> X-Miltered: at nez-perce with ID 4209DB48.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at nez-perce with ID 4209DB47.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 wrote:01 sourceforge:01 wrote:01 computed:01 computed:01 topological:01 rhs:01 sub-optimal:01 nodes:01 nodes:01 formalize:01 nada:98 constants:01 node:01 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=RCVD_BY_IP autolearn=disabled version=3.0.2 X-Spam-Level: On Tue, 8 Feb 2005 20:33:52 +0200, Radu Grigore wrote: > On Tue, 08 Feb 2005 09:12:32 -0800 (PST), skaller > wrote: > > > However it isn't clear (to me) that just picking an arbitrary > > assignment to split into two using a temporary > > actually minimises the number of temporaries. > > I'm mildly convinced that it works. OK, I took 1h to think about this problem more carefully. Now I'm almost convinced that it is NP :) I'll use the example given in the previous mail: a = b+c b = c c = a+b We can read the first equation as "a must be computed before b and c". So we can construct a graph whose edges mean "source should be computed before target". In this case it is: a -> b a -> c b -> c c -> a c -> b If this graph has no cycles the problem is simple: just do a topological sort and the problem is solved in O(V+E). If the graph has cycles then we need to break them by introducing temporaries. Without looking at the form of RHS expressions, that is without considering computing an expression partially then we are left with two uses for temporaries: 1. hold the original value of a variable 2. hold the result of an expression and write it later to the destination Point (1) requires one extra assignment, while point (2) requires two extra assignments. The effect is however the same: one of the variable's value can be computed as soon as desired. Therefore we will only use (1). [NOTE that your approach used (2) and hence is clearly sub-optimal]. The effect of (1) (and (2)) on the constructed graph is that all in-edges of a variable are removed. So we have reduced the problem to this: "Given a graph mark as few nodes as possible such that the after removing in-edges of marked nodes we obtain a tree". Now, that looks a lot like SET-COVER. In the example above we have these cycles: 1: abc 2: bc 3: ac The set is {1, 2, 3} and the subsets from which we have to choose are a:{1, 3}, b:{2}, c:{1, 2, 3}. The minimum cover is clearly c:{1, 2, 3} and hence the solution is to create a temporary for the original value of c. After removing all in-edges of c and topologically sorting the graph we get: c, a, b. So this is the order in which we should do the other assignments. Yes -- we reached the correct solution. Now, it looks like that starting from a SET-COVER problem one can build a graph with the right relationship between nodes and cycles and then an assignment problem. I can't give a (semi-)formal argument but I have tried on a few instances and the process of construction seems to be fairly algorithmic (although hard to formalize -- at least in an email -- since I used pictures). This is why I say your problem looks NP. An acceptable algorithm might be this one: 1. detect strongly connected components (optional) 2. within each SCC find all cycles and group them in subsets according to the nodes they have in common 3. within each SCC solve SET-COVER (for cycle sets given by common nodes) 4. sort topologically the graph after removing the in-edges of nodes marked at step 3 Step 2 is clearly exponential in the number of variables (consider a complete graph). For step 3 you probably want to use an approximation algorithm. It looks like there are ones with good complexity (but I do not know them so I can't tell anything about the constants). In any case, have a look at: http://www.nada.kth.se/~viggo/wwwcompendium/node146.html If you use the lg C algorithm (C = cycles) then the overall complexity is something like O(2^V), where V is the number of variables. -- regards, radu http://rgrig.idilis.ro/