From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: * X-Spam-Status: No, score=1.6 required=5.0 tests=AWL,SPF_SOFTFAIL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id AE694BB84 for ; Sat, 17 Jan 2009 16:48:05 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AscBAH+McUnVhjEYkWdsb2JhbACUBAEBAQEJCwoHEQO3Z4Vz X-IronPort-AV: E=Sophos;i="4.37,281,1231110000"; d="pl'?scan'208";a="21630867" Received: from ihsmtp02voda.lis.interhost.com (HELO ihsmtp02cons.lis.interhost.com) ([213.134.49.24]) by mail3-smtp-sop.national.inria.fr with ESMTP; 17 Jan 2009 16:48:05 +0100 Received: from [192.168.1.64] ([77.54.249.136]) by ihsmtp02cons.lis.interhost.com with Microsoft SMTPSVC(6.0.3790.3959); Sat, 17 Jan 2009 15:45:25 +0000 Message-ID: <4971FDA9.5050704@inescporto.pt> Date: Sat, 17 Jan 2009 15:47:53 +0000 From: Hugo Ferreira User-Agent: Thunderbird 2.0.0.19 (X11/20090105) MIME-Version: 1.0 To: Andrej Bauer Cc: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Optimizing symbolic processing code References: <4970488C.9080104@inescporto.pt> <200901161341.53632.jon@ffconsultancy.com> <49709693.50201@inescporto.pt> <4970B398.5010100@inescporto.pt> <7d8707de0901161109y2de73536oc7a454f4f0e1ad91@mail.gmail.com> <7d8707de0901161248t39960316v46c6fa64e8001531@mail.gmail.com> <4971A4CD.9070708@inescporto.pt> <7d8707de0901170339m91341a4ob9482b1280833a3@mail.gmail.com> In-Reply-To: <7d8707de0901170339m91341a4ob9482b1280833a3@mail.gmail.com> Content-Type: multipart/mixed; boundary="------------040506060704090209070501" X-OriginalArrivalTime: 17 Jan 2009 15:45:25.0080 (UTC) FILETIME=[9D0B8D80:01C978BA] X-Spam: no; 0.00; andrej:01 andrej:01 unification:01 arrays:01 arrays:01 unification:01 low-level:01 ocaml:01 recursion:01 endline:01 endline:01 pervasives:01 stdout:01 lowercase:01 beginner's:01 X-Attachments: cset="UTF-8" type="application/x-perl" name="prolog.pl" name="prolog.pl" This is a multi-part message in MIME format. --------------040506060704090209070501 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Hello Andrej, Andrej Bauer wrote: > Dear Hugo, > > you have not noticed miniprolog before because it was not there until yesterday. > I see. > Almost any optimization will cause my interpreter to go much faster. Not quite. It depends a lot on the problem to solve and the order in which the knowledge base is declared. More on this later. > I think the most reasonable one to do would be to avoid explicit > subtitutions during unification Not quite. You still need to deal with creating and using new variables somewhere, somehow. > (since I already keep track of the > environment), but then I get into the union-find business. > > I would not be too sure that lists are much slower than arrays when > the lists and arrays are both short. The reason why I used arrays is because the union-find data structure uses arrays, specifically persistent arrays implemented using standard arrays. > You might be yet another victim > of premature optimization. > Maybe. But I haven't really done any optimization save for using "standard" data structures and techniques in resolution. > How does the performance of your prolog compare to my miniprolog? ~/Desktop/prolog_test/miniprolog$ time ./miniprolog.native -n prolog.pl A = p B = r C = o D = l E = o F = g real 12m14.648s user 12m8.606s sys 0m0.168s versus my ~/workspace/planner$ time ./itest_1.p.native real 19m12.786s user 19m6.728s sys 0m0.196s Note that I changed your code slightly to terminate after the first solution. See changes below. > Judging from the care you took with your interpreter, yours should be > much faster. This depends on the problem. I have attached the Prolog code for you to see and experiment with. You will notice that the order of the assertions are done so as to make resolution very difficult. Notice however that: 1) Care was taken so that (nearly) all assertions you compare with are the same one (letter/1) so indexing won't help. 2) These predicates also use a single variable so copying during renaming is cheap. 3) Lastly unification only binds with one variable so the unification algorithm will also not determine the efficiency of the application. What does determine the speed of execution is basically how quickly one can scan the available (candidate) predicates to form the resolvent, update the stack and quickly backtrack to the next candidate. Your code in this respect is "optimal" because it uses a simple lists for both scanning and keeping track of the variables. > It should be easy to run the same program on both and see > what happens. > The results are, I think, consistent. Consider: a) We have more or less the same order of magnitude in execution times. b) Note that _my_ code should be a little _slower_ given that none of its "optimizations" are useful and only make running times larger. However the difference of 7 minutes is I believe too large. c) Given the very simple mechanisms required to solve this problem, I figure only low-level stuff will make your "optimal" application faster for the test problem (I haven't tried but you can also consider using single letter predicate names to measure the weight that string comparisons have, should speed up things). Which brings me to the initial question. Why such a big gap between SWI Prolog for example and our Ocaml code? ~/workspace/planner$ time swipl -f prolog.pl -g "win(A, B, C, D, E, F), halt." /home/hugof/workspace/planner/docs/prolog.pl compiled 0.00 sec, 8,560 bytes real 0m30.278s user 0m30.222s sys 0m0.012s Stack use, function calls, tail recursion, currying, garbage collection and many other issues could be to blame. My question is what can it be, how may I diagnose it, what can I do to speed things up. Finally, for those that may think I may be simply giving to much importance to this, consider that I need to execute resolution hundreds of thousands of times for the ML algorithm. Once again, any suggestions are welcome. Regards, Hugo F. > Best regards, > > Andrej > (** [display_solution ch env] displays the solution of a goal encoded by [env]. It then gives the user the option to search for other solutions, as described by the list of choice points [ch], or to abort the current proof search. *) let rec display_solution ch env = match string_of_env env, ch with | "Yes", _ -> print_endline "Yes" | answer, [] -> print_endline answer | answer, ch -> begin print_string answer ; Pervasives.exit 1 <-------- Changes !! (* print_string (answer ^ "\nmore? (y/n) [y] ") ; flush stdout ; match String.lowercase (read_line ()) with | "y" | "yes" | "" -> continue_search ch | _ -> raise NoSolution *) end > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > --------------040506060704090209070501 Content-Type: application/x-perl; name="prolog.pl" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="prolog.pl" letter(a). letter(b). letter(c). letter(d). letter(e). letter(f). letter(g). letter(h). letter(i). letter(j). letter(k). letter(l). letter(m). letter(n). letter(o). letter(p). letter(q). letter(r). letter(s). letter(t). letter(u). letter(v). letter(w). letter(x). letter(y). letter(z). winner(p,r,o,l,o,g). win(A,B,C,D,E,F) :- letter(A), letter(B), letter(C), letter(D), letter(E), letter(F), winner(A,B,C,D,E,F). loose(A,B,C,D,E,F) :- winner(A,B,C,D,E,F), letter(A), letter(B), letter(C), letter(D), letter(E), letter(F). ?- win(A,B,C,D,E,F). --------------040506060704090209070501--