Re: [Caml-list] Optimizing symbolic processing code

From: Hugo Ferreira <hmf@inescporto.pt>
To: Andrej Bauer <andrej.bauer@andrej.com>
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Optimizing symbolic processing code
Date: Sat, 17 Jan 2009 16:08:45 +0000	[thread overview]
Message-ID: <4972028D.7050802@inescporto.pt> (raw)
In-Reply-To: <4971FDA9.5050704@inescporto.pt>

[-- Attachment #1: Type: text/plain, Size: 6305 bytes --]

Hello,

Just to rule out the issue with string comparison, I did a test
by changing the predicate names names. I include the Prolog
code for you. The results are:

~/Desktop/prolog_test/miniprolog$ time ./miniprolog.native -n prolog_opt.pl
A = p
B = r
C = o
D = l
E = o
F = g
real    11m27.547s
user    11m19.870s
sys     0m0.204s

So we may add to the conclusions that the use of symbol tables in this
problem also doesn't have much weight because we only shaved off a
"measly" minute.

Regards,
Hugo F.

Hugo Ferreira wrote:
> Hello Andrej,
> 
> Andrej Bauer wrote:
>> Dear Hugo,
>>
>> you have not noticed miniprolog before because it was not there until 
>> yesterday.
>>
> 
> I see.
> 
>> Almost any optimization will cause my interpreter to go much faster. 
> 
> Not quite. It depends a lot on the problem to solve and the order
> in which the knowledge base is declared. More on this later.
> 
>> I think the most reasonable one to do would be to avoid explicit
>> subtitutions during unification 
> 
> Not quite. You still need to deal with creating and using new
> variables somewhere, somehow.
> 
>> (since I already keep track of the
>> environment), but then I get into the union-find business.
>>
>> I would not be too sure that lists are much slower than arrays when
>> the lists and arrays are both short. 
> 
> The reason why I used arrays is because the union-find data
> structure uses arrays, specifically persistent arrays implemented
> using standard arrays.
> 
>> You might be yet another victim
>> of premature optimization.
>>
> 
> Maybe. But I haven't really done any optimization save for using
> "standard" data structures and techniques in resolution.
> 
>> How does the performance of your prolog compare to my miniprolog?
> 
> ~/Desktop/prolog_test/miniprolog$ time ./miniprolog.native -n prolog.pl
> A = p
> B = r
> C = o
> D = l
> E = o
> F = g
> real    12m14.648s
> user    12m8.606s
> sys     0m0.168s
> 
> versus my
> 
> ~/workspace/planner$ time ./itest_1.p.native
> real    19m12.786s
> user    19m6.728s
> sys    0m0.196s
> 
> Note that I changed your code slightly to terminate after the
> first solution. See changes below.
> 
>> Judging from the care you took with your interpreter, yours should be
>> much faster. 
> 
> This depends on the problem. I have attached the Prolog code for
> you to see and experiment with. You will notice that the order
> of the assertions are done so as to make resolution very difficult.
> 
> Notice however that:
> 
> 1) Care was taken so that (nearly) all assertions you compare with are
>    the same one (letter/1) so indexing won't help.
> 
> 2) These predicates also use a single variable so copying during
>    renaming is cheap.
> 
> 3) Lastly unification only binds with one variable so the unification
>    algorithm will also not determine the efficiency of the application.
> 
> What does determine the speed of execution is basically how quickly one
> can scan the available (candidate) predicates to form the resolvent,
> update the stack and quickly backtrack to the next candidate. Your code
> in this respect is "optimal" because it uses a simple lists for both
> scanning and keeping track of the variables.
> 
>> It should be easy to run the same program on both and see
>> what happens.
>>
> 
> The results are, I think, consistent. Consider:
> 
> a) We have more or less the same order of magnitude in execution times.
> 
> b) Note that _my_ code should be a little _slower_ given that none of
>    its "optimizations" are useful and only make running times larger.
>    However the difference of 7 minutes is I believe too large.
> 
> c) Given the very simple mechanisms required to solve this problem, I
>    figure only low-level stuff will make your "optimal" application
>    faster for the test problem (I haven't tried but you can also
>    consider using single letter predicate names to measure the weight
>    that string comparisons have, should speed up things).
> 
> 
> Which brings me to the initial question. Why such a big gap between SWI 
> Prolog for example and our Ocaml code?
> 
> ~/workspace/planner$ time swipl -f prolog.pl -g "win(A, B, C, D, E, F),
> halt."
> /home/hugof/workspace/planner/docs/prolog.pl compiled 0.00 sec, 8,560 bytes
> 
> real    0m30.278s
> user    0m30.222s
> sys     0m0.012s
> 
> Stack use, function calls, tail recursion, currying, garbage collection
> and many other issues could be to blame. My question is what can it be,
> how may I diagnose it, what can I do to speed things up.
> 
> Finally, for those that may think I may be simply giving to much
> importance to this, consider that I need to execute resolution
> hundreds of thousands of times for the ML algorithm.
> 
> Once again, any suggestions are welcome.
> 
> Regards,
> Hugo F.
> 
> 
>> Best regards,
>>
>> Andrej
>>
> 
> 
> (** [display_solution ch env] displays the solution of a goal encoded
>     by [env]. It then gives the user the option to search for other
>     solutions, as described by the list of choice points [ch], or to abort
>     the current proof search. *)
> let rec display_solution ch env =
>   match string_of_env env, ch with
>     | "Yes", _ -> print_endline "Yes"
>     | answer, [] -> print_endline answer
>     | answer, ch -> begin
>         print_string answer ; Pervasives.exit 1  <-------- Changes !!
>         (*
>     print_string (answer ^ "\nmore? (y/n) [y] ") ;
>     flush stdout ;
>     match String.lowercase (read_line ()) with
>       | "y" | "yes" | "" -> continue_search ch
>       | _ -> raise NoSolution
>          *)
>       end
> 
> 
>> _______________________________________________
>> Caml-list mailing list. Subscription management:
>> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
>> Archives: http://caml.inria.fr
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs

[-- Attachment #2: prolog_opt.pl --]
[-- Type: application/x-perl, Size: 506 bytes --]