Re: [Caml-list] Optimizing symbolic processing code

From: Hugo Ferreira <hmf@inescporto.pt>
To: Andrej Bauer <andrej.bauer@andrej.com>
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Optimizing symbolic processing code
Date: Sat, 17 Jan 2009 09:28:45 +0000	[thread overview]
Message-ID: <4971A4CD.9070708@inescporto.pt> (raw)
In-Reply-To: <7d8707de0901161248t39960316v46c6fa64e8001531@mail.gmail.com>

Andrej,

First and foremost thanks for taking the time to answer.

Andrej Bauer wrote:
> After being so bad spirited in my last message, I decided to make it
> up by doing something positive. I have added to the PL Zoo a mini
> prolog interpreter, see http://andrej.com/plzoo/ . 

Interesting. I had visited this page for a quick look and never
realized (or don't remember) seeing any Prolog interpreter.

> It is very slow and

Maybe be slow, but it is very clear and concise.

> I am sure a decent implementation would speed it up by an order of
> magnitude (at least a 100 fold). 

Indeed this may be possible.

> I wonder how your implementation
> compares to mine.
> 

I will go through your code in the order it is available:

1) First thing I noticed is that you have very few type of terms.
    In my case I add parsing and unification of integers, strings,
    lists, negation (not), etc. I also added some extras because
    I use the same parser to manipulate first order logic
    sentences used  by a AI planner.

2) Your parsing doesn't construct a symbol table. So this means
    that all comparisons are string based. In my case all
    comparisons are integer based i.e: everything from a predicate
    symbol to a constant is an integer.

3) Your variable bank is based on a list. This means that any
    look-up requires linear time. I use arrays for this. This
    has effects on unification.

4) Related to the above I use a Union-Find implementation
    (See http://www.lri.fr/~filliatr/ for the implementation
     I use) to bind variables. I have also experimented with
    another data structure for this, but this implementation
    is simple and fast although mutable.

5) Your unification algorithm looks like a standard (at-worst)
    quadric order unifier, which is not too bad. However you
    use your linear list of substitutions (3+4). What is more
    your occurs check is done on every variable - term binding.
    I on the other hand, use a near-linear algorithm using fast
    union-find and do a occurs check only at the end for only
    those terms bound to variable and again using the U-F data
    structure.

6) Ok, this one is the one that seems to be killing my application.
    You use a very simple database, its basically a list of
    assertions. Any look-up is linear in respect to the number of
    assertions, which means that resolution of a goal is exponential
    in respect to the number of assertions and number of goals. I
    use a discriminant trie whose search is linear in respect to
    the number of goal (as opposed to the assertions).

7) Before unification you take care of performing variable
    renaming. This has to be repeated every time you use a
    predicate. I use clauses represented in a canonical form.
    Renaming for me is simply a matter of bumping a counter of
    variables in the variable bank and attaching this offset to
    the terms in question. I also "reuse" variables because of
    the way the variable bank is implemented.

8) Your algorithm is a very clean implementation of SLDNF.
    You keep a stack of sorts and allow one to continue search
    for the next goal. One of my implementations did this however
    keeping track of the position in a trie resulted in complicated
    code. I now use simple folds over the data structure. A
    function is invoked whenever a solution is found. If only one
    solution is required then a quick exit is performed via an
    exception. In your case your implementation is simpler because
    it uses only lists.

Ok, in respect to your first response:

 > Judging from what your responses, the most probable explanation for
 > inefficiency is that you implemented your prolog interpreter badly.

Possibly yes, but from what I have explained above you can see I have
taken pains to have a decent intepreter.

 > It would help a lot if you just showed us your code.

True. But the problem is I did not know what was killing performance,
hence the request for more information on how to diagnose the problem.
Only then would I analyse further and ask for help with more details.

I have compiled and executed the code with profiling. "grpof" shows
that about 8-10 % of the time is spent on folding over the trie (I
use map folds and finds, why are map folds taking so long?). In other
words it is not an issue on unification or the resolution function but
the search in the trie. I also find calls to caml currying functions.
This seems to point me to [1] for solutions.

I am going to make some additional experiments in order to diagnose
the problem further.

One simple question: is Ocaml matching fast enough that I need
not worry with:

a) the number of variants in a type
b) comparisons of the sort:

and bind_all_var_to_any f h ps t delay acc =
   (* k a b -> b *)
   let scan k e acc =
     match e with
     | Leaf _ -> acc
     | Node(Pred.Rel _,ps',vs',jps',jvs') ->
             (* jump over ptedicate *)
             fold_all f jps' jvs' t delay acc

     | Node(_,ps',vs',jps',jvs') ->
             fold_all f ps' vs' t delay acc
    in
    Node_map.fold (scan) ps acc

Once again, appreciate any comments on the above.

Regards,
Hugo F.

[1] http://ocaml.janestreet.com/?q=node/30

> Best regards,
> 
> Andrej
> 
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>