From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@sympa.inria.fr Delivered-To: caml-list@sympa.inria.fr Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 2D1A87EEAF for ; Tue, 29 Jan 2013 10:24:10 +0100 (CET) Received-SPF: Neutral (mail2-smtp-roc.national.inria.fr: domain of simon.cruanes.2007@m4x.org does not assert whether or not 129.104.30.34 is permitted sender) identity=pra; client-ip=129.104.30.34; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="SRS0=5uAI=LW=m4x.org=simon.cruanes.2007@bounces.m4x.org"; x-sender="simon.cruanes.2007@m4x.org"; x-conformance=sidf_compatible; x-record-type="spf2.0" Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of SRS0=5uAI=LW=m4x.org=simon.cruanes.2007@bounces.m4x.org designates 129.104.30.34 as permitted sender) identity=mailfrom; client-ip=129.104.30.34; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="SRS0=5uAI=LW=m4x.org=simon.cruanes.2007@bounces.m4x.org"; x-sender="SRS0=5uAI=LW=m4x.org=simon.cruanes.2007@bounces.m4x.org"; x-conformance=sidf_compatible; x-record-type="spf2.0" Received-SPF: Pass (mail2-smtp-roc.national.inria.fr: domain of postmaster@mx1.polytechnique.org designates 129.104.30.34 as permitted sender) identity=helo; client-ip=129.104.30.34; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="SRS0=5uAI=LW=m4x.org=simon.cruanes.2007@bounces.m4x.org"; x-sender="postmaster@mx1.polytechnique.org"; x-conformance=sidf_compatible; x-record-type="v=spf1" X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AjcCABaUB1GBaB4igWdsb2JhbABEq3mSXhYOAQEWJieCHgEBBSdHBBcLGAkEEg8JAwIBAgEzEhMGAgEBDgmHZAMPDLA8hkEDiV+RBQOXKZIk X-IPAS-Result: AjcCABaUB1GBaB4igWdsb2JhbABEq3mSXhYOAQEWJieCHgEBBSdHBBcLGAkEEg8JAwIBAgEzEhMGAgEBDgmHZAMPDLA8hkEDiV+RBQOXKZIk X-IronPort-AV: E=Sophos;i="4.84,559,1355094000"; d="scan'208";a="414774" Received: from mx1.polytechnique.org ([129.104.30.34]) by mail2-smtp-roc.national.inria.fr with ESMTP/TLS/ADH-AES256-SHA; 29 Jan 2013 10:24:09 +0100 Received: from emmental.inria.fr (emmental.inria.fr [128.93.0.14]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ssl.polytechnique.org (Postfix) with ESMTPSA id C95B3140C50C8 for ; Tue, 29 Jan 2013 10:24:08 +0100 (CET) Message-ID: <5107952B.2000503@m4x.org> Date: Tue, 29 Jan 2013 10:23:55 +0100 From: Simon Cruanes User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: caml-list@inria.fr References: <51065CFC.8010508@m4x.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-AV-Checked: ClamAV using ClamSMTP at svoboda.polytechnique.org (Tue Jan 29 10:24:08 2013 +0100 (CET)) X-Spam-Flag: No, tests=bogofilter, spamicity=0.000767, queueID=E080C140C50D2 X-Org-Mail: simon.cruanes.2007@polytechnique.org Subject: Re: [Caml-list] [ANN] Datalog-0.1 So, I benchmarked it against the Lua datalog engine (http://www.ccs.neu.edu/home/ramsdell/tools/datalog/) on two kinds of problems (that can already be found in the tests/ directory, with scripts to generate problems of any size): - graph: a transitive reflexive relation on a graph, 'reachable()', an a relation 'edge()' that describes a graph of size n. The graph is a big cycle (ie E_i -> E_{i+1}, and E_n -> E_1). The transitive closure is therefore a clique of size n. The query is 'reachable(X,Y)?', which returns n^2 facts. - induction (bad name): two mutually recursive predicates, p(i) :- q(i), p(i-1) and q(i) :- p(i-1), q(i-1) for i from 1 to n, and initial facts p(0) and q(0). The query is 'P(X)?', and returns n facts. The respective command lines used to benchmark are as follow ('datalog_cli' is in OCaml, 'datalog' is the lua engine): $ time datalog_cli tests/graph500.pl -pattern 'reachable(X,Y)' > /dev/null $ time echo 'reachable(X,Y)?' | datalog tests/graph500.pl -i > /dev/null $ time datalog_cli tests/induction100.pl -pattern 'p(X)' > /dev/null $ time echo 'p(X)?' | datalog tests/induction100.pl -i > /dev/null I could not find an easy way to measure the computation time, without printing the solution (which is also heavy); however, in both cases the full solution is printed. So, here are the respective times (total time) on a Intel i5 3.2GHz, with 4GB of RAM: problem datalog_cli datalog -------------------------------------------- graph200.pl 0.289s 0,798s graph500.pl 3.005s 5.092 graph1000.pl 19.151s 21.138s graph1500.pl 52.509s 51.589s graph2000.pl 151.04s OOM (swapped, became too slow) induction200.pl 0.024s 0.138s induction500.pl 0.021s 0.750s induction1000.pl 0.066s 2.997s induction1500.pl 0.075s 6.911s induction2000.pl 0.134s 12.856s So, the OCaml library looks competitive with the Lua SLD-resolution engine. I believe it performs better on the 'induction' examples because of the memorization of intermediate results. On the other hand, it always compute the whole fixpoint, whereas SLD-resolution can focus on what is needed for the query. Hope that helps! :) Cheers, -- Simon Cruanes On 01/28/2013 07:06 PM, yoann padioleau wrote: > Hi, > > How does it compare in terms of performance with other datalog engines? > > On Mon, Jan 28, 2013 at 3:11 AM, Simon Cruanes > wrote: >> Hello, >> >> I'm pleased to announce the first (alpha) release of Datalog, a small >> fixpoint engine for the Datalog fragment of logic >> (http://en.wikipedia.org/wiki/Datalog) written in OCaml. The library is >> designed to be used to compute fixpoints of rules, incrementally. It is >> available under the BSD license at https://github.com/c-cube/datalog . >> >> Both a command-line fixpoint computation tool and a functorial library >> are provided. Input files have a prolog-like syntax, but without >> function symbols. The command line tool works like this (the example >> computes the transitive closure of a graph, and then cliques within this >> transitive closure): >> >> $ cat tests/clique10.pl >> reachable(X,Y) :- edge(X,Y). >> reachable(X,Y) :- edge(X,Z), reachable(Z,Y). >> same_clique(X,Y) :- reachable(X,Y), reachable(Y,X). >> edge(0, 1). % edge from 0 to 1 >> edge(1, 2). >> edge(2, 3). >> edge(3, 4). >> edge(4, 5). >> edge(5, 0). >> edge(5, 6). >> edge(6, 7). >> edge(7, 8). >> edge(8, 9). >> edge(9, 10). >> edge(10, 7). >> >> $ ./datalog_cli.native tests/clique10.pl -pattern 'same_clique(1,X)' >> % start datalog >> % parse file tests/clique10.pl >> % process 15 rules >> % computing fixpoint... >> % done. >> % facts matching pattern same_clique(1, X1): >> same_clique(1, 0). >> same_clique(1, 1). >> same_clique(1, 3). >> same_clique(1, 2). >> same_clique(1, 5). >> same_clique(1, 4). >> % max_heap_size: 126976; minor_collections: 0; major collections: 0 >> >> The library provides a functor that allows one to create rules (Datalog >> clauses), and add them to a structure that computes their fixpoint >> incrementally (i.e., the fixpoint is updated after each rule is >> inserted). Callbacks can be associated to symbols, to be called whenever >> a fact is derived. >> >> Feedback, comments, or bug reports are very welcome! >> >> Cheers, >> >> -- >> Simon Cruanes >> >> >> -- >> Caml-list mailing list. Subscription management and archives: >> https://sympa.inria.fr/sympa/arc/caml-list >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >> Bug reports: http://caml.inria.fr/bin/caml-bugs >