caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] [ANN] Datalog-0.1
@ 2013-01-28 11:11 Simon Cruanes
  2013-01-28 18:06 ` yoann padioleau
  0 siblings, 1 reply; 4+ messages in thread
From: Simon Cruanes @ 2013-01-28 11:11 UTC (permalink / raw)
  To: OCaml List

Hello,

I'm pleased to announce the first (alpha) release of Datalog, a small
fixpoint engine for the Datalog fragment of logic
(http://en.wikipedia.org/wiki/Datalog) written in OCaml. The library is
designed to be used to compute fixpoints of rules, incrementally. It is
available under the BSD license at https://github.com/c-cube/datalog .

Both a command-line fixpoint computation tool and a functorial library
are provided. Input files have a prolog-like syntax, but without
function symbols. The command line tool works like this (the example
computes the transitive closure of a graph, and then cliques within this
transitive closure):

$ cat tests/clique10.pl
reachable(X,Y) :- edge(X,Y).
reachable(X,Y) :- edge(X,Z), reachable(Z,Y).
same_clique(X,Y) :- reachable(X,Y), reachable(Y,X).
edge(0, 1).  % edge from 0 to 1
edge(1, 2).
edge(2, 3).
edge(3, 4).
edge(4, 5).
edge(5, 0).
edge(5, 6).
edge(6, 7).
edge(7, 8).
edge(8, 9).
edge(9, 10).
edge(10, 7).

$ ./datalog_cli.native tests/clique10.pl -pattern 'same_clique(1,X)'
% start datalog
% parse file tests/clique10.pl
% process 15 rules
% computing fixpoint...
% done.
% facts matching pattern same_clique(1, X1):
same_clique(1, 0).
same_clique(1, 1).
same_clique(1, 3).
same_clique(1, 2).
same_clique(1, 5).
same_clique(1, 4).
% max_heap_size: 126976; minor_collections: 0; major collections: 0

The library provides a functor that allows one to create rules (Datalog
clauses), and add them to a structure that computes their fixpoint
incrementally (i.e., the fixpoint is updated after each rule is
inserted). Callbacks can be associated to symbols, to be called whenever
a fact is derived.

Feedback, comments, or bug reports are very welcome!

Cheers,

-- 
Simon Cruanes


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] [ANN] Datalog-0.1
  2013-01-28 11:11 [Caml-list] [ANN] Datalog-0.1 Simon Cruanes
@ 2013-01-28 18:06 ` yoann padioleau
  2013-01-28 18:19   ` Simon Cruanes
  2013-01-29  9:23   ` Simon Cruanes
  0 siblings, 2 replies; 4+ messages in thread
From: yoann padioleau @ 2013-01-28 18:06 UTC (permalink / raw)
  To: Simon Cruanes; +Cc: OCaml List

Hi,

How does it compare in terms of performance with other datalog engines?

On Mon, Jan 28, 2013 at 3:11 AM, Simon Cruanes
<simon.cruanes.2007@m4x.org> wrote:
> Hello,
>
> I'm pleased to announce the first (alpha) release of Datalog, a small
> fixpoint engine for the Datalog fragment of logic
> (http://en.wikipedia.org/wiki/Datalog) written in OCaml. The library is
> designed to be used to compute fixpoints of rules, incrementally. It is
> available under the BSD license at https://github.com/c-cube/datalog .
>
> Both a command-line fixpoint computation tool and a functorial library
> are provided. Input files have a prolog-like syntax, but without
> function symbols. The command line tool works like this (the example
> computes the transitive closure of a graph, and then cliques within this
> transitive closure):
>
> $ cat tests/clique10.pl
> reachable(X,Y) :- edge(X,Y).
> reachable(X,Y) :- edge(X,Z), reachable(Z,Y).
> same_clique(X,Y) :- reachable(X,Y), reachable(Y,X).
> edge(0, 1).  % edge from 0 to 1
> edge(1, 2).
> edge(2, 3).
> edge(3, 4).
> edge(4, 5).
> edge(5, 0).
> edge(5, 6).
> edge(6, 7).
> edge(7, 8).
> edge(8, 9).
> edge(9, 10).
> edge(10, 7).
>
> $ ./datalog_cli.native tests/clique10.pl -pattern 'same_clique(1,X)'
> % start datalog
> % parse file tests/clique10.pl
> % process 15 rules
> % computing fixpoint...
> % done.
> % facts matching pattern same_clique(1, X1):
> same_clique(1, 0).
> same_clique(1, 1).
> same_clique(1, 3).
> same_clique(1, 2).
> same_clique(1, 5).
> same_clique(1, 4).
> % max_heap_size: 126976; minor_collections: 0; major collections: 0
>
> The library provides a functor that allows one to create rules (Datalog
> clauses), and add them to a structure that computes their fixpoint
> incrementally (i.e., the fixpoint is updated after each rule is
> inserted). Callbacks can be associated to symbols, to be called whenever
> a fact is derived.
>
> Feedback, comments, or bug reports are very welcome!
>
> Cheers,
>
> --
> Simon Cruanes
>
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] [ANN] Datalog-0.1
  2013-01-28 18:06 ` yoann padioleau
@ 2013-01-28 18:19   ` Simon Cruanes
  2013-01-29  9:23   ` Simon Cruanes
  1 sibling, 0 replies; 4+ messages in thread
From: Simon Cruanes @ 2013-01-28 18:19 UTC (permalink / raw)
  To: caml-list

Hi,

I did not compare its performance with other engines. First, I don't
know where to find problem files for benchmarking. Then, the algorithm
used here is different from the usual two techniques that are bottom-up
(semi)-naive fixpoint computation (with magic sets, etc.), and top-down
SLD resolution (akin to prolog). The goal of my library is to enable
incremental computation of the fixpoint (for instance, by attaching
callbacks to symbols, you can detect new consequences of a fact just
after adding this fact).

Also, Datalog was originally designed as a database language (it's
basically like simple relational algebra, but with recursive queries),
but here everything is in memory.

However, it would be interesting to compare it with some in-memory
engines (there is a simple one in Lua, top-down, if I remember correctly).

Regards,

--
Simon Cruanes

On 01/28/2013 07:06 PM, yoann padioleau wrote:
> Hi,
> 
> How does it compare in terms of performance with other datalog engines?
> 
> On Mon, Jan 28, 2013 at 3:11 AM, Simon Cruanes
> <simon.cruanes.2007@m4x.org> wrote:
>> Hello,
>>
>> I'm pleased to announce the first (alpha) release of Datalog, a small
>> fixpoint engine for the Datalog fragment of logic
>> (http://en.wikipedia.org/wiki/Datalog) written in OCaml. The library is
>> designed to be used to compute fixpoints of rules, incrementally. It is
>> available under the BSD license at https://github.com/c-cube/datalog .
>>
>> Both a command-line fixpoint computation tool and a functorial library
>> are provided. Input files have a prolog-like syntax, but without
>> function symbols. The command line tool works like this (the example
>> computes the transitive closure of a graph, and then cliques within this
>> transitive closure):
>>
>> $ cat tests/clique10.pl
>> reachable(X,Y) :- edge(X,Y).
>> reachable(X,Y) :- edge(X,Z), reachable(Z,Y).
>> same_clique(X,Y) :- reachable(X,Y), reachable(Y,X).
>> edge(0, 1).  % edge from 0 to 1
>> edge(1, 2).
>> edge(2, 3).
>> edge(3, 4).
>> edge(4, 5).
>> edge(5, 0).
>> edge(5, 6).
>> edge(6, 7).
>> edge(7, 8).
>> edge(8, 9).
>> edge(9, 10).
>> edge(10, 7).
>>
>> $ ./datalog_cli.native tests/clique10.pl -pattern 'same_clique(1,X)'
>> % start datalog
>> % parse file tests/clique10.pl
>> % process 15 rules
>> % computing fixpoint...
>> % done.
>> % facts matching pattern same_clique(1, X1):
>> same_clique(1, 0).
>> same_clique(1, 1).
>> same_clique(1, 3).
>> same_clique(1, 2).
>> same_clique(1, 5).
>> same_clique(1, 4).
>> % max_heap_size: 126976; minor_collections: 0; major collections: 0
>>
>> The library provides a functor that allows one to create rules (Datalog
>> clauses), and add them to a structure that computes their fixpoint
>> incrementally (i.e., the fixpoint is updated after each rule is
>> inserted). Callbacks can be associated to symbols, to be called whenever
>> a fact is derived.
>>
>> Feedback, comments, or bug reports are very welcome!
>>
>> Cheers,
>>
>> --
>> Simon Cruanes
>>
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa.inria.fr/sympa/arc/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] [ANN] Datalog-0.1
  2013-01-28 18:06 ` yoann padioleau
  2013-01-28 18:19   ` Simon Cruanes
@ 2013-01-29  9:23   ` Simon Cruanes
  1 sibling, 0 replies; 4+ messages in thread
From: Simon Cruanes @ 2013-01-29  9:23 UTC (permalink / raw)
  To: caml-list

So, I benchmarked it against the Lua datalog engine
(http://www.ccs.neu.edu/home/ramsdell/tools/datalog/) on two kinds of
problems (that can already be found in the tests/ directory, with
scripts to generate problems of any size):

- graph: a transitive reflexive relation on a graph, 'reachable()', an a
relation 'edge()' that describes a graph of size n. The graph is a big
cycle (ie E_i -> E_{i+1}, and E_n -> E_1). The transitive closure is
therefore a clique of size n. The query is 'reachable(X,Y)?', which
returns n^2 facts.

- induction (bad name): two mutually recursive predicates, p(i) :- q(i),
p(i-1) and q(i) :- p(i-1), q(i-1) for i from 1 to n, and initial facts
p(0) and q(0). The query is 'P(X)?', and returns n facts.

The respective command lines used to benchmark are as follow
('datalog_cli' is in OCaml, 'datalog' is the lua engine):

$ time datalog_cli tests/graph500.pl -pattern 'reachable(X,Y)' > /dev/null
$ time echo 'reachable(X,Y)?' | datalog tests/graph500.pl -i > /dev/null

$ time datalog_cli tests/induction100.pl -pattern 'p(X)' > /dev/null
$ time echo 'p(X)?' | datalog tests/induction100.pl -i > /dev/null

I could not find an easy way to measure the computation time, without
printing the solution (which is also heavy); however, in both cases the
full solution is printed.

So, here are the respective times (total time) on a Intel i5 3.2GHz,
with 4GB of RAM:

problem          datalog_cli  datalog
--------------------------------------------
graph200.pl      0.289s       0,798s
graph500.pl      3.005s       5.092
graph1000.pl     19.151s      21.138s
graph1500.pl     52.509s      51.589s
graph2000.pl     151.04s      OOM (swapped, became too slow)
induction200.pl  0.024s       0.138s
induction500.pl  0.021s       0.750s
induction1000.pl 0.066s       2.997s
induction1500.pl 0.075s       6.911s
induction2000.pl 0.134s       12.856s

So, the OCaml library looks competitive with the Lua SLD-resolution
engine. I believe it performs better on the 'induction' examples because
of the memorization of intermediate results. On the other hand, it
always compute the whole fixpoint, whereas SLD-resolution can focus on
what is needed for the query.

Hope that helps! :)

Cheers,

--
Simon Cruanes

On 01/28/2013 07:06 PM, yoann padioleau wrote:
> Hi,
> 
> How does it compare in terms of performance with other datalog engines?
> 
> On Mon, Jan 28, 2013 at 3:11 AM, Simon Cruanes
> <simon.cruanes.2007@m4x.org> wrote:
>> Hello,
>>
>> I'm pleased to announce the first (alpha) release of Datalog, a small
>> fixpoint engine for the Datalog fragment of logic
>> (http://en.wikipedia.org/wiki/Datalog) written in OCaml. The library is
>> designed to be used to compute fixpoints of rules, incrementally. It is
>> available under the BSD license at https://github.com/c-cube/datalog .
>>
>> Both a command-line fixpoint computation tool and a functorial library
>> are provided. Input files have a prolog-like syntax, but without
>> function symbols. The command line tool works like this (the example
>> computes the transitive closure of a graph, and then cliques within this
>> transitive closure):
>>
>> $ cat tests/clique10.pl
>> reachable(X,Y) :- edge(X,Y).
>> reachable(X,Y) :- edge(X,Z), reachable(Z,Y).
>> same_clique(X,Y) :- reachable(X,Y), reachable(Y,X).
>> edge(0, 1).  % edge from 0 to 1
>> edge(1, 2).
>> edge(2, 3).
>> edge(3, 4).
>> edge(4, 5).
>> edge(5, 0).
>> edge(5, 6).
>> edge(6, 7).
>> edge(7, 8).
>> edge(8, 9).
>> edge(9, 10).
>> edge(10, 7).
>>
>> $ ./datalog_cli.native tests/clique10.pl -pattern 'same_clique(1,X)'
>> % start datalog
>> % parse file tests/clique10.pl
>> % process 15 rules
>> % computing fixpoint...
>> % done.
>> % facts matching pattern same_clique(1, X1):
>> same_clique(1, 0).
>> same_clique(1, 1).
>> same_clique(1, 3).
>> same_clique(1, 2).
>> same_clique(1, 5).
>> same_clique(1, 4).
>> % max_heap_size: 126976; minor_collections: 0; major collections: 0
>>
>> The library provides a functor that allows one to create rules (Datalog
>> clauses), and add them to a structure that computes their fixpoint
>> incrementally (i.e., the fixpoint is updated after each rule is
>> inserted). Callbacks can be associated to symbols, to be called whenever
>> a fact is derived.
>>
>> Feedback, comments, or bug reports are very welcome!
>>
>> Cheers,
>>
>> --
>> Simon Cruanes
>>
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa.inria.fr/sympa/arc/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
> 


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-01-29  9:24 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-28 11:11 [Caml-list] [ANN] Datalog-0.1 Simon Cruanes
2013-01-28 18:06 ` yoann padioleau
2013-01-28 18:19   ` Simon Cruanes
2013-01-29  9:23   ` Simon Cruanes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).