From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <sympa_inria@inria.fr>
X-Original-To: caml-list@sympa.inria.fr
Delivered-To: caml-list@sympa.inria.fr
Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83])
	by sympa.inria.fr (Postfix) with ESMTPS id 85B5C7F747
	for <caml-list@sympa.inria.fr>; Tue, 19 Aug 2014 10:35:04 +0200 (CEST)
X-IronPort-AV: E=Sophos;i="5.01,892,1400018400"; 
   d="scan'208";a="89806201"
Received: from sympa.inria.fr ([193.51.193.213])
  by mail2-relais-roc.national.inria.fr with ESMTP; 19 Aug 2014 10:35:04 +0200
Received: by sympa.inria.fr (Postfix, from userid 20132)
	id 77E6C7F79D; Tue, 19 Aug 2014 10:35:04 +0200 (CEST)
Date: Tue, 19 Aug 2014 10:34:59 +0200
To: caml-list@inria.fr
Message-ID: <sympa.1408436986.28320.956@inria.fr>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
Content-Transfer-Encoding: 7BIT
From: <becker.nils@gmx.net>
X-Mailer: Sympa 6.1.17
Subject: [Caml-list] using React for -- reactions

hi,

i am looking for performance advice in the context of an event driven
simulation using react.ml. in a nutshell: can a React-based simulation
scheme be made similarly fast as an imperative scheme which mutates
arrays in-place? is there a run-time cost for large dependency graphs?

specifically:

i am exploring doing a Gillespie-type simulation for a network of
chemical reactions using React. here is how it looks. there are N
species and R possible reactions between them. each chemical species has
an associated molecule number signal. each reaction has an associated
event. the event triggers updates of the number signals of the
reaction's input and output species; this is implemented using E.select
and S.accum. the waiting time to the next reaction event of any of the R
reactions is a (random) function of the molecule numbers of its inputs;
after the molecule numbers are updated, the waiting times for all
affected reactions are updated using E.merge. the waiting times for
future events are managed in a separate heap. to initiate the next
update step, the earliest scheduled waiting time is popped from the
heap, and the associated reaction event is started.

reactions are relatively sparse, ie. the number of reactions R=O(N),
where N could be from 1e1 to 1e4 or so. so, in each update step only a
small number <10 of dependent molecule numbers and waiting times need to
be updated, even for large systems. in other words, the React-managed
dependency graph is wide (N) but not tall (relatively short dependecy
chains, see above), and it has some lateral connections. (side note, it would
be helpful to be able to output the graph somehow for debugging purposes)

when profiling, i find that indeed the runtime for a fixed number of
steps does not increase much as N increases. this is good. most of the
time seems to be spent within the H.up and H.down functions in React,
not in the random number generation. this is not ideal since it
indicates dependency handling overhead. (correct?)

for small N, the simulation runs about 10x slower than an
alternative version which instead of using React, keeps molecule numbers
in lists and functionally updates the list manually in each step.

the list version scales linearly with N, which disqualifies it; but i
think that a 'classical' version that stores molecule numbers in an
array would solve that problem. the reason i would still prefer a React
based scheme is that this would allow to dynamically update the network
of reactions at runtime, allowing for flexible, extensible rule-based
simulation.

bottom line: can a React based simulation possibly compete with manual,
in-place update of arrays, let's say within a factor of 2? if yes what
do i need to pay attention to?