caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] using React for -- reactions
@ 2014-08-19  8:34 becker.nils
  2014-08-19 12:27 ` Daniel Bünzli
  0 siblings, 1 reply; 4+ messages in thread
From: becker.nils @ 2014-08-19  8:34 UTC (permalink / raw)
  To: caml-list

hi,

i am looking for performance advice in the context of an event driven
simulation using react.ml. in a nutshell: can a React-based simulation
scheme be made similarly fast as an imperative scheme which mutates
arrays in-place? is there a run-time cost for large dependency graphs?

specifically:

i am exploring doing a Gillespie-type simulation for a network of
chemical reactions using React. here is how it looks. there are N
species and R possible reactions between them. each chemical species has
an associated molecule number signal. each reaction has an associated
event. the event triggers updates of the number signals of the
reaction's input and output species; this is implemented using E.select
and S.accum. the waiting time to the next reaction event of any of the R
reactions is a (random) function of the molecule numbers of its inputs;
after the molecule numbers are updated, the waiting times for all
affected reactions are updated using E.merge. the waiting times for
future events are managed in a separate heap. to initiate the next
update step, the earliest scheduled waiting time is popped from the
heap, and the associated reaction event is started.

reactions are relatively sparse, ie. the number of reactions R=O(N),
where N could be from 1e1 to 1e4 or so. so, in each update step only a
small number <10 of dependent molecule numbers and waiting times need to
be updated, even for large systems. in other words, the React-managed
dependency graph is wide (N) but not tall (relatively short dependecy
chains, see above), and it has some lateral connections. (side note, it would
be helpful to be able to output the graph somehow for debugging purposes)

when profiling, i find that indeed the runtime for a fixed number of
steps does not increase much as N increases. this is good. most of the
time seems to be spent within the H.up and H.down functions in React,
not in the random number generation. this is not ideal since it
indicates dependency handling overhead. (correct?)

for small N, the simulation runs about 10x slower than an
alternative version which instead of using React, keeps molecule numbers
in lists and functionally updates the list manually in each step.

the list version scales linearly with N, which disqualifies it; but i
think that a 'classical' version that stores molecule numbers in an
array would solve that problem. the reason i would still prefer a React
based scheme is that this would allow to dynamically update the network
of reactions at runtime, allowing for flexible, extensible rule-based
simulation.

bottom line: can a React based simulation possibly compete with manual,
in-place update of arrays, let's say within a factor of 2? if yes what
do i need to pay attention to?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] using React for -- reactions
  2014-08-19  8:34 [Caml-list] using React for -- reactions becker.nils
@ 2014-08-19 12:27 ` Daniel Bünzli
  2014-08-19 13:13   ` Nils Becker
  2014-08-20  9:11   ` Nils Becker
  0 siblings, 2 replies; 4+ messages in thread
From: Daniel Bünzli @ 2014-08-19 12:27 UTC (permalink / raw)
  To: becker.nils; +Cc: caml-list

Le mardi, 19 août 2014 à 09:34, becker.nils@gmx.net a écrit :
> is there a run-time cost for large dependency graphs?

The runtime cost is depends on the topology of the graph rather than its largeness. Basically given a primitive event/signal update you need in the worst case to do a topological sort of its recursive dependencies on each update step. But that's really the worst case, as for example if a signal doesn't change its value in the step it's dependents are not updated.  

> reactions are relatively sparse, ie. the number of reactions R=O(N),
> where N could be from 1e1 to 1e4 or so. so, in each update step only a
> small number <10 of dependent molecule numbers and waiting times need to
> be updated, even for large systems. in other words, the React-managed
> dependency graph is wide (N) but not tall (relatively short dependecy
> chains, see above),

If it's wide at the *primitive* level then this is a good property for performance since an update will have little needless work to do. If it is wide just *after* the primitive for no good reason, i.e. single primitive with huge number of dependents that filter the primitive and then short dependencies, you may want to express the first level of dependents as a primitive if the filtering nodes entails that only a few of them trigger on an update. That is sometimes you can gain performance by putting more logic/transformation in your primitives, as it allows you to cut down on the number of nodes or needless updates.

> bottom line: can a React based simulation possibly compete with manual,
> in-place update of arrays, let's say within a factor of 2? if yes what
> do i need to pay attention to?

No idea. Basically you need to pay attention to the topology of your graph. Is your code published somewhere ?  

Best,

Daniel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] using React for -- reactions
  2014-08-19 12:27 ` Daniel Bünzli
@ 2014-08-19 13:13   ` Nils Becker
  2014-08-20  9:11   ` Nils Becker
  1 sibling, 0 replies; 4+ messages in thread
From: Nils Becker @ 2014-08-19 13:13 UTC (permalink / raw)
  To: Daniel Bünzli, becker.nils; +Cc: caml-list

thanks daniel,

> If it's wide at the *primitive* level then this is a good property
> for performance since an update will have little needless work to do.
> If it is wide just *after* the primitive for no good reason, i.e.
> single primitive with huge number of dependents that filter the
> primitive and then short dependencies, you may want to express the
> first level of dependents as a primitive if the filtering nodes
> entails that only a few of them trigger on an update. 

yes, it's wide at the primitive level. initially i did have an E.select
over O(N) events somewhere which slowed it down a lot, but that is
already repaired, as you explain. right now, i should not have any
combinators that filter out a lot of unneccessary dependencies, so that
should be fine.

> 
>> bottom line: can a React based simulation possibly compete with
>> manual, in-place update of arrays, let's say within a factor of 2?
>> if yes what do i need to pay attention to?
> 
> No idea. Basically you need to pay attention to the topology of your
> graph. 

so i guess i need to make the update chains a short as possible. at the
moment, i update the external heap by events which produce (heap ->
heap) updater functions; they each have exactly one daughter event in
which the function is applied to the actual heap. this structure is just
a by-product of how i initialize the simulation. i guess by streamlining
this into just one event i could potentially shorten the dependency
chain and speed it up?

> Is your code published somewhere ?

no, it's pretty messy still. if you'd like to have a look i could put it
on bitbucket..

n.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Caml-list] using React for -- reactions
  2014-08-19 12:27 ` Daniel Bünzli
  2014-08-19 13:13   ` Nils Becker
@ 2014-08-20  9:11   ` Nils Becker
  1 sibling, 0 replies; 4+ messages in thread
From: Nils Becker @ 2014-08-20  9:11 UTC (permalink / raw)
  To: Daniel Bünzli, becker.nils; +Cc: caml-list

>> bottom line: can a React based simulation possibly compete with
>> manual, in-place update of arrays, let's say within a factor of 2?
>> if yes what do i need to pay attention to?
> 
> No idea. Basically you need to pay attention to the topology of your
> graph. 

one data point: i removed one linear dependency and saw 15% speedup, so
this seems to be the way to go.

the code was similar to this:

let signal_1 = S.merge f init_val {bunch of other signals} in
let signal_2 = S.map g signal_1

actually i'm not interested in signal_1 in itself. it is the result of a
list fold with f. its values are pairs that still contain an auxiliary
accumulator variable. in the second step i post-process this result to
give the meaningful signal_2. nothing else depends on signal_1.

i now got rid of signal_2 by putting the post-processing into the
downstream dependents of signal_1. this makes the structure of the
program a bit less nice since i now use the half-finished signals, but
it removes one trivial dependency. i think this is the cause of the speedup.

this is actually a general pattern i find myself using: a combinator
does not quite do all of the transformations i need, so i make another
signal/event directly downstream which completes the processing. another
example is

let evt_1 = E.select {bunch of events} (* not interested in evt_1 *)
let signal_2 = S.accum evt_1 init_val  (* except for making this *)

it now turns out this pattern is not really for free since it makes
dependency chains longer. it would be cool to have a way to 'contract'
these trivial links in the dependency graph. ie. to produce the result
of signal_2 directly, with only a single node in the graph. or to
'block' them together in the dependency resolution in some way. is there
already a way to do this that i missed?

n.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-08-20  9:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-19  8:34 [Caml-list] using React for -- reactions becker.nils
2014-08-19 12:27 ` Daniel Bünzli
2014-08-19 13:13   ` Nils Becker
2014-08-20  9:11   ` Nils Becker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).