[Caml-list] Keeping A big data optimization problem functional

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* [Caml-list] Keeping A big data optimization problem functional
@ 2014-04-14  5:25 Arthur Breitman
  2014-04-14  5:43 ` Francois Berenger
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Arthur Breitman @ 2014-04-14  5:25 UTC (permalink / raw)
  To: caml-list

[-- Attachment #1: Type: text/plain, Size: 2720 bytes --]

Hello all,

This is my first post to the mailing list, I hope it will be relevant and
in line with the posting guidelines. If not, please let me know and pardon
my ignorance.

I am working on a large problem that I'm attempting to tackle in Ocaml. It
requires me to optimize a function over a tree

- the tree has on the order of 100,000 to 1,000,000 nodes
- each node on the tree contains on the order of 100kB to 1MB of data

So total, I'm looking at 10GB to 1TB one data which is typically stored in
a database on disk. The tree is loaded in memory using keys that represent
entries in the database.

Another structure I'm dealing with is a key value store, whose size is on
the same order of magnitude as the size of the data above. Initially, this
key-value store is empty.

I am given an "apply" function that, for a given key-value store and for a
given node in the tree produces a new key-value store. Typically by
modifying a few keys in the key-value store. The amount of bits changed is
on the same order of magnitude as the size of the node. When I reach a leaf
of the tree, I can compute the optimality of that leaf from the state of
the key-value store. I want to find the best leaf.

Here's my problem...
If I write the following in a imperative fashion, it seems straightforward
enough. I should be doing a DFS over my tree, mutating my key=value store
as I go, and saving on disk "backup" information to backtrack through the
tree. This doesn't feel very elegant. I've been considering two solutions

1) since there is a one to one mapping between nodes in the tree and
key-value stores, I could make my key-value store type to actually be the
same as a node in the tree. I would put it in a module along with a
reference to the node representing which key-value store is being
represented on disk. The "apply" function would be straightforward and
lazy, but whenever the keys of a key-value store are to be accessed, the
following would happen

a) Check if the key-value store on disk is the same as the one requested,
if so read the key
b) If that's not the case, find the least common ancestor in the tree,
migrate the disk cache to that version then down to the one requested. This
would require me to have an "undo" function to complement the "apply"
function, which is an acceptable compromise.

2) the second solution would be to represent the key-value store as a
binary tree. Since each operation changes only a few keys, I wouldn't be
making a dumb copy of the entire key-value store at each step, only of the
keys that matter. The problem is that though that works great in memory,
I'm not sure how to make it work with an on-disk storage, a necessity given
the size of it.

Thanks,
Arthurto

[-- Attachment #2: Type: text/html, Size: 3168 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Keeping A big data optimization problem functional
  2014-04-14  5:25 [Caml-list] Keeping A big data optimization problem functional Arthur Breitman
@ 2014-04-14  5:43 ` Francois Berenger
  2014-04-14  6:07   ` Arthur Breitman
  2014-04-14  8:12 ` Goswin von Brederlow
  2014-04-14 20:38 ` Richard W.M. Jones
  2 siblings, 1 reply; 10+ messages in thread
From: Francois Berenger @ 2014-04-14  5:43 UTC (permalink / raw)
  To: caml-list

And that's when I realize that "Oh! No! We don't have bindings to 
Berkeley DB in OPAM!". That's unfortunate. :(

On 2014年04月14日 14:25, Arthur Breitman wrote:
> Hello all,
>
> This is my first post to the mailing list, I hope it will be relevant and
> in line with the posting guidelines. If not, please let me know and pardon
> my ignorance.
>
> I am working on a large problem that I'm attempting to tackle in Ocaml. It
> requires me to optimize a function over a tree
>
> - the tree has on the order of 100,000 to 1,000,000 nodes
> - each node on the tree contains on the order of 100kB to 1MB of data
>
> So total, I'm looking at 10GB to 1TB one data which is typically stored in
> a database on disk. The tree is loaded in memory using keys that represent
> entries in the database.
>
> Another structure I'm dealing with is a key value store, whose size is on
> the same order of magnitude as the size of the data above. Initially, this
> key-value store is empty.
>
> I am given an "apply" function that, for a given key-value store and for a
> given node in the tree produces a new key-value store. Typically by
> modifying a few keys in the key-value store. The amount of bits changed is
> on the same order of magnitude as the size of the node. When I reach a leaf
> of the tree, I can compute the optimality of that leaf from the state of
> the key-value store. I want to find the best leaf.
>
> Here's my problem...
> If I write the following in a imperative fashion, it seems straightforward
> enough. I should be doing a DFS over my tree, mutating my key=value store
> as I go, and saving on disk "backup" information to backtrack through the
> tree. This doesn't feel very elegant. I've been considering two solutions
>
> 1) since there is a one to one mapping between nodes in the tree and
> key-value stores, I could make my key-value store type to actually be the
> same as a node in the tree. I would put it in a module along with a
> reference to the node representing which key-value store is being
> represented on disk. The "apply" function would be straightforward and
> lazy, but whenever the keys of a key-value store are to be accessed, the
> following would happen
>
> a) Check if the key-value store on disk is the same as the one requested,
> if so read the key
> b) If that's not the case, find the least common ancestor in the tree,
> migrate the disk cache to that version then down to the one requested. This
> would require me to have an "undo" function to complement the "apply"
> function, which is an acceptable compromise.
>
> 2) the second solution would be to represent the key-value store as a
> binary tree. Since each operation changes only a few keys, I wouldn't be
> making a dumb copy of the entire key-value store at each step, only of the
> keys that matter. The problem is that though that works great in memory,
> I'm not sure how to make it work with an on-disk storage, a necessity given
> the size of it.
>
> Thanks,
> Arthurto
>


-- 
Best regards,
Francois Berenger.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Keeping A big data optimization problem functional
  2014-04-14  5:43 ` Francois Berenger
@ 2014-04-14  6:07   ` Arthur Breitman
  0 siblings, 0 replies; 10+ messages in thread
From: Arthur Breitman @ 2014-04-14  6:07 UTC (permalink / raw)
  To: Francois Berenger; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 3669 bytes --]

No but it does have bindings for leveldb which does just what I want. The
question is, how do I hide those dirty db mutations.


On Sun, Apr 13, 2014 at 11:43 PM, Francois Berenger <berenger@riken.jp>wrote:

> And that's when I realize that "Oh! No! We don't have bindings to Berkeley
> DB in OPAM!". That's unfortunate. :(
>
>
> On 2014年04月14日 14:25, Arthur Breitman wrote:
>
>> Hello all,
>>
>> This is my first post to the mailing list, I hope it will be relevant and
>> in line with the posting guidelines. If not, please let me know and pardon
>> my ignorance.
>>
>> I am working on a large problem that I'm attempting to tackle in Ocaml. It
>> requires me to optimize a function over a tree
>>
>> - the tree has on the order of 100,000 to 1,000,000 nodes
>> - each node on the tree contains on the order of 100kB to 1MB of data
>>
>> So total, I'm looking at 10GB to 1TB one data which is typically stored in
>> a database on disk. The tree is loaded in memory using keys that represent
>> entries in the database.
>>
>> Another structure I'm dealing with is a key value store, whose size is on
>> the same order of magnitude as the size of the data above. Initially, this
>> key-value store is empty.
>>
>> I am given an "apply" function that, for a given key-value store and for a
>> given node in the tree produces a new key-value store. Typically by
>> modifying a few keys in the key-value store. The amount of bits changed is
>> on the same order of magnitude as the size of the node. When I reach a
>> leaf
>> of the tree, I can compute the optimality of that leaf from the state of
>> the key-value store. I want to find the best leaf.
>>
>> Here's my problem...
>> If I write the following in a imperative fashion, it seems straightforward
>> enough. I should be doing a DFS over my tree, mutating my key=value store
>> as I go, and saving on disk "backup" information to backtrack through the
>> tree. This doesn't feel very elegant. I've been considering two solutions
>>
>> 1) since there is a one to one mapping between nodes in the tree and
>> key-value stores, I could make my key-value store type to actually be the
>> same as a node in the tree. I would put it in a module along with a
>> reference to the node representing which key-value store is being
>> represented on disk. The "apply" function would be straightforward and
>> lazy, but whenever the keys of a key-value store are to be accessed, the
>> following would happen
>>
>> a) Check if the key-value store on disk is the same as the one requested,
>> if so read the key
>> b) If that's not the case, find the least common ancestor in the tree,
>> migrate the disk cache to that version then down to the one requested.
>> This
>> would require me to have an "undo" function to complement the "apply"
>> function, which is an acceptable compromise.
>>
>> 2) the second solution would be to represent the key-value store as a
>> binary tree. Since each operation changes only a few keys, I wouldn't be
>> making a dumb copy of the entire key-value store at each step, only of the
>> keys that matter. The problem is that though that works great in memory,
>> I'm not sure how to make it work with an on-disk storage, a necessity
>> given
>> the size of it.
>>
>> Thanks,
>> Arthurto
>>
>>
>
> --
> Best regards,
> Francois Berenger.
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>



-- 
Arthur Breitman

[-- Attachment #2: Type: text/html, Size: 4656 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Keeping A big data optimization problem functional
  2014-04-14  5:25 [Caml-list] Keeping A big data optimization problem functional Arthur Breitman
  2014-04-14  5:43 ` Francois Berenger
@ 2014-04-14  8:12 ` Goswin von Brederlow
  2014-04-14 19:32   ` Arthur Breitman
  2014-04-14 20:38 ` Richard W.M. Jones
  2 siblings, 1 reply; 10+ messages in thread
From: Goswin von Brederlow @ 2014-04-14  8:12 UTC (permalink / raw)
  To: caml-list

On Sun, Apr 13, 2014 at 11:25:18PM -0600, Arthur Breitman wrote:
> Hello all,
> 
> This is my first post to the mailing list, I hope it will be relevant and
> in line with the posting guidelines. If not, please let me know and pardon
> my ignorance.
> 
> I am working on a large problem that I'm attempting to tackle in Ocaml. It
> requires me to optimize a function over a tree
> 
> - the tree has on the order of 100,000 to 1,000,000 nodes
> - each node on the tree contains on the order of 100kB to 1MB of data
> 
> So total, I'm looking at 10GB to 1TB one data which is typically stored in
> a database on disk. The tree is loaded in memory using keys that represent
> entries in the database.
> 
> Another structure I'm dealing with is a key value store, whose size is on
> the same order of magnitude as the size of the data above. Initially, this
> key-value store is empty.

You probably don't expect to keep 1TB data in ram. Does the key-value
store just reference values in the DB?
 
> I am given an "apply" function that, for a given key-value store and for a
> given node in the tree produces a new key-value store. Typically by
> modifying a few keys in the key-value store. The amount of bits changed is
> on the same order of magnitude as the size of the node. When I reach a leaf
> of the tree, I can compute the optimality of that leaf from the state of
> the key-value store. I want to find the best leaf.

Not sure what you mean there. Maybe you could give an example with a
small data set (maybe 10 keys).
 
> Here's my problem...
> If I write the following in a imperative fashion, it seems straightforward
> enough. I should be doing a DFS over my tree, mutating my key=value store
> as I go, and saving on disk "backup" information to backtrack through the
> tree. This doesn't feel very elegant. I've been considering two solutions
> 
> 1) since there is a one to one mapping between nodes in the tree and
> key-value stores, I could make my key-value store type to actually be the
> same as a node in the tree. I would put it in a module along with a
> reference to the node representing which key-value store is being
> represented on disk. The "apply" function would be straightforward and
> lazy, but whenever the keys of a key-value store are to be accessed, the
> following would happen
> 
> a) Check if the key-value store on disk is the same as the one requested,
> if so read the key
> b) If that's not the case, find the least common ancestor in the tree,
> migrate the disk cache to that version then down to the one requested. This
> would require me to have an "undo" function to complement the "apply"
> function, which is an acceptable compromise.
> 
> 2) the second solution would be to represent the key-value store as a
> binary tree. Since each operation changes only a few keys, I wouldn't be
> making a dumb copy of the entire key-value store at each step, only of the
> keys that matter. The problem is that though that works great in memory,
> I'm not sure how to make it work with an on-disk storage, a necessity given
> the size of it.
> 
> Thanks,
> Arthurto

Wouldn't a simple Map work as your key-value store? The apply function
would then modify the map, creating a new map that shares most of the
unchanged data with the original map. That way keeping the results of
many apply functions in memory should be possible and applying should
be fast.

But if memory becomes a problem then you probably want to use some
form of B-Tree to.

MfG
	Goswin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Keeping A big data optimization problem functional
  2014-04-14  8:12 ` Goswin von Brederlow
@ 2014-04-14 19:32   ` Arthur Breitman
  2014-04-15  9:05     ` Goswin von Brederlow
  2014-04-15 13:50     ` Thomas Gazagnaire
  0 siblings, 2 replies; 10+ messages in thread
From: Arthur Breitman @ 2014-04-14 19:32 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 2326 bytes --]

Responses in-line

On Mon, Apr 14, 2014 at 2:12 AM, Goswin von Brederlow <goswin-v-b@web.de>wrote:

> On Sun, Apr 13, 2014 at 11:25:18PM -0600, Arthur Breitman wrote:
> > Hello all,
>
> Another structure I'm dealing with is a key value store, whose size is on
> > the same order of magnitude as the size of the data above. Initially,
> this
> > key-value store is empty.
>
> You probably don't expect to keep 1TB data in ram. Does the key-value
> store just reference values in the DB?


Indeed it does. That's part of the difficulty.


 > I am given an "apply" function that, for a given key-value store and for
> a
> > given node in the tree produces a new key-value store. Typically by
> > modifying a few keys in the key-value store. The amount of bits changed
> is
> > on the same order of magnitude as the size of the node. When I reach a
> leaf
> > of the tree, I can compute the optimality of that leaf from the state of
> > the key-value store. I want to find the best leaf.
>
> Not sure what you mean there. Maybe you could give an example with a
> small data set (maybe 10 keys).
>

It would be hard to make it a small example, but abstractly I have:

type node = {value: string; children: node list option}
module type Context = begin
  type t
  val apply: t -> node -> t
  val get: t -> string -> string option
  val set: t -> string -> string -> t
  val fitness -> int
  val empty -> t
end

Given a tree of nodes, and starting with Context.empty find the leaf node
with the highest fitness. You can think of every node in the tree as an
operation acting on the context.


>
> Wouldn't a simple Map work as your key-value store? The apply function
> would then modify the map, creating a new map that shares most of the
> unchanged data with the original map. That way keeping the results of
> many apply functions in memory should be possible and applying should
> be fast.
>

It would if I didn't potentially have a terabyte of data.



>
> But if memory becomes a problem then you probably want to use some
> form of B-Tree to.
>
> MfG
>         Goswin
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>



-- 
Arthur Breitman

[-- Attachment #2: Type: text/html, Size: 4022 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Keeping A big data optimization problem functional
  2014-04-14  5:25 [Caml-list] Keeping A big data optimization problem functional Arthur Breitman
  2014-04-14  5:43 ` Francois Berenger
  2014-04-14  8:12 ` Goswin von Brederlow
@ 2014-04-14 20:38 ` Richard W.M. Jones
  2 siblings, 0 replies; 10+ messages in thread
From: Richard W.M. Jones @ 2014-04-14 20:38 UTC (permalink / raw)
  To: Arthur Breitman; +Cc: caml-list

On Sun, Apr 13, 2014 at 11:25:18PM -0600, Arthur Breitman wrote:
> So total, I'm looking at 10GB to 1TB one data which is typically stored in
> a database on disk. The tree is loaded in memory using keys that represent
> entries in the database.
> 
> Another structure I'm dealing with is a key value store, whose size is on
> the same order of magnitude as the size of the data above. Initially, this
> key-value store is empty.

It might not be suitable, but ocaml-ancient would let you keep certain
OCaml structs on disk, but mmapped so you can just use them as if they
are in memory.  (I believe it won't actually work with Map unless you
write a special comparison function, but anyway you'll need to read
the README file & possibly the source closely if you go down this
route.)

Back in the day I used to use ocaml-ancient to do analysis on large
structs of the order of 30+ GB -- large in those days ...

Rich.

-- 
Richard Jones
Red Hat

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Keeping A big data optimization problem functional
  2014-04-14 19:32   ` Arthur Breitman
@ 2014-04-15  9:05     ` Goswin von Brederlow
  2014-04-15 13:16       ` Arthur Breitman
  2014-04-15 13:50     ` Thomas Gazagnaire
  1 sibling, 1 reply; 10+ messages in thread
From: Goswin von Brederlow @ 2014-04-15  9:05 UTC (permalink / raw)
  To: caml-list

On Mon, Apr 14, 2014 at 01:32:18PM -0600, Arthur Breitman wrote:
> Responses in-line
> 
> On Mon, Apr 14, 2014 at 2:12 AM, Goswin von Brederlow <goswin-v-b@web.de>wrote:
> 
> > On Sun, Apr 13, 2014 at 11:25:18PM -0600, Arthur Breitman wrote:
> > > Hello all,
> >
> > Another structure I'm dealing with is a key value store, whose size is on
> > > the same order of magnitude as the size of the data above. Initially,
> > this
> > > key-value store is empty.
> >
> > You probably don't expect to keep 1TB data in ram. Does the key-value
> > store just reference values in the DB?
> 
> 
> Indeed it does. That's part of the difficulty.
> 
> 
>  > I am given an "apply" function that, for a given key-value store and for
> > a
> > > given node in the tree produces a new key-value store. Typically by
> > > modifying a few keys in the key-value store. The amount of bits changed
> > is
> > > on the same order of magnitude as the size of the node. When I reach a
> > leaf
> > > of the tree, I can compute the optimality of that leaf from the state of
> > > the key-value store. I want to find the best leaf.
> >
> > Not sure what you mean there. Maybe you could give an example with a
> > small data set (maybe 10 keys).
> >
> 
> It would be hard to make it a small example, but abstractly I have:
> 
> type node = {value: string; children: node list option}

A list can be [], no need for an option usualy. Or is there a
difference between None and Some []?

> module type Context = begin
>   type t
>   val apply: t -> node -> t
>   val get: t -> string -> string option
>   val set: t -> string -> string -> t
>   val fitness -> int
>   val empty -> t
> end

Shouldn't that be:

module type Context = begin
  type t
  type db_ref
  val apply: t -> node -> t
  val get: t -> string -> db_ref option
  val set: t -> string -> db_ref -> t
  val fitness -> int
  val empty -> t
end
 
> Given a tree of nodes, and starting with Context.empty find the leaf node
> with the highest fitness. You can think of every node in the tree as an
> operation acting on the context.
> 
> 
> >
> > Wouldn't a simple Map work as your key-value store? The apply function
> > would then modify the map, creating a new map that shares most of the
> > unchanged data with the original map. That way keeping the results of
> > many apply functions in memory should be possible and applying should
> > be fast.
> >
> 
> It would if I didn't potentially have a terabyte of data.

But you wouldn't put the data into the map, only the reference to the
DB. And maybe have a cache of recently used DB references so you don't
have to lookup the same object over and over.

> > But if memory becomes a problem then you probably want to use some
> > form of B-Tree to.
> >
> > MfG
> >         Goswin
> -- 
> Arthur Breitman

MfG
	Goswin

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Keeping A big data optimization problem functional
  2014-04-15  9:05     ` Goswin von Brederlow
@ 2014-04-15 13:16       ` Arthur Breitman
  0 siblings, 0 replies; 10+ messages in thread
From: Arthur Breitman @ 2014-04-15 13:16 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: caml-list

[-- Attachment #1: Type: text/plain, Size: 3613 bytes --]

On Apr 15, 2014 5:06 AM, "Goswin von Brederlow" <goswin-v-b@web.de> wrote:
>
> On Mon, Apr 14, 2014 at 01:32:18PM -0600, Arthur Breitman wrote:
> > Responses in-line
> >
> > On Mon, Apr 14, 2014 at 2:12 AM, Goswin von Brederlow <goswin-v-b@web.de
>wrote:
> >
> > > On Sun, Apr 13, 2014 at 11:25:18PM -0600, Arthur Breitman wrote:
> > > > Hello all,
> > >
> > > Another structure I'm dealing with is a key value store, whose size
is on
> > > > the same order of magnitude as the size of the data above.
Initially,
> > > this
> > > > key-value store is empty.
> > >
> > > You probably don't expect to keep 1TB data in ram. Does the key-value
> > > store just reference values in the DB?
> >
> >
> > Indeed it does. That's part of the difficulty.
> >
> >
> >  > I am given an "apply" function that, for a given key-value store and
for
> > > a
> > > > given node in the tree produces a new key-value store. Typically by
> > > > modifying a few keys in the key-value store. The amount of bits
changed
> > > is
> > > > on the same order of magnitude as the size of the node. When I
reach a
> > > leaf
> > > > of the tree, I can compute the optimality of that leaf from the
state of
> > > > the key-value store. I want to find the best leaf.
> > >
> > > Not sure what you mean there. Maybe you could give an example with a
> > > small data set (maybe 10 keys).
> > >
> >
> > It would be hard to make it a small example, but abstractly I have:
> >
> > type node = {value: string; children: node list option}
>
> A list can be [], no need for an option usualy. Or is there a
> difference between None and Some []?

1) yes the "option" part is useless, I missed it in the wrote

> > module type Context = begin
> >   type t
> >   val apply: t -> node -> t
> >   val get: t -> string -> string option
> >   val set: t -> string -> string -> t
> >   val fitness -> int
> >   val empty -> t
> > end
>
> Shouldn't that be:
>
> module type Context = begin
>   type t
>   type db_ref
>   val apply: t -> node -> t
>   val get: t -> string -> db_ref option
>   val set: t -> string -> db_ref -> t
>   val fitness -> int
>   val empty -> t
> end
>

2) I'm not sure what you mean by a "db_ref", but I don't think the module
signature you describe is what I want. I stand by the signature I wrote.
What I want is a functional map that just happens to be so large it needs a
disk database to be stored.


> > Given a tree of nodes, and starting with Context.empty find the leaf
node
> > with the highest fitness. You can think of every node in the tree as an
> > operation acting on the context.
> >
> >
> > >
> > > Wouldn't a simple Map work as your key-value store? The apply function
> > > would then modify the map, creating a new map that shares most of the
> > > unchanged data with the original map. That way keeping the results of
> > > many apply functions in memory should be possible and applying should
> > > be fast.
> > >
> >
> > It would if I didn't potentially have a terabyte of data.
>
> But you wouldn't put the data into the map, only the reference to the
> DB.

That would work if I had a small number of keys with large value types.
Instead I have on the order of a billion keyvalue pairs.

>
> > > But if memory becomes a problem then you probably want to use some
> > > form of B-Tree to.
> > >
> > > MfG
> > >         Goswin
> > --
> > Arthur Breitman
>
> MfG
>         Goswin
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs

[-- Attachment #2: Type: text/html, Size: 5264 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Keeping A big data optimization problem functional
  2014-04-14 19:32   ` Arthur Breitman
  2014-04-15  9:05     ` Goswin von Brederlow
@ 2014-04-15 13:50     ` Thomas Gazagnaire
  2014-04-16 11:02       ` Arthur Breitman
  1 sibling, 1 reply; 10+ messages in thread
From: Thomas Gazagnaire @ 2014-04-15 13:50 UTC (permalink / raw)
  To: Arthur Breitman; +Cc: Goswin von Brederlow, caml-list

> type node = {value: string; children: node list option}  
> module type Context = begin
>   type t
>   val apply: t -> node -> t 
>   val get: t -> string -> string option 
>   val set: t -> string -> string -> t
>   val fitness -> int
>   val empty -> t
> end

What's the link between nodes and elements of your store ?

--
Thomas

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] Keeping A big data optimization problem functional
  2014-04-15 13:50     ` Thomas Gazagnaire
@ 2014-04-16 11:02       ` Arthur Breitman
  0 siblings, 0 replies; 10+ messages in thread
From: Arthur Breitman @ 2014-04-16 11:02 UTC (permalink / raw)
  To: Thomas Gazagnaire; +Cc: caml-list, Goswin von Brederlow

[-- Attachment #1: Type: text/plain, Size: 801 bytes --]

On Apr 15, 2014 9:50 AM, "Thomas Gazagnaire" <thomas@gazagnaire.org> wrote:
>
> > type node = {value: string; children: node list option}
> > module type Context = begin
> >   type t
> >   val apply: t -> node -> t
> >   val get: t -> string -> string option
> >   val set: t -> string -> string -> t
> >   val fitness -> int
> >   val empty -> t
> > end
>
> What's the link between nodes and elements of your store ?

The only link between the two is the existence of the parametric "apply"
function.

For all intent and purposes, you can think of the nodes as functions
operating on elements of type "register".*

--
Arthur

* in reality it's a little more complicated as the apply function can
return a register with a different apply function, but for our purpose it's
irrelevant

>
> --
> Thomas

[-- Attachment #2: Type: text/html, Size: 1192 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2014-04-16 11:02 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-14  5:25 [Caml-list] Keeping A big data optimization problem functional Arthur Breitman
2014-04-14  5:43 ` Francois Berenger
2014-04-14  6:07   ` Arthur Breitman
2014-04-14  8:12 ` Goswin von Brederlow
2014-04-14 19:32   ` Arthur Breitman
2014-04-15  9:05     ` Goswin von Brederlow
2014-04-15 13:16       ` Arthur Breitman
2014-04-15 13:50     ` Thomas Gazagnaire
2014-04-16 11:02       ` Arthur Breitman
2014-04-14 20:38 ` Richard W.M. Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).