caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] swapping large data structures from/to files
@ 2004-04-08 14:51 Sebastien Ferre
  2004-04-08 15:57 ` Basile Starynkevitch
  0 siblings, 1 reply; 2+ messages in thread
From: Sebastien Ferre @ 2004-04-08 14:51 UTC (permalink / raw)
  To: caml-list

Hi Caml-ists,

I am interested in handling so large data structures
that they don't fit in main memory. I need 2 things:

1. Persistency of the data structure, preferably in
a file (similarly to NDBM, say).

2. Customized swapping strategy of elements of the data
structure, what should be more efficient than the
virtual memory.

Typically, my data structure is a DAG, and I wish to
keep in memory only a limited amount of nodes at a time.
Hence the necessaty for swapping. It is also important
to have as much as possible in memory, and not merely
accessing the file, for efficiency reasons.

Has anything be done in this direction ?
The library Dbm is fine to me for the persistency,
but it does not work on every platform :-(.
( Would Dbm be difficult to rewrite in OCaml ?)

Thanks,
Sébastien Ferré

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [Caml-list] swapping large data structures from/to files
  2004-04-08 14:51 [Caml-list] swapping large data structures from/to files Sebastien Ferre
@ 2004-04-08 15:57 ` Basile Starynkevitch
  0 siblings, 0 replies; 2+ messages in thread
From: Basile Starynkevitch @ 2004-04-08 15:57 UTC (permalink / raw)
  To: Sebastien Ferre, caml-list

On Thu, Apr 08, 2004 at 02:51:11PM +0000, Sebastien Ferre wrote:

> I am interested in handling so large data structures
> that they don't fit in main memory. 

I am curious - what are your huge DAGs? (bio-informatics
applications??)

> I need 2 things:

> 1. Persistency of the data structure, preferably in
> a file (similarly to NDBM, say).

Did you look into Persil on my home page (see my sig)? It does provide
persistency into small segmented files (which works reasonably for
small data, since the whole file gets copied at end of process) or
with MySQL4. If you need, I could add another persistent store for it
(but I think that using a transactional database with Persil is much
better for big persistent data).

The most important issue is: do you need some kind of transaction
mechanism?  I could write some better file based persistent store iff
you don't need [nested] transactions (with commit & abort ability)!

You might also use Bigarray-s which can be mapped to files.

> 2. Customized swapping strategy of elements of the data
> structure, what should be more efficient than the
> virtual memory.

I'm not sure to fully understand your point. Persil does give the
ability to unload & relead persistent values on (explicit) demand.

Do you agree to explicitly say in your application (by appropriate
calls) I won't need any more this and this values? Or do you want the
system to guess them by yourself.

(For completness, you can give hints to the VM system with the madvise
system call, but it won't work with Ocaml - because values may be
moved by the GC).

> 
> Typically, my data structure is a DAG, and I wish to
> keep in memory only a limited amount of nodes at a time.

Is schema evolution a concern for you? Ie if you change the types
implementing your DAG, how do you deal with the huge persistent data
in that case? (Persil does not handle this issue, since it uses the
Marshal module)


> Hence the necessaty for swapping. It is also important
> to have as much as possible in memory, and not merely
> accessing the file, for efficiency reasons.

> Has anything be done in this direction ?
> The library Dbm is fine to me for the persistency,
> but it does not work on every platform :-(.
> ( Would Dbm be difficult to rewrite in OCaml ?)

I think that there are quite portable versions of Dbm (or BSD DB).

> Sébastien Ferré

(You can answer me in French if you wish; if you CC the list, let's
continue in english)


-- 
Basile STARYNKEVITCH -- basile dot starynkevitch at inria dot fr
Project cristal.inria.fr - phone +33 1 3963 5197 - mobile 6 8501 2359
http://cristal.inria.fr/~starynke --- all opinions are only mine 

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2004-04-08 15:58 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-04-08 14:51 [Caml-list] swapping large data structures from/to files Sebastien Ferre
2004-04-08 15:57 ` Basile Starynkevitch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).