caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Jesper Louis Andersen <jesper.louis.andersen@gmail.com>
To: Joel Reymont <joelr1@gmail.com>
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] appending data to a mmap-ed file
Date: Thu, 16 Dec 2010 13:38:57 +0100	[thread overview]
Message-ID: <AANLkTi=Mc3ac=V=mVpWrQUdozzvSkfnT4jJ+VdAXuFAy@mail.gmail.com> (raw)
In-Reply-To: <8D1E9D5B-188C-4AF3-980B-EF229BA98FB4@gmail.com>

On Thu, Dec 16, 2010 at 12:31, Joel Reymont <joelr1@gmail.com> wrote:
> I'm constantly appending to a file of stock quotes (ints, longs, doubles, etc.). I have this file mapped into memory with mmap.

Ok, this helps a bit on what you are trying to do (you asked almost
the same question on the Erlang mailing list, but the details of
getting a foothold for the same thing in Erlang is subtly different)

My approach would be simple by noting you you two kinds of data and
some peculiar behaviour
  * "Newly generated data"
  * "Old data for archeology"
  * Data are almost never deleted

So:
  * If data is less than a threshold in size (preferably less than a
couple of PAGE_SIZE page boundaries, keep data in memory and serve it
from there. Simply have an Ocaml array of bytes or something such to
store data into (my Ocaml representation specific knowledge is not up
to par at the moment, but arrange it such that the byte-array has
C-representation underneath. I know that Ocaml strings have this).
This is the newly generated data.
 * Once in a while, you write(2) this string to the file on the disk,
then reopen the mmap() (which is now READ-ONLY as an effect. There
might be sharing tricks to play here should you do multi-process).
 * Lookup is handled by checking if data is archeology or data are
recent. The right lookup is then made. Everything hidden by batching
it up in a module.
 * You can play with the factor of when to write data to disk. Too
large, and you risk loosing too much data on failure. Too small and
the approach dies of syscall-overhead.

You may have additional constraints, so spill them, please.


-- 
J.


  reply	other threads:[~2010-12-16 12:39 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-16 11:31 Joel Reymont
2010-12-16 12:38 ` Jesper Louis Andersen [this message]
2010-12-16 13:13   ` [Caml-list] " Joel Reymont
2010-12-16 12:57 ` Gerd Stolpmann
2010-12-16 17:16 ` Richard W.M. Jones
2010-12-17  0:36   ` Goswin von Brederlow
2010-12-17 14:48     ` Richard W.M. Jones
2010-12-17 15:49       ` Joel Reymont
2010-12-17 19:05       ` Goswin von Brederlow
2010-12-18  9:56 ` Christophe Raffalli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='AANLkTi=Mc3ac=V=mVpWrQUdozzvSkfnT4jJ+VdAXuFAy@mail.gmail.com' \
    --to=jesper.louis.andersen@gmail.com \
    --cc=caml-list@yquem.inria.fr \
    --cc=joelr1@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).