caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Gerd Stolpmann <info@gerd-stolpmann.de>
To: Goswin von Brederlow <goswin-v-b@web.de>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Bigarray question: Detecting subarrays, overlaps, ...
Date: Tue, 28 Feb 2012 22:09:27 +0100	[thread overview]
Message-ID: <1330463367.2826.96.camel@thinkpad> (raw)
In-Reply-To: <87y5rmemr6.fsf@frosties.localnet>

Am Dienstag, den 28.02.2012, 21:17 +0100 schrieb Goswin von Brederlow:
> Hi,
> 
> I'm implementing a RAID in userspace using ocaml and NBD (Network Block
> Device) as protocol to export the device. For this I'm using
> Bigarray.Array1 as buffer for data and wrote myself the right read(),
> write(), pread() and pwrite() stubs. The advantage of this (over
> strings) is that Bigarrays are not copied around by the GC so they don't
> need to copy data around between the Bigarray and a temporary buffer in
> the C stub.

I used the same approach for PlasmaFS. The bigarray-based reads and
writes are really missing in the stdlib. (If anybody wants to
experiment, look into the Netsys_mem module of Ocamlnet, and use the
functions mem_read and mem_write.)

Btw, you should try to allocate the bigarrays in a page-aligned way.
This makes the syscalls even more efficient.

In my use case, I did not write to devices directly, and could assume
that the blocks are backed by the page cache. So I did not run into this
problem you describe in the following.

> For efficiency each request stores all its data in a single
> Bigarray.Array1. For reasons of the RAID implementation large requests
> are split into 4k chunks using Bigarray.Array1.sub and grouped into
> stripes. The stripes are then acted upon independently until all stripes
> involved in a request are finished. Then the request is completed.
> 
> Now I get a problem when 2 requests come in that overlap. Say I get a
> write request for blocks 0 - 6 and a read request for blocks 4-9 in
> that order. Since I already have the data for block 4-6 in memory from
> the write request I do not want to reread them from disk. On the stripe
> level the data looks like this:
> 
> |W0 W1 W2 W3 W4 W5 W6 .  .  .  | write 0-6
> |            W4 W5 W6 R7 R8 R9 | read  4-9
> 
> For the read request I need to copy W4-6 to R4-6 or send out W4-6 + R7-9
> in two writes. I think I would like to avoid sending each stripe chunk
> out seperately.

Why not? This seems to be an elegant solution, and I don't see why this
should make the accesses slower. The time for the extra context switches
in negligible compared to the disk accesses.

>  On the other hand I could implement (p)readv() and
> (p)writev() stubs.
> 
> Anyway, my question now is how to detect which subarrays in the stripes
> are part of a common larger array? Do I need to write my own C stub that
> looks into the internas of the arrays to see if they share a common
> ancestor?  I think that would be preferable to tracking the origin of
> each subarray myself.

Yes, subarrays are tracked, but this feature exists only to catch the
right moment for unmmapping bigarrays (if needed). As far as I remember,
this is not tracked as a sub/super relationship, but the bigarrays
accessing the same buffer share then the same buffer descriptor, and
when the use count drops to 0, the buffer is finally destroyed. So, you
cannot say which bigarray is the original one, and it can even be that
the original bigarray is already deallocated but the backing buffer is
not yet.

> On a similar note how does Bigarray.Array1.blit handle arrays that are
> part of the same larger array and overlap?
> 
> ----------------------------------------------------------------------
> I'm missing functions like:
> 
> val super : ('a, 'b, 'c) t -> ('a, 'b, 'c) t -> ('a, 'b, 'c) t
>     Create a merged sub-array of 2 adjacent sub-arrays of the same
>     big array.

This function would be possible to implement. The requirement would be
that both bigarrays use the same buffer descriptor.

> val same_parent : ('a, 'b, 'c) t -> ('a, 'b, 'c) t -> bool
>     True if the 2 (sub-)arrays are part of the same big array.

I would not call it "same_parent", but "same_backing_buffer".

> val offset : ('a, 'b, 'c) t -> int
>     Offset of the sub-array in the underlying big array.

I think this information is in the current implementation not available.
As buffer sharing is also possible after reshaping, this is also not
meaningful in the general case.

Gerd

> MfG
>         Goswin
> 

-- 
------------------------------------------------------------
Gerd Stolpmann, Darmstadt, Germany    gerd@gerd-stolpmann.de
Creator of GODI and camlcity.org.
Contact details:        http://www.camlcity.org/contact.html
Company homepage:       http://www.gerd-stolpmann.de
*** Searching for new projects! Need consulting for system
*** programming in Ocaml? Gerd Stolpmann can help you.
------------------------------------------------------------


  reply	other threads:[~2012-02-28 21:09 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-02-28 20:17 Goswin von Brederlow
2012-02-28 21:09 ` Gerd Stolpmann [this message]
2012-02-29  9:23   ` Goswin von Brederlow
2012-02-29 13:21     ` Gerd Stolpmann
2012-03-01  8:52       ` Goswin von Brederlow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1330463367.2826.96.camel@thinkpad \
    --to=info@gerd-stolpmann.de \
    --cc=caml-list@inria.fr \
    --cc=goswin-v-b@web.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).