caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Goswin von Brederlow <goswin-v-b@web.de>
To: Xavier Leroy <Xavier.Leroy@inria.fr>
Cc: Goswin von Brederlow <goswin-v-b@web.de>, caml-list@inria.fr
Subject: Re: [Caml-list] How to read different ints from a Bigarray?
Date: Thu, 29 Oct 2009 18:05:37 +0100	[thread overview]
Message-ID: <87pr86b066.fsf@frosties.localdomain> (raw)
In-Reply-To: <4AE87AB9.5020607@inria.fr> (Xavier Leroy's message of "Wed, 28 Oct 2009 18:09:13 +0100")

Xavier Leroy <Xavier.Leroy@inria.fr> writes:

> Goswin von Brederlow wrote:
>
>> I'm working on binding s for linux libaio library (asynchron IO) with
>> a sharp eye on efficiency. That means no copying must be done on the
>> data, which in turn means I can not use string as buffer type.
>> 
>> The best type for this seems to be a (int, int8_unsigned_elt,
>> c_layout) Bigarray.Array1.t. So far so good.
>
> That's a reasonable choice.
>
>> Now I define helper functions:
>> 
>> let get_uint8 buf off = buf.{off}
>> let set_uint8 buf off x = buf.{off} <- x
>> 
>> But I want more:
>> 
>> get/set_int8 - do I use Obj.magic to "convert" to int8_signed_elt?
>
> Not at all.  If you ask OCaml's typechecker to infer the type of
> get_uint8, you'll see that it returns a plain OCaml "int" (in the
> 0...255 range). Likewise, the "x" parameter to "set_uint8" has type
> "int" (of which only the 8 low bits are used).
>
> Repeat after me: "Obj.magic is not part of the OCaml language".
>
>> And endian correcting access for larger ints:
>> 
>> get/set_big_uint16
>> get/set_big_int16
>> get/set_little_uint16
>> get/set_little_int16
>> get/set_big_uint24
>> ...
>> get/set_little_int56
>> get/set_big_int64
>> get/set_little_int64
>
> The "56" functions look like a bit of overkill to me :-)
>
>> What is the best way there? For uintXX I can get_uint8 each byte and
>> shift and add them together. But that feels inefficient as each access
>> will range check
>
> Not necessarily.  OCaml 3.11 introduced unchecked accesses to
> bigarrays, so you can range-check yourself once, then perform
> unchecked accesses.  Use with caution...
>
>> and the shifting generates a lot of code while cpus
>> can usualy endian correct an int more elegantly.
>> 
>> Is it worth the overhead of calling a C function to write optimized
>> stubs for this?
>
> The only way to know is to benchmark both approaches :-(  My guess is
> that for 16-bit accesses, you're better off with a pure Caml solution,
> but for 64-bit accesses, a C function could be faster.
>
> - Xavier Leroy

Here are some benchmark results:

get an int out of a string:
                C               Ocaml
  uint8  le     19.496          17.433
   int8  le     19.298          17.850
  uint16 le     19.427          25.046
   int16 le     19.383          27.664
  uint16 be     20.502          23.200
   int16 be     20.350          27.535

get an int out of a Bigarray.Array1.t:
		safe		unsafe
  uint8  le	55.194s		54.508s
  uint64 le     80.51s		81.46s

Now to be fair the C code is unsafe as it does no boundary check. I
intend to get/set larger structures so I only need to check if all of
the structure fits. So most of the time I want unsafe calls and String
does not have any. And storing an int64, int32 does not need to check
for overflow for every single byte written in char_of_int.

The Bigarray unsafe_get is really disapointing. Note that uint64 is so
much slower because of allocating the result (my guess). Array1.set
runs the same speed for uint8 and uint64.

Overall it looks like C calls just aren't that expensive and endian
and sign conversions in ocaml plain suck.

I can not use an ocaml string as my buffer must be aligned and
unmovable (required by the linux kernel). A string manually created
outside the GC heap will never be freeed by the GC so that is out of
the question too. And Bigarray is plain too slow.

So a well defined custom type with access functions from both C and
Ocaml seems to be the way to go. As needed one can then also write a
stub for get/set of e.g. struct { uint64_t kind : 8; unit64_t inode;
uint64_t data; } <-> type key = Meta of int64 * int64 | Inode of
inode_t | Block of inode_t * block_t.

So much for the idea to get rid of the custom buffer type in libaio.

MfG
        Goswin


  parent reply	other threads:[~2009-10-29 17:05 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-28 13:54 Goswin von Brederlow
2009-10-28 14:16 ` Sylvain Le Gall
2009-10-28 15:00   ` [Caml-list] " Goswin von Brederlow
2009-10-28 15:17     ` Sylvain Le Gall
2009-10-28 17:57       ` [Caml-list] " Goswin von Brederlow
2009-10-28 18:19         ` Sylvain Le Gall
2009-10-28 21:05           ` [Caml-list] " Goswin von Brederlow
2009-10-28 21:26             ` Sylvain Le Gall
2009-10-28 22:48         ` [Caml-list] " blue storm
2009-10-29  9:50           ` Goswin von Brederlow
2009-10-29 10:34             ` Goswin von Brederlow
2009-10-29 12:20             ` Richard Jones
2009-10-29 17:07               ` Goswin von Brederlow
2009-10-30 20:30                 ` Richard Jones
2009-11-01 15:11                   ` Goswin von Brederlow
2009-11-01 19:57                     ` Richard Jones
2009-11-02 16:11                       ` Goswin von Brederlow
2009-11-02 16:33                         ` Mauricio Fernandez
2009-11-02 20:27                           ` Richard Jones
2009-11-03 13:18                             ` Goswin von Brederlow
2009-11-02 20:48                           ` Goswin von Brederlow
2009-10-29 20:40     ` Florian Weimer
2009-10-29 21:04       ` Gerd Stolpmann
2009-10-29 23:43         ` Goswin von Brederlow
2009-10-30  0:48           ` Gerd Stolpmann
2009-10-29 23:38       ` Goswin von Brederlow
2009-10-28 15:37 ` [Caml-list] " Olivier Andrieu
2009-10-28 16:05   ` Sylvain Le Gall
2009-10-28 15:43 ` [Caml-list] " Gerd Stolpmann
2009-10-28 16:06   ` Sylvain Le Gall
2009-10-28 18:09   ` [Caml-list] " Goswin von Brederlow
2009-10-28 17:09 ` Xavier Leroy
2009-10-28 19:05   ` Goswin von Brederlow
2009-10-29 17:05   ` Goswin von Brederlow [this message]
2009-10-29 18:42     ` Christophe TROESTLER
2009-10-29 19:03       ` Goswin von Brederlow
2009-10-29 18:48     ` Sylvain Le Gall
2009-10-29 23:25       ` [Caml-list] " Goswin von Brederlow

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pr86b066.fsf@frosties.localdomain \
    --to=goswin-v-b@web.de \
    --cc=Xavier.Leroy@inria.fr \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).