caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* How to read different ints from a Bigarray?
@ 2009-10-28 13:54 Goswin von Brederlow
  2009-10-28 14:16 ` Sylvain Le Gall
                   ` (3 more replies)
  0 siblings, 4 replies; 38+ messages in thread
From: Goswin von Brederlow @ 2009-10-28 13:54 UTC (permalink / raw)
  To: caml-list

Hi,

I'm working on binding s for linux libaio library (asynchron IO) with
a sharp eye on efficiency. That means no copying must be done on the
data, which in turn means I can not use string as buffer type.

The best type for this seems to be a (int, int8_unsigned_elt,
c_layout) Bigarray.Array1.t. So far so good.

Now I define helper functions:

let get_uint8 buf off = buf.{off}
let set_uint8 buf off x = buf.{off} <- x

But I want more:

get/set_int8 - do I use Obj.magic to "convert" to int8_signed_elt?

And endian correcting access for larger ints:

get/set_big_uint16
get/set_big_int16
get/set_little_uint16
get/set_little_int16
get/set_big_uint24
...
get/set_little_int56
get/set_big_int64
get/set_little_int64

What is the best way there? For uintXX I can get_uint8 each byte and
shift and add them together. But that feels inefficient as each access
will range check and the shifting generates a lot of code while cpus
can usualy endian correct an int more elegantly.

Is it worth the overhead of calling a C function to write optimized
stubs for this?

And last:

get/set_string, blit_from/to_string

Do I create a string where needed and then loop over every char
calling s.(i) <- char_of_int buf.{off+i}? Or better a C function using
memcpy?

What do you think?

MfG
        Goswin

PS: Does batteries have a better module for this than Bigarray?


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: How to read different ints from a Bigarray?
  2009-10-28 13:54 How to read different ints from a Bigarray? Goswin von Brederlow
@ 2009-10-28 14:16 ` Sylvain Le Gall
  2009-10-28 15:00   ` [Caml-list] " Goswin von Brederlow
  2009-10-28 15:37 ` [Caml-list] " Olivier Andrieu
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 38+ messages in thread
From: Sylvain Le Gall @ 2009-10-28 14:16 UTC (permalink / raw)
  To: caml-list

Hello,

On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> Hi,
>
> I'm working on binding s for linux libaio library (asynchron IO) with
> a sharp eye on efficiency. That means no copying must be done on the
> data, which in turn means I can not use string as buffer type.
>
> The best type for this seems to be a (int, int8_unsigned_elt,
> c_layout) Bigarray.Array1.t. So far so good.
>
> Now I define helper functions:
>
> let get_uint8 buf off = buf.{off}
> let set_uint8 buf off x = buf.{off} <- x
>
> But I want more:
>
> get/set_int8 - do I use Obj.magic to "convert" to int8_signed_elt?
>
> And endian correcting access for larger ints:
>
> get/set_big_uint16
> get/set_big_int16
> get/set_little_uint16
> get/set_little_int16
> get/set_big_uint24
> ...
> get/set_little_int56
> get/set_big_int64
> get/set_little_int64
>
> What is the best way there? For uintXX I can get_uint8 each byte and
> shift and add them together. But that feels inefficient as each access
> will range check and the shifting generates a lot of code while cpus
> can usualy endian correct an int more elegantly.
>
> Is it worth the overhead of calling a C function to write optimized
> stubs for this?
>
> And last:
>
> get/set_string, blit_from/to_string
>
> Do I create a string where needed and then loop over every char
> calling s.(i) <- char_of_int buf.{off+i}? Or better a C function using
> memcpy?
>
> What do you think?
>

Well, we talk about this a little bit, but here is my opinion:
- calling a C function to add a single int will generate a big overhead
- OCaml string are quite fast to modify values

So to my mind the best option is to have a buffer string (say 16/32
char) where you put data inside and flush it in a single C call to
Bigarray. 

E.g.:
let append_char t c =
  if t.idx >= 64 then
    (
      flush t.bigarray t.buffer;
      t.idx <- 0
    );
  t.buffer.(t.idx) <- c;
  t.idx <- t.idx + 1

let append_little_uint16 t i =
  append_char t ((i lsr 8) land 0xFF);
  append_char t ((i lsr 0) land 0xFF)
  

I have used this kind of technique and it seems as fast as C, and a lot
less C coding.

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-28 14:16 ` Sylvain Le Gall
@ 2009-10-28 15:00   ` Goswin von Brederlow
  2009-10-28 15:17     ` Sylvain Le Gall
  2009-10-29 20:40     ` Florian Weimer
  0 siblings, 2 replies; 38+ messages in thread
From: Goswin von Brederlow @ 2009-10-28 15:00 UTC (permalink / raw)
  To: caml-list

Sylvain Le Gall <sylvain@le-gall.net> writes:

> Hello,
>
> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>> Hi,
>>
>> I'm working on binding s for linux libaio library (asynchron IO) with
>> a sharp eye on efficiency. That means no copying must be done on the
>> data, which in turn means I can not use string as buffer type.
>>
>> The best type for this seems to be a (int, int8_unsigned_elt,
>> c_layout) Bigarray.Array1.t. So far so good.
>>
>> Now I define helper functions:
>>
>> let get_uint8 buf off = buf.{off}
>> let set_uint8 buf off x = buf.{off} <- x
>>
>> But I want more:
>>
>> get/set_int8 - do I use Obj.magic to "convert" to int8_signed_elt?
>>
>> And endian correcting access for larger ints:
>>
>> get/set_big_uint16
>> get/set_big_int16
>> get/set_little_uint16
>> get/set_little_int16
>> get/set_big_uint24
>> ...
>> get/set_little_int56
>> get/set_big_int64
>> get/set_little_int64
>>
>> What is the best way there? For uintXX I can get_uint8 each byte and
>> shift and add them together. But that feels inefficient as each access
>> will range check and the shifting generates a lot of code while cpus
>> can usualy endian correct an int more elegantly.
>>
>> Is it worth the overhead of calling a C function to write optimized
>> stubs for this?
>>
>> And last:
>>
>> get/set_string, blit_from/to_string
>>
>> Do I create a string where needed and then loop over every char
>> calling s.(i) <- char_of_int buf.{off+i}? Or better a C function using
>> memcpy?
>>
>> What do you think?
>>
>
> Well, we talk about this a little bit, but here is my opinion:
> - calling a C function to add a single int will generate a big overhead
> - OCaml string are quite fast to modify values
>
> So to my mind the best option is to have a buffer string (say 16/32
> char) where you put data inside and flush it in a single C call to
> Bigarray. 
>
> E.g.:
> let append_char t c =
>   if t.idx >= 64 then
>     (
>       flush t.bigarray t.buffer;
>       t.idx <- 0
>     );
>   t.buffer.(t.idx) <- c;
>   t.idx <- t.idx + 1
>
> let append_little_uint16 t i =
>   append_char t ((i lsr 8) land 0xFF);
>   append_char t ((i lsr 0) land 0xFF)
>   
>
> I have used this kind of technique and it seems as fast as C, and a lot
> less C coding.
>
> Regards,
> Sylvain Le Gall

This wont work so nicely:

- Writes are not always in sequence. I want to do a stream access
  too where this could be verry effective. But the plain buffer is
  more for random / known offset access. At a minimum you would have
  holes for alignment.

- It makes read/write buffers complicated as you need to flush or peek
  the string in case of uncommited changes. I can't do write-only
  buffers as I want to be able to write a buffer and then add a
  checksum to it in my application. The lib should not block that.

- The data is passed to libaio and needs to be kept alive and unmoved
  as long as libaio knows it. I was hoping I could use the pointer to
  the data to register/unregister GC roots without having to add a
  another custom header and indirections.


I also still wonder how bad a C function call really is. Consider the
case of writing an int64.

Directly: You get one C call that does range check, endian convert and
write in one go.

Bffered: With your code you have 7 Int64 shifts, 8 Int64 lands, 8
conversions to int, at least one index check (more likely 8 to avoid
handling unaligned access) and 1/8 C call to blit the 64 byte buffer
string into the Bigarray.

MfG
        Goswin

PS: Is a.{i} <- x a C call?


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: How to read different ints from a Bigarray?
  2009-10-28 15:00   ` [Caml-list] " Goswin von Brederlow
@ 2009-10-28 15:17     ` Sylvain Le Gall
  2009-10-28 17:57       ` [Caml-list] " Goswin von Brederlow
  2009-10-29 20:40     ` Florian Weimer
  1 sibling, 1 reply; 38+ messages in thread
From: Sylvain Le Gall @ 2009-10-28 15:17 UTC (permalink / raw)
  To: caml-list

On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> Sylvain Le Gall <sylvain@le-gall.net> writes:
>
>> Hello,
>>
>> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>>> Hi,
>>>
>>
>> Well, we talk about this a little bit, but here is my opinion:
>> - calling a C function to add a single int will generate a big overhead
>> - OCaml string are quite fast to modify values
>>
>> So to my mind the best option is to have a buffer string (say 16/32
>> char) where you put data inside and flush it in a single C call to
>> Bigarray. 
>>
>> E.g.:
>> let append_char t c =
>>   if t.idx >= 64 then
>>     (
>>       flush t.bigarray t.buffer;
>>       t.idx <- 0
>>     );
>>   t.buffer.(t.idx) <- c;
>>   t.idx <- t.idx + 1
>>
>> let append_little_uint16 t i =
>>   append_char t ((i lsr 8) land 0xFF);
>>   append_char t ((i lsr 0) land 0xFF)
>>   
>>
>> I have used this kind of technique and it seems as fast as C, and a lot
>> less C coding.
>>
>> Regards,
>> Sylvain Le Gall
>
> This wont work so nicely:
>
> - Writes are not always in sequence. I want to do a stream access
>   too where this could be verry effective. But the plain buffer is
>   more for random / known offset access. At a minimum you would have
>   holes for alignment.
>
> - It makes read/write buffers complicated as you need to flush or peek
>   the string in case of uncommited changes. I can't do write-only
>   buffers as I want to be able to write a buffer and then add a
>   checksum to it in my application. The lib should not block that.
>

I was thinking to pure stream. It still stand with random access but you
don't get a lot less C function call. You just have to write less C
code.

>
> I also still wonder how bad a C function call really is. Consider the
> case of writing an int64.
>
> Directly: You get one C call that does range check, endian convert and
> write in one go.
>
> Bffered: With your code you have 7 Int64 shifts, 8 Int64 lands, 8
> conversions to int, at least one index check (more likely 8 to avoid
> handling unaligned access) and 1/8 C call to blit the 64 byte buffer
> string into the Bigarray.

Not at all, you begin to break your int64 into 3 int (24bit * 2 + 16bit)
and then 7 int shift, 8 int land. 

You can even manage to only break into 1 or 2 int.

And off course, you bypass index check. 

> PS: Is a.{i} <- x a C call?

Yes.

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] How to read different ints from a Bigarray?
  2009-10-28 13:54 How to read different ints from a Bigarray? Goswin von Brederlow
  2009-10-28 14:16 ` Sylvain Le Gall
@ 2009-10-28 15:37 ` Olivier Andrieu
  2009-10-28 16:05   ` Sylvain Le Gall
  2009-10-28 15:43 ` [Caml-list] " Gerd Stolpmann
  2009-10-28 17:09 ` Xavier Leroy
  3 siblings, 1 reply; 38+ messages in thread
From: Olivier Andrieu @ 2009-10-28 15:37 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: caml-list

On Wed, Oct 28, 2009 at 14:54, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> Hi,
>
> I'm working on binding s for linux libaio library (asynchron IO) with
> a sharp eye on efficiency. That means no copying must be done on the
> data, which in turn means I can not use string as buffer type.

hmm I think you could try with strings.
You need to allocate the storage yourself (with malloc) but then, as
long as you properly set up the header and last field of the block as
OCaml does for its native strings, the runtime will use it without
problems. The GC will see that the block is outside the Caml heap and
won't try to manage it.
And on the Caml side you can use it as a regular string.
The only caveat I know is that this disrupts the polymorphic
comparison function a bit.

That's worth a try IMO.



Sylvain Le Gall <sylvain@le-gall.net> wrote:
> > PS: Is a.{i} <- x a C call?
> Yes.

really ? Given the number of Pbigarray* constructors in the compiler
code, I'd be surprised :)
No I think that for some cases like "accessing a 64bits bigarray on a
32bits arch" result in a C call, but otherwise it's handled by the
compiler.

-- 
  Olivier


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] How to read different ints from a Bigarray?
  2009-10-28 13:54 How to read different ints from a Bigarray? Goswin von Brederlow
  2009-10-28 14:16 ` Sylvain Le Gall
  2009-10-28 15:37 ` [Caml-list] " Olivier Andrieu
@ 2009-10-28 15:43 ` Gerd Stolpmann
  2009-10-28 16:06   ` Sylvain Le Gall
  2009-10-28 18:09   ` [Caml-list] " Goswin von Brederlow
  2009-10-28 17:09 ` Xavier Leroy
  3 siblings, 2 replies; 38+ messages in thread
From: Gerd Stolpmann @ 2009-10-28 15:43 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: caml-list


Am Mittwoch, den 28.10.2009, 14:54 +0100 schrieb Goswin von Brederlow:
> Hi,
> 
> I'm working on binding s for linux libaio library (asynchron IO) with
> a sharp eye on efficiency. That means no copying must be done on the
> data, which in turn means I can not use string as buffer type.
> 
> The best type for this seems to be a (int, int8_unsigned_elt,
> c_layout) Bigarray.Array1.t. So far so good.
> 
> Now I define helper functions:
> 
> let get_uint8 buf off = buf.{off}
> let set_uint8 buf off x = buf.{off} <- x
> 
> But I want more:
> 
> get/set_int8 - do I use Obj.magic to "convert" to int8_signed_elt?
> 
> And endian correcting access for larger ints:
> 
> get/set_big_uint16
> get/set_big_int16
> get/set_little_uint16
> get/set_little_int16
> get/set_big_uint24
> ...
> get/set_little_int56
> get/set_big_int64
> get/set_little_int64
> 
> What is the best way there? For uintXX I can get_uint8 each byte and
> shift and add them together. But that feels inefficient as each access
> will range check and the shifting generates a lot of code while cpus
> can usualy endian correct an int more elegantly.
> 
> Is it worth the overhead of calling a C function to write optimized
> stubs for this?
> 
> And last:
> 
> get/set_string, blit_from/to_string
> 
> Do I create a string where needed and then loop over every char
> calling s.(i) <- char_of_int buf.{off+i}? Or better a C function using
> memcpy?
> 
> What do you think?

A C call is too expensive for a single int (and ocamlopt). The runtime
needs to fix the stack and make it look C-compatible before it can do
the call. Maybe it's ok for an int64.

Can you ensure that you only access the int's at word boundaries? If so,
it would be an option to wrap the same malloc'ed block of memory with
several bigarrays, e.g. you use an (int, int8_unsigned_elt, c_layout)
Bigarray.Array1.t when you access on byte level, but an (int32,
int32_unsigned_elt, c_layout) Bigarray.Array1.t when you access on int32
level, but both bigarrays would point to the same block and share data.
This is trivial to do from C, just create several wrappers for the same
memory.

The nice thing about bigarrays is that the compiler can emit assembly
instructions for accessing them. Much faster than picking bytes and
reconstructing the int's on the caml side. However, if you cannot ensure
aligned int's the latter is probably unavoidable.

Btw, I would be interested in your aio bindings if you do them as open
source project.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann, Bad Nauheimer Str.3, 64289 Darmstadt,Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: How to read different ints from a Bigarray?
  2009-10-28 15:37 ` [Caml-list] " Olivier Andrieu
@ 2009-10-28 16:05   ` Sylvain Le Gall
  0 siblings, 0 replies; 38+ messages in thread
From: Sylvain Le Gall @ 2009-10-28 16:05 UTC (permalink / raw)
  To: caml-list

Hello,

On 28-10-2009, Olivier Andrieu <oandrieu@nerim.net> wrote:
> On Wed, Oct 28, 2009 at 14:54, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>> Yes.
>
> really ? Given the number of Pbigarray* constructors in the compiler
> code, I'd be surprised :)
> No I think that for some cases like "accessing a 64bits bigarray on a
> 32bits arch" result in a C call, but otherwise it's handled by the
> compiler.
>

Indeed I just test and you are right. I must have experienced this
behavior with int64 or something like this.

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: How to read different ints from a Bigarray?
  2009-10-28 15:43 ` [Caml-list] " Gerd Stolpmann
@ 2009-10-28 16:06   ` Sylvain Le Gall
  2009-10-28 18:09   ` [Caml-list] " Goswin von Brederlow
  1 sibling, 0 replies; 38+ messages in thread
From: Sylvain Le Gall @ 2009-10-28 16:06 UTC (permalink / raw)
  To: caml-list

On 28-10-2009, Gerd Stolpmann <gerd@gerd-stolpmann.de> wrote:
> Am Mittwoch, den 28.10.2009, 14:54 +0100 schrieb Goswin von Brederlow:
>
> Btw, I would be interested in your aio bindings if you do them as open
> source project.
>

Of course:
http://forge.ocamlcore.org/projects/libaio-ocaml/

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] How to read different ints from a Bigarray?
  2009-10-28 13:54 How to read different ints from a Bigarray? Goswin von Brederlow
                   ` (2 preceding siblings ...)
  2009-10-28 15:43 ` [Caml-list] " Gerd Stolpmann
@ 2009-10-28 17:09 ` Xavier Leroy
  2009-10-28 19:05   ` Goswin von Brederlow
  2009-10-29 17:05   ` Goswin von Brederlow
  3 siblings, 2 replies; 38+ messages in thread
From: Xavier Leroy @ 2009-10-28 17:09 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: caml-list

Goswin von Brederlow wrote:

> I'm working on binding s for linux libaio library (asynchron IO) with
> a sharp eye on efficiency. That means no copying must be done on the
> data, which in turn means I can not use string as buffer type.
> 
> The best type for this seems to be a (int, int8_unsigned_elt,
> c_layout) Bigarray.Array1.t. So far so good.

That's a reasonable choice.

> Now I define helper functions:
> 
> let get_uint8 buf off = buf.{off}
> let set_uint8 buf off x = buf.{off} <- x
> 
> But I want more:
> 
> get/set_int8 - do I use Obj.magic to "convert" to int8_signed_elt?

Not at all.  If you ask OCaml's typechecker to infer the type of
get_uint8, you'll see that it returns a plain OCaml "int" (in the
0...255 range). Likewise, the "x" parameter to "set_uint8" has type
"int" (of which only the 8 low bits are used).

Repeat after me: "Obj.magic is not part of the OCaml language".

> And endian correcting access for larger ints:
> 
> get/set_big_uint16
> get/set_big_int16
> get/set_little_uint16
> get/set_little_int16
> get/set_big_uint24
> ...
> get/set_little_int56
> get/set_big_int64
> get/set_little_int64

The "56" functions look like a bit of overkill to me :-)

> What is the best way there? For uintXX I can get_uint8 each byte and
> shift and add them together. But that feels inefficient as each access
> will range check

Not necessarily.  OCaml 3.11 introduced unchecked accesses to
bigarrays, so you can range-check yourself once, then perform
unchecked accesses.  Use with caution...

> and the shifting generates a lot of code while cpus
> can usualy endian correct an int more elegantly.
> 
> Is it worth the overhead of calling a C function to write optimized
> stubs for this?

The only way to know is to benchmark both approaches :-(  My guess is
that for 16-bit accesses, you're better off with a pure Caml solution,
but for 64-bit accesses, a C function could be faster.

- Xavier Leroy


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-28 15:17     ` Sylvain Le Gall
@ 2009-10-28 17:57       ` Goswin von Brederlow
  2009-10-28 18:19         ` Sylvain Le Gall
  2009-10-28 22:48         ` [Caml-list] " blue storm
  0 siblings, 2 replies; 38+ messages in thread
From: Goswin von Brederlow @ 2009-10-28 17:57 UTC (permalink / raw)
  To: Sylvain Le Gall; +Cc: caml-list

Sylvain Le Gall <sylvain@le-gall.net> writes:

> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>> Sylvain Le Gall <sylvain@le-gall.net> writes:
>>
>>> Hello,
>>>
>>> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>>>> Hi,
>>>>
>>>
>>> Well, we talk about this a little bit, but here is my opinion:
>>> - calling a C function to add a single int will generate a big overhead
>>> - OCaml string are quite fast to modify values
>>>
>>> So to my mind the best option is to have a buffer string (say 16/32
>>> char) where you put data inside and flush it in a single C call to
>>> Bigarray. 
>>>
>>> E.g.:
>>> let append_char t c =
>>>   if t.idx >= 64 then
>>>     (
>>>       flush t.bigarray t.buffer;
>>>       t.idx <- 0
>>>     );
>>>   t.buffer.(t.idx) <- c;
>>>   t.idx <- t.idx + 1
>>>
>>> let append_little_uint16 t i =
>>>   append_char t ((i lsr 8) land 0xFF);
>>>   append_char t ((i lsr 0) land 0xFF)
>>>   
>>>
>>> I have used this kind of technique and it seems as fast as C, and a lot
>>> less C coding.
>>>
>>> Regards,
>>> Sylvain Le Gall
>>
>> This wont work so nicely:
>>
>> - Writes are not always in sequence. I want to do a stream access
>>   too where this could be verry effective. But the plain buffer is
>>   more for random / known offset access. At a minimum you would have
>>   holes for alignment.
>>
>> - It makes read/write buffers complicated as you need to flush or peek
>>   the string in case of uncommited changes. I can't do write-only
>>   buffers as I want to be able to write a buffer and then add a
>>   checksum to it in my application. The lib should not block that.
>>
>
> I was thinking to pure stream. It still stand with random access but you
> don't get a lot less C function call. You just have to write less C
> code.

set_uint8 buf 5 1 -> read in 64 byte from stream, skip to 5, set byte
set uint8 buf 100 1 -> write 64 byte, read other 64 byte, set byte

That can become real expensive.

>> I also still wonder how bad a C function call really is. Consider the
>> case of writing an int64.
>>
>> Directly: You get one C call that does range check, endian convert and
>> write in one go.
>>
>> Bffered: With your code you have 7 Int64 shifts, 8 Int64 lands, 8
>> conversions to int, at least one index check (more likely 8 to avoid
>> handling unaligned access) and 1/8 C call to blit the 64 byte buffer
>> string into the Bigarray.
>
> Not at all, you begin to break your int64 into 3 int (24bit * 2 + 16bit)
> and then 7 int shift, 8 int land. 
>
> You can even manage to only break into 1 or 2 int.
>
> And off course, you bypass index check. 

fun with unaligned writes.

>> PS: Is a.{i} <- x a C call?
>
> Yes.

That obviously sucks. I was hoping since the compiler has a special
syntax for it it would be built-in. Bigarray being a seperate module
should have clued me in.

That obviously speaks against splitting int64 into 8 bytes and calling
a.{i} <- x for each.

I think I will implement your method and C stubs for every set/get and
compare.

Maybe ideal would be a format string based interface that calls C with
a format string and a record of values. Because what I really need is
to read/write records in an architecture independend way. Something
like

type t = { x:int; y:char; z:int64 }
let t_format = "%2u%c%8d"

put_formated buf t_format t

But how to get that type safe? Maybe a camlp4 module that generates
the format string and type from a single declaration so they always
match.

> Regards,
> Sylvain Le Gall

MfG
        Goswin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] How to read different ints from a Bigarray?
  2009-10-28 15:43 ` [Caml-list] " Gerd Stolpmann
  2009-10-28 16:06   ` Sylvain Le Gall
@ 2009-10-28 18:09   ` Goswin von Brederlow
  1 sibling, 0 replies; 38+ messages in thread
From: Goswin von Brederlow @ 2009-10-28 18:09 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: Goswin von Brederlow, caml-list

Gerd Stolpmann <gerd@gerd-stolpmann.de> writes:

> Am Mittwoch, den 28.10.2009, 14:54 +0100 schrieb Goswin von Brederlow:
>> Hi,
>> 
>> I'm working on binding s for linux libaio library (asynchron IO) with
>> a sharp eye on efficiency. That means no copying must be done on the
>> data, which in turn means I can not use string as buffer type.
>> 
>> The best type for this seems to be a (int, int8_unsigned_elt,
>> c_layout) Bigarray.Array1.t. So far so good.
>> 
>> Now I define helper functions:
>> 
>> let get_uint8 buf off = buf.{off}
>> let set_uint8 buf off x = buf.{off} <- x
>> 
>> But I want more:
>> 
>> get/set_int8 - do I use Obj.magic to "convert" to int8_signed_elt?
>> 
>> And endian correcting access for larger ints:
>> 
>> get/set_big_uint16
>> get/set_big_int16
>> get/set_little_uint16
>> get/set_little_int16
>> get/set_big_uint24
>> ...
>> get/set_little_int56
>> get/set_big_int64
>> get/set_little_int64
>> 
>> What is the best way there? For uintXX I can get_uint8 each byte and
>> shift and add them together. But that feels inefficient as each access
>> will range check and the shifting generates a lot of code while cpus
>> can usualy endian correct an int more elegantly.
>> 
>> Is it worth the overhead of calling a C function to write optimized
>> stubs for this?
>> 
>> And last:
>> 
>> get/set_string, blit_from/to_string
>> 
>> Do I create a string where needed and then loop over every char
>> calling s.(i) <- char_of_int buf.{off+i}? Or better a C function using
>> memcpy?
>> 
>> What do you think?
>
> A C call is too expensive for a single int (and ocamlopt). The runtime
> needs to fix the stack and make it look C-compatible before it can do
> the call. Maybe it's ok for an int64.
>
> Can you ensure that you only access the int's at word boundaries? If so,
> it would be an option to wrap the same malloc'ed block of memory with
> several bigarrays, e.g. you use an (int, int8_unsigned_elt, c_layout)
> Bigarray.Array1.t when you access on byte level, but an (int32,
> int32_unsigned_elt, c_layout) Bigarray.Array1.t when you access on int32
> level, but both bigarrays would point to the same block and share data.
> This is trivial to do from C, just create several wrappers for the same
> memory.

I actualy need 512 byte aligned (better page aligned) data so that is
definetly a possibility if only aligned access is required.

> The nice thing about bigarrays is that the compiler can emit assembly
> instructions for accessing them. Much faster than picking bytes and
> reconstructing the int's on the caml side. However, if you cannot ensure
> aligned int's the latter is probably unavoidable.

So a.{i} <- x is not a C call. That is good to know.

That leaves only the problem of endian conversion. I guess I could
live with reading the int and shifting the bytes around for the rare
cases of endianess of cpu and data differing. I might even not bother
providing that since I don't need it at all.

> Btw, I would be interested in your aio bindings if you do them as open
> source project.

See other mail. There is also an libfuse-ocaml that uses libaio-ocaml
(althout that source is already in git instead of svn) if you want to
see some more extensive use than the test.ml.

> Gerd

MfG
        Goswin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: How to read different ints from a Bigarray?
  2009-10-28 17:57       ` [Caml-list] " Goswin von Brederlow
@ 2009-10-28 18:19         ` Sylvain Le Gall
  2009-10-28 21:05           ` [Caml-list] " Goswin von Brederlow
  2009-10-28 22:48         ` [Caml-list] " blue storm
  1 sibling, 1 reply; 38+ messages in thread
From: Sylvain Le Gall @ 2009-10-28 18:19 UTC (permalink / raw)
  To: caml-list

On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> Sylvain Le Gall <sylvain@le-gall.net> writes:
>> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>>> Sylvain Le Gall <sylvain@le-gall.net> writes:
>>>> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>
>>> PS: Is a.{i} <- x a C call?
>>
>> Yes.
>
> That obviously sucks. I was hoping since the compiler has a special
> syntax for it it would be built-in. Bigarray being a seperate module
> should have clued me in.
>
> That obviously speaks against splitting int64 into 8 bytes and calling
> a.{i} <- x for each.
>
> I think I will implement your method and C stubs for every set/get and
> compare.

This is only the case with int64 array in fact (I really have done test
and you don't need a C call in most case).

Moreover, as Xavier suggests, Array1.unsafe_get/set seems nice.

I would however try to find a way to avoid writing too many
set/get_{uint*} functions. This can be a nighmare to maintain.

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] How to read different ints from a Bigarray?
  2009-10-28 17:09 ` Xavier Leroy
@ 2009-10-28 19:05   ` Goswin von Brederlow
  2009-10-29 17:05   ` Goswin von Brederlow
  1 sibling, 0 replies; 38+ messages in thread
From: Goswin von Brederlow @ 2009-10-28 19:05 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: Goswin von Brederlow, caml-list

Xavier Leroy <Xavier.Leroy@inria.fr> writes:

> Goswin von Brederlow wrote:
>
>> I'm working on binding s for linux libaio library (asynchron IO) with
>> a sharp eye on efficiency. That means no copying must be done on the
>> data, which in turn means I can not use string as buffer type.
>> 
>> The best type for this seems to be a (int, int8_unsigned_elt,
>> c_layout) Bigarray.Array1.t. So far so good.
>
> That's a reasonable choice.

Actualy signed seems better. Easier to get an int and mask out the
lower 8 bit to get unsigned then sign extend. Or?

>> Now I define helper functions:
>> 
>> let get_uint8 buf off = buf.{off}
>> let set_uint8 buf off x = buf.{off} <- x
>> 
>> But I want more:
>> 
>> get/set_int8 - do I use Obj.magic to "convert" to int8_signed_elt?
>
> Not at all.  If you ask OCaml's typechecker to infer the type of
> get_uint8, you'll see that it returns a plain OCaml "int" (in the
> 0...255 range). Likewise, the "x" parameter to "set_uint8" has type
> "int" (of which only the 8 low bits are used).

The point was to make get_int8 to return an int in the -128..127
range and get_uint8 in the 0..255 range. That both are int doesn't
matter.

> Repeat after me: "Obj.magic is not part of the OCaml language".

Somebody else suggested to create an (int, int8_unsigned_elt,
c_layout) Bigarray.Array1.t and (int, int8_signed_elt,
c_layout) Bigarray.Array1.t and (int, int16_unsigned_elt,
c_layout) Bigarray.Array1.t and (int, int16_signed_elt,
c_layout) Bigarray.Array1.t and ... that all point to the same block
of bits. As evil as Obj.Magic I guess but might work nicely.

>> And endian correcting access for larger ints:
>> 
>> get/set_big_uint16
>> get/set_big_int16
>> get/set_little_uint16
>> get/set_little_int16
>> get/set_big_uint24
>> ...
>> get/set_little_int56
>> get/set_big_int64
>> get/set_little_int64
>
> The "56" functions look like a bit of overkill to me :-)

For one part I am storing keys in there consisting of

struct Key {
  uint64_t type:8; // enum { TYPE1, TYPE2, TYPE3, ... };
  uint64_t inode:56;
  uint64_t data;
}

That gives a nice 16 bytes for a key but requires splitting the first
uint64_t into 8 and 56 bit.  I could provide only get_int64 and split
that in ocaml but what the hell. A function more or less doesn't kill
me.

>> What is the best way there? For uintXX I can get_uint8 each byte and
>> shift and add them together. But that feels inefficient as each access
>> will range check
>
> Not necessarily.  OCaml 3.11 introduced unchecked accesses to
> bigarrays, so you can range-check yourself once, then perform
> unchecked accesses.  Use with caution...

I'm always verry cautious of such. In the existing code I already
needed some unsafe_string that I really didn't like. Need to add
phantom types to get rid of them some day.

>> and the shifting generates a lot of code while cpus
>> can usualy endian correct an int more elegantly.
>> 
>> Is it worth the overhead of calling a C function to write optimized
>> stubs for this?
>
> The only way to know is to benchmark both approaches :-(  My guess is
> that for 16-bit accesses, you're better off with a pure Caml solution,
> but for 64-bit accesses, a C function could be faster.
>
> - Xavier Leroy

Writing benchmark code, writing, writing. Now where is that big endian
cpu to test converting from little endian? :)))

MfG
        Goswin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-28 18:19         ` Sylvain Le Gall
@ 2009-10-28 21:05           ` Goswin von Brederlow
  2009-10-28 21:26             ` Sylvain Le Gall
  0 siblings, 1 reply; 38+ messages in thread
From: Goswin von Brederlow @ 2009-10-28 21:05 UTC (permalink / raw)
  To: Sylvain Le Gall; +Cc: caml-list

Sylvain Le Gall <sylvain@le-gall.net> writes:

> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>> Sylvain Le Gall <sylvain@le-gall.net> writes:
>>> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>>>> Sylvain Le Gall <sylvain@le-gall.net> writes:
>>>>> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>>
>>>> PS: Is a.{i} <- x a C call?
>>>
>>> Yes.
>>
>> That obviously sucks. I was hoping since the compiler has a special
>> syntax for it it would be built-in. Bigarray being a seperate module
>> should have clued me in.
>>
>> That obviously speaks against splitting int64 into 8 bytes and calling
>> a.{i} <- x for each.
>>
>> I think I will implement your method and C stubs for every set/get and
>> compare.
>
> This is only the case with int64 array in fact (I really have done test
> and you don't need a C call in most case).

Can I assume you tested on a 32bit cpu?

MfG
        Goswin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: How to read different ints from a Bigarray?
  2009-10-28 21:05           ` [Caml-list] " Goswin von Brederlow
@ 2009-10-28 21:26             ` Sylvain Le Gall
  0 siblings, 0 replies; 38+ messages in thread
From: Sylvain Le Gall @ 2009-10-28 21:26 UTC (permalink / raw)
  To: caml-list

On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> Sylvain Le Gall <sylvain@le-gall.net> writes:
>> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>>> Sylvain Le Gall <sylvain@le-gall.net> writes:
>>>> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>>>>> Sylvain Le Gall <sylvain@le-gall.net> writes:
>>>>>> On 28-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>>>
>>> a.{i} <- x for each.
>>>
>>> I think I will implement your method and C stubs for every set/get and
>>> compare.
>>
>> This is only the case with int64 array in fact (I really have done test
>> and you don't need a C call in most case).
>
> Can I assume you tested on a 32bit cpu?
>

Yes. There is probably even less case on 64bits CPU.

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-28 17:57       ` [Caml-list] " Goswin von Brederlow
  2009-10-28 18:19         ` Sylvain Le Gall
@ 2009-10-28 22:48         ` blue storm
  2009-10-29  9:50           ` Goswin von Brederlow
  1 sibling, 1 reply; 38+ messages in thread
From: blue storm @ 2009-10-28 22:48 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: Sylvain Le Gall, caml-list

On Wed, Oct 28, 2009 at 6:57 PM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> Maybe ideal would be a format string based interface that calls C with
> a format string and a record of values. Because what I really need is
> to read/write records in an architecture independend way. Something
> like
>
> type t = { x:int; y:char; z:int64 }
> let t_format = "%2u%c%8d"
>
> put_formated buf t_format t
>
> But how to get that type safe? Maybe a camlp4 module that generates
> the format string and type from a single declaration so they always
> match.

It's possibly off-topic, but you might be interested in Richard
Jones's Bitstring project [1] wich deals with similar issues quite
nicely in my opinion.

[1] http://code.google.com/p/bitstring/


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-28 22:48         ` [Caml-list] " blue storm
@ 2009-10-29  9:50           ` Goswin von Brederlow
  2009-10-29 10:34             ` Goswin von Brederlow
  2009-10-29 12:20             ` Richard Jones
  0 siblings, 2 replies; 38+ messages in thread
From: Goswin von Brederlow @ 2009-10-29  9:50 UTC (permalink / raw)
  To: blue storm; +Cc: Goswin von Brederlow, Sylvain Le Gall, caml-list

blue storm <bluestorm.dylc@gmail.com> writes:

> On Wed, Oct 28, 2009 at 6:57 PM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>> Maybe ideal would be a format string based interface that calls C with
>> a format string and a record of values. Because what I really need is
>> to read/write records in an architecture independend way. Something
>> like
>>
>> type t = { x:int; y:char; z:int64 }
>> let t_format = "%2u%c%8d"
>>
>> put_formated buf t_format t
>>
>> But how to get that type safe? Maybe a camlp4 module that generates
>> the format string and type from a single declaration so they always
>> match.
>
> It's possibly off-topic, but you might be interested in Richard
> Jones's Bitstring project [1] wich deals with similar issues quite
> nicely in my opinion.
>
> [1] http://code.google.com/p/bitstring/

No, quite on-topic.

I glanced at the examples and code and it looks to me though as if
this can only parse bitstrings but not create them from a pattern.
You have

let parse_foo bits =
  bitmatch bits with
  | { x : 16 : littleendian; y : 16 : littleendian } -> fun x y -> (x, y)

but no

let unparse_foo (x, y) =
  bitmake { x : 16 : littleendian; y : 16 : littleendian } x y


Idealy would be something along

let pattern = make_pattern { x : 16 : littleendian; y : 16 : littleendian }
let parse_foo bits = parse pattern (fun x y -> (x, y))
let unparse_foo (x, y) = unparse pattern x y

But I know how to do that with CPS already. I just need the primitives
to get/set the basic types.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29  9:50           ` Goswin von Brederlow
@ 2009-10-29 10:34             ` Goswin von Brederlow
  2009-10-29 12:20             ` Richard Jones
  1 sibling, 0 replies; 38+ messages in thread
From: Goswin von Brederlow @ 2009-10-29 10:34 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: blue storm, Sylvain Le Gall, caml-list

Goswin von Brederlow <goswin-v-b@web.de> writes:

> blue storm <bluestorm.dylc@gmail.com> writes:
>
>> On Wed, Oct 28, 2009 at 6:57 PM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>>> Maybe ideal would be a format string based interface that calls C with
>>> a format string and a record of values. Because what I really need is
>>> to read/write records in an architecture independend way. Something
>>> like
>>>
>>> type t = { x:int; y:char; z:int64 }
>>> let t_format = "%2u%c%8d"
>>>
>>> put_formated buf t_format t
>>>
>>> But how to get that type safe? Maybe a camlp4 module that generates
>>> the format string and type from a single declaration so they always
>>> match.
>>
>> It's possibly off-topic, but you might be interested in Richard
>> Jones's Bitstring project [1] wich deals with similar issues quite
>> nicely in my opinion.
>>
>> [1] http://code.google.com/p/bitstring/
>
> No, quite on-topic.
>
> I glanced at the examples and code and it looks to me though as if
> this can only parse bitstrings but not create them from a pattern.
> You have
>
> let parse_foo bits =
>   bitmatch bits with
>   | { x : 16 : littleendian; y : 16 : littleendian } -> fun x y -> (x, y)
>
> but no
>
> let unparse_foo (x, y) =
>   bitmake { x : 16 : littleendian; y : 16 : littleendian } x y
>
>
> Idealy would be something along
>
> let pattern = make_pattern { x : 16 : littleendian; y : 16 : littleendian }
> let parse_foo bits = parse pattern (fun x y -> (x, y))
> let unparse_foo (x, y) = unparse pattern x y
>
> But I know how to do that with CPS already. I just need the primitives
> to get/set the basic types.
>
> MfG
>         Goswin

And I was wrong. There is

http://code.google.com/p/bitstring/source/browse/trunk/examples/make_ipv4_header.ml

as an example. Not ideal since parsing and unparsing will duplicate
the pattern definition but that will be locale for each type.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29  9:50           ` Goswin von Brederlow
  2009-10-29 10:34             ` Goswin von Brederlow
@ 2009-10-29 12:20             ` Richard Jones
  2009-10-29 17:07               ` Goswin von Brederlow
  1 sibling, 1 reply; 38+ messages in thread
From: Richard Jones @ 2009-10-29 12:20 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: blue storm, Sylvain Le Gall, caml-list

On Thu, Oct 29, 2009 at 10:50:31AM +0100, Goswin von Brederlow wrote:
> but no
> 
> let unparse_foo (x, y) =
>   bitmake { x : 16 : littleendian; y : 16 : littleendian } x y

See:

http://et.redhat.com/~rjones/bitstring/html/Bitstring.html#2_Constructingbitstrings

I don't necessarily think bitstring is suitable here though because
you still need to read your data into a string (or fake a string on
the C heap as Olivier Andrieu mentioned).  I think in this case you'd
be better off just writing this part of the code in C.

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] How to read different ints from a Bigarray?
  2009-10-28 17:09 ` Xavier Leroy
  2009-10-28 19:05   ` Goswin von Brederlow
@ 2009-10-29 17:05   ` Goswin von Brederlow
  2009-10-29 18:42     ` Christophe TROESTLER
  2009-10-29 18:48     ` Sylvain Le Gall
  1 sibling, 2 replies; 38+ messages in thread
From: Goswin von Brederlow @ 2009-10-29 17:05 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: Goswin von Brederlow, caml-list

Xavier Leroy <Xavier.Leroy@inria.fr> writes:

> Goswin von Brederlow wrote:
>
>> I'm working on binding s for linux libaio library (asynchron IO) with
>> a sharp eye on efficiency. That means no copying must be done on the
>> data, which in turn means I can not use string as buffer type.
>> 
>> The best type for this seems to be a (int, int8_unsigned_elt,
>> c_layout) Bigarray.Array1.t. So far so good.
>
> That's a reasonable choice.
>
>> Now I define helper functions:
>> 
>> let get_uint8 buf off = buf.{off}
>> let set_uint8 buf off x = buf.{off} <- x
>> 
>> But I want more:
>> 
>> get/set_int8 - do I use Obj.magic to "convert" to int8_signed_elt?
>
> Not at all.  If you ask OCaml's typechecker to infer the type of
> get_uint8, you'll see that it returns a plain OCaml "int" (in the
> 0...255 range). Likewise, the "x" parameter to "set_uint8" has type
> "int" (of which only the 8 low bits are used).
>
> Repeat after me: "Obj.magic is not part of the OCaml language".
>
>> And endian correcting access for larger ints:
>> 
>> get/set_big_uint16
>> get/set_big_int16
>> get/set_little_uint16
>> get/set_little_int16
>> get/set_big_uint24
>> ...
>> get/set_little_int56
>> get/set_big_int64
>> get/set_little_int64
>
> The "56" functions look like a bit of overkill to me :-)
>
>> What is the best way there? For uintXX I can get_uint8 each byte and
>> shift and add them together. But that feels inefficient as each access
>> will range check
>
> Not necessarily.  OCaml 3.11 introduced unchecked accesses to
> bigarrays, so you can range-check yourself once, then perform
> unchecked accesses.  Use with caution...
>
>> and the shifting generates a lot of code while cpus
>> can usualy endian correct an int more elegantly.
>> 
>> Is it worth the overhead of calling a C function to write optimized
>> stubs for this?
>
> The only way to know is to benchmark both approaches :-(  My guess is
> that for 16-bit accesses, you're better off with a pure Caml solution,
> but for 64-bit accesses, a C function could be faster.
>
> - Xavier Leroy

Here are some benchmark results:

get an int out of a string:
                C               Ocaml
  uint8  le     19.496          17.433
   int8  le     19.298          17.850
  uint16 le     19.427          25.046
   int16 le     19.383          27.664
  uint16 be     20.502          23.200
   int16 be     20.350          27.535

get an int out of a Bigarray.Array1.t:
		safe		unsafe
  uint8  le	55.194s		54.508s
  uint64 le     80.51s		81.46s

Now to be fair the C code is unsafe as it does no boundary check. I
intend to get/set larger structures so I only need to check if all of
the structure fits. So most of the time I want unsafe calls and String
does not have any. And storing an int64, int32 does not need to check
for overflow for every single byte written in char_of_int.

The Bigarray unsafe_get is really disapointing. Note that uint64 is so
much slower because of allocating the result (my guess). Array1.set
runs the same speed for uint8 and uint64.

Overall it looks like C calls just aren't that expensive and endian
and sign conversions in ocaml plain suck.

I can not use an ocaml string as my buffer must be aligned and
unmovable (required by the linux kernel). A string manually created
outside the GC heap will never be freeed by the GC so that is out of
the question too. And Bigarray is plain too slow.

So a well defined custom type with access functions from both C and
Ocaml seems to be the way to go. As needed one can then also write a
stub for get/set of e.g. struct { uint64_t kind : 8; unit64_t inode;
uint64_t data; } <-> type key = Meta of int64 * int64 | Inode of
inode_t | Block of inode_t * block_t.

So much for the idea to get rid of the custom buffer type in libaio.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29 12:20             ` Richard Jones
@ 2009-10-29 17:07               ` Goswin von Brederlow
  2009-10-30 20:30                 ` Richard Jones
  0 siblings, 1 reply; 38+ messages in thread
From: Goswin von Brederlow @ 2009-10-29 17:07 UTC (permalink / raw)
  To: caml-list

Richard Jones <rich@annexia.org> writes:

> On Thu, Oct 29, 2009 at 10:50:31AM +0100, Goswin von Brederlow wrote:
>> but no
>> 
>> let unparse_foo (x, y) =
>>   bitmake { x : 16 : littleendian; y : 16 : littleendian } x y
>
> See:
>
> http://et.redhat.com/~rjones/bitstring/html/Bitstring.html#2_Constructingbitstrings
>
> I don't necessarily think bitstring is suitable here though because
> you still need to read your data into a string (or fake a string on
> the C heap as Olivier Andrieu mentioned).  I think in this case you'd
> be better off just writing this part of the code in C.
>
> Rich.

I still can reuse a lot of this. Esspecially the syntax extension
seems like a good idea. Maybe reduced to bytes instead of bits
though. I don't intend to use such fine grained structures to need bit
access.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] How to read different ints from a Bigarray?
  2009-10-29 17:05   ` Goswin von Brederlow
@ 2009-10-29 18:42     ` Christophe TROESTLER
  2009-10-29 19:03       ` Goswin von Brederlow
  2009-10-29 18:48     ` Sylvain Le Gall
  1 sibling, 1 reply; 38+ messages in thread
From: Christophe TROESTLER @ 2009-10-29 18:42 UTC (permalink / raw)
  To: caml-list

On Thu, 29 Oct 2009 18:05:37 +0100, Goswin von Brederlow wrote:
> 
> get an int out of a string:
>                 C               Ocaml
>   uint8  le     19.496          17.433
>    int8  le     19.298          17.850
>   uint16 le     19.427          25.046
>    int16 le     19.383          27.664
>   uint16 be     20.502          23.200
>    int16 be     20.350          27.535
> 
> get an int out of a Bigarray.Array1.t:
> 		safe		unsafe
>   uint8  le	55.194s		54.508s
>   uint64 le     80.51s		81.46s
> 
> The Bigarray unsafe_get is really disapointing. Note that uint64 is so
> much slower because of allocating the result (my guess). Array1.set
> runs the same speed for uint8 and uint64.

This is likely because you used the polymorphic function to access
bigarrays (compile your code with -annot and press C-c C-t in Emacs
with the point on the variable).  For the compiler to be able to emit
fast code, you need to provide the monomorphic type of the bigarray:

  (a: (int, int8_unsigned_elt, c_layout) Array1.t)

(assuming you opened the Bigarray module).

Cheers,
C.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: How to read different ints from a Bigarray?
  2009-10-29 17:05   ` Goswin von Brederlow
  2009-10-29 18:42     ` Christophe TROESTLER
@ 2009-10-29 18:48     ` Sylvain Le Gall
  2009-10-29 23:25       ` [Caml-list] " Goswin von Brederlow
  1 sibling, 1 reply; 38+ messages in thread
From: Sylvain Le Gall @ 2009-10-29 18:48 UTC (permalink / raw)
  To: caml-list

On 29-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> Xavier Leroy <Xavier.Leroy@inria.fr> writes:
>> Goswin von Brederlow wrote:
>
> Here are some benchmark results:
>
> get an int out of a string:
>                 C               Ocaml
>   uint8  le     19.496          17.433
>    int8  le     19.298          17.850
>   uint16 le     19.427          25.046
>    int16 le     19.383          27.664
>   uint16 be     20.502          23.200
>    int16 be     20.350          27.535
>
> get an int out of a Bigarray.Array1.t:
> 		safe		unsafe
>   uint8  le	55.194s		54.508s
>   uint64 le     80.51s		81.46s
>

Can you provide us with the corresponding code and benchmark? 

Maybe you can just commit this in libaio/test/bench.ml.

Regards,
Sylvain Le Gall


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] How to read different ints from a Bigarray?
  2009-10-29 18:42     ` Christophe TROESTLER
@ 2009-10-29 19:03       ` Goswin von Brederlow
  0 siblings, 0 replies; 38+ messages in thread
From: Goswin von Brederlow @ 2009-10-29 19:03 UTC (permalink / raw)
  To: Christophe TROESTLER; +Cc: caml-list

Christophe TROESTLER <Christophe.Troestler+ocaml@umons.ac.be> writes:

> On Thu, 29 Oct 2009 18:05:37 +0100, Goswin von Brederlow wrote:
>> 
>> get an int out of a string:
>>                 C               Ocaml
>>   uint8  le     19.496          17.433
>>    int8  le     19.298          17.850
>>   uint16 le     19.427          25.046
>>    int16 le     19.383          27.664
>>   uint16 be     20.502          23.200
>>    int16 be     20.350          27.535
>> 
>> get an int out of a Bigarray.Array1.t:
>> 		safe		unsafe
>>   uint8  le	55.194s		54.508s
>>   uint64 le     80.51s		81.46s
>> 
>> The Bigarray unsafe_get is really disapointing. Note that uint64 is so
>> much slower because of allocating the result (my guess). Array1.set
>> runs the same speed for uint8 and uint64.
>
> This is likely because you used the polymorphic function to access
> bigarrays (compile your code with -annot and press C-c C-t in Emacs
> with the point on the variable).  For the compiler to be able to emit
> fast code, you need to provide the monomorphic type of the bigarray:
>
>   (a: (int, int8_unsigned_elt, c_layout) Array1.t)
>
> (assuming you opened the Bigarray module).
>
> Cheers,
> C.

Wow, you are right. Down to 14.919s. String.unsafe_get, which actualy
does exist despite not being documented, gets 14.843s. So basically the
same.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-28 15:00   ` [Caml-list] " Goswin von Brederlow
  2009-10-28 15:17     ` Sylvain Le Gall
@ 2009-10-29 20:40     ` Florian Weimer
  2009-10-29 21:04       ` Gerd Stolpmann
  2009-10-29 23:38       ` Goswin von Brederlow
  1 sibling, 2 replies; 38+ messages in thread
From: Florian Weimer @ 2009-10-29 20:40 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: caml-list

* Goswin von Brederlow:

> - The data is passed to libaio and needs to be kept alive and unmoved
>   as long as libaio knows it.

It also has to be aligned to a 512-byte boundary, so you can use
O_DIRECT.  Linux does not support truely asynchronous I/O without
O_DIRECT AFAIK, which rarely makes it worth the trouble.


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29 20:40     ` Florian Weimer
@ 2009-10-29 21:04       ` Gerd Stolpmann
  2009-10-29 23:43         ` Goswin von Brederlow
  2009-10-29 23:38       ` Goswin von Brederlow
  1 sibling, 1 reply; 38+ messages in thread
From: Gerd Stolpmann @ 2009-10-29 21:04 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Goswin von Brederlow, caml-list


Am Donnerstag, den 29.10.2009, 21:40 +0100 schrieb Florian Weimer:
> * Goswin von Brederlow:
> 
> > - The data is passed to libaio and needs to be kept alive and unmoved
> >   as long as libaio knows it.
> 
> It also has to be aligned to a 512-byte boundary, so you can use
> O_DIRECT.  Linux does not support truely asynchronous I/O without
> O_DIRECT AFAIK, which rarely makes it worth the trouble.

Right. There is also the question whether aio for regular files (i.e.
files backed by page cache) is continued to be supported at all - it is
well known that Linus Torvalds doesn't like it. It can happen that at
some day aio will be restricted to block devices only.

So I wouldn't use it for production code, but it is of course still an
interesting interface.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann, Bad Nauheimer Str.3, 64289 Darmstadt,Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29 18:48     ` Sylvain Le Gall
@ 2009-10-29 23:25       ` Goswin von Brederlow
  0 siblings, 0 replies; 38+ messages in thread
From: Goswin von Brederlow @ 2009-10-29 23:25 UTC (permalink / raw)
  To: Sylvain Le Gall; +Cc: caml-list

Sylvain Le Gall <sylvain@le-gall.net> writes:

> On 29-10-2009, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>> Xavier Leroy <Xavier.Leroy@inria.fr> writes:
>>> Goswin von Brederlow wrote:
>>
>> Here are some benchmark results:
>>
>> get an int out of a string:
>>                 C               Ocaml
>>   uint8  le     19.496          17.433
>>    int8  le     19.298          17.850
>>   uint16 le     19.427          25.046
>>    int16 le     19.383          27.664
>>   uint16 be     20.502          23.200
>>    int16 be     20.350          27.535
>>
>> get an int out of a Bigarray.Array1.t:
>> 		safe		unsafe
>>   uint8  le	55.194s		54.508s
>>   uint64 le     80.51s		81.46s
>>
>
> Can you provide us with the corresponding code and benchmark? 
>
> Maybe you can just commit this in libaio/test/bench.ml.
>
> Regards,
> Sylvain Le Gall

As Christophe guessed the problem was polymorphic functions. If I
specify a fixed Array1 type then the compiler uses the optimized
access functions. Makes unsafe Bigarray slightly faster than unsafe
string actually (must not optimize int_of_char/Char.unsafe_chr away)
and that independent of argument size (on set, on get allocating
int32/int64 costs time so they are slower).

So Bigarray is the fastest but getting different types out of a
Bigarray will be tricky. Unaligned even more so if not impossible.

I have to sleep on this. Maybe in my use case I can have all
structures int64 aligned and then split the int64 up in ocaml where
structures have smaller members. Would have been too much to have a
Bigarray with access functions for any type. Maybe some little wrapper
with Obj.Magic will do *hide*.

As for libaio it should be easy to make it create and use any Bigarray
type.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29 20:40     ` Florian Weimer
  2009-10-29 21:04       ` Gerd Stolpmann
@ 2009-10-29 23:38       ` Goswin von Brederlow
  1 sibling, 0 replies; 38+ messages in thread
From: Goswin von Brederlow @ 2009-10-29 23:38 UTC (permalink / raw)
  To: Florian Weimer; +Cc: caml-list

Florian Weimer <fw@deneb.enyo.de> writes:

> * Goswin von Brederlow:
>
>> - The data is passed to libaio and needs to be kept alive and unmoved
>>   as long as libaio knows it.
>
> It also has to be aligned to a 512-byte boundary, so you can use
> O_DIRECT.  Linux does not support truely asynchronous I/O without
> O_DIRECT AFAIK, which rarely makes it worth the trouble.

True. But the libaio can provide a Aio.Buffer.make that returns an
aligned Bigarray (or string or whatever, currently a custom type).

If you write to files on a filesystem without O_DIRECT it will block
when submitting the requests till they have completed. Not sure what
happens on block devices without O_DIRECT.

My use case is for a Fuse Filesystem and writing to disks. O_DIRECT is
quite alright there. If you can't use O_DIRECT then you are left with
going multithreaded.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29 21:04       ` Gerd Stolpmann
@ 2009-10-29 23:43         ` Goswin von Brederlow
  2009-10-30  0:48           ` Gerd Stolpmann
  0 siblings, 1 reply; 38+ messages in thread
From: Goswin von Brederlow @ 2009-10-29 23:43 UTC (permalink / raw)
  To: Gerd Stolpmann; +Cc: Florian Weimer, Goswin von Brederlow, caml-list

Gerd Stolpmann <gerd@gerd-stolpmann.de> writes:

> Am Donnerstag, den 29.10.2009, 21:40 +0100 schrieb Florian Weimer:
>> * Goswin von Brederlow:
>> 
>> > - The data is passed to libaio and needs to be kept alive and unmoved
>> >   as long as libaio knows it.
>> 
>> It also has to be aligned to a 512-byte boundary, so you can use
>> O_DIRECT.  Linux does not support truely asynchronous I/O without
>> O_DIRECT AFAIK, which rarely makes it worth the trouble.
>
> Right. There is also the question whether aio for regular files (i.e.
> files backed by page cache) is continued to be supported at all - it is
> well known that Linus Torvalds doesn't like it. It can happen that at
> some day aio will be restricted to block devices only.
>
> So I wouldn't use it for production code, but it is of course still an
> interesting interface.
>
> Gerd

Damn. That seems so stupid. Then writing asynchronous will only be
possible with creating a pot full of worker thread, each one writing
one chunk. So you get all those chunks in random order submitted to
the kernel, the kernel has to reorder them, fit them back together,
write them and then wake up the right thread for each piece
completed. So much extra work while libaio has all the data already in
perfect structures for the kernel.

And how will you do barriers when writing with threads? Wait for all
threads to complete every time you hit a barrier and thereby stalling
the pipeline?

MfG
        Goswin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29 23:43         ` Goswin von Brederlow
@ 2009-10-30  0:48           ` Gerd Stolpmann
  0 siblings, 0 replies; 38+ messages in thread
From: Gerd Stolpmann @ 2009-10-30  0:48 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: Florian Weimer, caml-list


Am Freitag, den 30.10.2009, 00:43 +0100 schrieb Goswin von Brederlow:
> Gerd Stolpmann <gerd@gerd-stolpmann.de> writes:
> 
> > Am Donnerstag, den 29.10.2009, 21:40 +0100 schrieb Florian Weimer:
> >> * Goswin von Brederlow:
> >> 
> >> > - The data is passed to libaio and needs to be kept alive and unmoved
> >> >   as long as libaio knows it.
> >> 
> >> It also has to be aligned to a 512-byte boundary, so you can use
> >> O_DIRECT.  Linux does not support truely asynchronous I/O without
> >> O_DIRECT AFAIK, which rarely makes it worth the trouble.
> >
> > Right. There is also the question whether aio for regular files (i.e.
> > files backed by page cache) is continued to be supported at all - it is
> > well known that Linus Torvalds doesn't like it. It can happen that at
> > some day aio will be restricted to block devices only.
> >
> > So I wouldn't use it for production code, but it is of course still an
> > interesting interface.
> >
> > Gerd
> 
> Damn. That seems so stupid. Then writing asynchronous will only be
> possible with creating a pot full of worker thread, each one writing
> one chunk. So you get all those chunks in random order submitted to
> the kernel, the kernel has to reorder them, fit them back together,
> write them and then wake up the right thread for each piece
> completed. So much extra work while libaio has all the data already in
> perfect structures for the kernel.

Well, this is exactly the implementation of the POSIX aio functions in
glibc. They are mapped to a bunch of threads.

> And how will you do barriers when writing with threads? Wait for all
> threads to complete every time you hit a barrier and thereby stalling
> the pipeline?

You can't implement barriers. When you have page-cache backed I/O (i.e.
non-direct I/O, no matter of aio or sync I/O) there is no control when
data is written. Ok, there is fsync but this is very coarse-grained
control.

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann, Bad Nauheimer Str.3, 64289 Darmstadt,Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
Phone: +49-6151-153855                  Fax: +49-6151-997714
------------------------------------------------------------


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-29 17:07               ` Goswin von Brederlow
@ 2009-10-30 20:30                 ` Richard Jones
  2009-11-01 15:11                   ` Goswin von Brederlow
  0 siblings, 1 reply; 38+ messages in thread
From: Richard Jones @ 2009-10-30 20:30 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: caml-list

On Thu, Oct 29, 2009 at 06:07:59PM +0100, Goswin von Brederlow wrote:
> I still can reuse a lot of this. Esspecially the syntax extension
> seems like a good idea. Maybe reduced to bytes instead of bits
> though. I don't intend to use such fine grained structures to need bit
> access.

Take a close look at bitstring.  In all the cases where it can
*statically* determine that accesses are on byte or larger boundaries,
it does *not* do any bitfiddling but uses the most efficient, direct C
calls possible.

We really did spend a lot of time optimizing the bitmatch case.

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-10-30 20:30                 ` Richard Jones
@ 2009-11-01 15:11                   ` Goswin von Brederlow
  2009-11-01 19:57                     ` Richard Jones
  0 siblings, 1 reply; 38+ messages in thread
From: Goswin von Brederlow @ 2009-11-01 15:11 UTC (permalink / raw)
  To: Richard Jones; +Cc: Goswin von Brederlow, caml-list

Richard Jones <rich@annexia.org> writes:

> On Thu, Oct 29, 2009 at 06:07:59PM +0100, Goswin von Brederlow wrote:
>> I still can reuse a lot of this. Esspecially the syntax extension
>> seems like a good idea. Maybe reduced to bytes instead of bits
>> though. I don't intend to use such fine grained structures to need bit
>> access.
>
> Take a close look at bitstring.  In all the cases where it can
> *statically* determine that accesses are on byte or larger boundaries,
> it does *not* do any bitfiddling but uses the most efficient, direct C
> calls possible.
>
> We really did spend a lot of time optimizing the bitmatch case.
>
> Rich.

But C calls are still 33% slower than direct access in ocaml (if one
doesn't use the polymorphic functions).

What would be great would be to use whatever Bigarray uses to get the
compiler to emit direct access to the data instead of C calls. Time to
hit the source.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-11-01 15:11                   ` Goswin von Brederlow
@ 2009-11-01 19:57                     ` Richard Jones
  2009-11-02 16:11                       ` Goswin von Brederlow
  0 siblings, 1 reply; 38+ messages in thread
From: Richard Jones @ 2009-11-01 19:57 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: caml-list

On Sun, Nov 01, 2009 at 04:11:52PM +0100, Goswin von Brederlow wrote:
> But C calls are still 33% slower than direct access in ocaml (if one
> doesn't use the polymorphic functions).

Are you using noalloc calls?

http://camltastic.blogspot.com/2008/08/tip-calling-c-functions-directly-with.html

I would love to see inline assembler supported by the compiler.

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-11-01 19:57                     ` Richard Jones
@ 2009-11-02 16:11                       ` Goswin von Brederlow
  2009-11-02 16:33                         ` Mauricio Fernandez
  0 siblings, 1 reply; 38+ messages in thread
From: Goswin von Brederlow @ 2009-11-02 16:11 UTC (permalink / raw)
  To: Richard Jones; +Cc: Goswin von Brederlow, caml-list

Richard Jones <rich@annexia.org> writes:

> On Sun, Nov 01, 2009 at 04:11:52PM +0100, Goswin von Brederlow wrote:
>> But C calls are still 33% slower than direct access in ocaml (if one
>> doesn't use the polymorphic functions).
>
> Are you using noalloc calls?
>
> http://camltastic.blogspot.com/2008/08/tip-calling-c-functions-directly-with.html

Yes. And I looked at the bigarray module and couldn't figure out how
they differ from my own external function. Only difference I see is
the leading "%" on the external name. What does that do?

> I would love to see inline assembler supported by the compiler.
>
> Rich.

And some primitive operations on integers like sign extending and byte
swapping in the Pervasives module where the compiler emits cpu
specific code instead of a caml/C call.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-11-02 16:11                       ` Goswin von Brederlow
@ 2009-11-02 16:33                         ` Mauricio Fernandez
  2009-11-02 20:27                           ` Richard Jones
  2009-11-02 20:48                           ` Goswin von Brederlow
  0 siblings, 2 replies; 38+ messages in thread
From: Mauricio Fernandez @ 2009-11-02 16:33 UTC (permalink / raw)
  To: caml-list

On Mon, Nov 02, 2009 at 05:11:27PM +0100, Goswin von Brederlow wrote:
> Richard Jones <rich@annexia.org> writes:
> 
> > On Sun, Nov 01, 2009 at 04:11:52PM +0100, Goswin von Brederlow wrote:
> >> But C calls are still 33% slower than direct access in ocaml (if one
> >> doesn't use the polymorphic functions).
> >
> > Are you using noalloc calls?
> >
> > http://camltastic.blogspot.com/2008/08/tip-calling-c-functions-directly-with.html
> 
> Yes. And I looked at the bigarray module and couldn't figure out how
> they differ from my own external function. Only difference I see is
> the leading "%" on the external name. What does that do?

That means that it is using a hardcoded OCaml primitive, whose code can be
generated by the compiler via C--. See asmcomp/cmmgen.ml.

> > I would love to see inline assembler supported by the compiler.

It might be possible to hack support for C-- expressions in external
declarations. That'd be a sort of portable assembler.

-- 
Mauricio Fernandez  -   http://eigenclass.org


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-11-02 16:33                         ` Mauricio Fernandez
@ 2009-11-02 20:27                           ` Richard Jones
  2009-11-03 13:18                             ` Goswin von Brederlow
  2009-11-02 20:48                           ` Goswin von Brederlow
  1 sibling, 1 reply; 38+ messages in thread
From: Richard Jones @ 2009-11-02 20:27 UTC (permalink / raw)
  To: caml-list

On Mon, Nov 02, 2009 at 05:33:24PM +0100, Mauricio Fernandez wrote:
> It might be possible to hack support for C-- expressions in external
> declarations. That'd be a sort of portable assembler.

To be honest I'm far more interested in x86-64-specific instructions
(SSE3/4 in particular).  There are only two processor architectures
that matter in the world in any practical sense, x86-64 and ARM.

Rich.

-- 
Richard Jones
Red Hat


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-11-02 16:33                         ` Mauricio Fernandez
  2009-11-02 20:27                           ` Richard Jones
@ 2009-11-02 20:48                           ` Goswin von Brederlow
  1 sibling, 0 replies; 38+ messages in thread
From: Goswin von Brederlow @ 2009-11-02 20:48 UTC (permalink / raw)
  To: caml-list

Mauricio Fernandez <mfp@acm.org> writes:

> On Mon, Nov 02, 2009 at 05:11:27PM +0100, Goswin von Brederlow wrote:
>> Richard Jones <rich@annexia.org> writes:
>> 
>> > On Sun, Nov 01, 2009 at 04:11:52PM +0100, Goswin von Brederlow wrote:
>> >> But C calls are still 33% slower than direct access in ocaml (if one
>> >> doesn't use the polymorphic functions).
>> >
>> > Are you using noalloc calls?
>> >
>> > http://camltastic.blogspot.com/2008/08/tip-calling-c-functions-directly-with.html
>> 
>> Yes. And I looked at the bigarray module and couldn't figure out how
>> they differ from my own external function. Only difference I see is
>> the leading "%" on the external name. What does that do?
>
> That means that it is using a hardcoded OCaml primitive, whose code can be
> generated by the compiler via C--. See asmcomp/cmmgen.ml.
>
>> > I would love to see inline assembler supported by the compiler.
>
> It might be possible to hack support for C-- expressions in external
> declarations. That'd be a sort of portable assembler.

This brings me a lot closer to a fast buffer structure. I know have
this code:

(* buffer.ml: Buffer module for libaio-ocaml
 * Copyright (C) 2009 Goswin von Brederlow
 *
 * This program is free software: you can redistribute it and/or modify
 * it under the terms of the GNU Lesser General Public License as
 * published by the Free Software Foundation, either version 3 of the
 * License, or (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful,
 * but WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 * GNU General Public License for more details.
 *
 * You should have received a copy of the GNU General Public License
 * along with this program.  If not, see <http://www.gnu.org/licenses/>.
 * Under Debian a copy can be found in /usr/share/common-licenses/LGPL-3.
 *)

open Bigarray

type buffer = (int, int8_unsigned_elt, c_layout) Array1.t

exception Unaligned

let create size = (Array1.create int8_unsigned c_layout size : buffer)

let unsafe_get_uint8 (buf : buffer) off = Array1.unsafe_get buf off

let unsafe_get_uint16 (buf : buffer) off =
  let off = off asr 1 in
  let buf = ((Obj.magic buf) : (int, int16_unsigned_elt, c_layout) Array1.t) 
  in
    Array1.unsafe_get buf off

let unsafe_get_int31 (buf : buffer) off =
  let off = off asr 2 in
  let buf = ((Obj.magic buf) : (int32, int32, c_layout) Array1.t) in
  let x = Array1.unsafe_get buf off
  in
    Int32.to_int x

let unsafe_get_int63 (buf : buffer) off =
  let off = off asr 3 in
  let buf = ((Obj.magic buf) : (int, int, c_layout) Array1.t)
  in
    Array1.unsafe_get buf off


Looking at the generated code I see that this works nicely for 8 and
16bit:

0000000000404a50 <camlBuffer__unsafe_get_uint8_131>:
  404a50:       48 d1 fb                sar    %rbx
  404a53:       48 8b 40 08             mov    0x8(%rax),%rax
  404a57:       48 0f b6 04 18          movzbq (%rax,%rbx,1),%rax
  404a5c:       48 8d 44 00 01          lea    0x1(%rax,%rax,1),%rax
  404a61:       c3                      retq   

0000000000404a90 <camlBuffer__unsafe_get_uint16_137>:
  404a90:       48 d1 fb                sar    %rbx
  404a93:       48 83 cb 01             or     $0x1,%rbx
  404a97:       48 d1 fb                sar    %rbx
  404a9a:       48 8b 40 08             mov    0x8(%rax),%rax
  404a9e:       48 0f b7 04 58          movzwq (%rax,%rbx,2),%rax
  404aa3:       48 8d 44 00 01          lea    0x1(%rax,%rax,1),%rax
  404aa8:       c3                      retq   

But for 31/63 bits I get:

0000000000404b90 <camlBuffer__unsafe_get_int31_145>:
  404b90:       48 83 ec 08             sub    $0x8,%rsp
  404b94:       48 c1 fb 02             sar    $0x2,%rbx
  404b98:       48 83 cb 01             or     $0x1,%rbx
  404b9c:       48 89 c7                mov    %rax,%rdi
  404b9f:       48 89 de                mov    %rbx,%rsi
  404ba2:       48 8b 05 5f bc 21 00    mov    0x21bc5f(%rip),%rax        # 620808 <_DYNAMIC+0x7e0>
  404ba9:       e8 92 2a 01 00          callq  417640 <caml_c_call>
  404bae:       48 63 40 08             movslq 0x8(%rax),%rax
  404bb2:       48 d1 e0                shl    %rax
  404bb5:       48 83 c8 01             or     $0x1,%rax
  404bb9:       48 83 c4 08             add    $0x8,%rsp
  404bbd:       c3                      retq   

0000000000404ca0 <camlBuffer__unsafe_get_int63_154>:
  404ca0:       48 83 ec 08             sub    $0x8,%rsp
  404ca4:       48 c1 fb 03             sar    $0x3,%rbx
  404ca8:       48 83 cb 01             or     $0x1,%rbx
  404cac:       48 89 c7                mov    %rax,%rdi
  404caf:       48 89 de                mov    %rbx,%rsi
  404cb2:       48 8b 05 4f bb 21 00    mov    0x21bb4f(%rip),%rax        # 620808 <_DYNAMIC+0x7e0>
  404cb9:       e8 82 29 01 00          callq  417640 <caml_c_call>
  404cbe:       48 83 c4 08             add    $0x8,%rsp
  404cc2:       c3                      retq   

At least in the int63 case I would have thought the compiler would
emit asm code to read the int instead of a function call. In the 31bit
case I would have hoped it would optimize the intermittend int32 away.

Is there something I can do better to get_int31? I was hoping for code
like this:

0000000000404a90 <camlBuffer__unsafe_get_uint31_137>:
  404c90:       48 c1 fb 03             sar    $0x3,%rbx
  404a94:       48 83 cb 01             or     $0x1,%rbx
  404a98:       48 d1 fb                sar    %rbx
  404a9b:       48 8b 40 08             mov    0x8(%rax),%rax
  404a9f:       xx xx xx xx xx          movzwq (%rax,%rbx,4),%rax
  404aa4:       48 8d 44 00 01          lea    0x1(%rax,%rax,1),%rax
  404aa9:       c3                      retq   

MfG
        Goswin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Caml-list] Re: How to read different ints from a Bigarray?
  2009-11-02 20:27                           ` Richard Jones
@ 2009-11-03 13:18                             ` Goswin von Brederlow
  0 siblings, 0 replies; 38+ messages in thread
From: Goswin von Brederlow @ 2009-11-03 13:18 UTC (permalink / raw)
  To: Richard Jones; +Cc: caml-list

Richard Jones <rich@annexia.org> writes:

> On Mon, Nov 02, 2009 at 05:33:24PM +0100, Mauricio Fernandez wrote:
>> It might be possible to hack support for C-- expressions in external
>> declarations. That'd be a sort of portable assembler.
>
> To be honest I'm far more interested in x86-64-specific instructions
> (SSE3/4 in particular).  There are only two processor architectures
> that matter in the world in any practical sense, x86-64 and ARM.
>
> Rich.

And mips.

MfG
        Goswin


^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2009-11-03 13:18 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-28 13:54 How to read different ints from a Bigarray? Goswin von Brederlow
2009-10-28 14:16 ` Sylvain Le Gall
2009-10-28 15:00   ` [Caml-list] " Goswin von Brederlow
2009-10-28 15:17     ` Sylvain Le Gall
2009-10-28 17:57       ` [Caml-list] " Goswin von Brederlow
2009-10-28 18:19         ` Sylvain Le Gall
2009-10-28 21:05           ` [Caml-list] " Goswin von Brederlow
2009-10-28 21:26             ` Sylvain Le Gall
2009-10-28 22:48         ` [Caml-list] " blue storm
2009-10-29  9:50           ` Goswin von Brederlow
2009-10-29 10:34             ` Goswin von Brederlow
2009-10-29 12:20             ` Richard Jones
2009-10-29 17:07               ` Goswin von Brederlow
2009-10-30 20:30                 ` Richard Jones
2009-11-01 15:11                   ` Goswin von Brederlow
2009-11-01 19:57                     ` Richard Jones
2009-11-02 16:11                       ` Goswin von Brederlow
2009-11-02 16:33                         ` Mauricio Fernandez
2009-11-02 20:27                           ` Richard Jones
2009-11-03 13:18                             ` Goswin von Brederlow
2009-11-02 20:48                           ` Goswin von Brederlow
2009-10-29 20:40     ` Florian Weimer
2009-10-29 21:04       ` Gerd Stolpmann
2009-10-29 23:43         ` Goswin von Brederlow
2009-10-30  0:48           ` Gerd Stolpmann
2009-10-29 23:38       ` Goswin von Brederlow
2009-10-28 15:37 ` [Caml-list] " Olivier Andrieu
2009-10-28 16:05   ` Sylvain Le Gall
2009-10-28 15:43 ` [Caml-list] " Gerd Stolpmann
2009-10-28 16:06   ` Sylvain Le Gall
2009-10-28 18:09   ` [Caml-list] " Goswin von Brederlow
2009-10-28 17:09 ` Xavier Leroy
2009-10-28 19:05   ` Goswin von Brederlow
2009-10-29 17:05   ` Goswin von Brederlow
2009-10-29 18:42     ` Christophe TROESTLER
2009-10-29 19:03       ` Goswin von Brederlow
2009-10-29 18:48     ` Sylvain Le Gall
2009-10-29 23:25       ` [Caml-list] " Goswin von Brederlow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).