caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* [Caml-list] String, Array, Bigarray.char
@ 2013-05-09 13:32 Tom Ridge
  2013-05-09 13:44 ` Anil Madhavapeddy
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Tom Ridge @ 2013-05-09 13:32 UTC (permalink / raw)
  To: caml-list

Dear caml-list,

Quick question: I am working a lot with arrays of byte, which at
various points I want to view as strings, and at various points I want
to view as arrays. The exact types involved should be discernible from
the code below.

I have some conversion functions e.g.:

  type myfusebuffer = (char, Bigarray.int8_unsigned_elt,
Bigarray.c_layout) Bigarray.Array1.t

  module A = Bigarray.Array1

  (* convenience only; don't use in production code *)
  let array_of_string bs = (
    let arr = (Array.init (String.length bs) (String.get bs)) in
    let contents = A.of_array Bigarray.char Bigarray.c_layout arr in
    contents)
  let (_:string -> myfusebuffer) = array_of_string

This presumably takes O(n) time (where n is the length of the string
bs). My question is: is there functionality to move values between
these types at cost O(1)? Basically, I'm hoping that String is
implemented as A.of_array Bigarray.char Bigarray.c_layout or
similar...

Thanks

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] String, Array, Bigarray.char
  2013-05-09 13:32 [Caml-list] String, Array, Bigarray.char Tom Ridge
@ 2013-05-09 13:44 ` Anil Madhavapeddy
  2013-05-09 14:07   ` Tom Ridge
  2013-05-09 14:25 ` Markus Mottl
  2013-05-10 23:42 ` Goswin von Brederlow
  2 siblings, 1 reply; 10+ messages in thread
From: Anil Madhavapeddy @ 2013-05-09 13:44 UTC (permalink / raw)
  To: Tom Ridge; +Cc: caml-list

On 9 May 2013, at 09:32, Tom Ridge <tom.j.ridge+caml@googlemail.com> wrote:
> 
> Quick question: I am working a lot with arrays of byte, which at
> various points I want to view as strings, and at various points I want
> to view as arrays. The exact types involved should be discernible from
> the code below.
> 
> I have some conversion functions e.g.:
> 
>  type myfusebuffer = (char, Bigarray.int8_unsigned_elt,
> Bigarray.c_layout) Bigarray.Array1.t
> 
>  module A = Bigarray.Array1
> 
>  (* convenience only; don't use in production code *)
>  let array_of_string bs = (
>    let arr = (Array.init (String.length bs) (String.get bs)) in
>    let contents = A.of_array Bigarray.char Bigarray.c_layout arr in
>    contents)
>  let (_:string -> myfusebuffer) = array_of_string
> 
> This presumably takes O(n) time (where n is the length of the string
> bs). My question is: is there functionality to move values between
> these types at cost O(1)? Basically, I'm hoping that String is
> implemented as A.of_array Bigarray.char Bigarray.c_layout or
> similar...

Strings are represented as normal OCaml values within the OCaml heap,
whereas Bigarrays are simply pointers to externally allocated memory
(via malloc).  You do therefore need to copy across them in most cases.
One quick solution is to define a subset of the String module that uses
the Bigarray accessor functions, but this isn't ideal (especially when
external libraries that use strings are involved).

Your fusebuffer type probably means that you're working with filesystem
data.  Can you just use Bigarrays for everything, with copies to strings
only when you absolutely need to?  We haven't released this out of beta
yet, but the cstruct camlp4 extension helps map C structures to OCaml:
https://github.com/mirage/ocaml-cstruct

-anil




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] String, Array, Bigarray.char
  2013-05-09 13:44 ` Anil Madhavapeddy
@ 2013-05-09 14:07   ` Tom Ridge
  2013-05-09 14:14     ` Tom Ridge
  0 siblings, 1 reply; 10+ messages in thread
From: Tom Ridge @ 2013-05-09 14:07 UTC (permalink / raw)
  To: Anil Madhavapeddy; +Cc: caml-list

Thanks for this information.

I guess I will probably end up using arrays as much as possible. In
various places I have used strings as though they were immutable
arrays of byte. I guess the advantage of this approach is that strings
seem more familiar than arrays (especially Bigarrays). But it is
probably not much of a big deal to move to using arrays everywhere.

Thanks once again

Tom


On 9 May 2013 14:44, Anil Madhavapeddy <anil@recoil.org> wrote:
> On 9 May 2013, at 09:32, Tom Ridge <tom.j.ridge+caml@googlemail.com> wrote:
>>
>> Quick question: I am working a lot with arrays of byte, which at
>> various points I want to view as strings, and at various points I want
>> to view as arrays. The exact types involved should be discernible from
>> the code below.
>>
>> I have some conversion functions e.g.:
>>
>>  type myfusebuffer = (char, Bigarray.int8_unsigned_elt,
>> Bigarray.c_layout) Bigarray.Array1.t
>>
>>  module A = Bigarray.Array1
>>
>>  (* convenience only; don't use in production code *)
>>  let array_of_string bs = (
>>    let arr = (Array.init (String.length bs) (String.get bs)) in
>>    let contents = A.of_array Bigarray.char Bigarray.c_layout arr in
>>    contents)
>>  let (_:string -> myfusebuffer) = array_of_string
>>
>> This presumably takes O(n) time (where n is the length of the string
>> bs). My question is: is there functionality to move values between
>> these types at cost O(1)? Basically, I'm hoping that String is
>> implemented as A.of_array Bigarray.char Bigarray.c_layout or
>> similar...
>
> Strings are represented as normal OCaml values within the OCaml heap,
> whereas Bigarrays are simply pointers to externally allocated memory
> (via malloc).  You do therefore need to copy across them in most cases.
> One quick solution is to define a subset of the String module that uses
> the Bigarray accessor functions, but this isn't ideal (especially when
> external libraries that use strings are involved).
>
> Your fusebuffer type probably means that you're working with filesystem
> data.  Can you just use Bigarrays for everything, with copies to strings
> only when you absolutely need to?  We haven't released this out of beta
> yet, but the cstruct camlp4 extension helps map C structures to OCaml:
> https://github.com/mirage/ocaml-cstruct
>
> -anil
>
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] String, Array, Bigarray.char
  2013-05-09 14:07   ` Tom Ridge
@ 2013-05-09 14:14     ` Tom Ridge
  2013-05-09 14:21       ` Anil Madhavapeddy
  2013-05-09 14:29       ` Markus Mottl
  0 siblings, 2 replies; 10+ messages in thread
From: Tom Ridge @ 2013-05-09 14:14 UTC (permalink / raw)
  To: Anil Madhavapeddy; +Cc: caml-list

Although I see that this won't be so easy because various functions
such as Unix.write have the buffer argument being of type string :(

So at various points I seem to be forced to use strings. I suppose one
alternative is to reimplement the functions I use (such as Unix.write)
to work with arrays. Does anyone know if this has been done elsewhere?


On 9 May 2013 15:07, Tom Ridge <tom.j.ridge+caml@googlemail.com> wrote:
> Thanks for this information.
>
> I guess I will probably end up using arrays as much as possible. In
> various places I have used strings as though they were immutable
> arrays of byte. I guess the advantage of this approach is that strings
> seem more familiar than arrays (especially Bigarrays). But it is
> probably not much of a big deal to move to using arrays everywhere.
>
> Thanks once again
>
> Tom
>
>
> On 9 May 2013 14:44, Anil Madhavapeddy <anil@recoil.org> wrote:
>> On 9 May 2013, at 09:32, Tom Ridge <tom.j.ridge+caml@googlemail.com> wrote:
>>>
>>> Quick question: I am working a lot with arrays of byte, which at
>>> various points I want to view as strings, and at various points I want
>>> to view as arrays. The exact types involved should be discernible from
>>> the code below.
>>>
>>> I have some conversion functions e.g.:
>>>
>>>  type myfusebuffer = (char, Bigarray.int8_unsigned_elt,
>>> Bigarray.c_layout) Bigarray.Array1.t
>>>
>>>  module A = Bigarray.Array1
>>>
>>>  (* convenience only; don't use in production code *)
>>>  let array_of_string bs = (
>>>    let arr = (Array.init (String.length bs) (String.get bs)) in
>>>    let contents = A.of_array Bigarray.char Bigarray.c_layout arr in
>>>    contents)
>>>  let (_:string -> myfusebuffer) = array_of_string
>>>
>>> This presumably takes O(n) time (where n is the length of the string
>>> bs). My question is: is there functionality to move values between
>>> these types at cost O(1)? Basically, I'm hoping that String is
>>> implemented as A.of_array Bigarray.char Bigarray.c_layout or
>>> similar...
>>
>> Strings are represented as normal OCaml values within the OCaml heap,
>> whereas Bigarrays are simply pointers to externally allocated memory
>> (via malloc).  You do therefore need to copy across them in most cases.
>> One quick solution is to define a subset of the String module that uses
>> the Bigarray accessor functions, but this isn't ideal (especially when
>> external libraries that use strings are involved).
>>
>> Your fusebuffer type probably means that you're working with filesystem
>> data.  Can you just use Bigarrays for everything, with copies to strings
>> only when you absolutely need to?  We haven't released this out of beta
>> yet, but the cstruct camlp4 extension helps map C structures to OCaml:
>> https://github.com/mirage/ocaml-cstruct
>>
>> -anil
>>
>>
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] String, Array, Bigarray.char
  2013-05-09 14:14     ` Tom Ridge
@ 2013-05-09 14:21       ` Anil Madhavapeddy
  2013-05-09 14:30         ` Tom Ridge
  2013-05-09 16:29         ` ygrek
  2013-05-09 14:29       ` Markus Mottl
  1 sibling, 2 replies; 10+ messages in thread
From: Anil Madhavapeddy @ 2013-05-09 14:21 UTC (permalink / raw)
  To: Tom Ridge; +Cc: caml-list

You should look at either Lwt or Core/Async, which both provide wrappers
over Bigarray, and asynchronous alternatives to using Unix and threads.

Lwt_bytes: http://ocsigen.org/lwt/api/Lwt_bytes
Bigstring: https://ocaml.janestreet.com/ocaml-core/latest/doc/core/Bigstring.html

The choice between them depends on your situation.  Lwt is an add-on
library that interops with existing code well, whereas Core is a more
complete stdlib replacement (with vastly more features/datastructures).

-anil

On 9 May 2013, at 10:14, Tom Ridge <tom.j.ridge+caml@googlemail.com> wrote:

> Although I see that this won't be so easy because various functions
> such as Unix.write have the buffer argument being of type string :(
> 
> So at various points I seem to be forced to use strings. I suppose one
> alternative is to reimplement the functions I use (such as Unix.write)
> to work with arrays. Does anyone know if this has been done elsewhere?
> 
> 
> On 9 May 2013 15:07, Tom Ridge <tom.j.ridge+caml@googlemail.com> wrote:
>> Thanks for this information.
>> 
>> I guess I will probably end up using arrays as much as possible. In
>> various places I have used strings as though they were immutable
>> arrays of byte. I guess the advantage of this approach is that strings
>> seem more familiar than arrays (especially Bigarrays). But it is
>> probably not much of a big deal to move to using arrays everywhere.
>> 
>> Thanks once again
>> 
>> Tom
>> 
>> 
>> On 9 May 2013 14:44, Anil Madhavapeddy <anil@recoil.org> wrote:
>>> On 9 May 2013, at 09:32, Tom Ridge <tom.j.ridge+caml@googlemail.com> wrote:
>>>> 
>>>> Quick question: I am working a lot with arrays of byte, which at
>>>> various points I want to view as strings, and at various points I want
>>>> to view as arrays. The exact types involved should be discernible from
>>>> the code below.
>>>> 
>>>> I have some conversion functions e.g.:
>>>> 
>>>> type myfusebuffer = (char, Bigarray.int8_unsigned_elt,
>>>> Bigarray.c_layout) Bigarray.Array1.t
>>>> 
>>>> module A = Bigarray.Array1
>>>> 
>>>> (* convenience only; don't use in production code *)
>>>> let array_of_string bs = (
>>>>   let arr = (Array.init (String.length bs) (String.get bs)) in
>>>>   let contents = A.of_array Bigarray.char Bigarray.c_layout arr in
>>>>   contents)
>>>> let (_:string -> myfusebuffer) = array_of_string
>>>> 
>>>> This presumably takes O(n) time (where n is the length of the string
>>>> bs). My question is: is there functionality to move values between
>>>> these types at cost O(1)? Basically, I'm hoping that String is
>>>> implemented as A.of_array Bigarray.char Bigarray.c_layout or
>>>> similar...
>>> 
>>> Strings are represented as normal OCaml values within the OCaml heap,
>>> whereas Bigarrays are simply pointers to externally allocated memory
>>> (via malloc).  You do therefore need to copy across them in most cases.
>>> One quick solution is to define a subset of the String module that uses
>>> the Bigarray accessor functions, but this isn't ideal (especially when
>>> external libraries that use strings are involved).
>>> 
>>> Your fusebuffer type probably means that you're working with filesystem
>>> data.  Can you just use Bigarrays for everything, with copies to strings
>>> only when you absolutely need to?  We haven't released this out of beta
>>> yet, but the cstruct camlp4 extension helps map C structures to OCaml:
>>> https://github.com/mirage/ocaml-cstruct
>>> 
>>> -anil
>>> 
>>> 
>>> 
> 
> -- 
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] String, Array, Bigarray.char
  2013-05-09 13:32 [Caml-list] String, Array, Bigarray.char Tom Ridge
  2013-05-09 13:44 ` Anil Madhavapeddy
@ 2013-05-09 14:25 ` Markus Mottl
  2013-05-10 23:42 ` Goswin von Brederlow
  2 siblings, 0 replies; 10+ messages in thread
From: Markus Mottl @ 2013-05-09 14:25 UTC (permalink / raw)
  To: Tom Ridge; +Cc: caml-list

On Thu, May 9, 2013 at 9:32 AM, Tom Ridge
<tom.j.ridge+caml@googlemail.com> wrote:
> This presumably takes O(n) time (where n is the length of the string
> bs). My question is: is there functionality to move values between
> these types at cost O(1)? Basically, I'm hoping that String is
> implemented as A.of_array Bigarray.char Bigarray.c_layout or
> similar...

This is unfortunately impossible, because OCaml-strings live in the
OCaml-heap, whereas the contents of bigstrings (i.e. bigarrays of
chars) live in the C-heap.

The Jane Street Core library has a Bigstring module in
Core.Std.Bigstring, which builds on bigarrays of the same type that
you are using here.  This Bigstring-module also features functions for
converting and blitting ordinary OCaml strings to bigstrings.  The
"workhorse" function for that purpose is written in C and uses memcpy,
which is as fast as one could hope for.  It also supports all
functions that the standard String-module offers so you shouldn't have
to convert back and forth between strings and bigstrings in the first
place.

Regards,
Markus

--
Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] String, Array, Bigarray.char
  2013-05-09 14:14     ` Tom Ridge
  2013-05-09 14:21       ` Anil Madhavapeddy
@ 2013-05-09 14:29       ` Markus Mottl
  1 sibling, 0 replies; 10+ messages in thread
From: Markus Mottl @ 2013-05-09 14:29 UTC (permalink / raw)
  To: Tom Ridge; +Cc: Anil Madhavapeddy, caml-list

The Core.Std.Bigstring-module offers a large number of Unix-I/O
functions for bigstrings, even for vectorized I/O (e.g. writev).  I am
sure you will find everything you need there.

On Thu, May 9, 2013 at 10:14 AM, Tom Ridge
<tom.j.ridge+caml@googlemail.com> wrote:
> Although I see that this won't be so easy because various functions
> such as Unix.write have the buffer argument being of type string :(
>
> So at various points I seem to be forced to use strings. I suppose one
> alternative is to reimplement the functions I use (such as Unix.write)
> to work with arrays. Does anyone know if this has been done elsewhere?
>
>
> On 9 May 2013 15:07, Tom Ridge <tom.j.ridge+caml@googlemail.com> wrote:
>> Thanks for this information.
>>
>> I guess I will probably end up using arrays as much as possible. In
>> various places I have used strings as though they were immutable
>> arrays of byte. I guess the advantage of this approach is that strings
>> seem more familiar than arrays (especially Bigarrays). But it is
>> probably not much of a big deal to move to using arrays everywhere.
>>
>> Thanks once again
>>
>> Tom
>>
>>
>> On 9 May 2013 14:44, Anil Madhavapeddy <anil@recoil.org> wrote:
>>> On 9 May 2013, at 09:32, Tom Ridge <tom.j.ridge+caml@googlemail.com> wrote:
>>>>
>>>> Quick question: I am working a lot with arrays of byte, which at
>>>> various points I want to view as strings, and at various points I want
>>>> to view as arrays. The exact types involved should be discernible from
>>>> the code below.
>>>>
>>>> I have some conversion functions e.g.:
>>>>
>>>>  type myfusebuffer = (char, Bigarray.int8_unsigned_elt,
>>>> Bigarray.c_layout) Bigarray.Array1.t
>>>>
>>>>  module A = Bigarray.Array1
>>>>
>>>>  (* convenience only; don't use in production code *)
>>>>  let array_of_string bs = (
>>>>    let arr = (Array.init (String.length bs) (String.get bs)) in
>>>>    let contents = A.of_array Bigarray.char Bigarray.c_layout arr in
>>>>    contents)
>>>>  let (_:string -> myfusebuffer) = array_of_string
>>>>
>>>> This presumably takes O(n) time (where n is the length of the string
>>>> bs). My question is: is there functionality to move values between
>>>> these types at cost O(1)? Basically, I'm hoping that String is
>>>> implemented as A.of_array Bigarray.char Bigarray.c_layout or
>>>> similar...
>>>
>>> Strings are represented as normal OCaml values within the OCaml heap,
>>> whereas Bigarrays are simply pointers to externally allocated memory
>>> (via malloc).  You do therefore need to copy across them in most cases.
>>> One quick solution is to define a subset of the String module that uses
>>> the Bigarray accessor functions, but this isn't ideal (especially when
>>> external libraries that use strings are involved).
>>>
>>> Your fusebuffer type probably means that you're working with filesystem
>>> data.  Can you just use Bigarrays for everything, with copies to strings
>>> only when you absolutely need to?  We haven't released this out of beta
>>> yet, but the cstruct camlp4 extension helps map C structures to OCaml:
>>> https://github.com/mirage/ocaml-cstruct
>>>
>>> -anil
>>>
>>>
>>>
>
> --
> Caml-list mailing list.  Subscription management and archives:
> https://sympa.inria.fr/sympa/arc/caml-list
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs



--
Markus Mottl        http://www.ocaml.info        markus.mottl@gmail.com

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] String, Array, Bigarray.char
  2013-05-09 14:21       ` Anil Madhavapeddy
@ 2013-05-09 14:30         ` Tom Ridge
  2013-05-09 16:29         ` ygrek
  1 sibling, 0 replies; 10+ messages in thread
From: Tom Ridge @ 2013-05-09 14:30 UTC (permalink / raw)
  To: Anil Madhavapeddy, markus.mottl; +Cc: caml-list

Ah! These are probably solutions to my problem. I shall investigate further.

Thanks

On 9 May 2013 15:21, Anil Madhavapeddy <anil@recoil.org> wrote:
> You should look at either Lwt or Core/Async, which both provide wrappers
> over Bigarray, and asynchronous alternatives to using Unix and threads.
>
> Lwt_bytes: http://ocsigen.org/lwt/api/Lwt_bytes
> Bigstring: https://ocaml.janestreet.com/ocaml-core/latest/doc/core/Bigstring.html
>
> The choice between them depends on your situation.  Lwt is an add-on
> library that interops with existing code well, whereas Core is a more
> complete stdlib replacement (with vastly more features/datastructures).
>
> -anil
>
> On 9 May 2013, at 10:14, Tom Ridge <tom.j.ridge+caml@googlemail.com> wrote:
>
>> Although I see that this won't be so easy because various functions
>> such as Unix.write have the buffer argument being of type string :(
>>
>> So at various points I seem to be forced to use strings. I suppose one
>> alternative is to reimplement the functions I use (such as Unix.write)
>> to work with arrays. Does anyone know if this has been done elsewhere?
>>
>>
>> On 9 May 2013 15:07, Tom Ridge <tom.j.ridge+caml@googlemail.com> wrote:
>>> Thanks for this information.
>>>
>>> I guess I will probably end up using arrays as much as possible. In
>>> various places I have used strings as though they were immutable
>>> arrays of byte. I guess the advantage of this approach is that strings
>>> seem more familiar than arrays (especially Bigarrays). But it is
>>> probably not much of a big deal to move to using arrays everywhere.
>>>
>>> Thanks once again
>>>
>>> Tom
>>>
>>>
>>> On 9 May 2013 14:44, Anil Madhavapeddy <anil@recoil.org> wrote:
>>>> On 9 May 2013, at 09:32, Tom Ridge <tom.j.ridge+caml@googlemail.com> wrote:
>>>>>
>>>>> Quick question: I am working a lot with arrays of byte, which at
>>>>> various points I want to view as strings, and at various points I want
>>>>> to view as arrays. The exact types involved should be discernible from
>>>>> the code below.
>>>>>
>>>>> I have some conversion functions e.g.:
>>>>>
>>>>> type myfusebuffer = (char, Bigarray.int8_unsigned_elt,
>>>>> Bigarray.c_layout) Bigarray.Array1.t
>>>>>
>>>>> module A = Bigarray.Array1
>>>>>
>>>>> (* convenience only; don't use in production code *)
>>>>> let array_of_string bs = (
>>>>>   let arr = (Array.init (String.length bs) (String.get bs)) in
>>>>>   let contents = A.of_array Bigarray.char Bigarray.c_layout arr in
>>>>>   contents)
>>>>> let (_:string -> myfusebuffer) = array_of_string
>>>>>
>>>>> This presumably takes O(n) time (where n is the length of the string
>>>>> bs). My question is: is there functionality to move values between
>>>>> these types at cost O(1)? Basically, I'm hoping that String is
>>>>> implemented as A.of_array Bigarray.char Bigarray.c_layout or
>>>>> similar...
>>>>
>>>> Strings are represented as normal OCaml values within the OCaml heap,
>>>> whereas Bigarrays are simply pointers to externally allocated memory
>>>> (via malloc).  You do therefore need to copy across them in most cases.
>>>> One quick solution is to define a subset of the String module that uses
>>>> the Bigarray accessor functions, but this isn't ideal (especially when
>>>> external libraries that use strings are involved).
>>>>
>>>> Your fusebuffer type probably means that you're working with filesystem
>>>> data.  Can you just use Bigarrays for everything, with copies to strings
>>>> only when you absolutely need to?  We haven't released this out of beta
>>>> yet, but the cstruct camlp4 extension helps map C structures to OCaml:
>>>> https://github.com/mirage/ocaml-cstruct
>>>>
>>>> -anil
>>>>
>>>>
>>>>
>>
>> --
>> Caml-list mailing list.  Subscription management and archives:
>> https://sympa.inria.fr/sympa/arc/caml-list
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] String, Array, Bigarray.char
  2013-05-09 14:21       ` Anil Madhavapeddy
  2013-05-09 14:30         ` Tom Ridge
@ 2013-05-09 16:29         ` ygrek
  1 sibling, 0 replies; 10+ messages in thread
From: ygrek @ 2013-05-09 16:29 UTC (permalink / raw)
  To: caml-list

On Thu, 9 May 2013 10:21:39 -0400
Anil Madhavapeddy <anil@recoil.org> wrote:

> You should look at either Lwt or Core/Async, which both provide wrappers
> over Bigarray, and asynchronous alternatives to using Unix and threads.
> 
> Lwt_bytes: http://ocsigen.org/lwt/api/Lwt_bytes
> Bigstring: https://ocaml.janestreet.com/ocaml-core/latest/doc/core/Bigstring.html

There are also write/pwrite wrappers with bigarray argument in extunix - http://extunix.forge.ocamlcore.org/

-- 
 ygrek
 http://ygrek.org.ua

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Caml-list] String, Array, Bigarray.char
  2013-05-09 13:32 [Caml-list] String, Array, Bigarray.char Tom Ridge
  2013-05-09 13:44 ` Anil Madhavapeddy
  2013-05-09 14:25 ` Markus Mottl
@ 2013-05-10 23:42 ` Goswin von Brederlow
  2 siblings, 0 replies; 10+ messages in thread
From: Goswin von Brederlow @ 2013-05-10 23:42 UTC (permalink / raw)
  To: caml-list

On Thu, May 09, 2013 at 02:32:20PM +0100, Tom Ridge wrote:
> Dear caml-list,
> 
> Quick question: I am working a lot with arrays of byte, which at
> various points I want to view as strings, and at various points I want
> to view as arrays. The exact types involved should be discernible from
> the code below.
> 
> I have some conversion functions e.g.:
> 
>   type myfusebuffer = (char, Bigarray.int8_unsigned_elt,
> Bigarray.c_layout) Bigarray.Array1.t
> 
>   module A = Bigarray.Array1
> 
>   (* convenience only; don't use in production code *)
>   let array_of_string bs = (
>     let arr = (Array.init (String.length bs) (String.get bs)) in
>     let contents = A.of_array Bigarray.char Bigarray.c_layout arr in
>     contents)
>   let (_:string -> myfusebuffer) = array_of_string
> 
> This presumably takes O(n) time (where n is the length of the string
> bs). My question is: is there functionality to move values between
> these types at cost O(1)? Basically, I'm hoping that String is
> implemented as A.of_array Bigarray.char Bigarray.c_layout or
> similar...
> 
> Thanks

A Bigarray is a ocaml block with the dimension, size, proxy object
(for sub arrays) and pointer to the data.

A string on the other hand is a ocaml block with bytes directly in it.

Now if you allocate the memory for a Bigarray you can add a bit extra
in front so you can also access the array as string. But you run into
problems with the GC. Because it might want to free the Bigarrays
while something still holds a reference to the string. So realy not a
good idea.



Since you seem to want to use fuse and bigarrays to implement a
filesystem you might want to take a look at:

- ExtUnix:
type buffer = (int, Bigarray.int8_unsigned_elt, Bigarray.c_layout) Bigarray.Array1.t

- ExtUnix.BA:
val pread : Unix.file_descr -> int -> ExtUnixSpecific.buffer -> int
val pwrite : Unix.file_descr -> int -> ExtUnixSpecific.buffer -> int
val read : Unix.file_descr -> ExtUnixSpecific.buffer -> int
val write : Unix.file_descr -> ExtUnixSpecific.buffer -> int
val get_substr : ExtUnixSpecific.buffer -> int -> int -> string
val set_substr : ExtUnixSpecific.buffer -> int -> string -> unit

- ExtUnix.BA.BigEndian
- ExtUnix.BA.LittleEndian
- ExtUnix.BA.HostEndian


- libaio-ocaml: http://git.ocamlcore.org/cgi-bin/gitweb.cgi?p=libaio-ocaml/libaio-ocaml.git;a=summary
	pkg-ocaml-maint/packages/libaio-ocaml.git on git.debian.org
	Bindings for Linux async IO library (libaio)

- libfuse-ocaml: http://git.ocamlcore.org/cgi-bin/gitweb.cgi?p=libfuse-ocaml/libfuse-ocaml.git;a=summary
	pkg-ocaml-maint/packages/libfuse-ocaml.git on git.debian.org
	Bindings for libfuse

The fuse bindings are for the lowlevel interface and geared towards
async IO. I already convert filenames to strings. Or the stat
structures to Bigarray on replies and so on. All of those are
comparatively small so copying seems OK. The read callback always uses
Bigarray and the write callback reply with string, string array,
Bigarray or Bigarray array.


I've managed to merge the Bigarray and endian stuff from libaio-ocaml
and libfuse-ocaml in extunix but I haven't yet updated libaio-ocaml
and libfuse-ocaml to make use of that. So if you plan to use either of
those for real let me know and I will find some time to update them. I
plan to do that soon and get them added to Debian proper now that
wheezy is released anyway.

Also if you want to add bindings for the high level fuse interface,
implement a high level interface in ocaml directly or add a
multithreaded main loop I wouldn't mind patches to that affect.

MfG
	Goswin

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-05-10 23:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-09 13:32 [Caml-list] String, Array, Bigarray.char Tom Ridge
2013-05-09 13:44 ` Anil Madhavapeddy
2013-05-09 14:07   ` Tom Ridge
2013-05-09 14:14     ` Tom Ridge
2013-05-09 14:21       ` Anil Madhavapeddy
2013-05-09 14:30         ` Tom Ridge
2013-05-09 16:29         ` ygrek
2013-05-09 14:29       ` Markus Mottl
2013-05-09 14:25 ` Markus Mottl
2013-05-10 23:42 ` Goswin von Brederlow

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).