caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Raj Bandyopadhyay <rajb@rice.edu>
To: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Array copying in OCaml
Date: Sun, 27 Jul 2008 21:03:58 -0500	[thread overview]
Message-ID: <488D290E.4070001@rice.edu> (raw)
In-Reply-To: <666572260807241246m4ef1c4aclc6078db154d68c7a@mail.gmail.com>

As a follow up to this: I got a 3x performance improvement by writing my 
own version of the Array.sub function in C, patterned after the 
caml_make_vect function in the OCaml compiler.

I don't think that C is the main reason for the improvement, it's more 
fundamental:

1) The OCaml library version of Array.sub creates an array, 
***initializes it***, and then copies to it. The initialization is quite 
unnecessary, and algorithmically makes the function about twice as 
expensive as it should be.

2) From reading OCaml source code, it looks like caml_initialize is a 
much cheaper function to use than caml_modify due to GC issues. Yet, the 
OCaml library version uses caml_modify by initializing the target array 
with a default value first, instead of using the source array to 
initialize. I guess this is why on profiling, caml_modify shows up as 
really expensive, paired with lots of GC calls.

I do believe that this calls for a better version of the Array.sub 
function in the next version of OCaml. Here is my C code  below (I am 
only using it for an array of records, so the code is specialized to 
that). I would be glad to submit a more general version if people want 
it (and someone tells me where to submit it).

Thanks!
Raj


CAMLprim value caml_my_array_sub(value source, value ofs, value len)
{
    CAMLparam3 (source,ofs,len);
    CAMLlocal1 (res);
    mlsize_t size, wsize, i,offset;

    size = Long_val(len);
    offset = Long_val(ofs);
    if (size == 0)
    {
        res = Atom(0);
    }
    else
    {
        if (size > Max_wosize) caml_invalid_argument("Array.sub");
        if (size < Max_young_wosize)
        {
            res = caml_alloc_small(size, 0);
            for (i = 0; i < size; i++)
            {
                Field(res, i) = Field(source,i+offset);
            }
        }
        else
        {
            caml_minor_collection();
            res = caml_alloc_shr(size, 0);
            for (i = 0; i < size; i++)
            {
                caml_initialize(&Field(res, i), Field(source,i+offset));
            }
            res = caml_check_urgent_gc (res);
        }
    }
    CAMLreturn (res);
}


Adrien wrote:
> You can compile with the -unsafe flag. If you're using arrays a lot,
> it can easily speed your program by 50%. If your program is array
> based, the speedup can be even more important.
>
>
>
> 2008/7/24 Raj Bandyopadhyay <rajb@rice.edu>:
>   
>> Hi
>>
>> I have an application which copies a lot of (small) OCaml arrays using the
>> Array library (Array.sub and Array.blit) functions. This is turning out to
>> be extremely expensive.
>>
>> Is there any general way/trick to reduce the cost of this kind of operation?
>> I haven't found a way not to copy as much, because the program semantics
>> seems to demand it.
>>
>> Thanks
>> Raj
>>
>> _______________________________________________
>> Caml-list mailing list. Subscription management:
>> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
>> Archives: http://caml.inria.fr
>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
>> Bug reports: http://caml.inria.fr/bin/caml-bugs
>>
>>     


  reply	other threads:[~2008-07-28  2:04 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-24 16:31 Raj Bandyopadhyay
2008-07-24 16:37 ` [Caml-list] " Dr. Thomas Fischbacher
2008-07-24 17:25 ` Christophe TROESTLER
2008-07-24 19:46 ` Adrien
2008-07-28  2:03   ` Raj Bandyopadhyay [this message]
2008-07-28  7:52     ` Richard Jones
2008-07-28 17:34   ` Raj Bandyopadhyay
2008-07-28 21:11     ` Richard Jones
2008-07-29  9:37     ` Michel Mauny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=488D290E.4070001@rice.edu \
    --to=rajb@rice.edu \
    --cc=caml-list@yquem.inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).