caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Jon Harrop <jon@jdh30.plus.com>
To: caml users <caml-list@inria.fr>
Cc: Damien Doligez <damien.doligez@inria.fr>
Subject: Re: [Caml-list] Gripes with array
Date: Thu, 9 Sep 2004 17:58:05 +0100	[thread overview]
Message-ID: <200409091758.05679.jon@jdh30.plus.com> (raw)
In-Reply-To: <EA60BA7F-0243-11D9-AAE1-00039310CAE8@inria.fr>

On Thursday 09 September 2004 10:37, Damien wrote:
> ...
> > Am I right in thinking that the maximum
> > non-float array size on a 64-bit machine is 18,014,398,509,481,983?
>
> That's correct.  It's a good thing that 32-bitters are on their way out.

If I upgrade I'll surely end up playing Doom 3 and not get any work 
done... :-)

> > Also, can Array.init be made to fill the elements only once?
>
> No, that's impossible without breaking the GC invariants.

Could it not even be done by dragging Array.init inside the compiler, giving 
it the same status as Array.make?

> >  This would make quite a few things twice as fast
>
> Twice?  I doubt it very much.

That was an estimate based upon the assumption that, on large arrays (well, 
~4M elements ;-), the time taken is limited by the filling of elements which 
is currently done twice but which only needs to be done once. I believe this 
is justified because of the high-cost of memory writes (to main memory for 
out-of-cache sized arrays), even sequential ones, compared to (trivial, 
inlineable) function calls, a single heap allocation etc.

What is the bottleneck in the asymptotic limit?

For measurements on 4,000,000 element int arrays (using the code at the end of 
this mail) I get:

Array.make took 0.131528823272 secs.
Array.init took 0.311059344899 secs.
array_init took 0.179279577732 secs.

Measuring memset from C gives me 0.0311secs. So element-setting must be at 
least 10% of Array.init. Also, the array_init function is surprisingly fast, 
presumably due to "f" not being inlined into Array.init but being inlined 
into array_init.

This came up because my wavelet transform code in OCaml is within 15% of the 
performance of my equivalent C version excluding the cost of creating the 
array. Including that cost (even with calloc), the C version is twice as 
fast. Admittedly calloc will use memset, and not set the elements "properly", 
but even so...

> > let copy a = init (length a) (fun i -> a.(i))
>
> Exactly how it's written now, except that it's inlined by hand for
> performance reasons.

Array.copy took 0.298557505888 secs.
array_copy took 0.315200943696 secs.

This optimisation gives a <6% performance improvement (and this is really 
best-case for large arrays because the filling-function is trivial in this 
case). I'd have gone for five times less code in the array module and more 
code in the compiler... ;-)

Perhaps the current versions are significantly faster on smaller data 
structures...

Cheers,
Jon.

-----

let f i = 1+i

let array_init l =
  if l = 0 then [||] else
  let res = Array.make l (f 0) in
  for i = 1 to pred l do
    Array.unsafe_set res i (f i)
  done;
  res 

let array_copy a = Array.init (Array.length a) (fun i -> Array.unsafe_get a i)

let time f =
  let time = Unix.gettimeofday in
  let t = time () in
  ignore (f ());
  (time ()) -. t

let _ =
  let timings = Array.make 5 (0., 0) in
  let l = 4000000 in
  let a = Array.make l 0 in
  for i=0 to 100 do
    let entry = Random.int 5 in
    let t =
      time (match entry with
	0 -> fun () -> Array.make l 0
      | 1 -> fun () -> Array.init l f
      | 2 -> fun () -> array_init l
      | 3 -> fun () -> Array.copy a
      | 4 -> fun () -> array_copy a)
    in
    timings.(entry) <-
      let (ot, n) = timings.(entry) in
      (ot +. t, n+1);
  done;
  let entry = [| "Array.make";
		 "Array.init";
		 "array_init";
		 "Array.copy";
		 "array_copy" |] in
  for i=0 to 4 do
    print_endline (entry.(i)^": "^(string_of_int (snd timings.(i))))
  done;
  let timings =
    Array.map (fun (t, n) -> string_of_float (t /. float_of_int n)) timings in
  for i=0 to 4 do
    print_endline (entry.(i)^" took "^timings.(i)^" secs.")
  done

-----

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


  parent reply	other threads:[~2004-09-09 19:10 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-09-09  2:10 Jon Harrop
2004-09-09  5:08 ` Ville-Pertti Keinonen
2004-09-09  7:17 ` Jean-Christophe Filliatre
2004-09-09  8:23   ` Richard Jones
2004-09-09  9:08     ` Olivier Andrieu
2004-09-09 12:08       ` Basile Starynkevitch [local]
2004-09-09 12:31         ` Damien Doligez
2004-09-09 10:42     ` Gerd Stolpmann
2004-09-09  9:37 ` Damien Doligez
2004-09-09 10:34   ` Jean-Christophe Filliatre
2004-09-09 12:15     ` Igor Pechtchanski
2004-09-09 13:01   ` Brian Hurt
2004-09-09 20:08     ` [Caml-list] 32-bit is sticking around Brandon J. Van Every
2004-09-09 21:04       ` Jon Harrop
2004-09-11 15:30         ` Lars Nilsson
2004-09-11 16:24           ` [off topic] " David MENTRE
2004-09-11 17:52             ` Lars Nilsson
     [not found]           ` <200409111656.11952.jon@jdh30.plus.com>
2004-09-11 17:47             ` Lars Nilsson
2004-09-09 16:58   ` Jon Harrop [this message]
2004-09-10  5:56     ` Array.init (was [Caml-list] Gripes with array) Christophe Raffalli
2004-09-10  8:53       ` Richard Jones
2004-09-10 14:50         ` Damien Doligez
2004-09-13  7:02       ` Christophe Raffalli
2004-09-10 13:45     ` [Caml-list] Gripes with array Damien Doligez
2004-09-11  1:43       ` skaller
2004-09-11  3:16         ` skaller
2004-09-11 14:36       ` Jon Harrop
2004-09-11 20:53         ` Damien Doligez
2004-09-12 15:33           ` Jon Harrop
2004-09-12 16:07             ` Basile Starynkevitch [local]
2004-09-10 23:48 ` brogoff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200409091758.05679.jon@jdh30.plus.com \
    --to=jon@jdh30.plus.com \
    --cc=caml-list@inria.fr \
    --cc=damien.doligez@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).