caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
* string vs vect - some results
@ 1996-02-16 14:27 Jocelyn Serot
  1996-02-19 10:06 ` string vs vect + GC question Xavier Leroy
  0 siblings, 1 reply; 2+ messages in thread
From: Jocelyn Serot @ 1996-02-16 14:27 UTC (permalink / raw)
  To: caml-list


Hello,

In the process of finding an efficient implementation for an image abstract
datatype i've been led to compare the <vect> and <string> provided by the
Caml-Light core library.
I feel that some may be interested by the results.
I used the time profiler written by C. Raffalli (raffalli@cs.chalmers.se) and
posted a few months ago in this list.

I wrote several versions of fns doing read-write access to 1D and 2D arrays.
For each program, results are given like this (this is taken from C. Raffali
post):

  f                5.32    7.10
  g                4.00    4.00
  main             0.12    9.44
  total           -9.44    0.00

- The first column is the name of the function.
- The third column give the time (utime + stime) spend inside the function.
- The second column give the time spend inside the function minus the
  time spend in other profiled functions called by it

Times are given in seconds. They should not be interpreted in an absolute
manner (since they may vary slightly depending on the host load) but relatively
(the ratios appear to remain constant for several runs of the same pgm).

(***************************** 1st VERSION ***********************************)
let s = make_string 65536 `a`;;
let f n = for i = 0 to n do set_nth_char s i (nth_char s i) done;; 
let f = profile "f" f;;
let v = make_vect 65536 `a`;;
let g n = for i = 0 to n do vect_assign v i (vect_item v i) done;; 
let g = profile "g" g;;
f 65535;;
g 65535;;
print_profile ();;
(*--------------------------------------------------- RESULTS ------------------
g                        9.83     9.83
f                        9.68     9.68
------------------------------------------------------------------------------*)

	COMMENT: the cost of accessing a vect and a string cell is approximately
			 the same (with a slight advantage for strings). 

(***************************** 2nd VERSION ***********************************)
let s = make_string 65536 `a`;;
let f n = for i = 0 to n do s.[i] <- s.[i] done;; 
let f = profile "f" f;;
let v = make_vect 65536 `a`;;
let g n = for i = 0 to n do v.(i) <- v.(i) done;; 
let g = profile "g" g;;
f 65535;;
g 65535;;
print_profile ();;
(*--------------------------------------------------- RESULTS ------------------
g                        9.87     9.87
f                        9.52     9.52
------------------------------------------------------------------------------*)

	COMMENT: the .(), .[] and <- notations are only syntactics sugars.
			 Good.

(***************************** 3rd VERSION ***********************************)
let s = make_string 65536 `a`;;
let f n =
	for i = 0 to n do for j = 0 to n do s.[i*256+j] <- s.[i*256+j] done done;; 
let f = profile "f" f;;
let v = make_matrix 256 256 `a`;;
let g n =
	for i = 0 to n do for j = 0 to n do v.(i).(j) <- v.(i).(j) done done;;
let g = profile "g" g;;
f 255;;
g 255;;
print_profile ();;
(*--------------------------------------------------- RESULTS ------------------
g                       16.77    16.77
f                       11.58    11.58
------------------------------------------------------------------------------*)

	COMMENT: For 2D arrays, strings win the game. This is quite surprising
			 since Caml Light is not supposed to be very good in doing the
			 explicit arithmetic involved in the index computation. Vects seem
			 to be penalized by the double indirection. 

(***************************** 4th VERSION ***********************************)
let s = make_string 65536 `a`;;
let f n =
	for i = 0 to n do for j = 0 to n do s.[i*256+j] <- s.[i*256+j] done done;; 
let f = profile "f" f;;
let v = make_vect 256 (make_string 256 `a`);;
let g n =
	for i = 0 to n do for j = 0 to n do v.(i).[j] <- v.(i).[j] done done;;
let g = profile "g" g;;
f 255;;
g 255;;
print_profile ();;
(*--------------------------------------------------- RESULTS ------------------
g                       16.52    16.52
f                       11.62    11.62
------------------------------------------------------------------------------*)

	COMMENT: This "hybrid" version (vect of strings) does not bring any 
			 improvment

CONCLUSION:

For 1D arrays, string and vect-based implementation have similar efficiency.
For 2D arrays (like image representions for ex) strings seems more efficient
than vects of vects.

However, when defining operations on 2D-arrays more work has to be done if
these are represented by strings (in the Caml-Light core library the <string>
module is much smaller than the <vect> module (which in turn is smaller than
the list module..)).

I also have to say, from ny current experiences, that reading/writting
a 2D-array from/to a file can be _much_ more efficient 
with a string-based representation (specially if elements are stored in 
row-major order on disk)

Any comment, addition, negation ?

	Jocelyn S.

E-mail: Jocelyn.Serot@lasmea.univ-bpclermont.fr .............................
S-mail: LASMEA - URA 1793 CNRS, Universite Blaise Pascal, 63177 Aubiere cedex
Tel: (33) 73.40.73.30 - Fax: (33) 73.40.72.62 ...............................
.... http://wwwlasmea.univ-bpclermont.fr/Personnel/Jocelyn.Serot/Welcome.html





^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: string vs vect + GC question
  1996-02-16 14:27 string vs vect - some results Jocelyn Serot
@ 1996-02-19 10:06 ` Xavier Leroy
  0 siblings, 0 replies; 2+ messages in thread
From: Xavier Leroy @ 1996-02-19 10:06 UTC (permalink / raw)
  To: Jocelyn Serot; +Cc: caml-list


Concerning your performance comparison between arrays and strings:

> let s = make_string 65536 `a`;;
> let f n =
>   for i = 0 to n do for j = 0 to n do s.[i*256+j] <- s.[i*256+j] done done;; 
> let v = make_matrix 256 256 `a`;;
> let g n =
>   for i = 0 to n do for j = 0 to n do v.(i).(j) <- v.(i).(j) done done;;

> COMMENT: For 2D arrays, strings win the game. This is quite surprising
>          since Caml Light is not supposed to be very good in doing the
>          explicit arithmetic involved in the index computation. Vects seem
>          to be penalized by the double indirection. 

The main reason why arrays of arrays are less efficient is that each
access performs two bound checks, instead of one for the string-based
version. With array bound checking turned off (option -O fast), both
versions run at the same speed.

Also, if you're really doing matrices of bytes, the string-based
encoding is 4 times more compact (8 times on an Alpha) than the
array-based representation, which improves greatly cache behavior and
I/O speed.

As for your memory management question:

> I have a basic question regarding the behaviour of the Caml-Light Garbage
> collector: when an object becomes hidden, because of a its name is rebound at
> the top level, is it automatically reclaimed by the gc ?...
> For example:
>	#let m = make_vect 1024 0;;
>	m : int vect = [|0; 0; 0; 0; 0; ... |]
>	#let m = 0;;
>	m : int = 0   (* <- Will the space allocated for the vector be reclaimed ?*)

No, it will not be reclaimed. That's because all toplevel definitions
are entered in a global table of values, and always referenced through
that table. Rebinding an identifier does not guarantee that its
previous value is no longer needed, e.g.:

        let m = make_vect 1024 0;;
        let f x = m.(x);;
        let m = 0;;

"f" still accesses the previous value of m in the global table. 

Note that this behavior is specific to the toplevel: locally-bound
structures will always become garbage when leaving the scope of their
definition.

- Xavier Leroy





^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~1996-02-19 19:15 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1996-02-16 14:27 string vs vect - some results Jocelyn Serot
1996-02-19 10:06 ` string vs vect + GC question Xavier Leroy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).