* [Caml-list] Profiling a function execution
@ 2003-11-13 22:34 Daniel Bünzli
2003-11-13 23:53 ` Brian Hurt
2003-11-25 18:05 ` Xavier Leroy
0 siblings, 2 replies; 11+ messages in thread
From: Daniel Bünzli @ 2003-11-13 22:34 UTC (permalink / raw)
To: caml-list
Hello,
To compare different implementations of a function I would like to
profile its execution in time and heap memory usage (at least orders of
magnitude). To do so, I use the code below, relying on the Gc and Unix
libraries.
I have the following questions :
1) When I start profiling should I prefer a Gc.compact to Gc.full_major
?
2) Unix.times seems to have a low resolution, which means that my
timings are often 0.0 (unless I execute the function a lot of times). I
don't want to use Unix.gettimeofday because this prevents me to make
the difference between user and system time. Has anybody bindings to
the getrusage function or another idea ?
3) What is the accuracy of these results ? E.g. I read in the
documentation of the Gc module that the field minor_words is only an
approximation in programs compiled to native code. Is it also true for
the other fields ? How much can I trust the figures I get ?
4) Is it possible to know at runtime whether we are running native code
or interpreted bytecode ?
Thanks for your answers,
Daniel
-- profile.ml --
type t = { minor_bytes : float;
promoted_bytes : float;
major_bytes : float;
allocated_bytes : float;
minor_collections : float;
major_collections : float;
user_time : float;
system_time : float }
(* Bytes per words *)
let bpw = float_of_int (Sys.word_size / 8)
(* Heap allocation overhead due to profiling *)
let heap_overhead =
let s = Gc.stat() in
ignore(Unix.times());
ignore(Unix.times());
let s' = Gc.stat() in
((s'.Gc.minor_words +. s'.Gc.major_words -. s'.Gc.promoted_words) -.
(s.Gc.minor_words +. s.Gc.major_words -. s.Gc.promoted_words)) *. bpw
let execution_mean (f : 'a -> 'b) (arg : 'a) n =
Gc.full_major ();
let s = Gc.stat () in
let t = Unix.times () in
for i = 1 to n do
ignore(f arg)
done;
let t' = Unix.times () in
let s' = Gc.stat () in
let mi, pro, ma =
((s'.Gc.minor_words -. s.Gc.minor_words) *. bpw) -. heap_overhead,
(s'.Gc.promoted_words -. s.Gc.promoted_words) *. bpw,
(s'.Gc.major_words -. s.Gc.major_words) *. bpw in
let n' = float_of_int n in
{ minor_bytes = mi /. n';
promoted_bytes = pro /. n';
major_bytes = ma /. n';
allocated_bytes = (mi +. ma -. pro) /. n';
minor_collections =
(float_of_int (s'.Gc.minor_collections - s.Gc.minor_collections))
/. n';
major_collections =
(float_of_int (s'.Gc.major_collections - s.Gc.major_collections))
/. n';
user_time = (t'.Unix.tms_utime -. t.Unix.tms_utime) /. n';
system_time = (t'.Unix.tms_stime -. t.Unix.tms_stime) /. n'
}
let execution (f : 'a -> 'b) (arg : 'a) = execution_mean f arg 10_000
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Profiling a function execution
2003-11-13 22:34 [Caml-list] Profiling a function execution Daniel Bünzli
@ 2003-11-13 23:53 ` Brian Hurt
2003-11-14 12:23 ` Ville-Pertti Keinonen
2003-11-14 12:33 ` Richard Jones
2003-11-25 18:05 ` Xavier Leroy
1 sibling, 2 replies; 11+ messages in thread
From: Brian Hurt @ 2003-11-13 23:53 UTC (permalink / raw)
To: Daniel Bünzli; +Cc: Ocaml Mailing List
On Thu, 13 Nov 2003, Daniel Bünzli wrote:
> 2) Unix.times seems to have a low resolution, which means that my
> timings are often 0.0 (unless I execute the function a lot of times). I
> don't want to use Unix.gettimeofday because this prevents me to make
> the difference between user and system time. Has anybody bindings to
> the getrusage function or another idea ?
Clocks on computers have a 10ms quantiziation- i.e., they measure
everything as multiples of 10 milliseconds. All measurements are accurate
only up to about +/- 10 milliseconds. I.e. if times reports that
something takes 50 milliseconds, it probably takes more than 40 but less
than 60 milliseconds.
My recommendation if you want to time something that takes less than
seconds to do, is to do it lots of times. You want the amount of time
between two calls to times to be at least seconds, if not tens of seconds
or minutes. You then divide the total amount of time taken by the number
of times the operation was executed, to get the time per execution.
Is there a wrapper to the clock function kicking around?
--
"Usenet is like a herd of performing elephants with diarrhea -- massive,
difficult to redirect, awe-inspiring, entertaining, and a source of
mind-boggling amounts of excrement when you least expect it."
- Gene Spafford
Brian
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Profiling a function execution
2003-11-13 23:53 ` Brian Hurt
@ 2003-11-14 12:23 ` Ville-Pertti Keinonen
2003-11-14 12:33 ` Richard Jones
1 sibling, 0 replies; 11+ messages in thread
From: Ville-Pertti Keinonen @ 2003-11-14 12:23 UTC (permalink / raw)
To: Brian Hurt; +Cc: Daniel Bünzli, Ocaml Mailing List
On Thu, Nov 13, 2003 at 05:53:01PM -0600, Brian Hurt wrote:
> Clocks on computers have a 10ms quantiziation- i.e., they measure
> everything as multiples of 10 milliseconds. All measurements are accurate
No they don't, hardware timers usually run at at more than a MHz. Most
operating systems provide interfaces with limited precision (this usually
varies between 100 and 1000 Hz), but on systems that actually read the
clock and don't just count interrupts, getrusage is much more accurate
than times.
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Profiling a function execution
2003-11-13 23:53 ` Brian Hurt
2003-11-14 12:23 ` Ville-Pertti Keinonen
@ 2003-11-14 12:33 ` Richard Jones
2003-11-15 11:11 ` Wolfgang Lux
1 sibling, 1 reply; 11+ messages in thread
From: Richard Jones @ 2003-11-14 12:33 UTC (permalink / raw)
Cc: Ocaml Mailing List
If you're using a Pentium-class machine, then a nice trick is to wrap
the RDTSC assembly instruction and call that. It returns number of
clock cycles elapsed (a 64 bit number IIRC). Here's some code to get
you started:
#ifdef linux
#include <sys/bitypes.h>
#endif
extern inline u_int64_t
rdtsc (void)
{
u_int32_t hi, lo;
#ifdef linux
__asm__ ("rdtsc" : "=d" (hi), "=a" (lo));
#else
__asm__ (".byte 0x0f,0x31" : "=d" (hi), "=a" (lo));
#endif
return ((u_int64_t) hi) << 32 | lo;
}
Rich.
--
Richard Jones. http://www.annexia.org/ http://freshmeat.net/users/rwmj
Merjis Ltd. http://www.merjis.com/ - improving website return on investment
"My karma ran over your dogma"
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Profiling a function execution
2003-11-14 12:33 ` Richard Jones
@ 2003-11-15 11:11 ` Wolfgang Lux
2003-11-15 12:21 ` David MENTRE
2003-11-15 12:54 ` Richard Jones
0 siblings, 2 replies; 11+ messages in thread
From: Wolfgang Lux @ 2003-11-15 11:11 UTC (permalink / raw)
To: Richard Jones; +Cc: Ocaml Mailing List
Richard Jones wrote:
> If you're using a Pentium-class machine, then a nice trick is to wrap
> the RDTSC assembly instruction and call that. It returns number of
> clock cycles elapsed (a 64 bit number IIRC). Here's some code to get
> you started:
This looks like complete nonsense to me. This will neither distinguish
user and system mode execution times. Even worse, clock cycles will
not have any meaning if the operating system chooses to perform a
task switch while the profiled function is executed. It is for such
reasons that Unix systems have an getrusage call.
Regards
Wolfgang
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Profiling a function execution
2003-11-15 11:11 ` Wolfgang Lux
@ 2003-11-15 12:21 ` David MENTRE
2003-11-15 12:54 ` Richard Jones
1 sibling, 0 replies; 11+ messages in thread
From: David MENTRE @ 2003-11-15 12:21 UTC (permalink / raw)
To: Wolfgang Lux; +Cc: Richard Jones, Ocaml Mailing List
Hello,
Wolfgang Lux <wlux@uni-muenster.de> writes:
> This looks like complete nonsense to me. This will neither distinguish
> user and system mode execution times. Even worse, clock cycles will
> not have any meaning if the operating system chooses to perform a
> task switch while the profiled function is executed.
You're right. And the better approach might be to use profiling
architectures like OProfile (http://oprofile.sourceforge.net/news/).
Yours,
d.
--
David Mentré <dmentre@linux-france.org>
http://www.linux-france.org/~dmentre/david-mentre-public-key.asc
GnuPG key fingerprint: A7CD 7357 3EC4 1163 745B 7FD3 FB3E AD7C 2A18 BE9E
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Profiling a function execution
2003-11-15 11:11 ` Wolfgang Lux
2003-11-15 12:21 ` David MENTRE
@ 2003-11-15 12:54 ` Richard Jones
1 sibling, 0 replies; 11+ messages in thread
From: Richard Jones @ 2003-11-15 12:54 UTC (permalink / raw)
Cc: Ocaml Mailing List
On Sat, Nov 15, 2003 at 12:11:17PM +0100, Wolfgang Lux wrote:
> Richard Jones wrote:
>
> >If you're using a Pentium-class machine, then a nice trick is to wrap
> >the RDTSC assembly instruction and call that. It returns number of
> >clock cycles elapsed (a 64 bit number IIRC). Here's some code to get
> >you started:
>
> This looks like complete nonsense to me. This will neither distinguish
> user and system mode execution times. Even worse, clock cycles will
> not have any meaning if the operating system chooses to perform a
> task switch while the profiled function is executed.
Obviously. It was only a suggestion after all ...
Rich.
--
Richard Jones. http://www.annexia.org/ http://freshmeat.net/users/rwmj
Merjis Ltd. http://www.merjis.com/ - improving website return on investment
'There is a joke about American engineers and French engineers. The
American team brings a prototype to the French team. The French team's
response is: "Well, it works fine in practice; but how will it hold up
in theory?"'
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Profiling a function execution
2003-11-13 22:34 [Caml-list] Profiling a function execution Daniel Bünzli
2003-11-13 23:53 ` Brian Hurt
@ 2003-11-25 18:05 ` Xavier Leroy
2003-11-25 21:54 ` Daniel Bünzli
` (2 more replies)
1 sibling, 3 replies; 11+ messages in thread
From: Xavier Leroy @ 2003-11-25 18:05 UTC (permalink / raw)
To: Daniel Bünzli; +Cc: caml-list
Just to complement Damien's answers:
> 2) Unix.times seems to have a low resolution, which means that my
> timings are often 0.0 (unless I execute the function a lot of times). I
> don't want to use Unix.gettimeofday because this prevents me to make
> the difference between user and system time. Has anybody bindings to
> the getrusage function or another idea ?
At least under Linux, getrusage() doesn't provide more accuracy than
times(), since the kernel maintains user and system time as an
integral number of clock ticks. I haven't looked at other Unix
kernels, but I suspect that the additional precision made possible by
the getrusage() syscall is simply not exploited.
(Besides, times() is part of the POSIX standard while getrusage() is not,
meaning less portability.)
> 4) Is it possible to know at runtime whether we are running native code
> or interpreted bytecode ?
Some clever hacks involving external C functions were given on this list.
I'm not sure I would endorse them :-) But your question begs another
question: why would you need to do distinguish native code from
bytecode? We're working hard on removing the last discrepancies
between the two compilers...
- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Profiling a function execution
2003-11-25 18:05 ` Xavier Leroy
@ 2003-11-25 21:54 ` Daniel Bünzli
2003-11-25 22:38 ` Kim Nguyen
2003-11-26 12:23 ` Ville-Pertti Keinonen
2 siblings, 0 replies; 11+ messages in thread
From: Daniel Bünzli @ 2003-11-25 21:54 UTC (permalink / raw)
To: Xavier Leroy; +Cc: Damien Doligez, caml-list
Le 25 nov. 03, à 19:05, Xavier Leroy a écrit :
> I haven't looked at other Unix
> kernels, but I suspect that the additional precision made possible by
> the getrusage() syscall is simply not exploited.
I have done tests using the C code appended to this email.
Indeed with the Darwin Kernel 7.0.0 (MacOS X), you don't get extra
precision with the figures returned by getrusage(). I was mistaken by
the potential of the data structure described in the man page (there's
no info about the actual granularity there).
You can remove my wish (id=1937) for getrusage bindings from your
database.
> your question begs another
> question: why would you need to do distinguish native code from
> bytecode?
I just wanted a quick way to distinguish, on output, my bytecode
profiles from native ones. This can of course be handled very easily
with a preprocessor, I was just wondering if I had missed a flag in the
style of Sys.interactive.
> We're working hard on removing the last discrepancies
> between the two compilers...
One could argue that since there exists some discrepancies such a flag
should exist. However I doubt its usefulness in common practice (which
is why I didn't file a feature wish for that one). Furthermore, if it
doesn't exist, removing the last discrepancies becomes a must...
Thanks for your answers,
Daniel
--- test.c ---
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <time.h>
#include <sys/types.h>
#include <sys/times.h>
#include <sys/time.h>
#include <sys/resource.h>
#define NUM_OPS ULONG_MAX / 100
void idle (void)
{
unsigned long i;
int op;
for (i = 0; i < NUM_OPS; i++) op = 1+1;
}
void diff_times (void)
{
struct tms t1, t2;
times(&t1);
idle();
times(&t2);
printf("times: %fs user %fs sys\n",
(double)(t2.tms_utime-t1.tms_utime) / CLK_TCK,
(double)(t2.tms_stime-t1.tms_stime) / CLK_TCK);
}
void diff_getrusage(void)
{
struct rusage t1, t2;
getrusage(RUSAGE_SELF, &t1);
idle();
getrusage(RUSAGE_SELF, &t2);
printf("getrusage: %fs user %fs sys\n",
(double)t2.ru_utime.tv_sec + (double)t2.ru_utime.tv_usec / 1e6 -
((double)t1.ru_utime.tv_sec + (double)t1.ru_utime.tv_usec / 1e6),
(double)t2.ru_stime.tv_sec + (double)t2.ru_stime.tv_usec / 1e6 -
((double)t1.ru_stime.tv_sec + (double)t1.ru_stime.tv_usec / 1e6));
}
int main(int argc, char **argv)
{
diff_times();
diff_getrusage();
return 0;
}
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Profiling a function execution
2003-11-25 18:05 ` Xavier Leroy
2003-11-25 21:54 ` Daniel Bünzli
@ 2003-11-25 22:38 ` Kim Nguyen
2003-11-26 12:23 ` Ville-Pertti Keinonen
2 siblings, 0 replies; 11+ messages in thread
From: Kim Nguyen @ 2003-11-25 22:38 UTC (permalink / raw)
To: caml-list; +Cc: Xavier Leroy
[-- Attachment #1: Type: text/plain, Size: 665 bytes --]
On Tue, 25 Nov 2003 19:05:53 +0100
Xavier Leroy <xavier.leroy@inria.fr> wrote:
Hello,
> Some clever hacks involving external C functions were given on this list.
> I'm not sure I would endorse them :-)
Is this for a "political" reason (this 'side effect' of the compiler should not be
exposed to the casual user or can this hack lead to an unsafety of some kind ?
(in code generation or something else ?)
> We're working hard on removing the last discrepancies
> between the two compilers...
Does this include a Dynlink module for native code ? ( or is such a module
incompatible with the design of the compiler or with type safety ? )
Regards,
Kim Nguyen.
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [Caml-list] Profiling a function execution
2003-11-25 18:05 ` Xavier Leroy
2003-11-25 21:54 ` Daniel Bünzli
2003-11-25 22:38 ` Kim Nguyen
@ 2003-11-26 12:23 ` Ville-Pertti Keinonen
2 siblings, 0 replies; 11+ messages in thread
From: Ville-Pertti Keinonen @ 2003-11-26 12:23 UTC (permalink / raw)
To: Xavier Leroy; +Cc: Daniel Bünzli, caml-list
On Tue, Nov 25, 2003 at 07:05:53PM +0100, Xavier Leroy wrote:
> At least under Linux, getrusage() doesn't provide more accuracy than
> times(), since the kernel maintains user and system time as an
> integral number of clock ticks. I haven't looked at other Unix
> kernels, but I suspect that the additional precision made possible by
> the getrusage() syscall is simply not exploited.
4.4 BSD derivatives (except Darwin/MacOS X).
O^HD^HTru64.
> (Besides, times() is part of the POSIX standard while getrusage() is not,
> meaning less portability.)
You can always use something like #ifdef HAS_GETRUSAGE
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2003-11-26 12:23 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-11-13 22:34 [Caml-list] Profiling a function execution Daniel Bünzli
2003-11-13 23:53 ` Brian Hurt
2003-11-14 12:23 ` Ville-Pertti Keinonen
2003-11-14 12:33 ` Richard Jones
2003-11-15 11:11 ` Wolfgang Lux
2003-11-15 12:21 ` David MENTRE
2003-11-15 12:54 ` Richard Jones
2003-11-25 18:05 ` Xavier Leroy
2003-11-25 21:54 ` Daniel Bünzli
2003-11-25 22:38 ` Kim Nguyen
2003-11-26 12:23 ` Ville-Pertti Keinonen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).