[Caml-list] O'Caml vs C++: a little benchmark

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

* [Caml-list] O'Caml vs C++: a little benchmark
@ 2002-08-18 17:17 Oleg
  2002-08-18 18:00 ` William Chesters
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Oleg @ 2002-08-18 17:17 UTC (permalink / raw)
  To: caml-list

Hi

I wrote a few simple benchmarks [1] assessing binaries generated by "ocamlopt 
-unsafe -noassert" vs binaries generated by "g++-3.2 -O2 -fomit-frame-pointer 
-pedantic" on P3 Xeon, and the results were quite surprising to me.

Firstly, I expected iteration over O'Caml lists and integer arrays to be as 
fast as iteration over std::list and std::vector<int>, respectively. Instead, 
the benchmark gave me a speed difference of about 10x and 100x in favor of 
C++ for lists and arrays, respectively.

Secondly, a little benchmark comparing mutable binary trees of 64 bit floats 
also showed g++-3.2 to be about an order of magnitude faster.

What was even more surprising was that O'Caml turned out to be about 10 times 
faster than C++ for reversing lines in a file. I did not use explicit buffers 
of any kind in either version, and in C++ program, I used "getline", reading 
into std::string which should provide about the same level of abstraction and 
overflow protection as O'Caml string.

I'm curious as to where these huge differences for these small programs come 
from.

Thanks
Oleg

[1] http://www.columbia.edu/~ot14/ocaml_vs_cpp/
Sample output while running ./demo_all_root.sh on P3 Xeon 800 MHz w/ 256 Mb 
RAM:

lists
1.323
0.904 (* std::deque *)
16.330
arrays
0.258    <- C++ 
16.930   <- ocamlopt 
rev
0.063
0.009
memory
0.728
1.606
tree
1.338
12.307
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Caml-list] O'Caml vs C++: a little benchmark
  2002-08-18 17:17 [Caml-list] O'Caml vs C++: a little benchmark Oleg
@ 2002-08-18 18:00 ` William Chesters
  2002-08-18 19:06   ` Oleg
  2002-08-19 13:02   ` Xavier Leroy
  2002-08-18 19:16 ` Markus Mottl
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 23+ messages in thread
From: William Chesters @ 2002-08-18 18:00 UTC (permalink / raw)
  To: caml-list

Oleg writes:
 > I'm curious as to where these huge differences for these small programs come 
 > from.

>From a very quick look at your array example, the point is very simply
that in your "abstract" fold_left idiom

    Array.fold_left (fun x a -> x +. float a) sum ar

the compiler doesn't inline Array.fold_left, and hence not your
anonymous "fun" either, because it doesn't inline anything between
compilation units.  You would see this clearly in the assembler output
if you used ocamlopt -S.

This is an example of the general truth that you can get very good
performance out of ocaml, but only as long as you ensure that your
inner loops are coded in a very down-to-earth style.

Whether this is a problem depends on your point of view.  If you think
the aim of good programming is to find beautiful abstract formulations
(including the inner loops) which nevertheless compile to optimal
machine code, it's bad.  If you think that languages and programming
idioms should stay more or less isomorphic to the execution model of
the CPU then you won't feel a need for it.
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] O'Caml vs C++: a little benchmark
  2002-08-18 18:00 ` William Chesters
@ 2002-08-18 19:06   ` Oleg
  2002-08-18 21:37     ` William Chesters
  2002-08-19 13:02   ` Xavier Leroy
  1 sibling, 1 reply; 23+ messages in thread
From: Oleg @ 2002-08-18 19:06 UTC (permalink / raw)
  To: caml-list

On Sunday 18 August 2002 02:00 pm, William Chesters wrote:
> the compiler [...] doesn't inline anything between
> compilation units.  

I was always curious about this. If I were to change Array module, but not 
its interface, ocamlopt would ask me to recompile (not just re-link) 
everything that depends on Array, which means that ocamlopt looks at Array 
_implementation_, while compiling (not linking) code that depends on it.
In view of this, what stopped O'Caml creators from letting ocamlopt inline 
functions across module boundaries (especially if it's true that this could 
be responsible for a 100x speed boost)?

> You would see this clearly in the assembler output
> if you used ocamlopt -S.

Is there a tutorial for reading those *.s files ocamlopt produces?

Cheers
Oleg
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] O'Caml vs C++: a little benchmark
  2002-08-18 17:17 [Caml-list] O'Caml vs C++: a little benchmark Oleg
  2002-08-18 18:00 ` William Chesters
@ 2002-08-18 19:16 ` Markus Mottl
  2002-08-18 19:58   ` Oleg
  2002-08-19 13:12 ` malc
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Markus Mottl @ 2002-08-18 19:16 UTC (permalink / raw)
  To: Oleg; +Cc: caml-list

On Sun, 18 Aug 2002, Oleg wrote:
> Firstly, I expected iteration over O'Caml lists and integer arrays to
> be as 
> fast as iteration over std::list and std::vector<int>, respectively.
> Instead, > the benchmark gave me a speed difference of about 10x and
> 100x in favor of 
> C++ for lists and arrays, respectively.

My timings differ considerably (AMD Athlon 800 MHz 256 MB RAM; g++-2.96; 
demo_all.sh instead of demo_all_root.sh):

  lists
  0.680
  1.030
  2.740
  arrays
  0.250
  3.190
  rev
  0.200
  0.070
  memory
  0.840
  2.120
  tree
  7.280
  8.810

Note, btw., that I have measured user time: real time, which you have
chosen is just too unstable on my machine. Maybe this explains some of
your extreme measurements. I have also used a different file for reversal
(/etc/termcap) and have not run things with your root-script.

OCaml never inlines and/or unrolls recursive functions, giving iterative
solutions in C++ a significant edge concerning optimizations. The same
is true for the array solution, where you are even using higher-order
functions with "fold_left".

> Secondly, a little benchmark comparing mutable binary trees of 64 bit
> floats also showed g++-3.2 to be about an order of magnitude faster.

Not on my machine / with my compiler. Btw., not very fair of you to 
compare ephemeral and persistent datastructures... ;-)

> What was even more surprising was that O'Caml turned out to be about
> 10 times > faster than C++ for reversing lines in a file. I did not
> use explicit buffers > of any kind in either version, and in C++
> program, I used "getline", reading 
> into std::string which should provide about the same level of
> abstraction and > overflow protection as O'Caml string.

Because the C++-iostream library just sucks (at least the implementation 
used by g++). OCaml-I/O as provided by the Pervasives-module is way 
faster.

> I'm curious as to where these huge differences for these small
> programs come from.

Look at the assembler output for details... ;-)

Regards,
Markus Mottl

-- 
Markus Mottl                                             markus@oefai.at
Austrian Research Institute
for Artificial Intelligence                  http://www.oefai.at/~markus
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] O'Caml vs C++: a little benchmark
  2002-08-18 19:16 ` Markus Mottl
@ 2002-08-18 19:58   ` Oleg
  2002-08-18 22:59     ` Markus Mottl
  0 siblings, 1 reply; 23+ messages in thread
From: Oleg @ 2002-08-18 19:58 UTC (permalink / raw)
  To: caml-list

On Sunday 18 August 2002 03:16 pm, Markus Mottl wrote:

> My timings differ considerably (AMD Athlon 800 MHz 256 MB RAM; g++-2.96;
> demo_all.sh instead of demo_all_root.sh):

g++-3.2 removes abstraction penalty related to iterators, etc.

[...]
> Note, btw., that I have measured user time: real time, which you have
> chosen is just too unstable on my machine. 

On my machine/OS (Linux 2.4), user and real time are usually the same for 
ocaml, but can differ somewhat for C++ (probably because malloc/free is done 
by the kernel or something, I wouldn't know). Had I used user time, it would 
have steered the results in favor of C++ a little more in some cases.

[...]
> Not on my machine / with my compiler. Btw., not very fair of you to
> compare ephemeral and persistent datastructures... ;-)

I'm not! Both tree_mutable_ml.ml and tree_cpp.cpp contain mutable binary 
trees. I think your C++ tree is slower than mine because of the old compiler
(Or maybe it's the OS: tree allocates a lot of small objects).

[...]
> Look at the assembler output for details... ;-)

IANAAP (I am not an assembly programmer :)

Cheers,
Oleg

P.S. It looks like List and Array iteration is somehow much faster on Athlon 
than P3 Xeon.
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] O'Caml vs C++: a little benchmark
  2002-08-18 19:06   ` Oleg
@ 2002-08-18 21:37     ` William Chesters
  0 siblings, 0 replies; 23+ messages in thread
From: William Chesters @ 2002-08-18 21:37 UTC (permalink / raw)
  To: caml-list

Oleg writes:
 > > You would see this clearly in the assembler output
 > > if you used ocamlopt -S.
 > 
 > Is there a tutorial for reading those *.s files ocamlopt produces?

Um.  "*.s files" are a human-readable representation of the basic
machine codes on which the CPU in your computer actually operates.
This so-called "assembler language" is standardised by the CPU
manufacturer.  For instance, if you are using Linux the CPU is
probably a Pentium and you can read about the assembler language at

    http://www.intel.com/design/intarch/techinfo/pentium/instsum.htm

or try

    http://www.google.com/search?q=%22intel+assembler

The advantage of looking at these files is that they tell
unambiguously what the CPU will be doing when it executes your
program, in a way which is comparable across languages.  If you are
interested in performance, programming idioms, and compilers, you
should definitely learn about these things.  (With modern CPUs it's no
longer so easy to relate the assembler code directly to performance
but it's still a fair guide to what the compiler is doing.)

 > In view of this, what stopped O'Caml creators from letting ocamlopt inline 
 > functions across module boundaries (especially if it's true that this could 
 > be responsible for a 100x speed boost)?

Well, it's not quite as simple as that.  You could say there are two
ways to achieve high performance.  The simple way is to use a nice
clean, but effective, compiler, and use programming styles which you
can see directly will lead to efficient code if compiled in the
obvious way.  The tricky way is to use programming styles which are,
on the face of it, highly inefficient, and rely on a funky compiler to
"see how to make them efficient".  You have attempted the latter
approach, but ocaml follows the former philosophy, so your program
works badly.

Of course the "funky compiler" approach seems very attractive.  The
point, though, is that no (current) compiler is capable of
automatically doing a great job under *all* circumstances.  In fact if
you try to use high-level abstractions in performance-sensitive code,
you very quickly get into a complicated and fairly pointless game of
trying to figure out how to fine-tune your code to make the compiler
do what you want.  Some compilers (like the KAI C++ compiler) let you
go considerably further than ocaml, but it's still entirely
satisfactory for moderately complex applications.  The problem is that
the more you rely on the compiler to bridge the gap between "what you
say" in the source and "what you mean" in the machine code, the more
uncertainty there is about what it will really do.

In practice what you end up doing is using abstractions in the outer
loops where performance is less important, and low-level idioms in the
inner loops.  You have to do that with ocaml, but in most cases even
with the best C++ compilers.

So ocaml's fairly minimalist approach is more defensible than you
might think.

Having said that I for one would like to see inter-module inlining in
the compiler and I don't think it would be hard to do.  One day I may
find the time ...
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] O'Caml vs C++: a little benchmark
  2002-08-18 19:58   ` Oleg
@ 2002-08-18 22:59     ` Markus Mottl
  0 siblings, 0 replies; 23+ messages in thread
From: Markus Mottl @ 2002-08-18 22:59 UTC (permalink / raw)
  To: Oleg; +Cc: caml-list

On Sun, 18 Aug 2002, Oleg wrote:
> On my machine/OS (Linux 2.4), user and real time are usually the same for 
> ocaml, but can differ somewhat for C++ (probably because malloc/free is done 
> by the kernel or something, I wouldn't know). Had I used user time, it would 
> have steered the results in favor of C++ a little more in some cases.

And deteriorated it in others:

  lists
  1.480
  0.650
  2.400
  arrays
  0.380
  3.040
  rev
  0.490
  0.040
  memory
  0.600
  2.000
  tree
  1.280
  2.750

Btw., the last timing of the tree is a result of adding a type restriction
to floats to the "insert" function ("(x : float)"). This way we benefit
from unboxing.

> > Not on my machine / with my compiler. Btw., not very fair of you to
> > compare ephemeral and persistent datastructures... ;-)
> 
> I'm not! Both tree_mutable_ml.ml and tree_cpp.cpp contain mutable binary 
> trees. I think your C++ tree is slower than mine because of the old compiler
> (Or maybe it's the OS: tree allocates a lot of small objects).

Sorry, I had mistakenly thought that you used the persistent version.

Regards,
Markus Mottl

-- 
Markus Mottl                                             markus@oefai.at
Austrian Research Institute
for Artificial Intelligence                  http://www.oefai.at/~markus
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] O'Caml vs C++: a little benchmark
  2002-08-18 18:00 ` William Chesters
  2002-08-18 19:06   ` Oleg
@ 2002-08-19 13:02   ` Xavier Leroy
  2002-08-19 13:58     ` [Caml-list] Inlining across functors (was: O'Caml vs C++: a little benchmark) Thorsten Ohl
                       ` (2 more replies)
  1 sibling, 3 replies; 23+ messages in thread
From: Xavier Leroy @ 2002-08-19 13:02 UTC (permalink / raw)
  To: caml-list

> the compiler doesn't inline Array.fold_left, and hence not your
> anonymous "fun" either, because it doesn't inline anything between
> compilation units.

Just for the record: ocamlopt does perform inlining across compilation
units (via the information stored in .cmx files).  What it doesn't do,
however, is inlining and specialization of recursive function definitions.  

Two more points before I stop reading this "yet another stupid
micro-benchmark" thread:

* http://caml.inria.fr/ocaml/speed.html is mandatory reading before
  making any performance claim about OCaml.
* http://www.bagley.org/~doug/shootout/ is a (rare) example of a
  non-stupid inter-language performance comparison.

- Xavier Leroy
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] O'Caml vs C++: a little benchmark
  2002-08-18 17:17 [Caml-list] O'Caml vs C++: a little benchmark Oleg
  2002-08-18 18:00 ` William Chesters
  2002-08-18 19:16 ` Markus Mottl
@ 2002-08-19 13:12 ` malc
  2002-08-19 13:22 ` malc
  2002-08-23 21:05 ` John Max Skaller
  4 siblings, 0 replies; 23+ messages in thread
From: malc @ 2002-08-19 13:12 UTC (permalink / raw)
  To: Oleg; +Cc: caml-list

On Sun, 18 Aug 2002, Oleg wrote:

> Hi
> 
> I wrote a few simple benchmarks [1] assessing binaries generated by "ocamlopt 
> -unsafe -noassert" vs binaries generated by "g++-3.2 -O2 -fomit-frame-pointer 
> -pedantic" on P3 Xeon, and the results were quite surprising to me.
> 
> Firstly, I expected iteration over O'Caml lists and integer arrays to be as 
> fast as iteration over std::list and std::vector<int>, respectively. Instead, 
> the benchmark gave me a speed difference of about 10x and 100x in favor of 
> C++ for lists and arrays, respectively.
> 
> Secondly, a little benchmark comparing mutable binary trees of 64 bit floats 
> also showed g++-3.2 to be about an order of magnitude faster.
> 
> What was even more surprising was that O'Caml turned out to be about 10 times 
> faster than C++ for reversing lines in a file. I did not use explicit buffers 
> of any kind in either version, and in C++ program, I used "getline", reading 
> into std::string which should provide about the same level of abstraction and 
> overflow protection as O'Caml string.
> 
> I'm curious as to where these huge differences for these small programs come 
> from.
You made an unlucky choice of summing floats. So what you measure(at least 
in list, array cases) is speed of float boxing.
open Printf;;

let n = int_of_string (Sys.argv.(1));;
let r = int_of_string (Sys.argv.(2));;

type _s = { mutable s : float }
let result =
  let s = { s = 0.0 } in
  let list = Array.to_list (Array.init n (fun i -> i)) in
  let rec f = function [] -> () | x :: xs -> s.s <- s.s +. float x; f xs in
  for i = 0 to pred r do
    f list
  done;
  s.s

Is what you would probably came to, after reading:
http://caml.inria.fr/ocaml/numerical.html
  
I'll try to analyze memory, tree etc, if time permits.

P.S.
lists
1.142
0.510
1.298

(Athlon 1.4G(running at 1G), compiled with -inline 20)

-- 
mailto:malc@pulsesoft.com

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] O'Caml vs C++: a little benchmark
  2002-08-18 17:17 [Caml-list] O'Caml vs C++: a little benchmark Oleg
                   ` (2 preceding siblings ...)
  2002-08-19 13:12 ` malc
@ 2002-08-19 13:22 ` malc
  2002-08-23 21:05 ` John Max Skaller
  4 siblings, 0 replies; 23+ messages in thread
From: malc @ 2002-08-19 13:22 UTC (permalink / raw)
  To: Oleg; +Cc: caml-list

On Sun, 18 Aug 2002, Oleg wrote:

> Hi
> 
> I wrote a few simple benchmarks [1] assessing binaries generated by "ocamlopt 
> -unsafe -noassert" vs binaries generated by "g++-3.2 -O2 -fomit-frame-pointer 
> -pedantic" on P3 Xeon, and the results were quite surprising to me.
> 
> Firstly, I expected iteration over O'Caml lists and integer arrays to be as 
> fast as iteration over std::list and std::vector<int>, respectively. Instead, 
> the benchmark gave me a speed difference of about 10x and 100x in favor of 
> C++ for lists and arrays, respectively.
> 
> Secondly, a little benchmark comparing mutable binary trees of 64 bit floats 
> also showed g++-3.2 to be about an order of magnitude faster.
> 
> What was even more surprising was that O'Caml turned out to be about 10 times 
> faster than C++ for reversing lines in a file. I did not use explicit buffers 
> of any kind in either version, and in C++ program, I used "getline", reading 
> into std::string which should provide about the same level of abstraction and 
> overflow protection as O'Caml string.
> 
> I'm curious as to where these huge differences for these small programs come 
> from.

Oh btw:
[ocaml_vs_cpp]$ g++-3.2 -fomit-frame-pointer -Os list_cpp.cpp -o list_cpp
[ocaml_vs_cpp]$ time ./list_cpp 10000 10000

real    0m1.145s
user    0m1.150s
sys     0m0.000s
[ocaml_vs_cpp]$ icc -O3 -xi list_cpp.cpp -o list_cpp
list_cpp.cpp
[ocaml_vs_cpp]$ time ./list_cpp 10000 10000

real    0m0.518s
user    0m0.510s
sys     0m0.010s

Its been noted elsewhere that GCC tends to produce better code(at least 
for AMD's) when invoked with -Os instead of -O2 (where -Os is really an
-O2 _plus_ optimize for size). And Intel's compiler (6.0) really shines
here. Draw your own conclusions.

-- 
mailto:malc@pulsesoft.com

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Caml-list] Inlining across functors (was: O'Caml vs C++: a little benchmark)
  2002-08-19 13:02   ` Xavier Leroy
@ 2002-08-19 13:58     ` Thorsten Ohl
  2002-08-19 21:16       ` malc
  2002-08-19 14:39     ` [Caml-list] O'Caml vs C++: a little benchmark Oleg
  2002-08-19 15:15     ` William Chesters
  2 siblings, 1 reply; 23+ messages in thread
From: Thorsten Ohl @ 2002-08-19 13:58 UTC (permalink / raw)
  To: caml-list

Xavier Leroy <xavier.leroy@inria.fr> writes:

> Just for the record: ocamlopt does perform inlining across
> compilation units (via the information stored in .cmx files).  What
> it doesn't do, however, is inlining and specialization of recursive
> function definitions.

However, it appears that it doesn't inline across functors.  For
example, in

  module type M = 
    sig
      type t
      val op : t -> t -> t
      val of_int : int -> t
      val to_int : t -> int
    end
  
  module M : M =
    struct
      type t = int
      let op = ( + )
      let of_int n = n
      let to_int n = n
    end
  
  module F (A : M) =
    struct
      let f a b = A.to_int (A.op  (A.of_int a) (A.of_int b))
    end
  
  module M1 =
    struct
      let f1 a b = M.to_int (M.op  (M.of_int a) (M.of_int b))
    end
      
  module M2 =
    struct
      module FM = F(M)
      let f2 a b = FM.f a b
    end

all functions used in M1.f1 are expanded inline

  *** Linearized code
  Opt_f1_66:
    n/10[%eax] := a/8[%eax] + b/9[%ebx] + -1
    return R/0[%eax]

while M2.f2 retains a lot of auxiliary code:
  
  *** Linearized code
  Opt_f2_72:
    spilled-a/30[s0] := a/8[%eax] (spill)
    b/9[%eax] := R/1[%ebx]
    A/11[%ebx] := [env/10[%ecx] + 12]
    env/12[%ecx] := [A/11[%ebx]]
    spilled-env/29[s1] := env/12[%ecx] (spill)
    A/13[%ebx] := [env/12[%ecx] + 12]
    fun/14[%ebx] := [A/13[%ebx] + 8]
    spilled-fun/27[s3] := fun/14[%ebx] (spill)
    A/15[%ebx] := [env/12[%ecx] + 12]
    fun/16[%ebx] := [A/15[%ebx] + 4]
    A/17[%ecx] := [fun/16[%ebx]]
    {spilled-fun/27[s3]* spilled-env/29[s1]* spilled-a/30[s0]*}
    R/0[%eax] := call A/17[%ecx]
    R/0[%eax]
    R/1[%ebx]
    A/28[s2] := A/18[%eax] (spill)
    env/31[%eax] := spilled-env/29[s1] (reload)
    A/19[%eax] := [env/31[%eax] + 12]
    fun/20[%ebx] := [A/19[%eax] + 4]
    A/21[%ecx] := [fun/20[%ebx]]
    a/32[%eax] := spilled-a/30[s0] (reload)
    {spilled-fun/27[s3]* A/28[s2]* spilled-env/29[s1]*}
    R/0[%eax] := call A/21[%ecx]
    R/0[%eax]
    R/1[%ebx]
    env/33[%ebx] := spilled-env/29[s1] (reload)
    A/23[%ebx] := [env/33[%ebx] + 12]
    A/24[%ecx] := [A/23[%ebx]]
    A/34[%ebx] := A/28[s2] (reload)
    {spilled-fun/27[s3]*}
    R/0[%eax] := call "caml_apply2" R/0[%eax]
    R/1[%ebx]
    R/2[%ecx]
    fun/35[%ebx] := spilled-fun/27[s3] (reload)
    A/26[%ecx] := [fun/35[%ebx]]
    tailcall A/26[%ecx]
    R/0[%eax]
    R/1[%ebx]

Is this because the signature M can make no guarantee that op is never
a recursive function? Do all functor applications fall under the `no
inlining and specialization of recursive function definitions' clause?

Cheers,
-Thorsten
-- 
Thorsten Ohl, Physics Dept., Wuerzburg Univ. -- ohl@physik.uni-wuerzburg.de
http://theorie.physik.uni-wuerzburg.de/~ohl/     [<=== PGP public key here]
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] O'Caml vs C++: a little benchmark
  2002-08-19 13:02   ` Xavier Leroy
  2002-08-19 13:58     ` [Caml-list] Inlining across functors (was: O'Caml vs C++: a little benchmark) Thorsten Ohl
@ 2002-08-19 14:39     ` Oleg
  2002-08-19 15:15     ` William Chesters
  2 siblings, 0 replies; 23+ messages in thread
From: Oleg @ 2002-08-19 14:39 UTC (permalink / raw)
  To: Xavier Leroy, caml-list

On Monday 19 August 2002 09:02 am, Xavier Leroy wrote:
> * http://www.bagley.org/~doug/shootout/ is a (rare) example of a
>   non-stupid inter-language performance comparison.

What?! The cluelessness of the "shootout" is beyond belief. Let me quote e.g. 
http://www.bagley.org/~doug/shootout/bench/lists/ 

<quote>
Please Note: this test is due for an overhaul. I would like to have 2 lists 
tests, one that tests single-linked list functions, and one that tests 
double-linked list (deque) functions. 
</quote>

Are there people here who do not know that deque == double-ended queue <> 
"double-linked list" ?

If you were to look at the O'Caml and C++ programs for the "list" comparison 
in the shootout, you'd see that the C++ version uses bona fide doubly-linked 
lists (the ones that allow fast inserstion in the middle), while the O'Caml 
version uses pre-allocated Array.t in this "list" comparison.

Oleg
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] O'Caml vs C++: a little benchmark
  2002-08-19 13:02   ` Xavier Leroy
  2002-08-19 13:58     ` [Caml-list] Inlining across functors (was: O'Caml vs C++: a little benchmark) Thorsten Ohl
  2002-08-19 14:39     ` [Caml-list] O'Caml vs C++: a little benchmark Oleg
@ 2002-08-19 15:15     ` William Chesters
  2 siblings, 0 replies; 23+ messages in thread
From: William Chesters @ 2002-08-19 15:15 UTC (permalink / raw)
  To: caml-list

Xavier Leroy writes:
 > > the compiler doesn't inline Array.fold_left, and hence not your
 > > anonymous "fun" either, because it doesn't inline anything between
 > > compilation units.
 > 
 > Just for the record: ocamlopt does perform inlining across compilation
 > units (via the information stored in .cmx files).  What it doesn't do,
 > however, is inlining and specialization of recursive function definitions.  

Thanks for correcting that misapprehension, and apologies for
propagating it.  (I did do some tests, a long time ago---did I just
get it wrong, or is this a relatively recent feature??)
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Inlining across functors (was: O'Caml vs C++: a little benchmark)
  2002-08-19 13:58     ` [Caml-list] Inlining across functors (was: O'Caml vs C++: a little benchmark) Thorsten Ohl
@ 2002-08-19 21:16       ` malc
  2002-08-19 22:06         ` [Caml-list] Specialization (was: Inlining across functors) Thorsten Ohl
  2002-08-20  6:25         ` [Caml-list] Inlining across functors (was: O'Caml vs C++: a little benchmark) malc
  0 siblings, 2 replies; 23+ messages in thread
From: malc @ 2002-08-19 21:16 UTC (permalink / raw)
  To: Thorsten Ohl; +Cc: caml-list

On Mon, 19 Aug 2002, Thorsten Ohl wrote:

> Xavier Leroy <xavier.leroy@inria.fr> writes:
> 
> > Just for the record: ocamlopt does perform inlining across
> > compilation units (via the information stored in .cmx files).  What
> > it doesn't do, however, is inlining and specialization of recursive
> > function definitions.
> 
> However, it appears that it doesn't inline across functors.  For
> example, in
snip
> 
> Is this because the signature M can make no guarantee that op is never
> a recursive function? Do all functor applications fall under the `no
> inlining and specialization of recursive function definitions' clause?

Yes. With http://algol.prosalg.no/~malc/code/patches/specfun.tar.gz 
(patch against 3.04) you will get this instead:

*** Linearized code                                                             
Opt_f2_72:                                                                      
  A/11[%ecx] := [env/10[%ecx] + 12]                                             
  A/12[%ecx] := [A/11[%ecx]]                                                    
  tailcall "Opt_f_62" R/0[%eax]                                                 
  R/1[%ebx]                                                                     
  R/2[%ecx]   

-- 
mailto:malc@pulsesoft.com

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Caml-list] Specialization (was: Inlining across functors)
  2002-08-19 21:16       ` malc
@ 2002-08-19 22:06         ` Thorsten Ohl
  2002-08-20  6:35           ` [Caml-list] " malc
  2002-08-20  6:25         ` [Caml-list] Inlining across functors (was: O'Caml vs C++: a little benchmark) malc
  1 sibling, 1 reply; 23+ messages in thread
From: Thorsten Ohl @ 2002-08-19 22:06 UTC (permalink / raw)
  To: caml-list; +Cc: malc

malc  <malc@pulsesoft.com> writes:

> With http://algol.prosalg.no/~malc/code/patches/specfun.tar.gz 
> (patch against 3.04) you will get this instead:
> 
> *** Linearized code                                                             
> Opt_f2_72:                                                                      
>   A/11[%ecx] := [env/10[%ecx] + 12]                                             
>   A/12[%ecx] := [A/11[%ecx]]                                                    
>   tailcall "Opt_f_62" R/0[%eax]                                                 
>   R/1[%ebx]                                                                     
>   R/2[%ecx]   

Neat!

> What will be specialized: frist order non-curried functors

Unfortunately, the cases where my code would benefit most are all
curried and or higher-order functors.  E.g., I have beauties like

    module Tagged (Tagger : Tagger) (PT : Tuple.Poly)
	(Stat : Stat_Maker) (T : Topology.T with type 'a children = 'a PT.t)
	(P : Momentum.T) (M : Model.T) =
      struct 
	...
      end

where the signature Momentum.T can be implemented by simple bitmask
operations and since it is used _very_ often, specialization would
help a great deal.  The situation for symoblic algebra is similar.

I guess that the major obstacle for generalizing your approach to
specialization is in preventing code bloat.  Or am I wrong?

Fine-grained control for specialization (like your syntax extension)
at the point of functor application would be very useful.  The above
code could probably gain a constant factor larger than 10, if I could
specialize curried and higher-order functors. [At crunch time, I can
do this by hand of course, but--also for educational reasons--it would
be nice to let the compiler take care of this.]

Cheers,
-Thorsten
-- 
Thorsten Ohl, Physics Dept., Wuerzburg Univ. -- ohl@physik.uni-wuerzburg.de
http://theorie.physik.uni-wuerzburg.de/~ohl/     [<=== PGP public key here]
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] Inlining across functors (was: O'Caml vs C++: a little benchmark)
  2002-08-19 21:16       ` malc
  2002-08-19 22:06         ` [Caml-list] Specialization (was: Inlining across functors) Thorsten Ohl
@ 2002-08-20  6:25         ` malc
  1 sibling, 0 replies; 23+ messages in thread
From: malc @ 2002-08-20  6:25 UTC (permalink / raw)
  To: Thorsten Ohl; +Cc: caml-list

On Tue, 20 Aug 2002, malc wrote:

> On Mon, 19 Aug 2002, Thorsten Ohl wrote:
> 
> > Xavier Leroy <xavier.leroy@inria.fr> writes:
> > 
> > > Just for the record: ocamlopt does perform inlining across
> > > compilation units (via the information stored in .cmx files).  What
> > > it doesn't do, however, is inlining and specialization of recursive
> > > function definitions.
> > 
> > However, it appears that it doesn't inline across functors.  For
> > example, in
> snip
> > 
> > Is this because the signature M can make no guarantee that op is never
> > a recursive function? Do all functor applications fall under the `no
> > inlining and specialization of recursive function definitions' clause?
> 
> Yes. With http://algol.prosalg.no/~malc/code/patches/specfun.tar.gz 
> (patch against 3.04) you will get this instead:
> 
> *** Linearized code                                                             
> Opt_f2_72:                                                                      
>   A/11[%ecx] := [env/10[%ecx] + 12]                                             
>   A/12[%ecx] := [A/11[%ecx]]                                                    
>   tailcall "Opt_f_62" R/0[%eax]                                                 
>   R/1[%ebx]                                                                     
>   R/2[%ecx]   
Whoops.. Again a reminder to think before posting.. The real linearized 
code with specfun would be:

*** Linearized code                                                             
Opt_f_76:                                                                       
  n/10[%eax] := a/8[%eax] + b/9[%ebx] + -1                                      
  return R/0[%eax]                                                              
                                                                                
*** Linearized code                                                             
Opt_f2_84:                                                                      
  n/10[%eax] := a/8[%eax] + b/9[%ebx] + -1                                      
  return R/0[%eax]

(Forgot to specify -specialize on command line.. doh)

-- 
mailto:malc@pulsesoft.com

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Caml-list] Re: Specialization (was: Inlining across functors)
  2002-08-19 22:06         ` [Caml-list] Specialization (was: Inlining across functors) Thorsten Ohl
@ 2002-08-20  6:35           ` malc
  0 siblings, 0 replies; 23+ messages in thread
From: malc @ 2002-08-20  6:35 UTC (permalink / raw)
  To: Thorsten Ohl; +Cc: caml-list

On Tue, 20 Aug 2002, Thorsten Ohl wrote:

> malc  <malc@pulsesoft.com> writes:
> 
> > With http://algol.prosalg.no/~malc/code/patches/specfun.tar.gz 
> > (patch against 3.04) you will get this instead:
> > 
> > *** Linearized code                                                             
> > Opt_f2_72:                                                                      
> >   A/11[%ecx] := [env/10[%ecx] + 12]                                             
> >   A/12[%ecx] := [A/11[%ecx]]                                                    
> >   tailcall "Opt_f_62" R/0[%eax]                                                 
> >   R/1[%ebx]                                                                     
> >   R/2[%ecx]   
> 
> Neat!
> 
> > What will be specialized: frist order non-curried functors
> 
> Unfortunately, the cases where my code would benefit most are all
> curried and or higher-order functors.  E.g., I have beauties like
> 
>     module Tagged (Tagger : Tagger) (PT : Tuple.Poly)
> 	(Stat : Stat_Maker) (T : Topology.T with type 'a children = 'a PT.t)
> 	(P : Momentum.T) (M : Model.T) =
>       struct 
> 	...
>       end
> 
> where the signature Momentum.T can be implemented by simple bitmask
> operations and since it is used _very_ often, specialization would
> help a great deal.  The situation for symoblic algebra is similar.

If i remember correctly curried functors werent implemented because they 
cant be mapped neatly into current code, with hacks here and there it was 
possible, but i wanted to create half-decent and semi-readable code. Maybe 
there are other obstacles as well, it's been a while.

> 
> I guess that the major obstacle for generalizing your approach to
> specialization is in preventing code bloat.  Or am I wrong?

More than code bloat(which is tunable in case of my patch) lack of 
interest on part of Inria team has much bigger impact.

> 
> Fine-grained control for specialization (like your syntax extension)
> at the point of functor application would be very useful.  The above
> code could probably gain a constant factor larger than 10, if I could
> specialize curried and higher-order functors. [At crunch time, I can
> do this by hand of course, but--also for educational reasons--it would
> be nice to let the compiler take care of this.]

There's one thing i should mention, im not a type teorist, lambda calculus 
expert and so on. Specialization like it is implemented(simple rewriting 
of functor argument in functor body with some care to preserve typing) is
what SML/NJ does as well. Then again, types that specfun produces are a 
bit off w.r.t vanilla Caml (they used to be 1:1 but, then one bump was 
noticed and they stoped to be). I belive that if Jacques or Xavier or 
anyone else on the team was interested in this feature it could have been 
added in a matter of days, alas, they dont, and there is little that can 
be done.

-- 
mailto:malc@pulsesoft.com

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] O'Caml vs C++: a little benchmark
  2002-08-18 17:17 [Caml-list] O'Caml vs C++: a little benchmark Oleg
                   ` (3 preceding siblings ...)
  2002-08-19 13:22 ` malc
@ 2002-08-23 21:05 ` John Max Skaller
  2002-08-23 21:35   ` Oleg
  4 siblings, 1 reply; 23+ messages in thread
From: John Max Skaller @ 2002-08-23 21:05 UTC (permalink / raw)
  To: Oleg; +Cc: caml-list

Oleg wrote:

> I'm curious as to where these huge differences for these small programs come 
> from.


Perhaps .. here


	single_pass xb (sum +. float x)


you have a float conversion -- creates a new heap element.
Try changing the tests to do integer operations,
they should be comparable then.


-- 
John Max Skaller, mailto:skaller@ozemail.com.au
snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia.
voice:61-2-9660-0850


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] O'Caml vs C++: a little benchmark
  2002-08-23 21:05 ` John Max Skaller
@ 2002-08-23 21:35   ` Oleg
  2002-08-28 13:47     ` John Max Skaller
  0 siblings, 1 reply; 23+ messages in thread
From: Oleg @ 2002-08-23 21:35 UTC (permalink / raw)
  To: John Max Skaller; +Cc: caml-list

On Friday 23 August 2002 05:05 pm, John Max Skaller wrote:
> Oleg wrote:
> > I'm curious as to where these huge differences for these small programs
> > come from.
>
> Perhaps .. here
>
>
> 	single_pass xb (sum +. float x)
>
>
> you have a float conversion -- creates a new heap element.
> Try changing the tests to do integer operations,
> they should be comparable then.

The C++ version contains the same conversion (with the exception that int is 
native, but that's the price O'Caml is willing to pay, right?)  And why heap? 

BTW does O'Caml inline tail-recursive functions?

Cheers
Oleg
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] O'Caml vs C++: a little benchmark
  2002-08-23 21:35   ` Oleg
@ 2002-08-28 13:47     ` John Max Skaller
  2002-08-28 14:34       ` Alain Frisch
  2002-08-28 17:23       ` inlining tail-recursive functions (Re: [Caml-list] O'Caml vs C++: a little benchmark) Oleg
  0 siblings, 2 replies; 23+ messages in thread
From: John Max Skaller @ 2002-08-28 13:47 UTC (permalink / raw)
  To: Oleg; +Cc: caml-list

Oleg wrote:


>>
>>
>>	single_pass xb (sum +. float x)
>>
>>
>>you have a float conversion -- creates a new heap element.
>>Try changing the tests to do integer operations,
>>they should be comparable then.
>>
> 
> The C++ version contains the same conversion (with the exception that int is 
> native, but that's the price O'Caml is willing to pay, right?)  And why heap? 


Because Ocaml only has two (runtime) data types:
integer and pointer to  heap object (boxed object).
Heap allocations inside a loop are bound to be slower
than the same loop without them. So if you try an integer
above, you can see the price of boxing.


> BTW does O'Caml inline tail-recursive functions?


Do you mean loop unrolling? I hear that it doesn't
do loop unrolling. [There's nothing to gain from
a simple inlining, unless the loop is only executed
once or twice - you'd only save a single function call]


-- 
John Max Skaller, mailto:skaller@ozemail.com.au
snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia.
voice:61-2-9660-0850


-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Caml-list] O'Caml vs C++: a little benchmark
  2002-08-28 13:47     ` John Max Skaller
@ 2002-08-28 14:34       ` Alain Frisch
  2002-08-28 17:23       ` inlining tail-recursive functions (Re: [Caml-list] O'Caml vs C++: a little benchmark) Oleg
  1 sibling, 0 replies; 23+ messages in thread
From: Alain Frisch @ 2002-08-28 14:34 UTC (permalink / raw)
  To: John Max Skaller; +Cc: Caml list

On Wed, 28 Aug 2002, John Max Skaller wrote:

> [There's nothing to gain from
> a simple inlining, unless the loop is only executed
> once or twice - you'd only save a single function call]

Why ?  You can unroll the call to the function in its own body, up to a
fixed depth n and divide the number of function calls by n (and allow
subsequent optimizations, such as boxing/unboxing removal).

-- Alain

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 23+ messages in thread

* inlining tail-recursive functions (Re: [Caml-list] O'Caml vs C++: a little benchmark)
  2002-08-28 13:47     ` John Max Skaller
  2002-08-28 14:34       ` Alain Frisch
@ 2002-08-28 17:23       ` Oleg
  2002-08-31  1:13         ` John Max Skaller
  1 sibling, 1 reply; 23+ messages in thread
From: Oleg @ 2002-08-28 17:23 UTC (permalink / raw)
  To: John Max Skaller; +Cc: caml-list

On Wednesday 28 August 2002 09:47 am, John Max Skaller wrote:
> > BTW does O'Caml inline tail-recursive functions?
>
> Do you mean loop unrolling? I hear that it doesn't
> do loop unrolling. [There's nothing to gain from
> a simple inlining, unless the loop is only executed
> once or twice - you'd only save a single function call]

It's been mentioned that O'Caml can inline functions that are not recursive 
(including inlining across module boundaries). Tail-recursive functions can 
be, basically, transformed into non-recursive functions by the compiler. So I 
was wondering if O'Caml inlined them. The benefits of inlining tail-recursive 
functions should thus be the same as the benefits of inlining non-recursive 
functions.

Cheers
Oleg
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: inlining tail-recursive functions (Re: [Caml-list] O'Caml vs C++: a little benchmark)
  2002-08-28 17:23       ` inlining tail-recursive functions (Re: [Caml-list] O'Caml vs C++: a little benchmark) Oleg
@ 2002-08-31  1:13         ` John Max Skaller
  0 siblings, 0 replies; 23+ messages in thread
From: John Max Skaller @ 2002-08-31  1:13 UTC (permalink / raw)
  To: Oleg; +Cc: caml-list

Oleg wrote:

> On Wednesday 28 August 2002 09:47 am, John Max Skaller wrote:
> 
>>> BTW does O'Caml inline tail-recursive functions?
>>>
>>Do you mean loop unrolling? I hear that it doesn't
>>do loop unrolling. [There's nothing to gain from
>>a simple inlining, unless the loop is only executed
>>once or twice - you'd only save a single function call]
>>
> 
> It's been mentioned that O'Caml can inline functions that are not recursive 
> (including inlining across module boundaries). Tail-recursive functions can 
> be, basically, transformed into non-recursive functions by the compiler. So I 
> was wondering if O'Caml inlined them. The benefits of inlining tail-recursive 
> functions should thus be the same as the benefits of inlining non-recursive 
> functions.

I don't think so. For a tail rec function transformed into
a loop, there is little difference between inlining
the loop, or just doing a standard call to a
function containing the loop .. assuming functional
code, and assuming the loop is executed many times.

If the loop is very short, there may be some advantage
in unrolling it.

The difference would only be a single function call,

which has negligible cost compared to a large number
of loop iterations.

There *might* be a small architectural advantage
inlining the loop, if that tends to localise
all the variables (more efficient caching of
memory blocks). This happens to be the case
in my Felix compiler, where a reference to a variable
from another function costs an extra indirection,
and probably requires two pages of virtual
memory instead of the one page and no indirection
required if the loop were inlined.

But Ocaml doesn't work the way Felix does, so I doubt
there would be any advantage (all values are separately
heap allocated anyhow, which is not true in Felix,
which uses 'frames' to reduce heap allocations]

-- 
John Max Skaller, mailto:skaller@ozemail.com.au
snail:10/1 Toxteth Rd, Glebe, NSW 2037, Australia.
voice:61-2-9660-0850

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2002-08-31  1:13 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-08-18 17:17 [Caml-list] O'Caml vs C++: a little benchmark Oleg
2002-08-18 18:00 ` William Chesters
2002-08-18 19:06   ` Oleg
2002-08-18 21:37     ` William Chesters
2002-08-19 13:02   ` Xavier Leroy
2002-08-19 13:58     ` [Caml-list] Inlining across functors (was: O'Caml vs C++: a little benchmark) Thorsten Ohl
2002-08-19 21:16       ` malc
2002-08-19 22:06         ` [Caml-list] Specialization (was: Inlining across functors) Thorsten Ohl
2002-08-20  6:35           ` [Caml-list] " malc
2002-08-20  6:25         ` [Caml-list] Inlining across functors (was: O'Caml vs C++: a little benchmark) malc
2002-08-19 14:39     ` [Caml-list] O'Caml vs C++: a little benchmark Oleg
2002-08-19 15:15     ` William Chesters
2002-08-18 19:16 ` Markus Mottl
2002-08-18 19:58   ` Oleg
2002-08-18 22:59     ` Markus Mottl
2002-08-19 13:12 ` malc
2002-08-19 13:22 ` malc
2002-08-23 21:05 ` John Max Skaller
2002-08-23 21:35   ` Oleg
2002-08-28 13:47     ` John Max Skaller
2002-08-28 14:34       ` Alain Frisch
2002-08-28 17:23       ` inlining tail-recursive functions (Re: [Caml-list] O'Caml vs C++: a little benchmark) Oleg
2002-08-31  1:13         ` John Max Skaller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).