Dear all,

find attached two simple runge-kutta iteration schemes. One is written
in C, the other in OCaml. I compared the runtime of both and gcc (-O2)
produces an executable that is roughly 30% faster (to be more precise:
3.52s vs. 2.63s). That is in itself quite pleasing, I think. I do not
understand however, what causes this difference. Admittedly, the
generated assembly looks completely different, but both compilers inline
all functions and generate one big loop. Ocaml generates a lot more
scaffolding, but that is to be expected.

There is however an interesting particularity: OCaml generates 6 calls
to cos, while gcc only needs 3 (and one direct jump). Surprisingly,
there are also calls to cosh, acos and pretty much any other
trigonometric function (initialization of constants, maybe?)

However, the true culprit seems to be an excess of instructions between
the different calls to cos. This is what happens between the first two
calls to cos:

gcc:
jmpq   400530 <cos@plt>
nop
nopw   %cs:0x0(%rax,%rax,1)

sub    $0x38,%rsp
movsd  %xmm0,0x10(%rsp)
movapd %xmm1,%xmm0
movsd  %xmm2,0x18(%rsp)
movsd  %xmm1,0x8(%rsp)
callq  400530 <cos@plt>

ocamlopt:

callq  401a60 <cos@plt>
mulsd  (%r12),%xmm0
movsd  %xmm0,0x10(%rsp)
sub    $0x10,%r15
lea    0x25c7b6(%rip),%rax
cmp    (%rax),%r15
jb     404a8a <dlerror@plt+0x2d0a>
lea    0x8(%r15),%rax
movq   $0x4fd,-0x8(%rax)

movsd  0x32319(%rip),%xmm1

movapd %xmm1,%xmm2
mulsd  %xmm0,%xmm2
addsd  0x0(%r13),%xmm2
movsd  %xmm2,(%rax)
movapd %xmm1,%xmm0
mulsd  (%r12),%xmm0
addsd  (%rbx),%xmm0
callq  401a60 <cos@plt>


Is this caused by some underlying difference in the representation of
numeric values (i.e. tagged ints) or is it reasonable to attack this
issue as a hobby experiment?


thanks for any advice,

Christoph
-- 
Christoph Höger

Technische Universität Berlin
Fakultät IV - Elektrotechnik und Informatik
Übersetzerbau und Programmiersprachen

Sekr. TEL12-2, Ernst-Reuter-Platz 7, 10587 Berlin

Tel.: +49 (30) 314-24890
E-Mail: christoph.hoeger@tu-berlin.de