Re: [Caml-list] Closing the performance gap to C

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

From: "Christoph Höger" <christoph.hoeger@tu-berlin.de>
To: caml-list@inria.fr
Subject: Re: [Caml-list] Closing the performance gap to C
Date: Sat, 17 Dec 2016 14:02:57 +0100	[thread overview]
Message-ID: <adf19464-c995-0e02-48e9-100f0efd26b6@tu-berlin.de> (raw)
In-Reply-To: <7bc766a2-d460-524b-35ca-89609a34b719@tu-berlin.de>


[-- Attachment #1.1.1: Type: text/plain, Size: 2330 bytes --]

Ups. Forgot the actual examples.

Am 17.12.2016 um 14:01 schrieb Christoph Höger:
> Dear all,
> 
> find attached two simple runge-kutta iteration schemes. One is written
> in C, the other in OCaml. I compared the runtime of both and gcc (-O2)
> produces an executable that is roughly 30% faster (to be more precise:
> 3.52s vs. 2.63s). That is in itself quite pleasing, I think. I do not
> understand however, what causes this difference. Admittedly, the
> generated assembly looks completely different, but both compilers inline
> all functions and generate one big loop. Ocaml generates a lot more
> scaffolding, but that is to be expected.
> 
> There is however an interesting particularity: OCaml generates 6 calls
> to cos, while gcc only needs 3 (and one direct jump). Surprisingly,
> there are also calls to cosh, acos and pretty much any other
> trigonometric function (initialization of constants, maybe?)
> 
> However, the true culprit seems to be an excess of instructions between
> the different calls to cos. This is what happens between the first two
> calls to cos:
> 
> gcc:
> jmpq   400530 <cos@plt>
> nop
> nopw   %cs:0x0(%rax,%rax,1)
> 
> sub    $0x38,%rsp
> movsd  %xmm0,0x10(%rsp)
> movapd %xmm1,%xmm0
> movsd  %xmm2,0x18(%rsp)
> movsd  %xmm1,0x8(%rsp)
> callq  400530 <cos@plt>
> 
> ocamlopt:
> 
> callq  401a60 <cos@plt>
> mulsd  (%r12),%xmm0
> movsd  %xmm0,0x10(%rsp)
> sub    $0x10,%r15
> lea    0x25c7b6(%rip),%rax
> cmp    (%rax),%r15
> jb     404a8a <dlerror@plt+0x2d0a>
> lea    0x8(%r15),%rax
> movq   $0x4fd,-0x8(%rax)
> 
> movsd  0x32319(%rip),%xmm1
> 
> movapd %xmm1,%xmm2
> mulsd  %xmm0,%xmm2
> addsd  0x0(%r13),%xmm2
> movsd  %xmm2,(%rax)
> movapd %xmm1,%xmm0
> mulsd  (%r12),%xmm0
> addsd  (%rbx),%xmm0
> callq  401a60 <cos@plt>
> 
> 
> Is this caused by some underlying difference in the representation of
> numeric values (i.e. tagged ints) or is it reasonable to attack this
> issue as a hobby experiment?
> 
> 
> thanks for any advice,
> 
> Christoph
> 


-- 
Christoph Höger

Technische Universität Berlin
Fakultät IV - Elektrotechnik und Informatik
Übersetzerbau und Programmiersprachen

Sekr. TEL12-2, Ernst-Reuter-Platz 7, 10587 Berlin

Tel.: +49 (30) 314-24890
E-Mail: christoph.hoeger@tu-berlin.de

[-- Attachment #1.1.2: rk4.c --]
[-- Type: text/plain, Size: 843 bytes --]

#include <stdio.h>
#include <math.h>

double exact(double t) { return sin(t); }

double dy(double t, double y) { return cos(t); }

double rk4_step(double y, double t, double h) {
    double k1 = h * dy(t, y);
    double k2 = h * dy(t + 0.5 * h, y + 0.5 * k1);
    double k3 = h * dy(t + 0.5 * h, y + 0.5 * k2);
    double k4 = h * dy(t + h, y + k3);
    return  y + (k1 + k4)/ 6.0 + (k2+k3) / 3.0;
}

double loop (int steps, double h, int n, double y, double t) {
    if (n < steps)
        return loop(steps, h, n+1, rk4_step(y,t,h), t+h);
    else return y;
}

int main() {
    double h = 0.1;
    double y = loop(102, h, 1, 1.0, 0.0);
    double err = fabs(y - exact(102 * h));
    int large = 10000000;
    double y2 = loop(large, h, 1, 1.0, 0.0);
    printf("%d\n",
           (fabs(y2 - (exact(large * h))) < 2. * err));
    return 0;
}

[-- Attachment #1.1.3: testrk4.ml --]
[-- Type: text/plain, Size: 653 bytes --]

let y' t y = cos t
let exact t = sin t

let rk4_step y t h =
  let k1 = h *. y' t y in
  let k2 = h *. y' (t +. 0.5*.h) (y +. 0.5*.k1) in
  let k3 = h *. y' (t +. 0.5*.h) (y +. 0.5*.k2) in
  let k4 = h *. y' (t +. h) (y +. k3) in
  y +. (k1+.k4)/.6.0 +. (k2+.k3)/.3.0

let rec loop steps h n y t =
  if n < steps then loop steps h (n+1) (rk4_step y t h) (t +. h) else
    y
let _ =
  let h = 0.1 in
  let y = loop 102 h 1 1.0 0.0 in
  let err = abs_float (y -. (exact ((float_of_int 102) *. h))) in
  let large = 10000000 in
  let y = loop large h 1 1.0 0.0 in
  Printf.printf "%b\n"
    (abs_float (y -. (exact (float_of_int large) *. h)) < 2. *. err)

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

next prev parent reply	other threads:[~2016-12-17 13:03 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-17 13:01 Christoph Höger
2016-12-17 13:02 ` Christoph Höger [this message]
2016-12-19 10:58   ` Soegtrop, Michael
2016-12-19 11:51   ` Gerd Stolpmann
2016-12-19 14:52     ` Soegtrop, Michael
2016-12-19 16:41       ` Gerd Stolpmann
2016-12-19 17:09         ` Frédéric Bour
2016-12-19 17:19           ` Yotam Barnoy
2016-12-21 11:25             ` Alain Frisch
2016-12-21 14:45               ` Yotam Barnoy
2016-12-21 16:06                 ` Alain Frisch
2016-12-21 16:31                   ` Gerd Stolpmann
2016-12-21 16:39                     ` Yotam Barnoy
2016-12-21 16:47                       ` Gabriel Scherer
2016-12-21 16:51                         ` Yotam Barnoy
2016-12-21 16:56                         ` Mark Shinwell
2016-12-21 17:43                           ` Alain Frisch
2016-12-22  8:39                             ` Mark Shinwell
2016-12-22 17:23                             ` Pierre Chambart
2016-12-21 17:35                       ` Alain Frisch
2016-12-19 15:48     ` Ivan Gotovchits
2016-12-19 16:44       ` Yotam Barnoy
2016-12-19 16:59         ` Ivan Gotovchits
2016-12-21  9:08           ` Christoph Höger
2016-12-23 12:18             ` Oleg

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adf19464-c995-0e02-48e9-100f0efd26b6@tu-berlin.de \
    --to=christoph.hoeger@tu-berlin.de \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).