caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Erik de Castro Lopo <ocaml-erikd@mega-nerd.com>
To: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Need for a built in round_to_int function
Date: Tue, 22 Feb 2005 18:23:07 +1100	[thread overview]
Message-ID: <20050222182307.39ae9854.ocaml-erikd@mega-nerd.com> (raw)
In-Reply-To: <20050221160023.GA4759@yquem.inria.fr>

On Mon, 21 Feb 2005 17:00:23 +0100
Xavier Leroy <Xavier.Leroy@inria.fr> wrote:

> On the other hand, according to the P4 optimization manuals, the P4 is
> supposed to special-case this particular use of fnstcw / fldcw, so
> perhaps the situation is no worse than on the P3.  

OK, I've just tested this. On P4 the performance hit of fnstcw / fldcw 
is not as bad as it is with P3, but its still significant:

Using this test program (compiled with my hacked version of ocamlopt):

     http://www.mega-nerd.com/tmp/round_to_int.ml

On a 450MHz P3:

    Time int_of_float : 5.970000
    Time round_to_int : 2.360000

On a 2.8GHz P4:

    Time int_of_float : 0.420000
    Time round_to_int : 0.260000

> Essentially zero :-(  Basically, this is a case where additional stuff is
> introduced in the machine-independent parts of ocamlopt and in every
> code generator just to work around the brain-dead x87 floating-point
> instruction set.

Obviously it is your decision, but I think round_to_int is a common 
enough operation to warrant its own function. The ISO C Standards
committee thought so.

>  Every other processor (as well as the SSE2 instr.set

Quite honestly I think the value of SSE and SSE2 are over sold.
There are certain algorthims which simply can't be made to run
as fast on SSE/SSE2 as they run on the x87 FPU.

For instance, my audio sample rate converter:

    http://www.mega-nerd.com/SRC/

If I compile this on a P3 with gcc-3.4 using -mfpmath=sse -msse,
the highest quality (and hence slowest) converter runs 50% slower 
than the x87 FPU version. I have also tried re-writing the algorithm 
in hand optimised SSE code. The best I could get (I'm not an assembler 
expert) was still 10% slower than the x87 FPU.

I have just now repeated my experiment by compiling SRC on a P4 
with -msse2 (-mfpmath=sse2 doesn't work), the converter runs 75% 
slower than the x87 FPU version.

> I spent a lot of time in the past trying to extract decent float
> performance out of the x87 instruction set,

<snip>

> Nowadays, I no longer care about
> performance for x87: users who want good float performance should
> simply use the x86-64 architecture (with SSE2 floats), 

I'd love to get my hands on one of these, but I really do doubt
that its performance will be much better than that of the P4. 
The main problem is that generating good SSE/SSE2 code from
a high level language is an order of magnitude more difficult
than generating code for the x87 FPU.

Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo  nospam@mega-nerd.com (Yes it's valid)
+-----------------------------------------------------------+
"Projects promoting programming in natural language are intrinsically
doomed to fail." -- Edsger Dijkstra


  reply	other threads:[~2005-02-22  7:23 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-02-20 20:22 Erik de Castro Lopo
2005-02-21  0:00 ` [Caml-list] " Robert Roessler
2005-02-21  1:51   ` Erik de Castro Lopo
2005-02-21  4:34     ` Robert Roessler
2005-02-21  6:50       ` Erik de Castro Lopo
2005-02-21 11:54 ` Erik de Castro Lopo
2005-02-21 16:00   ` Xavier Leroy
2005-02-22  7:23     ` Erik de Castro Lopo [this message]
2005-02-21 16:01   ` Hendrik Tews

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20050222182307.39ae9854.ocaml-erikd@mega-nerd.com \
    --to=ocaml-erikd@mega-nerd.com \
    --cc=caml-list@yquem.inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).