[Caml-list] How hard would more inlining, more unboxed floats be?

caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed

From: William Chesters <williamc@paneris.org>
To: caml-list@inria.fr
Subject: [Caml-list] How hard would more inlining, more unboxed floats be?
Date: Sun, 13 May 2001 23:29:24 +0200 (CEST)	[thread overview]
Message-ID: <15102.64692.46611.769408@beertje.william.bogus> (raw)

ocaml is nearly a marvellous tool for abstract numerical programming.
By that I mean the ability to write Matlab or Fortran 90-style
expressions, and express the natural abstractness of algorithms using
functors.

   In C++ these highly desirable goals _can_ be achieved with
intricate template techniques (Blitz++, PETE/POOMA, MTL, ...), but
they are only marginally feasible, and in practice it's questionable
whether they save more in expressiveness than they cause in trouble.

   Now, ocaml is very good---within a compilation unit---at
inlining, partial evaluation, and elimination of temporary objects,
which are the essential optimisations required.

   And the backend code generator can do amazing things with a bit of
help.  For example, I tried the following code for a dot product:

    type floatref = { mutable it: float }

    let dot x x0 x1 y y0 =
      let j = ref 0 and acc = { it = 0. } in
      for i = x0 to x1 - 1 do
        acc.it <- acc.it +. Array.unsafe_get x i *. Array.unsafe_get y !j;
        incr j
      done;
      acc.it

Note use of specialised all-"float" record (record_representation =
Record_float) --- otherwise it's impossible to avoid allocation of a
boxed float in the inner loop, whether expressed imperatively or
tail-recursively ...  On a Pentium, it compiles to

    .L101:
            fldl    -4(%ebp, %ebx, 4)
            fmull   -4(%edx, %esi, 4)
            faddl   (%edi)
            fstpl   (%edi)
            addl    $2, %esi
            addl    $2, %ebx
            cmpl    %ecx, %ebx
            jle     .L101

By contrast, the best gcc can do is this:

    .L6:
            fldl (%esi,%eax,8)
            fmull (%ebx,%edx,8)
            incl %eax
            incl %edx
            faddp %st,%st(1)
            cmpl %ecx,%eax
            jl .L6

which tests maybe 10% faster.  Actually I think on a Pentium one can
get a few more percent out by using direct pointer increments, but
most people don't do that, not least because it's actually slower on
most RISCs.

   (Glasgow Haskell is impressive in many ways, but completely misses
this level of performance---its own code generator is not very
ambitious, and the C code it can alternatively feed to gcc is too
low-level, tries to micro-manage the stack and ends up obscuring
what is really going on, so that gcc ends up using no register
variables at all ...)

   Arguably, all that's standing between ocaml-3.01 and a killer
language for scientific computing is:

     -- inlining and partial evaluation _across_ compilation
        boundaries, and in particular through modules/functors

     -- explicit user control of inlining

     -- facilities for handling unboxed floats directly, and/or
        elision of box/unbox in tail calls, so that recursive
        "loops" don't incur allocation penalty

How hard would it be to get these things happening?
-------------------
To unsubscribe, mail caml-list-request@inria.fr.  Archives: http://caml.inria.fr

                 reply	other threads:[~2001-05-13 21:26 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=15102.64692.46611.769408@beertje.william.bogus \
    --to=williamc@paneris.org \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).