caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Jon Harrop <jon@ffconsultancy.com>
To: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Comparison of OCaml and MLton for numerics
Date: Fri, 1 Jun 2007 18:13:48 +0100	[thread overview]
Message-ID: <200706011813.48515.jon@ffconsultancy.com> (raw)
In-Reply-To: <f8560b80706010914s15fc75acl9290e3d42e259e20@mail.gmail.com>

On Friday 01 June 2007 17:14:36 Markus Mottl wrote:
> Absolutely!  E.g. we had to specialize hash tables for integer and
> string keys, because the generic implementation calls a function for
> each key comparison rather than generating specialized code for e.g.
> integer comparisons.  This has a noticable impact in production
> systems.

Sets are another example. In that case, function call in trivial comparison is 
a side effect of using functors.

> ...
> I'd surely be happy to see the addition of some (optional)
> higher-level code transformations to OCaml.  Not just inlining, maybe
> some partial evaluation of the resulting code, which could also reduce
> code size if the compiler can prove that certain branches will not be
> taken.

The stdlib has a lot of scope for optimization. The Set implementation 
currently does not specialize 1-element nodes. Doing this improves 
performance by ~30%, partly by relieving the GC and partly by avoiding 
branches in common cases. I believe some compilers automate this optimization 
(trimming the bottom layer from trees).

Non tail-recursive functions in the stdlib is another example. You can easily 
get quadratic instead of linear behaviour without realising. Hash tables can 
have big hidden performance costs, especially for soft real-time work.

Actually, while we're here. I've long thought that the stdlib should provide 
abstract implementations of concrete data structures like RB trees and AVL 
trees, and functorize the Set and Map modules over the tree type they use. 
This would let people add new abstract data structures (I like purely 
functional sequences based on AVL trees) built upon solid concrete data 
structures from the stdlib rather than cutting and pasting code (one of the 
more embarassing OCaml FAQs).

Making this feasible by optimizing away the abstractions requires more than 
just defunctorizing though. You need to partially specialize by type, as 
Markus says. You also need to do whole-program transformations to flatten 
data structures. For example, a set would require:

  Node of 'a t * 'a * 'a t * height

and a sequence would require:

  Node of 'a t * 'a * 'a t * height * size

So a generic OCaml solution would add an indirection to the metadata that 
could be flattened out.

Not being hardcore enough to tinker with the OCaml compiler itself, I'd write 
an OCaml program that generated OCaml implementations of data structures with 
the necessary specialization. Indeed, I've already done this for 
low-dimensional vectors and matrices, but doing it for trees would be more 
interesting.

Just out of curiosity, how many of the optimizations being discussed can be 
done with camlp4?

Anyway, we should try to build a coherent list of optimizations we'd like and 
then try to prioritize them.

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
OCaml for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists/?e


  parent reply	other threads:[~2007-06-01 17:19 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-31  5:50 Yuanchen Zhu
2007-05-31  6:17 ` [Caml-list] " Jon Harrop
2007-05-31  6:32   ` skaller
2007-05-31  7:31   ` Yuanchen Zhu
2007-05-31  9:08     ` Jon Harrop
2007-05-31  9:22       ` Yuanchen Zhu
2007-05-31 10:27         ` Jon Harrop
2007-05-31 21:30           ` Alain Frisch
2007-06-01  1:22             ` skaller
2007-06-01  1:36               ` Erik de Castro Lopo
2007-06-01  2:21                 ` skaller
2007-06-01  2:49                   ` Erick Tryzelaar
2007-06-01  3:05                     ` skaller
2007-06-01  5:30               ` Alain Frisch
2007-06-01  5:39                 ` Jon Harrop
2007-06-01  6:36                   ` Yuanchen Zhu
2007-06-01  8:09                 ` skaller
2007-06-01  8:53                   ` Richard Jones
2007-06-01  8:59                     ` Richard Jones
2007-06-01  9:22                       ` Stephan Tolksdorf
2007-06-01  9:49                         ` Richard Jones
2007-06-01  9:32                       ` Stephan Tolksdorf
2007-06-01 10:02                     ` skaller
2007-06-01 11:29                 ` Yaron Minsky
2007-06-01 11:43                   ` Erik de Castro Lopo
2007-06-01 11:58                     ` Jon Harrop
2007-06-01 13:49                       ` Julien Signoles
2007-06-01 14:18                         ` Stephen Weeks
2007-06-01 14:43                           ` Julien Signoles
2007-06-01 14:57                           ` Brian Hurt
2007-06-01 15:40                             ` Alain Frisch
2007-06-01 15:58                               ` Brian Hurt
2007-06-01 16:25                                 ` Markus Mottl
2007-06-01 16:47                               ` Jon Harrop
2007-06-01 23:26                             ` skaller
2007-06-01 23:49                               ` Brian Hurt
2007-06-02  3:26                                 ` skaller
2007-06-01 12:40                     ` Erik de Castro Lopo
2007-06-01 13:56                       ` Julien Signoles
2007-06-01 11:49                   ` David MENTRE
2007-06-01 14:41                     ` skaller
2007-06-01 16:52                       ` Jon Harrop
2007-06-01 23:33                         ` skaller
2007-06-01 16:14                     ` Markus Mottl
2007-06-01 16:46                       ` Jon Harrop
2007-06-01 17:13                       ` Jon Harrop [this message]
2007-06-04 14:03                         ` Mike Furr
2007-06-04 14:39                           ` Jon Harrop
2007-06-04 15:33                             ` Mike Furr
2007-06-04 18:08                             ` skaller
     [not found]                               ` <9d3ec8300706041518y115d22bdsa120d4010261d841@mail.gmail.com>
2007-06-04 22:19                                 ` Fwd: " Till Varoquaux
2007-06-04 23:40                                   ` Jon Harrop
2007-06-05  2:24                                   ` skaller
2007-06-04 22:44                               ` Pierre Etchemaïté
2007-06-05  1:42                                 ` Jon Harrop
2007-06-05 10:30                                   ` Jon Harrop
2007-06-10 12:10                           ` Jon Harrop
2007-06-10 12:58                             ` skaller
2007-06-01 14:15                 ` Stephen Weeks
2007-06-01 14:37                   ` Brian Hurt
2007-06-01 14:39                   ` Eric Cooper
2007-05-31  9:24       ` Yuanchen Zhu
2007-05-31 10:25       ` Loup Vaillant
2007-05-31 10:30         ` Jon Harrop
2007-05-31 12:12     ` skaller
2007-05-31  7:11 ` Daniel Bünzli
2007-05-31 15:15 ` Christophe Raffalli
2007-05-31 15:23   ` Jon Harrop
2007-05-31 15:35     ` Christophe Raffalli
     [not found]       ` <604682010705310923o5a1ee0eiee5ae697da9d3c60@mail.gmail.com>
2007-05-31 20:14         ` Stephen Weeks
2007-05-31 15:16 ` Christophe Raffalli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200706011813.48515.jon@ffconsultancy.com \
    --to=jon@ffconsultancy.com \
    --cc=caml-list@yquem.inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).