From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jon@ffconsultancy.com>
X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr
X-Spam-Level: 
X-Spam-Status: No, score=0.2 required=5.0 tests=AWL autolearn=disabled 
	version=3.1.3
X-Original-To: caml-list@yquem.inria.fr
Delivered-To: caml-list@yquem.inria.fr
Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105])
	by yquem.inria.fr (Postfix) with ESMTP id C5D3DBBC4
	for <caml-list@yquem.inria.fr>; Thu,  5 Mar 2009 00:12:17 +0100 (CET)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AjMGAFOZrknUnwckZmdsb2JhbACCHZJbDQsDBQkPBsMZhAgG
X-IronPort-AV: E=Sophos;i="4.38,304,1233529200"; 
   d="scan'208";a="36108527"
Received: from relay.ptn-ipout02.plus.net ([212.159.7.36])
  by mail4-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 05 Mar 2009 00:12:16 +0100
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: An4FAAyarknUnw4S/2dsb2JhbACCHdY7hAgG
Received: from pih-relay05.plus.net ([212.159.14.18])
  by relay.ptn-ipout02.plus.net with ESMTP; 04 Mar 2009 23:12:16 +0000
Received: from [87.112.234.120] (helo=leper.local)
	 by pih-relay05.plus.net with esmtp (Exim) id 1Lf0G7-0008L0-Mv; Wed, 04 Mar 2009 23:12:16 +0000
From: Jon Harrop <jon@ffconsultancy.com>
Organization: Flying Frog Consultancy Ltd.
To: Brian Hurt <bhurt@spnz.org>
Subject: Re: [Caml-list] stl?
Date: Wed, 4 Mar 2009 23:17:38 +0000
User-Agent: KMail/1.9.9
Cc: caml-list@yquem.inria.fr
References: <91a2ba3e0903031340wcdc976cp52522eb35f7ccb73@mail.gmail.com> <200903041945.23034.jon@ffconsultancy.com> <alpine.DEB.2.00.0903041623450.10051@beast>
In-Reply-To: <alpine.DEB.2.00.0903041623450.10051@beast>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200903042317.39079.jon@ffconsultancy.com>
X-Plusnet-Relay: b757ef9058ceafe7761895834b3d92ce
X-Spam: no; 0.00; stl:01 vectors:01 vectors:01 functor:01 functor:01 ocaml:01 lacks:01 compilation:01 abstraction:01 haskell:01 renders:01 renders:01 haskell:01 compiler:01 abstraction:01 

On Wednesday 04 March 2009 21:23:53 Brian Hurt wrote:
> On Wed, 4 Mar 2009, Jon Harrop wrote:
> > Depends how big your vectors are.
>
> Less than it might seem.  First of all, you have the allocation costs to
> hold the new vector- there's 3-5 clocks right there.  Let's say the
> vectors are 3 elements long.  So you have six loads and three stores-
> assume stores are free, and loads cost 1 clock apeice- they're in L1
> cache, and they're getting preloaded.  Plus the 3 clocks for the floating
> point adds.  So that's 12-15 clock cycles right there.  Say I'm being
> pessimistic by a factor of 2, and it's really only 6-8 clock cycles.  Vr.s
> 10 clock cycles or so for the call via functor.  Now we're down into the
> realm of only doubling the cost of the operation, not an order of
> magnitude increase.  A single mispredicted branch or L1 cache miss that
> has to go out to L2 cache- not even main memory, just L2!- would blow the
> overhead of calling via functor out of the water.

2x slower, yes.

> Premature optimization is the root of all evil.

That is not a premature optimization.

> >> And it's expensive only because Ocaml lacks an obvious optimization
> >> (defunctorization).
> >
> > If you neglect the enormous disadvantages: whole program analysis => no
> > incremental compilation and no DLLs.
> >
> > You can resolve the abstraction as soon as possible, as Haskell, does but
> > then you're on the slippery slope to wildly unpredictable performance
> > that renders the language too risky for a lot of real work.
>
> That explains why no one uses C++ for real work- since it resolves
> template abstractions as soon as possible, it' performance is so 
> unpredictable that it renders the language too risky for a lot of real
> work.

The unpredictability stems from allowing the Haskell compiler to choose 
whether it resolves the abstraction statically or dynamically. C++ prohibits 
dynamic resolution (greatly limiting its capabilities) so it is entirely 
predictable and your analogy is invalid.

> >> But for the vast bulk of people
> >> programming, performance (at least on the clock cycle level) simply
> >> isn't that important
> >
> > For some definition of "programming" that includes website design, maybe.
>
> Compare the number of people writing CRUD web applications in Ruby on
> Rails to the *entirity* of the professional Ocaml, SML, Haskell, and F#
> communities.  I dare you.

Irrelevant. This was about performance not specific functional programming. 
Compare the number of people using Ruby and Python for anything at all to the 
number of professional C++, Java and C# programmers.

> But no, not just web applications.  Most business logic applications.
> Large chunks of the embedded world.  Most desktop applications.  Most code
> doesn't need to be "as fast as possible", it simply needs to be "fast
> enough".

Ruby and Python are not common there either.

> >> - as demonstrated by all the "real work" being done in hideously slow
> >> languages like Ruby, Python, etc...
> >
> > Real work is done mostly in C++, Java and C# and proven predictable
> > efficiency is a huge part of that.
>
> You should have warned me before mentioning "Java" in the same sentence as
> "proven predictable efficiency".  I was drinking when I read that, and now
> I have to clean my keyboard.
>
> It's a negligable part of that.

Not true.

> Try this experiment.  Go to J. Random software manager in Java/.Net/C++
> shop. 

Like me.

> Ask them to list the reasons they 
> use the language of choice.  Don't prompt them, just let them list things
> in the order that comes to mind.  Wait and see how long it is before
> "performance" comes up as an issue.

Ask them why they don't use Ruby and Python and they'll say "performance".

> Java currently has a decent- not great, but decent- performance story.
> Now.  But Java became popular, and got into all of these coding shops,
> back when Java was either hideously slow, or radically unpredictable in
> performance.

On a scale with Ruby and Python, Java has never been "hideously" slow.

> > Operator overloading is about being able to reuse an identifier to refer
> > to different functions where the resolution depends upon the argument
> > types.
>
> OK.  So now we have a definition of what operator overloading is.  Now, my
> question: what's it good for?  It's obviously not necessary to write
> generic code- for example, a Newton's method that works on both floats and
> ratios.

Overloading is not about abstraction, it is about syntactic convenience. That 
is precisely why functors are not a viable alternative.

Type inference is closely related to overloading in that sense: both are 
designed to improve clarity and brevity.

> The only advantage I see is that you're able to use + instead of +. in
> many places.  OK, this is an advantage- but hardly a major one.

That is a huge advantage when you have a lot of numerical types.

> > Look at another example, Knuth's algorithm for numerically robust
> > variance:
> >
> >  let f (m, s, k) x =
> >    let m' = m + (x - m) / float k
> >    m', s + (x - m) * (x - m'), k + 1
> >
> > The "+" operator is used to mean both floating point addition and integer
> > addition in the same expression. Functors cannot do that. You can replace
> > "+" with float and int versions but only when they are in separate
> > modules, so you not only end up breaking this simple expression into
> > separate functions but even into separate modules. That's obviously crazy
> > useless and, indeed, is precisely the motivation behind Christophe
> > Troestlers's delimited overloading syntax extension.
>
> So, let's rewrite the above code to use different operators for floating
> point and integer operations:
>
>    let f (m, s, k) x =
>      let m' = m +. (x -. m) /. float k
>      m', s +. (x -. m) *. (x -. m'), k + 1
>
> Oh yes, I see now.  How much better it is if only we had operator
> overloading.

Now consider changing one of the types:

. I replace "float" with "float32" and I'm done.

. You have to replace every instance of every float operator with a different 
one, such as a function call to Float32.add.

> > So OCaml adopted + for int and +. for float. Sounds good but it scales
> > really badly when you introduce 2D and 3D homogeneous vectors and
> > matrices, arbitrary-dimensional vectors and matrices, quaternions,
> > complex numbers and even a 32-bit float type. You need something closer
> > to genuine overloading.
>
> My experience in general has been that operators don't scale in complex
> cases.  Number operators for number types scale better, but in any
> complicated case, and especially in most non-numeric cases, I prefer
> pseudo-english names.

I prefer "+" whereever you see "+" in mathematics.

> And I note that by far the most popular language for writing numeric code
> in, the all time champ for numerics, is Fortran.  A language that didn't
> allow operator overloading or redefinition at all, and which forced you to  
> use pseudo-english named functions if you wanted to add two matrixs
> together.  For a definition of pseudo-english which is loose enough to
> include names like "dgemm".

Ironically, Fortran has not only had overloaded operators and functions for 50 
years but they are used in the Fortran 77 definition of the DGEMM function 
and have even been user-defineable since Fortran 90.

Regardless, if you want to see overloading done right in the context of ML, 
look at F# and not Fortran.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e