From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <bhurt@spnz.org>
Delivered-To: caml-list@yquem.inria.fr
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39])
	by yquem.inria.fr (Postfix) with ESMTP id D787EBC88
	for <caml-list@yquem.inria.fr>; Mon,  7 Feb 2005 06:31:04 +0100 (CET)
Received: from herd.plethora.net (herd.plethora.net [205.166.146.1])
	by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j175V2UU010211
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL)
	for <caml-list@yquem.inria.fr>; Mon, 7 Feb 2005 06:31:04 +0100
Received: from bhurt.plethora.net (bhurt.plethora.net [205.166.146.49])
	by herd.plethora.net (8.13.1/8.13.1) with ESMTP id j175UgWe008808
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Sun, 6 Feb 2005 23:30:46 -0600 (CST)
Date: Sun, 6 Feb 2005 23:34:02 -0600 (CST)
From: Brian Hurt <bhurt@spnz.org>
X-X-Sender: bhurt@localhost.localdomain
To: skaller <skaller@users.sourceforge.net>
Cc: Jon <jdh30@cam.ac.uk>, <caml-list@yquem.inria.fr>
Subject: Re: [Caml-list] The boon of static type checking
In-Reply-To: <1107741266.6363.362.camel@pelican.wigram>
Message-ID: <Pine.LNX.4.44.0502062248540.5563-100000@localhost.localdomain>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Miltered: at concorde with ID 4206FD16.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)!
X-Spam: no; 0.00; caml-list:01 wrote:01 wrote:01 stl:01 stl:01 ocaml:01 val:01 const:01 const:01 syntax:01 compiler:01 ocamlopt:01 gcc:01 ocaml:01 gcc:01 
X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr
X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled 
	version=3.0.2
X-Spam-Level: 


Probably a bad idea, but I've got to jump in here.

Full disclosure: I *hate* C++.  Mainly because I've actually written real 
programs in it.  The next time I have to use C++ in any sort of serious 
way I'm registering c++sucks.com and starting a website to catalog all the 
different ways C++ sucks.  Feel free to stop reading at this point.

On 7 Feb 2005, skaller wrote:

> On Mon, 2005-02-07 at 04:28, Jon wrote:
> > On Feb 6 2005, Radu Grigore wrote:
> > > On Fri, 4 Feb 2005 10:26:55 +0000, Jon Harrop <jon@jdh30.plus.com> wrote:
> > > > The code to implement a simple tree is unwieldy in C++ (e.g. 1975 LOC
> > > > in stl_tree.h and stl_map.h vs 198 LOC in map.ml).
> > > 
> > > The implementation in stl_tree is not a "simple" tree. The difference
> > > in size is also because the STL implementation is more efficient.
> > 
> > This is a much smaller effect compared to the verbosity imposed by inherent 
> > inefficiencies of the C++ language though.
> 
> Curious what you think those are. 

Let's see.

Ocaml:
val map : ('a -> 'b) -> 'a list -> 'b list

C++:

template<class a, class b> list<b>& list::map(b& (*)(const a&), 
const list<a>&);

I'm not 100% sure that's correct- I think there needs to be a typename in 
there somewhere.  C++'s syntax has gotten so horrid even the compiler has 
a hard time figuring out what's a return type vr.s what's a function name.

Using templates for universal types is a bad idea, even ignoring the code 
duplication evils.

> 
> g++ seems to generate better
> code than ocamlopt for similar simple problems
> (see Alioth for quantitative evidence given silly
> set of sample 'problems')

Yep.  And, conservatively, 10 times as much effort has gone into the gcc
optimizer as the Ocaml optimizer.  Possibly 100 times.  For, according to
Alioth, about a 10% improvement.  It's only with gcc 3.x that C++ managed
to beat Ocaml on performance.

I note with humor that the big new optimization in GCC is the SSA-Tree 
form- "Single Static Assignment".  The idea behind SSA is that you can 
only assign a variable a value when you create it, you can not change it 
latter.  Once you express C++ in SSA, it's a lot easier to apply a lot of 
optimizations to it.  Of course, the more I look at SSA, the more it looks 
like a functional language to me.  So, in effect, the GCC optimization 
strategy is to convert the C++ into a functional language, and then 
optimize the functional language.  

> 
> IMHO the single major inefficiency in C++ is also a source
> of efficiency -- lack of a garbage collector.

It's a source of efficiency on the small scale- it's easy to write a 1,000 
line program with hand allocation.  Rather harder to write a 10,000 line 
program, and a major bitch to write a 100,000 line program without garbage 
collection.

> 
> OTOH Ocaml has a massive inefficiency C++ does not,
> namely boxing, and ocaml tools currently do not 
> seem to be very good at unboxing.

This is also a big efficiency for Ocaml, for two reasons.  First, it 
encourages code reuse.  Second, Ocaml only needs one implementation of a 
function for all types.  C++'s template system, to unbox everything, 
duplicates code like crazy- generally, a new implementation for every 
type.

> 
> Again, C++ provides inline functions which
> provide a way Ocaml does not have so easily 
> for optimising.

Don't assume that inlining is optimization.  Actually, it generally isn't.  
Having actually timed it on modern hardware, a function call costs like 
2-3 clock cycles these days.  Plus 1-2 clock cycles per argument.  This is 
compared to the 10-30 clock cycles a mispredicted branch costs, the 20+ 
clock cycles an L1 cache miss/L2 cache hit costs, and the 100-350+ clock 
cycles of an L2 cache miss/memory fetch.

The model people use to guess what makes code go faster is still based on
the 8086.  Count up the number of instructions you'll execute (possibly
making an allowance for "expensive" instructions like divide and square
root), and that's an approximation for how much time the program will
take.  Fewer instructions = faster program.  The fact that this model 
hasn't been true for two decades hasn't detracted from it's popularity.

Inlining code increases code size, vastly increasing the number of cache
misses.  I've seen the executable size of C++ code *triple* by adding two
words to the source code (virtual inline on a constructor).  I agree
that's an extreme case- but in most cases inlining is just avoiding a 2-3
clock cycle cost by instead incurring a 100-350+ clock cycle cost.

What little inlining is a good idea can be done by the compiler, 
especially by Ocaml which will inline across module boundaries.  But Mel 
is alive and well and probably coding in C++, and what fun is the compiler 
performing these optimizations when we can perfom them as well.  Most 
programmers still think they're working in Turbo C++ 1.0 which didn't 
optimize.  Hint: I've yet to meet a compiler that couldn't turn x/32 into 
x >> 5.

> So I'm quite interested in what inefficiencies C++ might
> have compared to ocaml -- feel free to identify not only
> 'intrinsic' inefficiency of the languages, but also
> problems in the compilation models, and difficulties
> in generating good code that are not necessarily
> intrinsic. (EG I would like to hear a comment
> that tail-rec optimisation whilst not 'intrinsically impossible'
> in C++ is hard to do, if that is your belief)

Actually, g++ does do tail call optimization (in the 3.x tree).  It's 
actually not that hard to do, just not that beneficial (for C++).

Brian