From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id D787EBC88 for ; Mon, 7 Feb 2005 06:31:04 +0100 (CET) Received: from herd.plethora.net (herd.plethora.net [205.166.146.1]) by concorde.inria.fr (8.13.0/8.13.0) with ESMTP id j175V2UU010211 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Mon, 7 Feb 2005 06:31:04 +0100 Received: from bhurt.plethora.net (bhurt.plethora.net [205.166.146.49]) by herd.plethora.net (8.13.1/8.13.1) with ESMTP id j175UgWe008808 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sun, 6 Feb 2005 23:30:46 -0600 (CST) Date: Sun, 6 Feb 2005 23:34:02 -0600 (CST) From: Brian Hurt X-X-Sender: bhurt@localhost.localdomain To: skaller Cc: Jon , Subject: Re: [Caml-list] The boon of static type checking In-Reply-To: <1107741266.6363.362.camel@pelican.wigram> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Miltered: at concorde with ID 4206FD16.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 wrote:01 wrote:01 stl:01 stl:01 ocaml:01 val:01 const:01 const:01 syntax:01 compiler:01 ocamlopt:01 gcc:01 ocaml:01 gcc:01 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.2 X-Spam-Level: Probably a bad idea, but I've got to jump in here. Full disclosure: I *hate* C++. Mainly because I've actually written real programs in it. The next time I have to use C++ in any sort of serious way I'm registering c++sucks.com and starting a website to catalog all the different ways C++ sucks. Feel free to stop reading at this point. On 7 Feb 2005, skaller wrote: > On Mon, 2005-02-07 at 04:28, Jon wrote: > > On Feb 6 2005, Radu Grigore wrote: > > > On Fri, 4 Feb 2005 10:26:55 +0000, Jon Harrop wrote: > > > > The code to implement a simple tree is unwieldy in C++ (e.g. 1975 LOC > > > > in stl_tree.h and stl_map.h vs 198 LOC in map.ml). > > > > > > The implementation in stl_tree is not a "simple" tree. The difference > > > in size is also because the STL implementation is more efficient. > > > > This is a much smaller effect compared to the verbosity imposed by inherent > > inefficiencies of the C++ language though. > > Curious what you think those are. Let's see. Ocaml: val map : ('a -> 'b) -> 'a list -> 'b list C++: template list& list::map(b& (*)(const a&), const list&); I'm not 100% sure that's correct- I think there needs to be a typename in there somewhere. C++'s syntax has gotten so horrid even the compiler has a hard time figuring out what's a return type vr.s what's a function name. Using templates for universal types is a bad idea, even ignoring the code duplication evils. > > g++ seems to generate better > code than ocamlopt for similar simple problems > (see Alioth for quantitative evidence given silly > set of sample 'problems') Yep. And, conservatively, 10 times as much effort has gone into the gcc optimizer as the Ocaml optimizer. Possibly 100 times. For, according to Alioth, about a 10% improvement. It's only with gcc 3.x that C++ managed to beat Ocaml on performance. I note with humor that the big new optimization in GCC is the SSA-Tree form- "Single Static Assignment". The idea behind SSA is that you can only assign a variable a value when you create it, you can not change it latter. Once you express C++ in SSA, it's a lot easier to apply a lot of optimizations to it. Of course, the more I look at SSA, the more it looks like a functional language to me. So, in effect, the GCC optimization strategy is to convert the C++ into a functional language, and then optimize the functional language. > > IMHO the single major inefficiency in C++ is also a source > of efficiency -- lack of a garbage collector. It's a source of efficiency on the small scale- it's easy to write a 1,000 line program with hand allocation. Rather harder to write a 10,000 line program, and a major bitch to write a 100,000 line program without garbage collection. > > OTOH Ocaml has a massive inefficiency C++ does not, > namely boxing, and ocaml tools currently do not > seem to be very good at unboxing. This is also a big efficiency for Ocaml, for two reasons. First, it encourages code reuse. Second, Ocaml only needs one implementation of a function for all types. C++'s template system, to unbox everything, duplicates code like crazy- generally, a new implementation for every type. > > Again, C++ provides inline functions which > provide a way Ocaml does not have so easily > for optimising. Don't assume that inlining is optimization. Actually, it generally isn't. Having actually timed it on modern hardware, a function call costs like 2-3 clock cycles these days. Plus 1-2 clock cycles per argument. This is compared to the 10-30 clock cycles a mispredicted branch costs, the 20+ clock cycles an L1 cache miss/L2 cache hit costs, and the 100-350+ clock cycles of an L2 cache miss/memory fetch. The model people use to guess what makes code go faster is still based on the 8086. Count up the number of instructions you'll execute (possibly making an allowance for "expensive" instructions like divide and square root), and that's an approximation for how much time the program will take. Fewer instructions = faster program. The fact that this model hasn't been true for two decades hasn't detracted from it's popularity. Inlining code increases code size, vastly increasing the number of cache misses. I've seen the executable size of C++ code *triple* by adding two words to the source code (virtual inline on a constructor). I agree that's an extreme case- but in most cases inlining is just avoiding a 2-3 clock cycle cost by instead incurring a 100-350+ clock cycle cost. What little inlining is a good idea can be done by the compiler, especially by Ocaml which will inline across module boundaries. But Mel is alive and well and probably coding in C++, and what fun is the compiler performing these optimizations when we can perfom them as well. Most programmers still think they're working in Turbo C++ 1.0 which didn't optimize. Hint: I've yet to meet a compiler that couldn't turn x/32 into x >> 5. > So I'm quite interested in what inefficiencies C++ might > have compared to ocaml -- feel free to identify not only > 'intrinsic' inefficiency of the languages, but also > problems in the compilation models, and difficulties > in generating good code that are not necessarily > intrinsic. (EG I would like to hear a comment > that tail-rec optimisation whilst not 'intrinsically impossible' > in C++ is hard to do, if that is your belief) Actually, g++ does do tail call optimization (in the 3.x tree). It's actually not that hard to do, just not that beneficial (for C++). Brian