From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id C87E3D45F for ; Tue, 1 Nov 2005 01:26:33 +0100 (CET) Received: from cenn.mc.mpls.visi.com (cenn.mc.mpls.visi.com [208.42.156.9]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id jA10QWqS007705 for ; Tue, 1 Nov 2005 01:26:33 +0100 Received: from [192.168.42.2] (bhurt.dsl.visi.com [208.42.141.66]) by cenn.mc.mpls.visi.com (Postfix) with ESMTP id 45918815E; Mon, 31 Oct 2005 18:26:32 -0600 (CST) Date: Mon, 31 Oct 2005 18:29:50 -0600 (CST) From: Brian Hurt X-X-Sender: bhurt@localhost.localdomain To: Tato Thetza Cc: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Re: OCaml efficiency/optimization? In-Reply-To: <1130495033.8413.246225066@webmail.messagingengine.com> Message-ID: References: <1130494903.8339.246224558@webmail.messagingengine.com> <1130495033.8413.246225066@webmail.messagingengine.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Miltered: at nez-perce with ID 4366B638.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 ocaml:01 ocaml:01 usefull:01 notation:01 replacing:01 replacing:01 applicative:01 okasaki's:01 red-black:01 citeseer:01 hinze:01 tweak:01 recursive:01 2005,:98 X-Spam-Checker-Version: SpamAssassin 3.0.3 (2005-04-27) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.3 On Fri, 28 Oct 2005, Tato Thetza wrote: > sorry I meant > Ocaml for Experienced Programmers: > http://www.bogonomicon.com/bblog/book/html/book1.html > > On Fri, 28 Oct 2005 03:21:43 -0700, "Tato Thetza" > said: >> I've been reading over >> http://caml.inria.fr/pub/docs/manual-ocaml/index.html and have learned >> two things: >> -lists are immutable and singly linked, which explains why 1::[2;3] is >> valid while [2,3]::1 is not, and why its efficient. >> -the proper way to ensure tail-recursive optimization >> >> question: are these and other optimizations documented somewhere >> officially? I find it a little uncomfortable I've been learning OCaml >> without knowning such internal details. Any secrets I should definitely >> know if I were to use this language in production? Answering the original question (somewhat belatedly), I present Brian's steps to optimizing code: 1) Get it to work. I don't care how fast your program returns the wrong answer, get it returning the right answer in all cases before doing anything else. This also includes making the code maintainable- if you can't figure out how the code works, you won't be able to figure out how to make the code work faster. And in general, optimizers work better on clean, maintainable code- generally maintainable code is also faster code. 2) Measure. Is the code fast enough? If so, you can stop now. 3) Optimize. This includes compiling to native, and using ocamldefun, the Ocaml Defunctorizor. This is stuff that is "easy" to do (comparatively), and can have dramatic effects on performance. 4) Buy a faster computer. No, really- what's the programmer time worth? >>From here on out it'll start being really easy to rack of tens, if not hundreds, of thousands of dollars of programmer time improving the code. At which point simply buying a faster machine might be signifigantly cheaper. This isn't always the case, but generally it is- you definately shouldn't assume it isn't the case. 5) Profile. If the code isn't fast enough, what's taking the time? Making a program 10x faster can be completely worthless- if the program takes 100ms to fire off a database query that takes 15 minutes to complete, making that program only take 10ms to fire off the same database query isn't going to help. Also, optimizing initilization loops is generally not usefull. Where is the time going? 6) Look for large-scale optimizations. Are you doing things you don't need to do? No operation completes faster than the operation you don't do. What is the big-O notation of your core algorithms? Replacing an O(n^2) algorithm with an O(n log n) algorithm can yield huge performance gains (10,000x or more). This is where maintainable code comes in real handy. This is also the point where I consider adding imperitive code to my program, replacing O(log N) tree lookups with O(1) array lookups. Note that this is an *optimization*- applicative datastructures are more maintainable (IMHO), and thus should be the default approach taken until it is proven that performance requirements demand imperitive data structures. But I'd definately go through Okasaki's book before making that decision. One comment here: implicit in this step is you being conversant with the current literature- which generally means doing new searches on a regular basis. You might think "Hey, I've been using red-black trees for decades, and it's not like they're exactly new technology. Why do a literature search on them?" But you'd miss this paper, which was only published six years ago: http://citeseer.ist.psu.edu/hinze99constructing.html 7) Measure again. Actually, I take that back- you should be measuring constantly (to show that you are, in fact, making things go faster), and profiling regularly, from here on out. 8) Now you're into the tweak stage. Generally (not always, but generally), using recursive functions and arguments, if it's less than five arguments, will be faster than loops and references (the arguments will just live in registers, while the reference assignments have to get written out to memory). Replace tuples of all floats with structures of all floats, and unbox the floats. 9) If it still isn't fast enough, rewrite core routines in C, or even assembly language. But these are Hail Mary passes of optimization- if they aren't enough, you're screwed. Brian