From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id VAA18123; Wed, 25 Sep 2002 21:22:24 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id VAA18119 for ; Wed, 25 Sep 2002 21:22:23 +0200 (MET DST) Received: from fichte.ai.univie.ac.at (fichte.ai.univie.ac.at [131.130.174.156]) by nez-perce.inria.fr (8.11.1/8.11.1) with ESMTP id g8PJMMD25416 for ; Wed, 25 Sep 2002 21:22:22 +0200 (MET DST) Received: from fichte.ai.univie.ac.at (markus@localhost [127.0.0.1]) by fichte.ai.univie.ac.at (8.12.3/8.12.3/Debian -4) with ESMTP id g8PJMJEI022445; Wed, 25 Sep 2002 21:22:19 +0200 Received: (from markus@localhost) by fichte.ai.univie.ac.at (8.12.3/8.12.3/Debian -4) id g8PJMIQA022444; Wed, 25 Sep 2002 21:22:18 +0200 Date: Wed, 25 Sep 2002 21:22:18 +0200 From: Markus Mottl To: Brian Hurt Cc: Ocaml Mailing List Subject: Re: [Caml-list] Probably FAQ: Why is list-append (list :: elem) so expensive? Message-ID: <20020925192218.GA20345@fichte.ai.univie.ac.at> Mail-Followup-To: Brian Hurt , Ocaml Mailing List References: <20020925161703.GC31883@fichte.ai.univie.ac.at> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Organization: Austrian Research Institute for Artificial Intelligence Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Wed, 25 Sep 2002, Brian Hurt wrote: > I think I'm asking the wrong question. Is there any circumstances where > Ocaml will reuse the cells from one list to make another list? > > Consider the following Ocaml code: > > let dup_list lst = > let rec reverse_list l accum = > match l with > [] -> accum > | head :: tail -> reverse_list tail (head :: accum) > in > reverse_list (reverse_list lst) > ;; Thanks to referential transparency of purely functional code it is easy to prove by induction that twice reversing a list will yield the original list again (home exercise :-). However, you shouldn't expect that current compiler technology is strong enough to exploit such equivalences even though some automated theorem provers manage to prove them without human help. Most real world high-level optimizers already struggle with doing partial evaluation correctly and efficiently, not even to mention trickier stuff like above. > Ok, the first time through reverse_list the compiler has to allocate a > whole new list, including a whole new set of list cells, for the > accumlation list. This is because the code doesn't dare modify lst. But > the second time reverse_list is called, it's obvious that l is the list > allocated the first time through, and it's garbage as soon as the call > completes. In fact, each cell of the list is garbage after it's been > prepended to accum. So why not reuse the cells from l to form the accum? > Yes, I know this violates the immutability of l. But it seems to me that, > with the exception of garbage creation rates, the two are identical. Yes, the two are indeed equivalent. And since our lists are purely functional, there is no danger of overwriting them (it) during reuse. The only noticable computational effect is that things will run faster after your transformation. > Phrased this way, I start wondering how much of an optimization this > actually would be. It's pretty demanding! Search space size for finding suitable high-level transformations usually grows exponentially with the number of definitions to consider. There are, however, certain classes of transformations that can be found and applied somewhat efficiently. The Haskell-compiler implements some of them (e.g. deforestation, which concerns the elimination of certain intermediate datastructures). One question is an economic one: where do compiler writers invest their time? As the OCaml-compiler shows, you can gain really a lot by forgetting about high-level optimizations (it doesn't even perform common subexpression elimination) and by writing good backends for machine-code generation instead. The problem is that the "easy" high-level transformation usually do not give you so much speedup, whereas the tricky ones (some can even yield super-exponential speedups) boost compilation times by some orders of magnitude. Few developers want to spend their life time waiting for their compilers to finish... In case you want to learn more about automating functional program transformations, you might find a hopefully somewhat understandable introduction in my MSc-thesis (just skip the mistakes ;-) http://www.oefai.at/~markus/msc_thesis > Heh. One of the dangers of Ocaml is that it makes the algorithm clear > enough that it encourages you to over-optimize your algorithms in > the same way that C encourages you to cycle count and over-optimize > your implementations. This may happen at times. It is generally a good idea to just write the simplest implementation you can imagine, profile your program and then optimize it. Very often the simplest algorithm is also among the best so it seldom pays to start optimizing before your program actually works. Regards, Markus Mottl -- Markus Mottl markus@oefai.at Austrian Research Institute for Artificial Intelligence http://www.oefai.at/~markus ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners