From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id XAA19826; Sat, 3 May 2003 23:13:52 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id XAA19530 for ; Sat, 3 May 2003 23:13:51 +0200 (MET DST) Received: from post.it.helsinki.fi (post.it.helsinki.fi [128.214.205.24]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id h43LDoH16231 for ; Sat, 3 May 2003 23:13:50 +0200 (MET DST) Received: from la.iki.fi (root@kruuna.helsinki.fi [128.214.205.14]) by post.it.helsinki.fi (8.12.9/8.12.2-SPAMmers-sod-off) with ESMTP id h43LDn5X005369; Sun, 4 May 2003 00:13:49 +0300 (EEST) Received: from la by la.iki.fi with local (Exim 3.36 #1 (Debian)) id 19C4KH-0003Br-00; Sun, 04 May 2003 00:13:45 +0300 Date: Sun, 4 May 2003 00:13:43 +0300 From: Lauri Alanko To: caml-list@inria.fr Subject: Re: [Caml-list] Efficiency of 'a list Message-ID: <20030503211343.GC700@la.iki.fi> References: <200305022227.20136.exa@kablonet.com.tr> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200305022227.20136.exa@kablonet.com.tr> User-Agent: Mutt/1.5.3i X-Spam: no; 0.00; lauri:01 alanko:01 caml-list:01 eray:01 ozkural:01 immutable:01 quicksort:01 mergesort:01 non-empty:01 topological:01 incarnations:01 operand:01 allocates:01 ocaml's:01 deque:01 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Fri, May 02, 2003 at 10:27:20PM +0300, Eray Ozkural wrote: > Could somebody please point to an explanation of ocaml linked list > implementation or summarize its performance characteristics? This > might seem like a trivial question but having used many functional > languages I know that it's easy to commit performance genocide using > linked lists. It's easy to commit performance genocide with any data structure if you use it for things for which it is not efficient. Singly linked immutable lists have some nice characteristics: O(1) for cons, head and tail, all purely functionally so you can safely share structure between multiple lists. For example, lists are perfect for representing several alternate paths in a path search. > For instance, a naive implementation of an optimal comparison sorting > algorithm in LISP almost invariably results in an O(n^2logn) routine :) Err? You mean something like quicksort? Who would use quicksort on a linked list? Mergesort is O(n log n) as it should be. > Therefore, it would be a good start to explain whether ocaml lists are > in fact LISP lists and if not in what aspects they differ. The main aspects are that the tail of non-empty list _must_ be another list, and that neither the head or tail (car or cdr) of a list cell can be altered. > The motivation for this question comes from trying to understand the use of > linked lists in an efficient algorithm, such as graph algorithms (say we are > implementing topological sort) > > Assume I'm using the following structure that is far from handsome: > type x = (int list) array > > Let a's type be x. Consider codes as the following > a.(i) <- a.(i) @ [x;y;z] > a.(i) <- [x] :: a.(i) > > What travesty results in execution of such codes with i coming from an > arbitrary sequence? Do using such constructs result in unholy > incarnations of space leaks or gross inefficiencies? The first line does an append. So it creates a new list which ends with the list [x;y;z] (the same one you gave @ as an operand) and has all the elements of a.(i) prepended to it. The operation allocates as many cells as a.(i) had, and thus also has to spend time proportional to a.(i)'s length. (@)'s implementation seems to be non-tail-recursive (it would require "cheating" to do it otherwise) so with long lists you might blow the stack. For the second line I think you mean either a.(i) <- x :: a.(i) or a.(i) <- [x] @ a.(i) Both of these are O(1). The first allocates a single cell, the second theoretically two (one of which is discarded immediately) unless the compiler is smart. If you really need to add stuff to both ends of a data structure, ocaml's primitive lists aren't the thing. You might consider some sort of a deque. > Another question, does ocaml make any effort to place members of a > list close to each other? Or, more naturally, the list element is > allocated using a global model and then simply linked inside the > structure? The list _element_ is just the thing which is placed in the list. Its allocation has nothing to do with the list's allocation. The list cells are allocated from the heap like everything else, so temporally closely allocated cells are likely to be physically close as well. However, the copying garbage collector is quite likely to enhance locality of lists when it runs. Someone more knowledgeable can probably tell more on this. Lauri Alanko la@iki.fi ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners