From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id XAA19826; Sat, 3 May 2003 23:13:52 +0200 (MET DST)
X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f
Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id XAA19530 for <caml-list@pauillac.inria.fr>; Sat, 3 May 2003 23:13:51 +0200 (MET DST)
Received: from post.it.helsinki.fi (post.it.helsinki.fi [128.214.205.24])
	by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id h43LDoH16231
	for <caml-list@inria.fr>; Sat, 3 May 2003 23:13:50 +0200 (MET DST)
Received: from la.iki.fi (root@kruuna.helsinki.fi [128.214.205.14])
	by post.it.helsinki.fi (8.12.9/8.12.2-SPAMmers-sod-off) with ESMTP id h43LDn5X005369;
	Sun, 4 May 2003 00:13:49 +0300 (EEST)
Received: from la by la.iki.fi with local (Exim 3.36 #1 (Debian))
	id 19C4KH-0003Br-00; Sun, 04 May 2003 00:13:45 +0300
Date: Sun, 4 May 2003 00:13:43 +0300
From: Lauri Alanko <la@iki.fi>
To: caml-list@inria.fr
Subject: Re: [Caml-list] Efficiency of 'a list
Message-ID: <20030503211343.GC700@la.iki.fi>
References: <200305022227.20136.exa@kablonet.com.tr>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <200305022227.20136.exa@kablonet.com.tr>
User-Agent: Mutt/1.5.3i
X-Spam: no; 0.00; lauri:01 alanko:01 caml-list:01 eray:01 ozkural:01 immutable:01 quicksort:01 mergesort:01 non-empty:01 topological:01 incarnations:01 operand:01 allocates:01 ocaml's:01 deque:01 
Sender: owner-caml-list@pauillac.inria.fr
Precedence: bulk

On Fri, May 02, 2003 at 10:27:20PM +0300, Eray Ozkural wrote:
> Could somebody please point to an explanation of ocaml linked list
> implementation or summarize its performance characteristics? This
> might seem like a trivial question but having used many functional
> languages I know that it's easy to commit performance genocide using
> linked lists.

It's easy to commit performance genocide with any data structure if you
use it for things for which it is not efficient. Singly linked immutable
lists have some nice characteristics: O(1) for cons, head and tail, all
purely functionally so you can safely share structure between multiple
lists. For example, lists are perfect for representing several alternate
paths in a path search.

> For instance, a naive implementation of an optimal comparison sorting 
> algorithm in LISP almost invariably results in an O(n^2logn) routine :)

Err? You mean something like quicksort? Who would use quicksort on a
linked list? Mergesort is O(n log n) as it should be.

> Therefore, it would be a good start to explain whether ocaml lists are
> in fact LISP lists and if not in what aspects they differ.

The main aspects are that the tail of non-empty list _must_ be another
list, and that neither the head or tail (car or cdr) of a list cell can
be altered.

> The motivation for this question comes from trying to understand the use of 
> linked lists in an efficient algorithm, such as graph algorithms (say we are 
> implementing topological sort)
> 
> Assume I'm using the following structure that is far from handsome:
> type x = (int list) array
> 
> Let a's type be x. Consider codes as the following
> a.(i) <- a.(i) @ [x;y;z]
> a.(i) <- [x] :: a.(i)
> 
> What travesty results in execution of such codes with i coming from an
> arbitrary sequence? Do using such constructs result in unholy
> incarnations of space leaks or gross inefficiencies?

The first line does an append. So it creates a new list which ends with
the list [x;y;z] (the same one you gave @ as an operand) and has all the
elements of a.(i) prepended to it. The operation allocates as many cells
as a.(i) had, and thus also has to spend time proportional to a.(i)'s
length. (@)'s implementation seems to be non-tail-recursive (it would
require "cheating" to do it otherwise) so with long lists you might blow
the stack.

For the second line I think you mean either
a.(i) <- x :: a.(i)
or
a.(i) <- [x] @ a.(i)

Both of these are O(1). The first allocates a single cell, the second
theoretically two (one of which is discarded immediately) unless the
compiler is smart.

If you really need to add stuff to both ends of a data structure,
ocaml's primitive lists aren't the thing. You might consider some sort
of a deque.

> Another question, does ocaml make any effort to place members of a
> list close to each other? Or, more naturally, the list element is
> allocated using a global model and then simply linked inside the
> structure?

The list _element_ is just the thing which is placed in the list. Its
allocation has nothing to do with the list's allocation. The list cells
are allocated from the heap like everything else, so temporally closely
allocated cells are likely to be physically close as well.

However, the copying garbage collector is quite likely to enhance
locality of lists when it runs. Someone more knowledgeable can probably
tell more on this.


Lauri Alanko
la@iki.fi

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners