caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Jon Harrop <jon@ffconsultancy.com>
To: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] The OCaml Summer Project
Date: Fri, 26 Jan 2007 10:45:17 +0000	[thread overview]
Message-ID: <200701261045.17495.jon@ffconsultancy.com> (raw)
In-Reply-To: <45B8E03F.7040802@janestcapital.com>

On Thursday 25 January 2007 16:52, Yaron M. Minsky wrote:
> I am pleased to announce the OCaml Summer Project.

Sounds like an excellent idea and the projects all look fascinating. However, 
I do have some comments on the "Binary tree library" project:

OCaml currently has two separate implementations of AVL trees in Map and Set 
functors. Set already has fast union and split operations.

Having two separate implementations is wasteful but more efficient. The 
underlying tree code could be factored out into another functor but this is 
costly in terms of performance. Also, the OCaml stdlib has used an odd choice 
of optimisations: inlining height calculation (which is quite a small benefit 
in the context of functors and polymorphism) but not amortising single-child 
trees into a separate constructor (which can remove up to 50% of the GC's 
effort). So the code can be made shorter and faster.

I've already implemented my own AVL set using the node-specialisation trick. 
Performance is ~30% faster, IIRC. I've also wanted to write a functional 
array based on AVL trees (O(log n) lookup but fast sub, append, insert, 
delete etc.) and a camlp4 extension to support pattern matching over this 
type. Lists and arrays are rather priviledged containers in OCaml, having 
pattern matching and literals, but trees are better in many respects and 
would make an excellent general-purpose container.

Finally, having to use functors does obfuscate OCaml code that deal with Sets 
and Maps in many cases, particularly because there are no built-in Int and 
Float modules so you must write your own. I often find that this superfluous 
code is as long as all of the code using the Sets/Maps. Although it would 
be "dangerous", Sets and Maps implemented without functors are much easier to 
use. After all, Hashtbl is typically used in that way.

It is also worth noting that several people (Diego, Jean-Christophe) have 
written other tree libraries using various data structures (RB, trie, splay 
etc.). As far as I can tell, AVL trees are a good all-rounder.

Best of luck with the projects!

-- 
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
Objective CAML for Scientists
http://www.ffconsultancy.com/products/ocaml_for_scientists


  parent reply	other threads:[~2007-01-26 10:52 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-25 16:52 Yaron M. Minsky
2007-01-25 20:26 ` Yaron M. Minsky
2007-01-26  1:29   ` [Caml-list] " Nathaniel Gray
2007-01-26  1:47     ` Gabriel Kerneis
     [not found]       ` <891bd3390701251931r433ed7ebu15700cd751cae492@mail.gmail.com>
2007-01-26  3:33         ` Fwd: " Yaron Minsky
2007-01-26  3:33         ` Yaron Minsky
2007-01-26  4:43       ` Markus Mottl
2007-01-26 12:51         ` Gerd Stolpmann
2007-01-26 14:39           ` Markus Mottl
2007-01-26  9:38 ` [Caml-list] " Dário Abdulrehman
2007-01-26 10:45 ` Jon Harrop [this message]
2007-01-26 18:26 ` Xavier Leroy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200701261045.17495.jon@ffconsultancy.com \
    --to=jon@ffconsultancy.com \
    --cc=caml-list@yquem.inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).