caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Mike Furr <furr@cs.umd.edu>
To: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Comparison of OCaml and MLton for numerics
Date: Mon, 04 Jun 2007 10:03:45 -0400	[thread overview]
Message-ID: <46641BC1.9090305@cs.umd.edu> (raw)
In-Reply-To: <200706011813.48515.jon@ffconsultancy.com>

Jon Harrop wrote:
> Actually, while we're here. I've long thought that the stdlib should provide 
> abstract implementations of concrete data structures like RB trees and AVL 
> trees, and functorize the Set and Map modules over the tree type they use. 
> This would let people add new abstract data structures (I like purely 
> functional sequences based on AVL trees) built upon solid concrete data 
> structures from the stdlib rather than cutting and pasting code (one of the 
> more embarassing OCaml FAQs).

My OCaml summer project to build an improved data structure library will 
(hopefully) address many of the issues you have raised.  Although I'm 
just getting started, I have already started using your suggestion of 
including a 1-element constructor for my trees as it does indeed seem to 
give a noticeable speedup.

> Making this feasible by optimizing away the abstractions requires more than 
> just defunctorizing though. You need to partially specialize by type, as 
> Markus says. You also need to do whole-program transformations to flatten 
> data structures. For example, a set would require:
> 
>   Node of 'a t * 'a * 'a t * height
> 
> and a sequence would require:
> 
>   Node of 'a t * 'a * 'a t * height * size
> 
> So a generic OCaml solution would add an indirection to the metadata that 
> could be flattened out.

Indeed this would be preferable.  Right now, my tree implementations 
include a generic tree mixin with the type constructor:

    Node of ('data,'inv) t * 'data * ('data,'inv) t * 'inv

forcing all invariant information into the last cell.  Implementations 
that require more than a single value here must use an extra level of 
indirection (so in your example, 'inv = height * size).  I don't think 
this will be too bad for performance since these values only need to be 
retrieved for re-balancing leaving read-only operations unaffected. 
However, this has the huge benefit of allowing me to only write *one* 
version of find, min/max, fold, iter, etc. for all of the trees I 
implement, which in and of itself is a big win. :)

-Mike


  reply	other threads:[~2007-06-04 14:04 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-31  5:50 Yuanchen Zhu
2007-05-31  6:17 ` [Caml-list] " Jon Harrop
2007-05-31  6:32   ` skaller
2007-05-31  7:31   ` Yuanchen Zhu
2007-05-31  9:08     ` Jon Harrop
2007-05-31  9:22       ` Yuanchen Zhu
2007-05-31 10:27         ` Jon Harrop
2007-05-31 21:30           ` Alain Frisch
2007-06-01  1:22             ` skaller
2007-06-01  1:36               ` Erik de Castro Lopo
2007-06-01  2:21                 ` skaller
2007-06-01  2:49                   ` Erick Tryzelaar
2007-06-01  3:05                     ` skaller
2007-06-01  5:30               ` Alain Frisch
2007-06-01  5:39                 ` Jon Harrop
2007-06-01  6:36                   ` Yuanchen Zhu
2007-06-01  8:09                 ` skaller
2007-06-01  8:53                   ` Richard Jones
2007-06-01  8:59                     ` Richard Jones
2007-06-01  9:22                       ` Stephan Tolksdorf
2007-06-01  9:49                         ` Richard Jones
2007-06-01  9:32                       ` Stephan Tolksdorf
2007-06-01 10:02                     ` skaller
2007-06-01 11:29                 ` Yaron Minsky
2007-06-01 11:43                   ` Erik de Castro Lopo
2007-06-01 11:58                     ` Jon Harrop
2007-06-01 13:49                       ` Julien Signoles
2007-06-01 14:18                         ` Stephen Weeks
2007-06-01 14:43                           ` Julien Signoles
2007-06-01 14:57                           ` Brian Hurt
2007-06-01 15:40                             ` Alain Frisch
2007-06-01 15:58                               ` Brian Hurt
2007-06-01 16:25                                 ` Markus Mottl
2007-06-01 16:47                               ` Jon Harrop
2007-06-01 23:26                             ` skaller
2007-06-01 23:49                               ` Brian Hurt
2007-06-02  3:26                                 ` skaller
2007-06-01 12:40                     ` Erik de Castro Lopo
2007-06-01 13:56                       ` Julien Signoles
2007-06-01 11:49                   ` David MENTRE
2007-06-01 14:41                     ` skaller
2007-06-01 16:52                       ` Jon Harrop
2007-06-01 23:33                         ` skaller
2007-06-01 16:14                     ` Markus Mottl
2007-06-01 16:46                       ` Jon Harrop
2007-06-01 17:13                       ` Jon Harrop
2007-06-04 14:03                         ` Mike Furr [this message]
2007-06-04 14:39                           ` Jon Harrop
2007-06-04 15:33                             ` Mike Furr
2007-06-04 18:08                             ` skaller
     [not found]                               ` <9d3ec8300706041518y115d22bdsa120d4010261d841@mail.gmail.com>
2007-06-04 22:19                                 ` Fwd: " Till Varoquaux
2007-06-04 23:40                                   ` Jon Harrop
2007-06-05  2:24                                   ` skaller
2007-06-04 22:44                               ` Pierre Etchemaïté
2007-06-05  1:42                                 ` Jon Harrop
2007-06-05 10:30                                   ` Jon Harrop
2007-06-10 12:10                           ` Jon Harrop
2007-06-10 12:58                             ` skaller
2007-06-01 14:15                 ` Stephen Weeks
2007-06-01 14:37                   ` Brian Hurt
2007-06-01 14:39                   ` Eric Cooper
2007-05-31  9:24       ` Yuanchen Zhu
2007-05-31 10:25       ` Loup Vaillant
2007-05-31 10:30         ` Jon Harrop
2007-05-31 12:12     ` skaller
2007-05-31  7:11 ` Daniel Bünzli
2007-05-31 15:15 ` Christophe Raffalli
2007-05-31 15:23   ` Jon Harrop
2007-05-31 15:35     ` Christophe Raffalli
     [not found]       ` <604682010705310923o5a1ee0eiee5ae697da9d3c60@mail.gmail.com>
2007-05-31 20:14         ` Stephen Weeks
2007-05-31 15:16 ` Christophe Raffalli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46641BC1.9090305@cs.umd.edu \
    --to=furr@cs.umd.edu \
    --cc=caml-list@yquem.inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).