caml-list - the Caml user's mailing list
 help / color / mirror / Atom feed
From: Mike Furr <furr@cs.umd.edu>
To: caml-list@yquem.inria.fr
Cc: osp2007-ocaml-reins@googlegroups.com
Subject: Re: [Caml-list] Comparison of OCaml and MLton for numerics
Date: Mon, 04 Jun 2007 11:33:24 -0400	[thread overview]
Message-ID: <466430C4.7020806@cs.umd.edu> (raw)
In-Reply-To: <200706041539.03286.jon@ffconsultancy.com>

   For now,  I just want to get Jon Harrop wrote:
> On Monday 04 June 2007 15:03:45 Mike Furr wrote:
>> My OCaml summer project to build an improved data structure library
> 
> This is an absolutely incredible project and I am eagerly awaiting your 
> results. As I am sure you already know, Jean-Christophe Filliatre, Chris 
> Okasaki and Markus Mottl have done fantastic work in this area already. I 
> would be very happy to see this collated. Diego Fernandez also did some good 
> work with Baire but it seems to have evaporated.

Yes, their work is a major influence of the project.

> I'll gladly submit my stdlib2 if you're interested. Its main benefit is 
> autogenerated functions (like iter, map, fold) that act upon tuples of 
> different arities:

Great, I would love to see what you've done!  Also, if you interested, 
you can subscribe to the mailing list for the project (no posts yet as 
I'm just getting started, but perhaps we should move our conversation 
there... CC'd).

http://groups.google.com/group/osp2007-ocaml-reins

> You might also consider lazy rebalancing, as rebalancing is both expensive and 
> often unnecessary.

Initially, I plan on supporting lazy rebalancing via the use of zippers. 
   This will also allow splay-like find semantics for arbitrary tree 
structures (although not quite as efficient as a native splay tree which 
of course will also be available).

> A functional array implemented as a balanced binary tree needs the number of 
> elements (size) of a sub tree just to index into the tree. So that read-only 
> operation must indirect unnecessarily using your approach. However, the 
> indirection is a much smaller overhead compared to the cost of functors or 
> polymorphism when the comparison functions are trivial (which they normally 
> are) using the current approach.

Indeed, this example may require a custom implementation to be 
efficient.  However, I have lots to do for the summer and will likely 
leave such performance tweaking until the end of the project.

> My feeling is that the use of functors in Set and Map are more of a hindrance 
> than a benefit. The ubiquity of Hashtbl is testament to this.

I'm not exactly sure what you mean by this.  Are you suggesting the 
performance is a hindrance or that the need for a separate 'module M = 
SMap.Make(...)' line is annoying?  One goal of my library will be that 
it will allow you to change data structures very easily.  For instance 
you might write the following while initially developing your code

   module MyMap = Poly.Map

This map would use the standard polymorphic compare function with keys 
of type 'a and values of type 'b (for some a,b).  Later, once your code 
has matured, you might know your keys are always of type (int*int) list.
Then you would simply change the module def to:

  module MyMap = Mono.AVLMap(List(Pair(Int,Int)))

and now you have a specific data structure which takes advantage of the 
faster monomorphic compare function.  The library will include modules 
for all of the base types (int, float, etc...) as well as hoisting 
common type constructors into functors (such as List() above). 
Therefore, you don't have to write any annoying boilerplate code to use 
the functors.

As far as performance goes, I plan on using functors where it makes the 
code most elegant and easy to use.  With all of the mention of 
ocamldefun on this mailing list, I imagine it won't be too long before 
we have version for ocaml 3.10 (or else I may write it myself!)

>> However, this has the huge benefit of allowing me to only write *one*
>> version of find, min/max, fold, iter, etc. for all of the trees I
>> implement, which in and of itself is a big win. :)
> 
> Perhaps we should consider the alternative approach where those functions are 
> higher-order functions over the necessary low-level tree functions. This 
> would only need support for inlining to recover optimal performance and it 
> also lets you write the core functions only once.

Perhaps.  This would require passing around a dictionary of functions 
and I'm not sure the resulting code would be as clean or any faster than 
functors.  Another way would be to build them on top of the libraries 
zipper-based iterators (but this will likely be less efficient).  I plan 
on doing this anyway to support arbitrary traversal strategies.  Again, 
perhaps at the end of the summer I can look into the performance 
characteristics of these ideas.

Cheers,
-Mike


  reply	other threads:[~2007-06-04 15:33 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-31  5:50 Yuanchen Zhu
2007-05-31  6:17 ` [Caml-list] " Jon Harrop
2007-05-31  6:32   ` skaller
2007-05-31  7:31   ` Yuanchen Zhu
2007-05-31  9:08     ` Jon Harrop
2007-05-31  9:22       ` Yuanchen Zhu
2007-05-31 10:27         ` Jon Harrop
2007-05-31 21:30           ` Alain Frisch
2007-06-01  1:22             ` skaller
2007-06-01  1:36               ` Erik de Castro Lopo
2007-06-01  2:21                 ` skaller
2007-06-01  2:49                   ` Erick Tryzelaar
2007-06-01  3:05                     ` skaller
2007-06-01  5:30               ` Alain Frisch
2007-06-01  5:39                 ` Jon Harrop
2007-06-01  6:36                   ` Yuanchen Zhu
2007-06-01  8:09                 ` skaller
2007-06-01  8:53                   ` Richard Jones
2007-06-01  8:59                     ` Richard Jones
2007-06-01  9:22                       ` Stephan Tolksdorf
2007-06-01  9:49                         ` Richard Jones
2007-06-01  9:32                       ` Stephan Tolksdorf
2007-06-01 10:02                     ` skaller
2007-06-01 11:29                 ` Yaron Minsky
2007-06-01 11:43                   ` Erik de Castro Lopo
2007-06-01 11:58                     ` Jon Harrop
2007-06-01 13:49                       ` Julien Signoles
2007-06-01 14:18                         ` Stephen Weeks
2007-06-01 14:43                           ` Julien Signoles
2007-06-01 14:57                           ` Brian Hurt
2007-06-01 15:40                             ` Alain Frisch
2007-06-01 15:58                               ` Brian Hurt
2007-06-01 16:25                                 ` Markus Mottl
2007-06-01 16:47                               ` Jon Harrop
2007-06-01 23:26                             ` skaller
2007-06-01 23:49                               ` Brian Hurt
2007-06-02  3:26                                 ` skaller
2007-06-01 12:40                     ` Erik de Castro Lopo
2007-06-01 13:56                       ` Julien Signoles
2007-06-01 11:49                   ` David MENTRE
2007-06-01 14:41                     ` skaller
2007-06-01 16:52                       ` Jon Harrop
2007-06-01 23:33                         ` skaller
2007-06-01 16:14                     ` Markus Mottl
2007-06-01 16:46                       ` Jon Harrop
2007-06-01 17:13                       ` Jon Harrop
2007-06-04 14:03                         ` Mike Furr
2007-06-04 14:39                           ` Jon Harrop
2007-06-04 15:33                             ` Mike Furr [this message]
2007-06-04 18:08                             ` skaller
     [not found]                               ` <9d3ec8300706041518y115d22bdsa120d4010261d841@mail.gmail.com>
2007-06-04 22:19                                 ` Fwd: " Till Varoquaux
2007-06-04 23:40                                   ` Jon Harrop
2007-06-05  2:24                                   ` skaller
2007-06-04 22:44                               ` Pierre Etchemaïté
2007-06-05  1:42                                 ` Jon Harrop
2007-06-05 10:30                                   ` Jon Harrop
2007-06-10 12:10                           ` Jon Harrop
2007-06-10 12:58                             ` skaller
2007-06-01 14:15                 ` Stephen Weeks
2007-06-01 14:37                   ` Brian Hurt
2007-06-01 14:39                   ` Eric Cooper
2007-05-31  9:24       ` Yuanchen Zhu
2007-05-31 10:25       ` Loup Vaillant
2007-05-31 10:30         ` Jon Harrop
2007-05-31 12:12     ` skaller
2007-05-31  7:11 ` Daniel Bünzli
2007-05-31 15:15 ` Christophe Raffalli
2007-05-31 15:23   ` Jon Harrop
2007-05-31 15:35     ` Christophe Raffalli
     [not found]       ` <604682010705310923o5a1ee0eiee5ae697da9d3c60@mail.gmail.com>
2007-05-31 20:14         ` Stephen Weeks
2007-05-31 15:16 ` Christophe Raffalli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=466430C4.7020806@cs.umd.edu \
    --to=furr@cs.umd.edu \
    --cc=caml-list@yquem.inria.fr \
    --cc=osp2007-ocaml-reins@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).