From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by yquem.inria.fr (Postfix) with ESMTP id A3397BBC1 for ; Wed, 23 Apr 2008 15:41:30 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AnABAMPZDkjBL1AZiGdsb2JhbACRUgEBAQ8mm1g X-IronPort-AV: E=Sophos;i="4.25,699,1199660400"; d="scan'208";a="11328837" Received: from gw.exalead.com (HELO exalead.com) ([193.47.80.25]) by mail1-smtp-roc.national.inria.fr with ESMTP; 23 Apr 2008 15:41:30 +0200 Received: from [192.168.204.148] (madpc064.exalead.com [192.168.204.148]) (authenticated bits=0) by exalead.com (8.14.2/8.14.0) with ESMTP id m3NDfTYJ006152; Wed, 23 Apr 2008 15:41:29 +0200 Message-ID: <480F3C89.5080203@exalead.com> Date: Wed, 23 Apr 2008 15:41:29 +0200 From: Berke Durak User-Agent: Thunderbird 1.5.0.10 (X11/20070221) MIME-Version: 1.0 To: Jon Harrop Cc: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Announce: xsetxmap, unfunctorized, Sexp-lib aware versions of Set and Map References: <480C60DF.9010000@exalead.com> <480C739F.5080104@lri.fr> <200804231335.25708.jon@ffconsultancy.com> In-Reply-To: <200804231335.25708.jon@ffconsultancy.com> Content-Type: text/plain; charset=ISO-8859-15; format=flowed Content-Transfer-Encoding: 7bit X-Spam: no; 0.00; berke:01 durak:01 berke:01 durak:01 integers:01 ocaml:01 ocaml's:01 inlined:01 refactor:01 node:01 node:01 hash:01 hash:01 functors:01 incorrectly:01 Jon Harrop wrote: > On Monday 21 April 2008 12:19:09 Berke Durak wrote: >> Yes you are right, I should have put a warning. These modules are for >> peculiar cases. > > Actually I would say that your style is more useful than the built-in Set and > Map modules because you don't have to jump through hoops defining your > own "Int" module with its own "int" type and its own comparison function over > ints every time you want a set of integers. I would put the comparison > function in the set itself though. Well, thanks. > Your modules cover 99% of use cases with the lowest syntactic overhead > possible in OCaml. However, you have inherited a couple of design flaws from > OCaml's standard library: > > . The "height" function has been inlined which has no significant effect on > performance but makes it harder to refactor the code. > > . If you special case Node(Empty, v, Empty) to a new Node1(v) type constructor > then you can improve performance by 30% by alleviating the GC. That's a good idea. > As Jean-Christophe says, it is dangerous. Have you checked out the second version where I use phantom types to prevent the more severe cases of data corruption? Yes, you can still end up with multiple copies of the same "set", but at least the tree won't get inconsistent because you mistakenly called add with different comparison functions. So it's as safe/unsafe as using List.assoc or the generic hash table to store sets or hash tables. That's still a problem, but it's not that much worse than with functors. After all, nothing prevents you from incorrectly writing module StringSet = Set.Make(String) module SetSet = Set.Make(struct type t = StringSet.t let compare = compare) > However, the whole language is > already dangerous in this sense because it is so easy to erroneously apply > polymorphic comparison or equality to the existing sets and maps and get > silent data corruption. The only solution is to fix the language itself by > allowing types to override comparison, equality and hashing. Overriding equality etc. would require run-time type-based dispatch or implicitly carrying the comparison function (as Haskell apparently does with type classes) as an extra argument. Could be expensive. > Another inherent problem is the inefficiency of performing comparisons in this > way in OCaml. Type specialization (e.g. using a JIT) greatly accelerates this > kind of code by allowing the comparison function (which is almost always > trivial) to be inlined and optimized. Did anyone try the GNU lightning-based ocamljit in the Ocaml CVS? -- Berke DURAK