From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Delivered-To: caml-list@yquem.inria.fr Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by yquem.inria.fr (Postfix) with ESMTP id 16A19BC75 for ; Mon, 21 Feb 2005 21:17:50 +0100 (CET) Received: from pauillac.inria.fr (pauillac.inria.fr [128.93.11.35]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j1LKHnIZ012128 for ; Mon, 21 Feb 2005 21:17:49 +0100 Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id VAA14292 for ; Mon, 21 Feb 2005 21:17:49 +0100 (MET) Received: from mail-eur1.microsoft.com (mail-eur1.microsoft.com [213.199.128.139]) by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id j1LKHmSR012123 for ; Mon, 21 Feb 2005 21:17:49 +0100 Received: from EUR-MSG-03.europe.corp.microsoft.com ([65.53.192.44]) by mail-eur1.microsoft.com with Microsoft SMTPSVC(6.0.3790.1289); Mon, 21 Feb 2005 20:17:49 +0000 X-MimeOLE: Produced By Microsoft Exchange V6.5.7226.0 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Subject: RE: Re: [Caml-list] Persistence Date: Mon, 21 Feb 2005 20:17:29 -0000 Message-ID: <5DCA48FADB33FF4D8C32A164DF24F2B0027899E2@EUR-MSG-03.europe.corp.microsoft.com> Thread-Topic: Re: [Caml-list] Persistence thread-index: AcUVnw+MUl/zE81YRyeKUaIwNyUORgCsqZEA From: "Don Syme" To: "Jon Harrop" , "Caml mailing list" X-OriginalArrivalTime: 21 Feb 2005 20:17:49.0625 (UTC) FILETIME=[6A17B290:01C51852] X-Miltered: at nez-perce with ID 421A41ED.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at nez-perce with ID 421A41EC.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; caml-list:01 syme:01 dsyme:01 intuitively:01 convincing:01 caml-list:01 wrote:01 stl:01 stl:01 ocaml's:01 ocaml:01 computes:01 ocaml:01 implements:01 reuse:01 X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on yquem.inria.fr X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.0.2 X-Spam-Level: Hi Jon, Could you send around the source code to this? I intuitively knew that such performance results were likely, but had never had such a convincing example to wave at people. =20 It would also be interesting to ask some C++ programmers how they would go about optimizing this code, for a couple of reasons (a) to see if they come up with anything faster; and=20 (b) to see if they would ever even think of using sharing-of-complex-immutable-internal-data-structure-nodes as an optimization technique. Don -----Original Message----- From: caml-list-admin@yquem.inria.fr [mailto:caml-list-admin@yquem.inria.fr] On Behalf Of Jon Harrop Sent: 18 February 2005 09:40 To: caml-list@yquem.inria.fr Subject: Fwd: Re: [Caml-list] Persistence Many moons ago: On Sunday 06 February 2005 22:26, Radu Grigore wrote: > I had written: > > The STL will probably take <<1ms because the STL's size() is probably > > O(1) whereas Map's fold is O(n). In contrast, forking lineages is > > probably O(n) with the STL (requiring a copy) and O(1) with ocaml's Map. > > I haven't actually tested with STL but the implementation of size() is > indeed a simple return. Do you have an example where forking lineages > is useful? I do now. :-) I've just finished converting one of my example scientific programs from OCaml to C++. The program basically computes the "n"th nearest neighbours of a vertex in a graph. The implementations are based upon sets of vertices. The OCaml version simply implements the obvious mathematical expression. The only difference is that reuse is implicit in the maths (which is purely functional) and has to be made explicit in the OCaml code by memoizing the function. My first stab at a C++ version did exactly the same thing. As you know, OCaml performance is supposed to to be within a factor of 2 of C/C++. Somewhat suprisingly, on my first timed run the C++ code turned out to be over 100 times slower than the OCaml. After some profiling it transpired that virtually all of the time was spent computing set unions which I had stupidly implemented using the STL set_union function. ;-) The reason is quite simple. Thanks to the purely functional implementation in OCaml, sets are immutable. OCaml's Set.union function exploits this by reusing pieces of tree from the input sets in the output ("copying" arbitrarily-large subsets in O(1) time), happily knowing that they cannot be altered afterwards. In contrast, the C++ implementation of set_union can make no such assumption and is forced to copy most of the elements from both of the input sets. The performance of set_union in the STL is, therefore, vastly worse than the performance of Set.union in OCaml, thanks to persistence. In summary, like me, you've probably been using persistence a lot without realising it. Every time you do set operations, for example. HTH. --=20 Dr Jon D Harrop, Flying Frog Consultancy Ltd. _______________________________________________ Caml-list mailing list. Subscription management: http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list Archives: http://caml.inria.fr Beginner's list: http://groups.yahoo.com/group/ocaml_beginners Bug reports: http://caml.inria.fr/bin/caml-bugs