From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=AWL autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id F056CBC37 for ; Mon, 28 Sep 2009 01:14:31 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ai8CAJeOv0rUnwckjmdsb2JhbACCH5hjAQEBAQkLCAkRBrZShB4F X-IronPort-AV: E=Sophos;i="4.44,462,1249250400"; d="scan'208";a="47392412" Received: from relay.ptn-ipout02.plus.net ([212.159.7.36]) by mail4-smtp-sop.national.inria.fr with ESMTP/TLS/RC4-SHA; 28 Sep 2009 01:14:31 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AkcFAFuOv0rUnw6U/2dsb2JhbACCH892hB4F Received: from fhw-relay07.plus.net ([212.159.14.148]) by relay.ptn-ipout02.plus.net with ESMTP; 28 Sep 2009 00:14:31 +0100 Received: from [87.114.87.187] (helo=leper.local) by fhw-relay07.plus.net with esmtp (Exim) id 1Ms2wo-0005m1-UM for caml-list@yquem.inria.fr; Mon, 28 Sep 2009 00:14:31 +0100 From: Jon Harrop Organization: Flying Frog Consultancy Ltd. To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] HLVM stuff Date: Mon, 28 Sep 2009 00:14:58 +0100 User-Agent: KMail/1.9.9 References: <200909272025.39995.jon@ffconsultancy.com> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200909280014.58488.jon@ffconsultancy.com> X-Plusnet-Relay: 5c0c241ddfa3de4aa63984e31ce0b6e4 X-Spam: no; 0.00; arrays:01 speedup:01 ocaml:01 discrepancy:01 polymorphism:01 polymorphism:01 compiler:01 compiler:01 fwiw:01 ocaml:01 higher-order:01 coq:01 parallelism:01 2009:98 0.34:98 On Sunday 27 September 2009 22:58:59 David McClain wrote: > On Sep 27, 2009, at 12:25 PM, Jon Harrop wrote: > > where the "kthSmallest" and "Array2D.parallelInit" functions are both > > polymorphic. The former handles implicit sequences of any comparable > > type and > > the latter handles 2D arrays of any element type. This use of > > polymorphic > > But facing a situation with 2^26 pixels to process, I would never do > that. Here is a better one-line F# solution: images |> Array2D.map (fun xs -> Array.sortInPlaceWith compare xs; xs.[m/2]) This solves your problem from the REPL in 0.34s. Moreover, you can easily parallelize it in F#: Parallel.For(0, n, fun y -> for x=0 to n-1 do Array.sortInPlaceWith compare images.[y, x]) images |> Array2D.map (fun xs -> xs.[m/2]) On this 8-core box, the time taken is reduced to 0.039s (finally a superlinear speedup on my Intel box, yay!). Here is the OCaml equivalent: Array.map (Array.map (fun gs -> Array.sort compare gs; gs.(m/2))) images This solves your problem non-interactively in 32s, which is 821x slower than F#. This huge performance discrepancy is a direct result of the elegant solution using polymorphic functions. HLVM's solution to polymorphism solves this problem, offering polymorphism with no performance degradation whatsoever. > I would write a type-specific function to apply. Why waste your time doing by hand what the compiler can do for you? > Why dispatch of every pixel of the aggregate, when I could dispatch once at > the top, to decide what kind of homogeneous array... Why dispatch at all when a JIT compiler would already know all of the types involved and could partially specialize your code for them? FWIW, a completed HLVM would solve this problem extremely efficiently despite having a naive garbage collector because the entire program only does a single allocation. This is not at all uncommon in technical computing and is exactly the characteristic I was referring to: these solutions leverage features of the OCaml language like higher-order functions, currying and partial application but they have completely different performance requirements to those of Coq. In the context of technical computing, the benefits of shared-memory parallelism far outweigh those of efficient single-threaded allocation and collection of small values. -- Dr Jon Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/?e