From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 X-Original-To: caml-list@yquem.inria.fr Delivered-To: caml-list@yquem.inria.fr Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by yquem.inria.fr (Postfix) with ESMTP id CE5EEBC29 for ; Tue, 29 Aug 2006 20:37:24 +0200 (CEST) Received: from mtaout-w.tc.umn.edu (mtaout-w.tc.umn.edu [160.94.160.21]) by concorde.inria.fr (8.13.6/8.13.6) with ESMTP id k7TIbN3E020555 for ; Tue, 29 Aug 2006 20:37:24 +0200 Received: from [128.101.106.92] (glu02.cs.umn.edu [128.101.106.92]) by mtaout-w.tc.umn.edu with ESMTP for caml-list@yquem.inria.fr; Tue, 29 Aug 2006 13:37:20 -0500 (CDT) X-Umn-Remote-Mta: [N] glu02.cs.umn.edu [128.101.106.92] #+LO+TS+AU+HN Message-ID: <44F48960.6080908@cs.umn.edu> Date: Tue, 29 Aug 2006 13:37:20 -0500 From: Christopher Kauffman User-Agent: Mozilla Thunderbird 1.0.8 (X11/20060725) X-Accept-Language: en-us, en MIME-Version: 1.0 To: OCaml Subject: Re: [Caml-list] Garbage collection and caml_adjust_gc_speed References: <44F3BFA7.20202@cs.umn.edu> <8B65BEE5-D101-4B39-98DD-3B68608702AC@inria.fr> In-Reply-To: <8B65BEE5-D101-4B39-98DD-3B68608702AC@inria.fr> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Miltered: at concorde with ID 44F48963.001 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Spam: no; 0.00; surprising:01 computations:01 matrices:01 bigarrays:01 lacaml:01 blas:01 lapack:01 lacaml:01 arrays:01 matrices:01 bigarray:01 bigarray:01 garbage:01 garbage:01 caml-list:01 > What I find surprising is that this simple function would dominate the run > time of your program. I suspect some problem with gprof. Could you give us > more details about the computations performed by your program? The program is in the area of computational biology. It is intended to be used to study approximations to protein folding. The main operations involve altering N by 3 matrices representing the coordinates of atoms of the protein. I have used Bigarrays for the data structures for efficiency and the Lacaml package for most of these operations (multiply, add, etc.) as it leverages BLAS and LAPACK library calls which are machine-tuned matrix operations. Lacaml uses fortran_layout double arrays as the matrices on which it operates. The style of Lacaml is to use optional label arguments so that space may be re-used. For instance the following creates two N by 3 matrices for multiplication and a 3 by 3 matrix to store the results. The two large matrices are multiplied and the result is stored in the 3 by 3 matrix: open Lacaml.D (* Use double precision matrix elements *) let n = 100 in let a = Mat.random n 3 in (* Create n by 3 matrix - Bigarray *) let b = Mat.random n 3 in (* Create n by 3 matrix - Bigarray *) let corr = Mat.make 3 3 in (* Create 3 by 3 matrix - Bigarray *) let new_corr = gemm ~c:corr ~transa:`T a b in (* matrix multiply *) ... This snippet multiplies 'a' and 'b' and stores the result in 'corr'. The final line creates a new name for 'corr', 'new_corr' which should point to the same space as the original 'corr'. This allows a pseudo-functional style of writing the program while avoiding the need to create many new matrices. As an alternative, the final line could be changed to ignore(gemm ~c:corr ~transa:`T acpy bcpy); and the name 'corr' used later. For convenience I have followed the style of the first version in my program. I suppose it is possible that creating the aliased name could affect the garbage collector (it might look like the allocation of another Bigarray so I will experiment with the second style.