From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id IAA24610; Wed, 6 Jun 2001 08:25:40 +0200 (MET DST) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id IAA24579 for ; Wed, 6 Jun 2001 08:25:39 +0200 (MET DST) Received: from granger.mail.mindspring.net (granger.mail.mindspring.net [207.69.200.148]) by nez-perce.inria.fr (8.11.1/8.10.0) with ESMTP id f566Pcn20559 for ; Wed, 6 Jun 2001 08:25:38 +0200 (MET DST) Received: from dylan (1Cust39.tnt2.tucson.az.da.uu.net [63.11.142.39]) by granger.mail.mindspring.net (8.9.3/8.8.5) with SMTP id CAA24041 for ; Wed, 6 Jun 2001 02:25:32 -0400 (EDT) Message-ID: <000f01c0ee51$c955f7f0$210148bf@dylan> From: "David McClain" To: References: <002f01c0ecf9$d028a3b0$210148bf@dylan><15131.59080.327155.47983@beertje.william.bogus> <4.3.2.7.2.20010605002003.027deaa0@shell16.ba.best.com> Subject: Re: [Caml-list] OCaml Speed for Block Convolutions Date: Tue, 5 Jun 2001 23:27:38 -0700 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2919.6600 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6600 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk > I think I'm missing something. The ocaml loops allocate memory for temps, but the C loops reuse storage? Did you do a version for ocaml that's got the same level of optimizations as the C version (not allocating, etc.)? I didn't write an OCaml program to solve this specific problem. Instead, I made use of my NML (numeric modeling language) built on top of OCaml. Hence, being a general purpose language, it had to make very conservative choices to protect the user. Whenever new data are generated, they are placed into freshly allocated arrays. Had I written code to specifically address this one problem it would probably have performed better. I was quite impressed with the performance I was getting from my NML which used this very general purpose OCaml kernel. I almost stopped at that point. But nagging questions, eager users, and a growing list of block convolution jobs convinced me to see what I could do in C, while still living primarily in the NML world. I didn't see any need to rewrite all the file I/O handling, the unwind-protect mechanisms, etc, etc, all in C when I have perfectly good versions of this stuff in NML thanks to OCaml. So I simply targeted the innermost loop where I knew much of the overhead was in fresh array creation at every intermediate step. I am doing block convolutions on GByte sized datasets, which is far more than NML was ever intended to support well. NML is a modeling and prototyping language. But adding C code is as easy as adding C code to any OCaml program and recompiling the augmented NML system. Hence targeting only the inner loops was very simple (in principle). In practice, even after nearly 30 years of heavy C/C++ experience I still cannot write even relatively simple programs without planting bugs in them. To OCaml's credit, I can easily write OCaml code that executes properly the first time out. No debugging. I wish that were the case for C as well. So to end this long winded explanation, had I written OCaml code specifically for the purpose of block convolutions I would have expected a performance around 70% of that demonstrated by C, not the factor of 4 that I had with NML. I could have written to use mutable arrays in that case, reaping much of the same advantages that I got when I rewrote the inner loops in C. - DM ----- Original Message ----- From: "Chris Hecker" To: "David McClain" ; Sent: Tuesday, June 05, 2001 12:22 AM Subject: Re: [Caml-list] OCaml Speed for Block Convolutions > > >I think the constant creation of new arrays to hold intermediate results > >(i.e., immutable data) is costing too much. Many of the intermediate values > >could be overwritten in place and save a lot of time. > >... of course... don't ask how long it took to write all this stuff in the > >inner loops of the convolution in C, nor how long it took to debug that > >stuff.... And immutable data definitely helps to get the code correct. OCaml > >still rules! > > I think I'm missing something. The ocaml loops allocate memory for temps, but the C loops reuse storage? Did you do a version for ocaml that's got the same level of optimizations as the C version (not allocating, etc.)? > > If the code's not to big, could you post it? > > Chris > > ------------------- To unsubscribe, mail caml-list-request@inria.fr. Archives: http://caml.inria.fr